- Description: When SQLDatabase.from_databricks is ran from a Databricks
Workflow job, line 205 (default_host = context.browserHostName) throws
an ``AttributeError`` as the ``context`` object has no
``browserHostName`` attribute. The fix handles the exception and sets
the ``default_host`` variable to null
---------
Co-authored-by: lmorosdb <lmorosdb>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
**Description:** At the moment neo4j wrapper is using setVectorProperty,
which is deprecated
([link](https://neo4j.com/docs/operations-manual/5/reference/procedures/#procedure_db_create_setVectorProperty)).
I replaced with the non-deprecated version.
Neo4j recently introduced a new cypher method to associate embeddings
into relations using "setRelationshipVectorProperty" method. In this PR
I also implemented a new method to perform this association maintaining
the same format used in the "add_embeddings" method which is used to
associate embeddings into Nodes.
I also included a test case for this new method.
Thank you for contributing to LangChain!
- [X] *ApertureDB as vectorstore**: "community: Add ApertureDB as a
vectorestore"
- **Description:** this change provides a new community integration that
uses ApertureData's ApertureDB as a vector store.
- **Issue:** none
- **Dependencies:** depends on ApertureDB Python SDK
- **Twitter handle:** ApertureData
- [X] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
Integration tests rely on a local run of a public docker image.
Example notebook additionally relies on a local Ollama server.
- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
All lint tests pass.
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Gautam <gautam@aperturedata.io>
On using TavilySearchAPIRetriever with any conversation chain getting
error :
`TypeError: Client.__init__() got an unexpected keyword argument
'api_key'`
It is because the retreiver class is using the depreciated `Client`
class, `TavilyClient` need to be used instead.
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
**Description:**
Databricks Vector Search recently added support for hybrid
keyword-similarity search.
See [usage
examples](https://docs.databricks.com/en/generative-ai/create-query-vector-search.html#query-a-vector-search-endpoint)
from their documentation.
This PR updates the Langchain vectorstore interface for Databricks to
enable the user to pass the *query_type* parameter to
*similarity_search* to make use of this functionality.
By default, there will not be any changes for existing users of this
interface. To use the new hybrid search feature, it is now possible to
do
```python
# ...
dvs = DatabricksVectorSearch(index)
dvs.similarity_search("my search query", query_type="HYBRID")
```
Or using the retriever:
```python
retriever = dvs.as_retriever(
search_kwargs={
"query_type": "HYBRID",
}
)
retriever.invoke("my search query")
```
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
You.com is releasing two new conversational APIs — Smart and Research.
This PR:
- integrates those APIs with Langchain, as an LLM
- streaming is supported
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
- **Description:** This pull request introduces two new methods to the
Langchain Chroma partner package that enable similarity search based on
image embeddings. These methods enhance the package's functionality by
allowing users to search for images similar to a given image URI. Also
introduces a notebook to demonstrate it's use.
- **Issue:** N/A
- **Dependencies:** None
- **Twitter handle:** @mrugank9009
---------
Co-authored-by: ccurme <chester.curme@gmail.com>
In some lines its trying to read a key that do not exists yet. In this
cases I changed the direct access to dict.get() method
Thank you for contributing to LangChain!
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
**Description:** Spell check fixes for docs, comments, and a couple of
strings. No code change e.g. variable names.
**Issue:** none
**Dependencies:** none
**Twitter handle:** hmartin
Thank you for contributing to LangChain!
- [ ] **HuggingFaceEndpoint**: "Skip Login to HuggingFaceHub"
- Where: langchain, community, llm, huggingface_endpoint
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Skip login to huggingface hub when when
`huggingfacehub_api_token` is not set. This is needed when using custom
`endpoint_url` outside of HuggingFaceHub.
- **Issue:** the issue # it fixes
https://github.com/langchain-ai/langchain/issues/20342 and
https://github.com/langchain-ai/langchain/issues/19685
- **Dependencies:** None
- [ ] **Add tests and docs**:
1. Tested with locally available TGI endpoint
2. Example Usage
```python
from langchain_community.llms import HuggingFaceEndpoint
llm = HuggingFaceEndpoint(
endpoint_url='http://localhost:8080',
server_kwargs={
"headers": {"Content-Type": "application/json"}
}
)
resp = llm.invoke("Tell me a joke")
print(resp)
```
Also tested against HF Endpoints
```python
from langchain_community.llms import HuggingFaceEndpoint
huggingfacehub_api_token = "hf_xyz"
repo_id = "mistralai/Mistral-7B-Instruct-v0.2"
llm = HuggingFaceEndpoint(
huggingfacehub_api_token=huggingfacehub_api_token,
repo_id=repo_id,
)
resp = llm.invoke("Tell me a joke")
print(resp)
```
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
If you use `refresh_schema=False`, then the metadata constraint doesn't
exist. ATM, we used default `None` in the constraint check, but then
`any` fails because it can't iterate over None value
Thank you for contributing to LangChain!
**Description**:
This PR fixes a bug described in the issue in #24064, when using the
AzureSearch Vectorstore with the asyncronous methods to do search which
is also the method used for the retriever. The proposed change includes
just change the access of the embedding as optional because is it not
used anywhere to retrieve documents. Actually, the syncronous methods of
retrieval do not use the embedding neither.
With this PR the code given by the user in the issue works.
```python
vectorstore = AzureSearch(
azure_search_endpoint=os.getenv("AI_SEARCH_ENDPOINT_SECRET"),
azure_search_key=os.getenv("AI_SEARCH_API_KEY"),
index_name=os.getenv("AI_SEARCH_INDEX_NAME_SECRET"),
fields=fields,
embedding_function=encoder,
)
retriever = vectorstore.as_retriever(search_type="hybrid", k=2)
await vectorstore.avector_search("what is the capital of France")
await retriever.ainvoke("what is the capital of France")
```
**Issue**:
The Azure Search Vectorstore is not working when searching for documents
with asyncronous methods, as described in issue #24064
**Dependencies**:
There are no extra dependencies required for this change.
---------
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
Description: ImagePromptTemplate for Multimodal llms like llava when
using Ollama
Twitter handle: https://x.com/a7ulr
Details:
When using llava models / any ollama multimodal llms and passing images
in the prompt as urls, langchain breaks with this error.
```python
image_url_components = image_url.split(",")
^^^^^^^^^^^^^^^^^^^^
AttributeError: 'dict' object has no attribute 'split'
```
From the looks of it, there was bug where the condition did check for a
`url` field in the variable but missed to actually assign it.
This PR fixes ImagePromptTemplate for Multimodal llms like llava when
using Ollama specifically.
@hwchase17
This adds an extractor interface and an implementation for HTML pages.
Extractors are used to create GraphVectorStore Links on loaded content.
**Twitter handle:** cbornet_
**Description:** There was missing some documentation regarding the
`filter` and `params` attributes in similarity search methods.
---------
Co-authored-by: rpereira <rafael.pereira@criticalsoftware.com>
This PR moves the in memory implementation to langchain-core.
* The implementation remains importable from langchain-community.
* Supporting utilities are marked as private for now.
- **Description:** Support PGVector in PebbloRetrievalQA
- Identity and Semantic Enforcement support for PGVector
- Refactor Vectorstore validation and name check
- Clear the overridden identity and semantic enforcement filters
- **Issue:** NA
- **Dependencies:** NA
- **Tests**: NA(already added)
- **Docs**: Updated
- **Twitter handle:** [@Raj__725](https://twitter.com/Raj__725)
**Description:** Fix for source path mismatch in PebbloSafeLoader. The
fix involves storing the full path in the doc metadata in VectorDB
**Issue:** NA, caught in internal testing
**Dependencies:** NA
**Add tests**: Updated tests
This PR introduces a GraphStore component. GraphStore extends
VectorStore with the concept of links between documents based on
document metadata. This allows linking documents based on a variety of
techniques, including common keywords, explicit links in the content,
and other patterns.
This works with existing Documents, so it’s easy to extend existing
VectorStores to be used as GraphStores. The interface can be implemented
for any Vector Store technology that supports metadata, not only graph
DBs.
When retrieving documents for a given query, the first level of search
is done using classical similarity search. Next, links may be followed
using various traversal strategies to get additional documents. This
allows documents to be retrieved that aren’t directly similar to the
query but contain relevant information.
2 retrieving methods are added to the VectorStore ones :
* traversal_search which gets all linked documents up to a certain depth
* mmr_traversal_search which selects linked documents using an MMR
algorithm to have more diverse results.
If a depth of retrieval of 0 is used, GraphStore is effectively a
VectorStore. It enables an easy transition from a simple VectorStore to
GraphStore by adding links between documents as a second step.
An implementation for Apache Cassandra is also proposed.
See
https://github.com/datastax/ragstack-ai/blob/main/libs/knowledge-store/notebooks/astra_support.ipynb
for a notebook explaining how to use GraphStore and that shows that it
can answer correctly to questions that a simple VectorStore cannot.
**Twitter handle:** _cbornet
This PR rolls out part of the new proposed interface for vectorstores
(https://github.com/langchain-ai/langchain/pull/23544) to existing store
implementations.
The PR makes the following changes:
1. Adds standard upsert, streaming_upsert, aupsert, astreaming_upsert
methods to the vectorstore.
2. Updates `add_texts` and `aadd_texts` to be non required with a
default implementation that delegates to `upsert` and `aupsert` if those
have been implemented. The original `add_texts` and `aadd_texts` methods
are problematic as they spread object specific information across
document and **kwargs. (e.g., ids are not a part of the document)
3. Adds a default implementation to `add_documents` and `aadd_documents`
that delegates to `upsert` and `aupsert` respectively.
4. Adds standard unit tests to verify that a given vectorstore
implements a correct read/write API.
A downside of this implementation is that it creates `upsert` with a
very similar signature to `add_documents`.
The reason for introducing `upsert` is to:
* Remove any ambiguities about what information is allowed in `kwargs`.
Specifically kwargs should only be used for information common to all
indexed data. (e.g., indexing timeout).
*Allow inheriting from an anticipated generalized interface for indexing
that will allow indexing `BaseMedia` (i.e., allow making a vectorstore
for images/audio etc.)
`add_documents` can be deprecated in the future in favor of `upsert` to
make sure that users have a single correct way of indexing content.
---------
Co-authored-by: ccurme <chester.curme@gmail.com>
The `langchain_common.vectostore.Redis.delete()` must not be a
`@staticmethod`.
With the current implementation, it's not possible to have multiple
instances of Redis vectorstore because all versions must share the
`REDIS_URL`.
It's not conform with the base class.
- **Description:** Enhance JiraAPIWrapper to accept the 'cloud'
parameter through an environment variable. This update allows more
flexibility in configuring the environment for the Jira API.
- **Twitter handle:** Andre_Q_Pereira
---------
Co-authored-by: André Quintino <andre.quintino@tui.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>