In this commit we update the documentation for Google El Carro for Oracle Workloads. We amend the documentation in the Google Providers page to use the correct name which is El Carro for Oracle Workloads. We also add changes to the document_loaders and memory pages to reflect changes we made in our repo.
- **Description**:
[`bigdl-llm`](https://github.com/intel-analytics/BigDL) is a library for
running LLM on Intel XPU (from Laptop to GPU to Cloud) using
INT4/FP4/INT8/FP8 with very low latency (for any PyTorch model). This PR
adds bigdl-llm integrations to langchain.
- **Issue**: NA
- **Dependencies**: `bigdl-llm` library
- **Contribution maintainer**: @shane-huang
Examples added:
- docs/docs/integrations/llms/bigdl.ipynb
Nvidia provider page is missing a Triton Inference Server package
reference.
Changes:
- added the Triton Inference Server reference
- copied the example notebook from the package into the doc files.
- added the Triton Inference Server description and links, the link to
the above example notebook
- formatted page to the consistent format
NOTE:
It seems that the [example
notebook](https://github.com/langchain-ai/langchain/blob/master/libs/partners/nvidia-trt/docs/llms.ipynb)
was originally created in wrong place. It should be in the LangChain
docs
[here](https://github.com/langchain-ai/langchain/tree/master/docs/docs/integrations/llms).
So, I've created a copy of this example. The original example is still
in the nvidia-trt package.
This PR migrates the existing MongoDBAtlasVectorSearch abstraction from
the `langchain_community` section to the partners package section of the
codebase.
- [x] Run the partner package script as advised in the partner-packages
documentation.
- [x] Add Unit Tests
- [x] Migrate Integration Tests
- [x] Refactor `MongoDBAtlasVectorStore` (autogenerated) to
`MongoDBAtlasVectorSearch`
- [x] ~Remove~ deprecate the old `langchain_community` VectorStore
references.
## Additional Callouts
- Implemented the `delete` method
- Included any missing async function implementations
- `amax_marginal_relevance_search_by_vector`
- `adelete`
- Added new Unit Tests that test for functionality of
`MongoDBVectorSearch` methods
- Removed [`del
res[self._embedding_key]`](e0c81e1cb0/libs/community/langchain_community/vectorstores/mongodb_atlas.py (L218))
in `_similarity_search_with_score` function as it would make the
`maximal_marginal_relevance` function fail otherwise. The `Document`
needs to store the embedding key in metadata to work.
Checklist:
- [x] PR title: Please title your PR "package: description", where
"package" is whichever of langchain, community, core, experimental, etc.
is being modified. Use "docs: ..." for purely docs changes, "templates:
..." for template changes, "infra: ..." for CI changes.
- Example: "community: add foobar LLM"
- [x] PR message
- [x] Pass lint and test: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified to check that you're
passing lint and testing. See contribution guidelines for more
information on how to write/run tests, lint, etc:
https://python.langchain.com/docs/contributing/
- [x] Add tests and docs: If you're adding a new integration, please
include
1. Existing tests supplied in docs/docs do not change. Updated
docstrings for new functions like `delete`
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory. (This already exists)
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Co-authored-by: Steven Silvester <steven.silvester@ieee.org>
Co-authored-by: Erick Friis <erick@langchain.dev>
This PR adds links to some more free resources for people to get
acquainted with Langhchain without having to configure their system.
<!-- If no one reviews your PR within a few days, please @-mention one
of baskaryan, efriis, eyurtsev, hwchase17. -->
Co-authored-by: Filip Schouwenaars <filipsch@users.noreply.github.com>
**Description:**
In this PR, I am adding a `PolygonFinancials` tool, which can be used to
get financials data for a given ticker. The financials data is the
fundamental data that is found in income statements, balance sheets, and
cash flow statements of public US companies.
**Twitter**:
[@virattt](https://twitter.com/virattt)
Several URL-s were broken (in the yesterday PR). Like
[Integrations/platforms/google/Document
Loaders](https://python.langchain.com/docs/integrations/platforms/google#document-loaders)
page, Example link to "Document Loaders / Cloud SQL for PostgreSQL" and
most of the new example links in the Document Loaders, Vectorstores,
Memory sections.
- fixed URL-s (manually verified all example links)
- sorted sections in page to follow the "integrations/components" menu
item order.
- fixed several page titles to fix Navbar item order
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
**Description:** Update to the list of partner packages in the list of
providers
**Issue:** Google & Nvidia had two entries each, both pointing to the
same page
**Dependencies:** None
- **Description:** A generic document loader adapter for SQLAlchemy on
top of LangChain's `SQLDatabaseLoader`.
- **Needed by:** https://github.com/crate-workbench/langchain/pull/1
- **Depends on:** GH-16655
- **Addressed to:** @baskaryan, @cbornet, @eyurtsev
Hi from CrateDB again,
in the same spirit like GH-16243 and GH-16244, this patch breaks out
another commit from https://github.com/crate-workbench/langchain/pull/1,
in order to reduce the size of this patch before submitting it, and to
separate concerns.
To accompany the SQLAlchemy adapter implementation, the patch includes
integration tests for both SQLite and PostgreSQL. Let me know if
corresponding utility resources should be added at different spots.
With kind regards,
Andreas.
### Software Tests
```console
docker compose --file libs/community/tests/integration_tests/document_loaders/docker-compose/postgresql.yml up
```
```console
cd libs/community
pip install psycopg2-binary
pytest -vvv tests/integration_tests -k sqldatabase
```
```
14 passed
```
![image](https://github.com/langchain-ai/langchain/assets/453543/42be233c-eb37-4c76-a830-474276e01436)
---------
Co-authored-by: Andreas Motl <andreas.motl@crate.io>
**Description**: This PR adds support for using the [LLMLingua project
](https://github.com/microsoft/LLMLingua) especially the LongLLMLingua
(Enhancing Large Language Model Inference via Prompt Compression) as a
document compressor / transformer.
The LLMLingua project is an interesting project that can greatly improve
RAG system by compressing prompts and contexts while keeping their
semantic relevance.
**Issue**: https://github.com/microsoft/LLMLingua/issues/31
**Dependencies**: [llmlingua](https://pypi.org/project/llmlingua/)
@baskaryan
---------
Co-authored-by: Ayodeji Ayibiowu <ayodeji.ayibiowu@getinge.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
### Description
This PR moves the Elasticsearch classes to a partners package.
Note that we will not move (and later remove) `ElasticKnnSearch`. It
were previously deprecated.
`ElasticVectorSearch` is going to stay in the community package since it
is used quite a lot still.
Also note that I left the `ElasticsearchTranslator` for self query
untouched because it resides in main `langchain` package.
### Dependencies
There will be another PR that updates the notebooks (potentially pulling
them into the partners package) and templates and removes the classes
from the community package, see
https://github.com/langchain-ai/langchain/pull/17468
#### Open question
How to make the transition smooth for users? Do we move the import
aliases and require people to install `langchain-elasticsearch`? Or do
we remove the import aliases from the `langchain` package all together?
What has worked well for other partner packages?
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
**Description**
Adding different threshold types to the semantic chunker. I’ve had much
better and predictable performance when using standard deviations
instead of percentiles.
![image](https://github.com/langchain-ai/langchain/assets/44395485/066e84a8-460e-4da5-9fa1-4ff79a1941c5)
For all the documents I’ve tried, the distribution of distances look
similar to the above: positively skewed normal distribution. All skews
I’ve seen are less than 1 so that explains why standard deviations
perform well, but I’ve included IQR if anyone wants something more
robust.
Also, using the percentile method backwards, you can declare the number
of clusters and use semantic chunking to get an ‘optimal’ splitting.
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
**Description:** Update the example fiddler notebook to use community
path, instead of langchain.callback
**Dependencies:** None
**Twitter handle:** @bhalder
Co-authored-by: Barun Halder <barun@fiddler.ai>
I tried to configure MongoDBChatMessageHistory using the code from the
original documentation to store messages based on the passed session_id
in MongoDB. However, this configuration did not take effect, and the
session id in the database remained as 'test_session'. To resolve this
issue, I found that when configuring MongoDBChatMessageHistory, it is
necessary to set session_id=session_id instead of
session_id=test_session.
Issue: DOC: Ineffective Configuration of MongoDBChatMessageHistory for
Custom session_id Storage
previous code:
```python
chain_with_history = RunnableWithMessageHistory(
chain,
lambda session_id: MongoDBChatMessageHistory(
session_id="test_session",
connection_string="mongodb://root:Y181491117cLj@123.56.224.232:27017",
database_name="my_db",
collection_name="chat_histories",
),
input_messages_key="question",
history_messages_key="history",
)
config = {"configurable": {"session_id": "mmm"}}
chain_with_history.invoke({"question": "Hi! I'm bob"}, config)
```
![image](https://github.com/langchain-ai/langchain/assets/83388493/c372f785-1ec1-43f5-8d01-b7cc07b806b7)
Modified code:
```python
chain_with_history = RunnableWithMessageHistory(
chain,
lambda session_id: MongoDBChatMessageHistory(
session_id=session_id, # here is my modify code
connection_string="mongodb://root:Y181491117cLj@123.56.224.232:27017",
database_name="my_db",
collection_name="chat_histories",
),
input_messages_key="question",
history_messages_key="history",
)
config = {"configurable": {"session_id": "mmm"}}
chain_with_history.invoke({"question": "Hi! I'm bob"}, config)
```
Effect after modification (it works):
![image](https://github.com/langchain-ai/langchain/assets/83388493/5776268c-9098-4da3-bf41-52825be5fafb)
**Description:** Update the azure search notebook to have more
descriptive comments, and an option to choose between OpenAI and
AzureOpenAI Embeddings
---------
Co-authored-by: Matt Gotteiner <[email protected]>
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:** Callback handler to integrate fiddler with langchain.
This PR adds the following -
1. `FiddlerCallbackHandler` implementation into langchain/community
2. Example notebook `fiddler.ipynb` for usage documentation
[Internal Tracker : FDL-14305]
**Issue:**
NA
**Dependencies:**
- Installation of langchain-community is unaffected.
- Usage of FiddlerCallbackHandler requires installation of latest
fiddler-client (2.5+)
**Twitter handle:** @fiddlerlabs @behalder
Co-authored-by: Barun Halder <barun@fiddler.ai>
- **Description:** Added the `return_sparql_query` feature to the
`GraphSparqlQAChain` class, allowing users to get the formatted SPARQL
query along with the chain's result.
- **Issue:** NA
- **Dependencies:** None
Note: I've ensured that the PR passes linting and testing by running
make format, make lint, and make test locally.
I have added a test for the integration (which relies on network access)
and I have added an example to the notebook showing its use.
https://github.com/langchain-ai/langchain/issues/17657
Thank you for contributing to LangChain!
Checklist:
- [ ] PR title: Please title your PR "package: description", where
"package" is whichever of langchain, community, core, experimental, etc.
is being modified. Use "docs: ..." for purely docs changes, "templates:
..." for template changes, "infra: ..." for CI changes.
- Example: "community: add foobar LLM"
- [ ] PR message: **Delete this entire template message** and replace it
with the following bulleted list
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] Pass lint and test: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified to check that you're
passing lint and testing. See contribution guidelines for more
information on how to write/run tests, lint, etc:
https://python.langchain.com/docs/contributing/
- [ ] Add tests and docs: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
**Description:** Initial pull request for Kinetica LLM wrapper
**Issue:** N/A
**Dependencies:** No new dependencies for unit tests. Integration tests
require gpudb, typeguard, and faker
**Twitter handle:** @chad_juliano
Note: There is another pull request for Kinetica vectorstore. Ultimately
we would like to make a partner package but we are starting with a
community contribution.
- **Description:** Update the Azure Search vector store notebook for the
latest version of the SDK
---------
Co-authored-by: Matt Gotteiner <[email protected]>
**Description:** Clean up Google product names and fix document loader
section
**Issue:** NA
**Dependencies:** None
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Update IBM watsonx.ai docs and add IBM as a provider
docs
- **Dependencies:**
[ibm-watsonx-ai](https://pypi.org/project/ibm-watsonx-ai/),
- **Tag maintainer:** :
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally. ✅
**Description:** This PR changes the module import path for SQLDatabase
in the documentation
**Issue:** Updates the documentation to reflect the move of integrations
to langchain-community
- **Description:** The URL in the tigris tutorial was htttps instead of
https, leading to a bad link.
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter handle:** Speucey
In this pull request, we introduce the add_images method to the
SingleStoreDB vector store class, expanding its capabilities to handle
multi-modal embeddings seamlessly. This method facilitates the
incorporation of image data into the vector store by associating each
image's URI with corresponding document content, metadata, and either
pre-generated embeddings or embeddings computed using the embed_image
method of the provided embedding object.
the change includes integration tests, validating the behavior of the
add_images. Additionally, we provide a notebook showcasing the usage of
this new method.
---------
Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
Issue in the API Reference:
If the `Classes` of `Functions` section is empty, it still shown in API
Reference. Here is an
[example](https://api.python.langchain.com/en/latest/core_api_reference.html#module-langchain_core.agents)
where `Functions` table is empty but still presented.
It happens only if this section has only the "private" members (with
names started with '_'). Those members are not shown but the whole
member section (empty) is shown.
This way we can document APIs in methods signature only where they are
checked by the typing system and we get them also in the param
description without having to duplicate in the docstrings (where they
are unchecked).
Twitter: @cbornet_
Description:
In this PR, I am adding a PolygonTickerNews Tool, which can be used to
get the latest news for a given ticker / stock.
Twitter handle: [@virattt](https://twitter.com/virattt)
**Description**: CogniSwitch focusses on making GenAI usage more
reliable. It abstracts out the complexity & decision making required for
tuning processing, storage & retrieval. Using simple APIs documents /
URLs can be processed into a Knowledge Graph that can then be used to
answer questions.
**Dependencies**: No dependencies. Just network calls & API key required
**Tag maintainer**: @hwchase17
**Twitter handle**: https://github.com/CogniSwitch
**Documentation**: Please check
`docs/docs/integrations/toolkits/cogniswitch.ipynb`
**Tests**: The usual tool & toolkits tests using `test_imports.py`
PR has passed linting and testing before this submission.
---------
Co-authored-by: Saicharan Sridhara <145636106+saiCogniswitch@users.noreply.github.com>
Hi, I'm from the LanceDB team.
Improves LanceDB integration by making it easier to use - now you aren't
required to create tables manually and pass them in the constructor,
although that is still backward compatible.
Bug fix - pandas was being used even though it's not a dependency for
LanceDB or langchain
PS - this issue was raised a few months ago but lost traction. It is a
feature improvement for our users kindly review this , Thanks !
This PR replaces the imports of the Astra DB vector store with the
newly-released partner package, in compliance with the deprecation
notice now attached to the community "legacy" store.
- **Description:** Update documentation for RunnableWithMessageHistory
- **Issue:** https://github.com/langchain-ai/langchain/issues/16642
I don't have access to an Anthropic API key so I updated things to use
OpenAI. Let me know if you'd prefer another provider.
**Description:** This PR introduces a new "Astra DB" Partner Package.
So far only the vector store class is _duplicated_ there, all others
following once this is validated and established.
Along with the move to separate package, incidentally, the class name
will change `AstraDB` => `AstraDBVectorStore`.
The strategy has been to duplicate the module (with prospected removal
from community at LangChain 0.2). Until then, the code will be kept in
sync with minimal, known differences (there is a makefile target to
automate drift control. Out of convenience with this check, the
community package has a class `AstraDBVectorStore` aliased to `AstraDB`
at the end of the module).
With this PR several bugfixes and improvement come to the vector store,
as well as a reshuffling of the doc pages/notebooks (Astra and
Cassandra) to align with the move to a separate package.
**Dependencies:** A brand new pyproject.toml in the new package, no
changes otherwise.
**Twitter handle:** `@rsprrs`
---------
Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>