langchain

Commit Graph

Author	SHA1	Message	Date
Aayush Kataria	71811e0547	community[minor]: Adds a vector store for Azure Cosmos DB for NoSQL (#21676 ) This PR add supports for Azure Cosmos DB for NoSQL vector store. Summary: Description: added vector store integration for Azure Cosmos DB for NoSQL Vector Store, Dependencies: azure-cosmos dependency, Tag maintainer: @hwchase17, @baskaryan @efriis @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	3 months ago
am-kinetica	ad101adec8	community[patch]: Kinetica Integrations handled error in querying; quotes in table names; updated gpudb API (#22724 ) - [ ] Miscellaneous updates and fixes: - Description: Handled error in querying; quotes in table names; updated gpudb API - Issue: Threw an error with an error message difficult to understand if a query failed or returned no records - Dependencies: Updated GPUDB API version to `7.2.0.9` @baskaryan @hwchase17	3 months ago
Eugene Yurtsev	05d31a2f00	community[patch]: Add missing type annotations (#22758 ) Add missing type annotations to objects in community. These missing type annotations will raise type errors in pydantic 2.	4 months ago
X-HAN	34edfe4a16	community[minor]: add Volcengine Rerank (#22700 ) Description: this PR adds Volcengine Rerank capability to Langchain, you can find Volcengine Rerank API from [here](https://www.volcengine.com/docs/84313/1254474) & [here](https://www.volcengine.com/docs/84313/1254605). [Volcengine](https://www.volcengine.com/) is a cloud service platform developed by ByteDance, the parent company of TikTok. You can obtain Volcengine API AK/SK from [here](https://www.volcengine.com/docs/84313/1254553). Dependencies: VolcengineRerank depends on `volcengine` python package. Twitter handle: my twitter/x account is https://x.com/LastMonopoly and I'd like a mention, thank you! Tests and docs 1. integration test: `test_volcengine_rerank.py` 2. example notebook: `volcengine_rerank.ipynb` Lint and test: I have run `make format`, `make lint` and `make test` from the root of the package I've modified.	4 months ago
Philippe PRADOS	9aabb446c5	community[minor]: Add SQL storage implementation (#22207 ) Hello @eyurtsev - package: langchain-comminity - Description: Add SQL implementation for docstore. A new implementation, in line with my other PR ([async PGVector](https://github.com/langchain-ai/langchain-postgres/pull/32), [SQLChatMessageMemory](https://github.com/langchain-ai/langchain/pull/22065)) - Twitter handler: pprados --------- Signed-off-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Piotr Mardziel <piotrm@gmail.com> Co-authored-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	4 months ago
Erick Friis	a24a9c6427	multiple: get rid of pyproject extras (#22581 ) They cause `poetry lock` to take a ton of time, and `uv pip install` can resolve the constraints from these toml files in trivial time (addressing problem with #19153) This allows us to properly upgrade lockfile dependencies moving forward, which revealed some issues that were either fixed or type-ignored (see file comments)	4 months ago
X-HAN	62f13f95e4	community[minor]: add DashScope Rerank (#22403 ) Description: this PR adds DashScope Rerank capability to Langchain, you can find DashScope Rerank API from [here](https://help.aliyun.com/document_detail/2780058.html?spm=a2c4g.2780059.0.0.6d995024FlrJ12) & [here](https://help.aliyun.com/document_detail/2780059.html?spm=a2c4g.2780058.0.0.63f75024cr11N9). [DashScope](https://dashscope.aliyun.com/) is the generative AI service from Alibaba Cloud (Aliyun). You can create DashScope API key from [here](https://bailian.console.aliyun.com/?apiKey=1#/api-key). Dependencies: DashScopeRerank depends on `dashscope` python package. Twitter handle: my twitter/x account is https://x.com/LastMonopoly and I'd like a mention, thanks you! Tests and docs 1. integration test: `test_dashscope_rerank.py` 2. example notebook: `dashscope_rerank.ipynb` Lint and test: I have run `make format`, `make lint` and `make test` from the root of the package I've modified. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	4 months ago
leila-messallem	3280a5b49b	community[patch]: improve test setup to accurately test filtering of labels in neo4j (#22531 ) Description: This PR addresses an issue with an existing test that was not effectively testing the intended functionality. The previous test setup did not adequately validate the filtering of the labels in neo4j, because the nodes and relationship in the test data did not have any properties set. Without properties these labels would not have been returned, regardless of the filtering. --------- Co-authored-by: Oskar Hane <oh@oskarhane.com>	4 months ago
Stefano Lottini	328d0c99f2	community[minor]: Add support for metadata indexing policy in Cassandra vector store (#22548 ) This PR adds a constructor `metadata_indexing` parameter to the Cassandra vector store to allow optional fine-tuning of which fields of the metadata are to be indexed. This is a feature supported by the underlying CassIO library. Indexing mode of "all", "none" or deny- and allow-list based choices are available. The rationale is, in some cases it's advisable to programmatically exclude some portions of the metadata from the index if one knows in advance they won't ever be used at search-time. this keeps the index more lightweight and performant and avoids limitations on the length of _indexed_ strings. I added a integration test of the feature. I also added the possibility of running the integration test with Cassandra on an arbitrary IP address (e.g. Dockerized), via `CASSANDRA_CONTACT_POINTS=10.1.1.5,10.1.1.6 poetry run pytest [...]` or similar. While I was at it, I added a line to the `.gitignore` since the mypy _test_ cache was not ignored yet. My X (Twitter) handle: @rsprrs.	4 months ago
Ofer Mendelevitch	ad502e8d50	community[minor]: Vectara Integration Update - Streaming, FCS, Chat, updates to documentation and example notebooks (#21334 ) Thank you for contributing to LangChain! Description: update to the Vectara / Langchain integration to integrate new Vectara capabilities: - Full RAG implemented as a Runnable with as_rag() - Vectara chat supported with as_chat() - Both support streaming response - Updated documentation and example notebook to reflect all the changes - Updated Vectara templates Twitter handle: ofermend Add tests and docs: no new tests or docs, but updated both existing tests and existing docs	4 months ago
Yuwen Hu	ba0dca46d7	community[minor]: Add IPEX-LLM BGE embedding support on both Intel CPU and GPU (#22226 ) Description: [IPEX-LLM](https://github.com/intel-analytics/ipex-llm) is a PyTorch library for running LLM on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max) with very low latency. This PR adds ipex-llm integrations to langchain for BGE embedding support on both Intel CPU and GPU. Dependencies: `ipex-llm`, `sentence-transformers` Contribution maintainer: @Oscilloscope98 tests and docs: - langchain/docs/docs/integrations/text_embedding/ipex_llm.ipynb - langchain/docs/docs/integrations/text_embedding/ipex_llm_gpu.ipynb - langchain/libs/community/tests/integration_tests/embeddings/test_ipex_llm.py --------- Co-authored-by: Shengsheng Huang <shannie.huang@gmail.com>	4 months ago
maang-h	596c062cba	community[patch]: Standardize qianfan model init args name (#22322 ) - Description: - Standardize qianfan chat model intialization arguments name - qianfan_ak (qianfan api key) -> api_key - qianfan_sk (qianfan secret key) -> secret_key - Delete unuse variable - Issue: #20085	4 months ago
Christophe Bornet	74947ec894	community[minor]: Add Cassandra ByteStore (#22064 )	4 months ago
Christophe Bornet	fea6b99b16	community[minor]: Add async methods to CassandraChatMessageHistory (#21975 )	4 months ago
Bagatur	50186da0a1	infra: rm unused # noqa violations (#22049 ) Updating #21137	4 months ago
SaschaStoll	709664a079	community[patch]: Performant filter columns option for Hanavector (#21971 ) Description: Backwards compatible extension of the initialisation interface of HanaDB to allow the user to specify specific_metadata_columns that are used for metadata storage of selected keys which yields increased filter performance. Any not-mentioned metadata remains in the general metadata column as part of a JSON string. Furthermore switched to executemany for batch inserts into HanaDB. Issue: N/A Dependencies: no new dependencies added Twitter handle: @sapopensource --------- Co-authored-by: Martin Kolb <martin.kolb@sap.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
Eric Zhang	e7e41eaabe	langchain: add RankLLM Reranker (#21171 ) Integrate RankLLM reranker (https://github.com/castorini/rank_llm) into LangChain An example notebook is given in `docs/docs/integrations/retrievers/rankllm-reranker.ipynb` --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	4 months ago
Matthew Hoffman	4f2e3bd7fd	community[patch]: fix public interface for embeddings module (#21650 ) ## Description The existing public interface for `langchain_community.emeddings` is broken. In this file, `__all__` is statically defined, but is subsequently overwritten with a dynamic expression, which type checkers like pyright do not support. pyright actually gives the following diagnostic on the line I am requesting we remove: [reportUnsupportedDunderAll](https://github.com/microsoft/pyright/blob/main/docs/configuration.md#reportUnsupportedDunderAll): ``` Operation on "__all__" is not supported, so exported symbol list may be incorrect ``` Currently, I get the following errors when attempting to use publicablly exported classes in `langchain_community.emeddings`: ```python import langchain_community.embeddings langchain_community.embeddings.HuggingFaceEmbeddings(...) # error: "HuggingFaceEmbeddings" is not exported from module "langchain_community.embeddings" (reportPrivateImportUsage) ``` This is solved easily by removing the dynamic expression.	4 months ago
Tomaz Bratanic	d8a1f1114d	community[patch]: Handle exceptions where node props aren't consistent in neo4j schema (#22027 )	4 months ago
Pengcheng Liu	4cf523949a	community[patch]: Update model client to support vision model in Tong… (#21474 ) - Description: Tongyi uses different client for chat model and vision model. This PR chooses proper client based on model name to support both chat model and vision model. Reference [tongyi document](https://help.aliyun.com/zh/dashscope/developer-reference/tongyi-qianwen-vl-plus-api?spm=a2c4g.11186623.0.0.27404c9a7upm11) for details. ``` from langchain_core.messages import HumanMessage from langchain_community.chat_models import ChatTongyi llm = ChatTongyi(model_name='qwen-vl-max') image_message = { "image": "https://lilianweng.github.io/posts/2023-06-23-agent/agent-overview.png" } text_message = { "text": "summarize this picture", } message = HumanMessage(content=[text_message, image_message]) llm.invoke([message]) ``` - Issue: None - Dependencies: None - Twitter handle: None	4 months ago
Sevin F. Varoglu	1bc0ea5496	community[patch]: update OctoAIEmbeddings to subclass OpenAIEmbeddings (#21805 )	4 months ago
Jesse S	fc79b372cb	community[minor]: add aerospike vectorstore integration (#21735 ) Please let me know if you see any possible areas of improvement. I would very much appreciate your constructive criticism if time allows. Description: - Added a aerospike vector store integration that utilizes [Aerospike-Vector-Search](https://aerospike.com/products/vector-database-search-llm/) add-on. - Added both unit tests and integration tests - Added a docker compose file for spinning up a test environment - Added a notebook Dependencies: any dependencies required for this change - aerospike-vector-search Twitter handle: - No twitter, you can use my GitHub handle or LinkedIn if you'd like Thanks! --------- Co-authored-by: Jesse Schumacher <jschumacher@aerospike.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
Param Singh	d07885f8b7	community[patch]: standardized sparkllm init args (#21633 ) Related to #20085 @baskaryan Thank you for contributing to LangChain! community:sparkllm[patch]: standardized init args updated `spark_api_key` so that aliased to `api_key`. Added integration test for `sparkllm` to test that it continues to set the same underlying attribute. updated temperature with Pydantic Field, added to the integration test. Ran `make format`,`make test`, `make lint`, `make spell_check`	4 months ago
Liuww	332ffed393	community[patch]: Adopting the lighter-weight xinference_client (#21900 ) While integrating the xinference_embedding, we observed that the downloaded dependency package is quite substantial in size. With a focus on resource optimization and efficiency, if the project requirements are limited to its vector processing capabilities, we recommend migrating to the xinference_client package. This package is more streamlined, significantly reducing the storage space requirements of the project and maintaining a feature focus, making it particularly suitable for scenarios that demand lightweight integration. Such an approach not only boosts deployment efficiency but also enhances the application's maintainability, rendering it an optimal choice for our current context. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	4 months ago
Stefano Lottini	040597e832	community: init signature revision for Cassandra LLM cache classes + small maintenance (#17765 ) This PR improves on the `CassandraCache` and `CassandraSemanticCache` classes, mainly in the constructor signature, and also introduces several minor improvements around these classes. ### Init signature A (sigh) breaking change is tentatively introduced to the constructor. To me, the advantages outweigh the possible discomfort: the new syntax places the DB-connection objects `session` and `keyspace` later in the param list, so that they can be given a default value. This is what enables the pattern of _not_ specifying them, provided one has previously initialized the Cassandra connection through the versatile utility method `cassio.init(...)`. In this way, a much less unwieldy instantiation can be done, such as `CassandraCache()` and `CassandraSemanticCache(embedding=xyz)`, everything else falling back to defaults. A downside is that, compared to the earlier signature, this might turn out to be breaking for those doing positional instantiation. As a way to mitigate this problem, this PR typechecks its first argument trying to detect the legacy usage. (And to make this point less tricky in the future, most arguments are left to be keyword-only). If this is considered too harsh, I'd like guidance on how to further smoothen this transition. Our plan is to make the pattern of optional session/keyspace a standard across all Cassandra classes, so that a repeatable strategy would be ideal. A possibility would be to keep positional arguments for legacy reasons but issue a deprecation warning if any of them is actually used, to later remove them with 0.2 - please advise on this point. ### Other changes - class docstrings: enriched, completely moved to class level, added note on `cassio.init(...)` pattern, added tiny sample usage code. - semantic cache: revised terminology to never mention "distance" (it is in fact a similarity!). Kept the legacy constructor param with a deprecation warning if used. - `llm_caching` notebook: uniform flow with the Cassandra and Astra DB separate cases; better and Cassandra-first description; all imports made explicit and from community where appropriate. - cache integration tests moved to community (incl. the imported tools), env var bugfix for `CASSANDRA_CONTACT_POINTS`. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	4 months ago
ccurme	19e6bf814b	community: fix CI (#21766 )	4 months ago
Cheese	0ead09f84d	community: Implement `bind_tools` for ChatTongyi (#20725 ) ## Description Implement `bind_tools` in ChatTongyi. Usage example: ```py from langchain_core.tools import tool from langchain_community.chat_models.tongyi import ChatTongyi @tool def multiply(first_int: int, second_int: int) -> int: """Multiply two integers together.""" return first_int * second_int llm = ChatTongyi(model="qwen-turbo") llm_with_tools = llm.bind_tools([multiply]) msg = llm_with_tools.invoke("What's 5 times forty two") print(msg) ``` Streaming is also supported. ## Dependencies No Dependency is required for this change. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	4 months ago
Eugene Yurtsev	25fbe356b4	community[patch]: upgrade to recent version of mypy (#21616 ) This PR upgrades community to a recent version of mypy. It inserts type: ignore on all existing failures.	4 months ago
Oguz Vuruskaner	5b35f077f9	[community][fix](DeepInfraEmbeddings): Implement chunking for large batches (#21189 ) Description: This PR introduces chunking logic to the `DeepInfraEmbeddings` class to handle large batch sizes without exceeding maximum batch size of the backend. This enhancement ensures that embedding generation processes large batches by breaking them down into smaller, manageable chunks, each conforming to the maximum batch size limit. Issue: Fixes #21189 Dependencies: No new dependencies introduced.	5 months ago
Eugene Yurtsev	f92006de3c	multiple: langchain 0.2 in master (#21191 ) 0.2rc migrations - [x] Move memory - [x] Move remaining retrievers - [x] graph_qa chains - [x] some dependency from evaluation code potentially on math utils - [x] Move openapi chain from `langchain.chains.api.openapi` to `langchain_community.chains.openapi` - [x] Migrate `langchain.chains.ernie_functions` to `langchain_community.chains.ernie_functions` - [x] migrate `langchain/chains/llm_requests.py` to `langchain_community.chains.llm_requests` - [x] Moving `langchain_community.cross_enoders.base:BaseCrossEncoder` -> `langchain_community.retrievers.document_compressors.cross_encoder:BaseCrossEncoder` (namespace not ideal, but it needs to be moved to `langchain` to avoid circular deps) - [x] unit tests langchain -- add pytest.mark.community to some unit tests that will stay in langchain - [x] unit tests community -- move unit tests that depend on community to community - [x] mv integration tests that depend on community to community - [x] mypy checks Other todo - [x] Make deprecation warnings not noisy (need to use warn deprecated and check that things are implemented properly) - [x] Update deprecation messages with timeline for code removal (likely we actually won't be removing things until 0.4 release) -- will give people more time to transition their code. - [ ] Add information to deprecation warning to show users how to migrate their code base using langchain-cli - [ ] Remove any unnecessary requirements in langchain (e.g., is SQLALchemy required?) --------- Co-authored-by: Erick Friis <erick@langchain.dev>	5 months ago
Jorge Piedrahita Ortiz	e65652c3e8	community: add SambaNova embeddings integration (#21227 ) - Description: SambaNova hosted embeddings integration	5 months ago
Mark Cusack	060987d755	community[minor]: Add indexing via locality sensitive hashing to the Yellowbrick vector store (#20856 ) - Description: Add LSH-based indexing to the Yellowbrick vector store module - Twitter handle: @markcusack --------- Co-authored-by: markcusack <markcusack@markcusacksmac.lan> Co-authored-by: markcusack <markcusack@Mark-Cusack-sMac.local> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	5 months ago
Rohan Aggarwal	8021d2a2ab	community[minor]: Oraclevs integration (#21123 ) Thank you for contributing to LangChain! - Oracle AI Vector Search Oracle AI Vector Search is designed for Artificial Intelligence (AI) workloads that allows you to query data based on semantics, rather than keywords. One of the biggest benefit of Oracle AI Vector Search is that semantic search on unstructured data can be combined with relational search on business data in one single system. This is not only powerful but also significantly more effective because you don't need to add a specialized vector database, eliminating the pain of data fragmentation between multiple systems. - Oracle AI Vector Search is designed for Artificial Intelligence (AI) workloads that allows you to query data based on semantics, rather than keywords. One of the biggest benefit of Oracle AI Vector Search is that semantic search on unstructured data can be combined with relational search on business data in one single system. This is not only powerful but also significantly more effective because you don't need to add a specialized vector database, eliminating the pain of data fragmentation between multiple systems. This Pull Requests Adds the following functionalities Oracle AI Vector Search : Vector Store Oracle AI Vector Search : Document Loader Oracle AI Vector Search : Document Splitter Oracle AI Vector Search : Summary Oracle AI Vector Search : Oracle Embeddings - We have added unit tests and have our own local unit test suite which verifies all the code is correct. We have made sure to add guides for each of the components and one end to end guide that shows how the entire thing runs. - We have made sure that make format and make lint run clean. Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: skmishraoracle <shailendra.mishra@oracle.com> Co-authored-by: hroyofc <harichandan.roy@oracle.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	5 months ago
Jakub Pawłowski	b0b1a67771	community[patch]: Skip unexpected 404 HTTP Error in Arxiv download (#21042 ) ### Description: When attempting to download PDF files from arXiv, an unexpected 404 error frequently occurs. This error halts the operation, regardless of whether there are additional documents to process. As a solution, I suggest implementing a mechanism to ignore and communicate this error and continue processing the next document from the list. Proposed Solution: To address the issue of unexpected 404 errors during PDF downloads from arXiv, I propose implementing the following solution: - Error Handling: Implement error handling mechanisms to catch and handle 404 errors gracefully. - Communication: Inform the user or logging system about the occurrence of the 404 error. - Continued Processing: After encountering a 404 error, continue processing the remaining documents from the list without interruption. This solution ensures that the application can handle unexpected errors without terminating the entire operation. It promotes resilience and robustness in the face of intermittent issues encountered during PDF downloads from arXiv. ### Issue: #20909 ### Dependencies: none --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	5 months ago
Cahid Arda Öz	cc6191cb90	community[minor]: Add support for Upstash Vector (#20824 ) ## Description Adding `UpstashVectorStore` to utilize [Upstash Vector](https://upstash.com/docs/vector/overall/getstarted)! #17012 was opened to add Upstash Vector to langchain but was closed to wait for filtering. Now filtering is added to Upstash vector and we open a new PR. Additionally, [embedding feature](https://upstash.com/docs/vector/features/embeddingmodels) was added and we add this to our vectorstore aswell. ## Dependencies [upstash-vector](https://pypi.org/project/upstash-vector/) should be installed to use `UpstashVectorStore`. Didn't update dependencies because of [this comment in the previous PR](https://github.com/langchain-ai/langchain/pull/17012#pullrequestreview-1876522450). ## Tests Tests are added and they pass. Tests are naturally network bound since Upstash Vector is offered through an API. There was [a discussion in the previous PR about mocking the unittests](https://github.com/langchain-ai/langchain/pull/17012#pullrequestreview-1891820567). We didn't make changes to this end yet. We can update the tests if you can explain how the tests should be mocked. --------- Co-authored-by: ytkimirti <yusuftaha9@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	5 months ago
chyroc	3e241956d3	community[minor]: add coze chat model (#20770 ) add coze chat model, to call coze.com apis	5 months ago
Tomaz Bratanic	67428c4052	community[patch]: Neo4j enhanced schema (#20983 ) Scan the database for example values and provide them to an LLM for better inference of Text2cypher	5 months ago
Pengcheng Liu	1fad39be1c	community[minor]: Add LarkSuite wiki document loader. (#21016 ) Description: Add LarkSuite wiki document loader. Refer to [LarkSuite api document ](https://open.feishu.cn/document/server-docs/docs/wiki-v2/space-node/list)for details. Issue: None Dependencies: None Twitter handle: None	5 months ago
Jorge Piedrahita Ortiz	40b2e2916b	community[minor]: Sambanova llm integration (#20955 ) - Description: Added [Sambanova systems](https://sambanova.ai/) integration, including sambaverse and sambastudio LLMs - Dependencies: sseclient-py (optional) --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	5 months ago
Mayank Solanki	8c085fc697	community[patch]: Added a function `from_existing_collection` in `Qdrant` vector database. (#20779 ) Issue: #20514 The current implementation of `construct_instance` expects a `texts: List[str]` that will call the embedding function. This might not be needed when we already have a client with collection and `path, you don't want to add any text. This PR adds a class method that returns a qdrant instance with an existing client. Here everytime `cb6e5e56c2/libs/community/langchain_community/vectorstores/qdrant.py (L1592)` `construct_instance` is called, this line sends some text for embedding generation. --------- Co-authored-by: Anush <anushshetty90@gmail.com>	5 months ago
Lei Zhang	9281841cfe	community[patch]: fix integrated test case test_recursive_url_loader.py assertions (issue-20919) (#20920 ) Description: Fix integrated test case test_recursive_url_loader.py Local testing successful ```shell (venv) lei@LeideMacBook-Pro community % poetry run pytest tests/integration_tests/document_loaders/test_recursive_url_loader.py ================================================================================ test session starts ================================================================================ platform darwin -- Python 3.11.4, pytest-7.4.4, pluggy-1.4.0 -- /Users/zhanglei/Work/github/langchain/venv/bin/python cachedir: .pytest_cache rootdir: /Users/zhanglei/Work/github/langchain/libs/community configfile: pyproject.toml plugins: syrupy-4.6.1, asyncio-0.20.3, cov-4.1.0, vcr-1.0.2, mock-3.12.0, anyio-3.7.1, dotenv-0.5.2, requests-mock-1.11.0, socket-0.6.0 asyncio: mode=Mode.AUTO collected 6 items tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader PASSED [ 16%] tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic PASSED [ 33%] tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader FAILED [ 50%] tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent PASSED [ 66%] tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_loading_invalid_url PASSED [ 83%] tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties PASSED [100%] ===================================================================================== FAILURES ====================================================================================== __________________________________________________________________________ test_sync_recursive_url_loader ___________________________________________________________________________ def test_sync_recursive_url_loader() -> None: url = "https://docs.python.org/3.9/" loader = RecursiveUrlLoader( url, extractor=lambda _: "placeholder", use_async=False, max_depth=2 ) docs = loader.load() > assert len(docs) == 23 E AssertionError: assert 24 == 23 E + where 24 = len([Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/', 'content_type': 'text/html', 'title': '3.9.18 Documentation', 'language': None}), Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/py-modindex.html', 'content_type': 'text/html', 'title': 'Python Module Index — Python 3.9.18 documentation', 'language': None}), Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/download.html', 'content_type': 'text/html', 'title': 'Download — Python 3.9.18 documentation', 'language': None}), Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/howto/index.html', 'content_type': 'text/html', 'title': 'Python HOWTOs — Python 3.9.18 documentation', 'language': None}), Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/whatsnew/index.html', 'content_type': 'text/html', 'title': 'Whatâ\x80\x99s New in Python — Python 3.9.18 documentation', 'language': None}), Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/c-api/index.html', 'content_type': 'text/html', 'title': 'Python/C API Reference Manual — Python 3.9.18 documentation', 'language': None}), ...]) tests/integration_tests/document_loaders/test_recursive_url_loader.py:38: AssertionError ================================================================================= warnings summary ================================================================================== tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties /Users/zhanglei/.pyenv/versions/3.11.4/lib/python3.11/html/parser.py:170: XMLParsedAsHTMLWarning: It looks like you're parsing an XML document using an HTML parser. If this really is an HTML document (maybe it's XHTML?), you can ignore or filter this warning. If it's XML, you should know that using an XML parser will be more reliable. To parse this document as XML, make sure you have the lxml package installed, and pass the keyword argument `features="xml"` into the BeautifulSoup constructor. k = self.parse_starttag(i) -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ================================================================================ slowest 5 durations ================================================================================ 56.75s call tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic 38.99s call tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader 31.20s call tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties 30.37s call tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent 15.44s call tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader ============================================================================== short test summary info ============================================================================== FAILED tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader - AssertionError: assert 24 == 23 ================================================================ 1 failed, 5 passed, 5 warnings in 172.97s (0:02:52) ================================================================ (venv) zhanglei@LeideMacBook-Pro community % poetry run pytest tests/integration_tests/document_loaders/test_recursive_url_loader.py ================================================================================ test session starts ================================================================================ platform darwin -- Python 3.11.4, pytest-7.4.4, pluggy-1.4.0 -- /Users/zhanglei/Work/github/langchain/venv/bin/python cachedir: .pytest_cache rootdir: /Users/zhanglei/Work/github/langchain/libs/community configfile: pyproject.toml plugins: syrupy-4.6.1, asyncio-0.20.3, cov-4.1.0, vcr-1.0.2, mock-3.12.0, anyio-3.7.1, dotenv-0.5.2, requests-mock-1.11.0, socket-0.6.0 asyncio: mode=Mode.AUTO collected 6 items tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader PASSED [ 16%] tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic PASSED [ 33%] tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader PASSED [ 50%] tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent PASSED [ 66%] tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_loading_invalid_url PASSED [ 83%] tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties PASSED [100%] ================================================================================= warnings summary ================================================================================== tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties /Users/zhanglei/.pyenv/versions/3.11.4/lib/python3.11/html/parser.py:170: XMLParsedAsHTMLWarning: It looks like you're parsing an XML document using an HTML parser. If this really is an HTML document (maybe it's XHTML?), you can ignore or filter this warning. If it's XML, you should know that using an XML parser will be more reliable. To parse this document as XML, make sure you have the lxml package installed, and pass the keyword argument `features="xml"` into the BeautifulSoup constructor. k = self.parse_starttag(i) -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ================================================================================ slowest 5 durations ================================================================================ 46.99s call tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic 32.43s call tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader 31.23s call tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent 30.75s call tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties 15.89s call tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader ===================================================================== 6 passed, 5 warnings in 157.42s (0:02:37) ===================================================================== (venv) lei@LeideMacBook-Pro community % ``` Issue: https://github.com/langchain-ai/langchain/issues/20919 Twitter handle: @coolbeevip	5 months ago
Shengsheng Huang	fd1061e7bf	community[patch]: add more data types support to ipex-llm llm integration (#20833 ) - Description: - add support for more data types: by default `IpexLLM` will load the model in int4 format. This PR adds more data types support such as `sym_in5`, `sym_int8`, etc. Data formats like NF3, NF4, FP4 and FP8 are only supported on GPU and will be added in future PR. - Fix a small issue in saving/loading, update api docs - Dependencies: `ipex-llm` library - Document: In `docs/docs/integrations/llms/ipex_llm.ipynb`, added instructions for saving/loading low-bit model. - Tests: added new test cases to `libs/community/tests/integration_tests/llms/test_ipex_llm.py`, added config params. - Contribution maintainer: @shane-huang	5 months ago
Jingpan Xiong	1202017c56	community[minor]: Add relyt vector database (#20316 ) Co-authored-by: kaka <kaka@zbyte-inc.cloud> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: jingsi <jingsi@leadincloud.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	5 months ago
Andres Algaba	05ae8ca7d4	community[patch]: deprecate persist method in Chroma (#20855 ) Thank you for contributing to LangChain! - [x] PR title - [x] PR message: - Description: Deprecate persist method in Chroma no longer exists in Chroma 0.4.x - Issue: #20851 - Dependencies: None - Twitter handle: AndresAlgaba1 - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	5 months ago
ccurme	b8db73233c	core, community: deprecate tool.__call__ (#20900 ) Does not update docs.	5 months ago
Tomaz Bratanic	520972fd0f	community[patch]: Support passing graph object to Neo4j integrations (#20876 ) For driver connection reusage, we introduce passing the graph object to neo4j integrations	5 months ago
Lei Zhang	748a6ae609	community[patch]: add HTTP response headers Content-Type to metadata of RecursiveUrlLoader document (#20875 ) Description: The RecursiveUrlLoader loader offers a link_regex parameter that can filter out URLs. However, this filtering capability is limited, and if the internal links of the website change, unexpected resources may be loaded. These resources, such as font files, can cause problems in subsequent embedding processing. > https://blog.langchain.dev/assets/fonts/source-sans-pro-v21-latin-ext_latin-regular.woff2?v=0312715cbf We can add the Content-Type in the HTTP response headers to the document metadata so developers can choose which resources to use. This allows developers to make their own choices. For example, the following may be a good choice for text knowledge. - text/plain - simple text file - text/html - HTML web page - text/xml - XML format file - text/json - JSON format data - application/pdf - PDF file - application/msword - Word document and ignore the following - text/css - CSS stylesheet - text/javascript - JavaScript script - application/octet-stream - binary data - image/jpeg - JPEG image - image/png - PNG image - image/gif - GIF image - image/svg+xml - SVG image - audio/mpeg - MPEG audio files - video/mp4 - MP4 video file - application/font-woff - WOFF font file - application/font-ttf - TTF font file - application/zip - ZIP compressed file - application/octet-stream - binary data Twitter handle: @coolbeevip --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	5 months ago
ccurme	481d3855dc	patch: remove usage of llm, chat model __call__ (#20788 ) - `llm(prompt)` -> `llm.invoke(prompt)` - `llm(prompt=prompt` -> `llm.invoke(prompt)` (same with `messages=`) - `llm(prompt, callbacks=callbacks)` -> `llm.invoke(prompt, config={"callbacks": callbacks})` - `llm(prompt, kwargs)` -> `llm.invoke(prompt, kwargs)`	5 months ago
Raghav Dixit	9b7fb381a4	community[patch]: LanceDB integration patch update (#20686 ) Description : - added functionalities - delete, index creation, using existing connection object etc. - updated usage - Added LaceDB cloud OSS support make lint_diff , make test checks done	5 months ago
volodymyr-memsql	493afe4d8d	community[patch]: add hybrid search to singlestoredb vectorstore (#20793 ) Implemented the ability to enable full-text search within the SingleStore vector store, offering users a versatile range of search strategies. This enhancement allows users to seamlessly combine full-text search with vector search, enabling the following search strategies: * Search solely by vector similarity. * Conduct searches exclusively based on text similarity, utilizing Lucene internally. * Filter search results by text similarity score, with the option to specify a threshold, followed by a search based on vector similarity. * Filter results by vector similarity score before conducting a search based on text similarity. * Perform searches using a weighted sum of vector and text similarity scores. Additionally, integration tests have been added to comprehensively cover all scenarios. Updated notebook with examples. CC: @baskaryan, @hwchase17 --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	5 months ago
Tomaz Bratanic	9efab3ed66	community[patch]: Add driver config param for neo4j graph (#20772 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	5 months ago
Martin Kolb	0186e4e633	community[patch]: Advanced filtering for HANA Cloud Vector Engine (#20821 ) - Description: This PR adds support for advanced filtering to the integration of HANA Vector Engine. The newly supported filtering operators are: $eq, $ne, $gt, $gte, $lt, $lte, $between, $in, $nin, $like, $and, $or - Issue: N/A - Dependencies: no new dependencies added Added integration tests to: `libs/community/tests/integration_tests/vectorstores/test_hanavector.py` Description of the new capabilities in notebook: `docs/docs/integrations/vectorstores/hanavector.ipynb`	5 months ago
ccurme	c010ec8b71	patch: deprecate (a)get_relevant_documents (#20477 ) - `.get_relevant_documents(query)` -> `.invoke(query)` - `.get_relevant_documents(query=query)` -> `.invoke(query)` - `.get_relevant_documents(query, callbacks=callbacks)` -> `.invoke(query, config={"callbacks": callbacks})` - `.get_relevant_documents(query, kwargs)` -> `.invoke(query, kwargs)` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	5 months ago
shumway743	cb6e5e56c2	community[minor]: add graph store implementation for apache age (#20582 ) Description: implemented GraphStore class for Apache Age graph db Dependencies: depends on psycopg2 Unit and integration tests included. Formatting and linting have been run. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	5 months ago
Christophe Bornet	c909ae0152	community[minor]: Add async methods to CassandraVectorStore (#20602 ) Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	5 months ago
Tomaz Bratanic	8c08cf4619	community: Add support for relationship indexes in neo4j vector (#20657 ) Neo4j has added relationship vector indexes. We can't populate them, but we can use existing indexes for retrieval	5 months ago
Christophe Bornet	d2d01370bc	community[minor]: Add async methods to CassandraLoader (#20609 ) Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	5 months ago
Pengcheng Liu	ecd19a9e58	community[patch]: Add function call support in Tongyi chat model. (#20119 ) - [ ] PR message: - Description: This pr adds function calling support in Tongyi chat model. - Issue: None - Dependencies: None - Twitter handle: None Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	5 months ago
Sevin F. Varoglu	3f156e0ece	community[minor]: add ChatOctoAI (#20059 ) This PR adds ChatOctoAI, a chat model integration for OctoAI.	5 months ago
pjb157	479be3cc91	community[minor]: Unify Titan Takeoff Integrations and Adding Embedding Support (#18775 ) Community: Unify Titan Takeoff Integrations and Adding Embedding Support Description: Titan Takeoff no longer reflects this either of the integrations in the community folder. The two integrations (TitanTakeoffPro and TitanTakeoff) where causing confusion with clients, so have moved code into one place and created an alias for backwards compatibility. Added Takeoff Client python package to do the bulk of the work with the requests, this is because this package is actively updated with new versions of Takeoff. So this integration will be far more robust and will not degrade as badly over time. Issue: Fixes bugs in the old Titan integrations and unified the code with added unit test converge to avoid future problems. Dependencies: Added optional dependency takeoff-client, all imports still work without dependency including the Titan Takeoff classes but just will fail on initialisation if not pip installed takeoff-client Twitter @MeryemArik9 Thanks all :) --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	5 months ago
Guangdong Liu	b78ede2f96	community[patch]: standardize init args (#20166 ) Related to https://github.com/langchain-ai/langchain/issues/20085 @baskaryan	5 months ago
Guangdong Liu	3729bec1a2	community[patch]: standardize init args (#20210 ) Related to https://github.com/langchain-ai/langchain/issues/20085 @baskaryan	5 months ago
sdan	a7c5e41443	community[minor]: Added VLite as VectorStore (#20245 ) Support [VLite](https://github.com/sdan/vlite) as a new VectorStore type. Description: vlite is a simple and blazing fast vector database(vdb) made with numpy. It abstracts a lot of the functionality around using a vdb in the retrieval augmented generation(RAG) pipeline such as embeddings generation, chunking, and file processing while still giving developers the functionality to change how they're made/stored. Before submitting: Added tests [here](`c09c2ebd5c/libs/community/tests/integration_tests/vectorstores/test_vlite.py`) Added ipython notebook [here](`c09c2ebd5c/docs/docs/integrations/vectorstores/vlite.ipynb`) Added simple docs on how to use [here](`c09c2ebd5c/docs/docs/integrations/providers/vlite.mdx`) Profiles Maintainers: @sdan Twitter handles: [@sdand](https://x.com/sdand) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	5 months ago
Sevin F. Varoglu	54d388d898	community[patch]: update OctoAI endpoint to subclass BaseOpenAI (#19757 ) This PR updates OctoAIEndpoint LLM to subclass BaseOpenAI as OctoAI is an OpenAI-compatible service. The documentation and tests have also been updated.	5 months ago
Benito Geordie	57b226532d	community[minor]: Added integrations for ThirdAI's NeuralDB as a Retriever (#17334 ) Description: Adds ThirdAI NeuralDB retriever integration. NeuralDB is a CPU-friendly and fine-tunable text retrieval engine. We previously added a vector store integration but we think that it will be easier for our customers if they can also find us under under langchain-community/retrievers. --------- Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com> Co-authored-by: Kartik Sarangmath <kartik@thirdai.com>	5 months ago
Martín Gotelli Ferenaz	b48add4353	community[patch]: Fix pgvector deprecated filter clause usage with OR and AND conditions (#20446 ) Description: Support filter by OR and AND for deprecated PGVector version Issue: #20445 Dependencies: N/A Twitter handle: @martinferenaz	5 months ago
Juan Carlos José Camacho	450c458f8f	community[minor]: Add Datahareld tool (#19680 ) Description: Integrate [dataherald](https://www.dataherald.com) tool, It is a natural language-to-SQL tool. Dependencies: Install dataherald sdk to use it, ``` pip install dataherald ``` --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Christophe Bornet <cbornet@hotmail.com>	5 months ago
Guangdong Liu	4be7ca7b4c	community[patch]:sparkllm standardize init args (#20194 ) Related to https://github.com/langchain-ai/langchain/issues/20085 @baskaryan	5 months ago
Leonid Ganeline	4cb5f4c353	community[patch]: import flattening fix (#20110 ) This PR should make it easier for linters to do type checking and for IDEs to jump to definition of code. See #20050 as a template for this PR. - As a byproduct: Added 3 missed `test_imports`. - Added missed `SolarChat` in to __init___.py Added it into test_import ut. - Added `# type: ignore` to fix linting. It is not clear, why linting errors appear after ^ changes. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	6 months ago
Prince Canuma	1f9f4d8742	community[minor]: Add support for MLX models (chat & llm) (#18152 ) Description: This PR adds support for MLX models both chat (i.e., instruct) and llm (i.e., pretrained) types/ Dependencies: mlx, mlx_lm, transformers Twitter handle: @Prince_Canuma --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	6 months ago
Alex Sherstinsky	5f563e040a	community: extend Predibase integration to support fine-tuned LLM adapters (#19979 ) - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: Langchain-Predibase integration was failing, because it was not current with the Predibase SDK; in addition, Predibase integration tests were instantiating the Langchain Community `Predibase` class with one required argument (`model`) missing. This change updates the Predibase SDK usage and fixes the integration tests. - Twitter handle: `@alexsherstinsky` - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	6 months ago
Marlene	2f03bc397e	Community: Updating Azure Retriever and Docs to be Azure AI Search instead of Azure Cognitive Search (#19925 ) Last year Microsoft [changed the name](https://learn.microsoft.com/en-us/azure/search/search-what-is-azure-search) of Azure Cognitive Search to Azure AI Search. This PR updates the Langchain Azure Retriever API and it's associated docs to reflect this change. It may be confusing for users to see the name Cognitive here and AI in the Microsoft documentation which is why this is needed. I've also added a more detailed example to the Azure retriever doc page. There are more places that need a similar update but I'm breaking it up so the PRs are not too big 😄 Fixing my errors from the previous PR. Twitter: @marlene_zw Two new tests added to test backward compatibility in `libs/community/tests/integration_tests/retrievers/test_azure_cognitive_search.py` --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	6 months ago
Tomaz Bratanic	df25829f33	community[minor]: Add metadata filtering support for neo4j vector (#20001 )	6 months ago
Tomaz Bratanic	09a0ecd000	langchain[minor]: Tests update metadata filtering examples of documents (#19963 ) Removing metadata properties that are dicts as some databases don't support that, and those properties aren't used in tests anyhow..	6 months ago
Cheng, Penghui	cc407e8a1b	community[minor]: weight only quantization with intel-extension-for-transformers. (#14504 ) Support weight only quantization with intel-extension-for-transformers. [Intel® Extension for Transformers](https://github.com/intel/intel-extension-for-transformers) is an innovative toolkit to accelerate Transformer-based models on Intel platforms, in particular effective on 4th Intel Xeon Scalable processor [Sapphire Rapids](https://www.intel.com/content/www/us/en/products/docs/processors/xeon-accelerated/4th-gen-xeon-scalable-processors.html) (codenamed Sapphire Rapids). The toolkit provides the below key features: * Seamless user experience of model compressions on Transformer-based models by extending [Hugging Face transformers](https://github.com/huggingface/transformers) APIs and leveraging [Intel® Neural Compressor](https://github.com/intel/neural-compressor) * Advanced software optimizations and unique compression-aware runtime. * Optimized Transformer-based model packages. * [NeuralChat](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/neural_chat), a customizable chatbot framework to create your own chatbot within minutes by leveraging a rich set of plugins and SOTA optimizations. * [Inference](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/llm/runtime/graph) of Large Language Model (LLM) in pure C/C++ with weight-only quantization kernels. This PR is an integration of weight only quantization feature with intel-extension-for-transformers. Unit test is in lib/langchain/tests/integration_tests/llm/test_weight_only_quantization.py The notebook is in docs/docs/integrations/llms/weight_only_quantization.ipynb. The document is in docs/docs/integrations/providers/weight_only_quantization.mdx. --------- Signed-off-by: Cheng, Penghui <penghui.cheng@intel.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
Erick Friis	f0d5b59962	core[patch]: remove requests (#19891 ) Removes required usage of `requests` from `langchain-core`, all of which has been deprecated. - removes Tracer V1 implementations - removes old `try_load_from_hub` github-based hub implementations Removal done in a way where imports will still succeed, and usage will fail with a `RuntimeError`.	6 months ago
Jamsheed Mistri	4f70bc119d	community[minor]: add Layerup Security integration (#19787 ) Description: adds integration with [Layerup Security](https://uselayerup.com). Docs can be found [here](https://docs.uselayerup.com). Integrates directly with our Python SDK. Dependencies: [LayerupSecurity](https://pypi.org/project/LayerupSecurity/) Note: all methods for our product require a paid API key, so I only included 1 test which checks for an invalid API key response. I have tested extensively locally. Twitter handle: [@layerup_](https://twitter.com/layerup_) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
Anıl Berk Altuner	4384fa8e49	community[minor]: Add Dria retriever (#17098 ) [Dria](https://dria.co/) is a hub of public RAG models for developers to both contribute and utilize a shared embedding lake. This PR adds a retriever that can retrieve documents from Dria.	6 months ago
Chenhui Zhang	a1f3e9f537	community[minor]: Update ChatZhipuAI to support GLM-4 model (#16695 ) Description: Update `ChatZhipuAI` to support the latest `glm-4` model. Issue: N/A Dependencies: httpx, httpx-sse, PyJWT The previous `ChatZhipuAI` implementation requires the `zhipuai` package, and cannot call the latest GLM model. This is because - The old version `zhipuai==1.` doesn't support the latest model. - `zhipuai==2.` requires `pydantic V2`, which is incompatible with 'langchain-community'. This re-implementation invokes the GLM model by sending HTTP requests to [open.bigmodel.cn](https://open.bigmodel.cn/dev/api) via the `httpx` package, and uses the `httpx-sse` package to handle stream events. --------- Co-authored-by: zR <2448370773@qq.com>	6 months ago
Kenneth Choe	f98d7f7494	langchain[minor], community[minor]: add CrossEncoderReranker with HuggingFaceCrossEncoder and SagemakerEndpointCrossEncoder (#13687 ) - Description: Support reranking based on cross encoder models available from HuggingFace. - Added `CrossEncoder` schema - Implemented `HuggingFaceCrossEncoder` and `SagemakerEndpointCrossEncoder` - Implemented `CrossEncoderReranker` that performs similar functionality to `CohereRerank` - Added `cross-encoder-reranker.ipynb` to demonstrate how to use it. Please let me know if anything else needs to be done to make it visible on the table-of-contents navigation bar on the left, or on the card list on [retrievers documentation page](https://python.langchain.com/docs/integrations/retrievers). - Issue: N/A - Dependencies: None other than the existing ones. --------- Co-authored-by: Kenny Choe <kchoe@amazon.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
anshaneel	0884e5de7f	community[minor]: Add Alpha Vantage API Tool (#14332 ) ### Description This implementation adds functionality from the AlphaVantage API, renowned for its comprehensive financial data. The class encapsulates various methods, each dedicated to fetching specific types of financial information from the API. ### Implemented Functions - `search_symbols`: - Searches the AlphaVantage API for financial symbols using the provided keywords. - `_get_market_news_sentiment`: - Retrieves market news sentiment for a specified stock symbol from the AlphaVantage API. - `_get_time_series_daily`: - Fetches daily time series data for a specific symbol from the AlphaVantage API. - `_get_quote_endpoint`: - Obtains the latest price and volume information for a given symbol from the AlphaVantage API. - `_get_time_series_weekly`: - Gathers weekly time series data for a particular symbol from the AlphaVantage API. - `_get_top_gainers_losers`: - Provides details on top gainers, losers, and most actively traded tickers in the US market from the AlphaVantage API. ### Issue: - #11994 ### Dependencies: - 'requests' library for HTTP requests. (import requests) - 'pytest' library for testing. (import pytest) --------- Co-authored-by: Adam Badar <94140103+adam-badar@users.noreply.github.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
Alex Sherstinsky	a9bc212bf2	community[minor]: fix failing Predibase integration (#19776 ) - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: Langchain-Predibase integration was failing, because it was not current with the Predibase SDK; in addition, Predibase integration tests were instantiating the Langchain Community `Predibase` class with one required argument (`model`) missing. This change updates the Predibase SDK usage and fixes the integration tests. - Twitter handle: `@alexsherstinsky` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
M.Abdulrahman Alnaseer	ba54f1577f	community[minor]: add support for llmsherpa (#19741 ) Thank you for contributing to LangChain! - [x] PR title: "community: added support for llmsherpa library" - [x] Add tests and docs: 1. Integration test: 'docs/docs/integrations/document_loaders/test_llmsherpa.py'. 2. an example notebook: `docs/docs/integrations/document_loaders/llmsherpa.ipynb`. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
Hrvoje Milković	b7344e3347	community[minor]: Infobip tool integration (#16805 ) Description: Adding Tool that wraps Infobip API for sending sms or emails and email validation. Dependencies: None, Twitter handle: @hmilkovic Implementation: ``` libs/community/langchain_community/utilities/infobip.py ``` Integration tests: ``` libs/community/tests/integration_tests/utilities/test_infobip.py ``` Example notebook: ``` docs/docs/integrations/tools/infobip.ipynb ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
morgana	074ad5095f	community[patch]: mmr search for Rockset vectorstore integration (#16908 ) - Description: Adding support for mmr search in the Rockset vectorstore integration. - Issue: N/A - Dependencies: N/A - Twitter handle: `@_morgan_adams_` --------- Co-authored-by: Rockset API Bot <admin@rockset.io> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	6 months ago
Tomaz Bratanic	dec00d3050	community[patch]: Add the ability to pass maps to neo4j retrieval query (#19758 ) Makes it easier to flatten complex values to text, so you don't have to use a lot of Cypher to do it.	6 months ago
Jiaming	3d3cc71287	community[patch]: fix bugs for bilibili Loader (#18036 ) - Description: 1. Fix the BiliBiliLoader that can receive cookie parameters, it requires 3 other parameters to run. The change is backward compatible. 2. Add test; 3. Add example in docs - Issue: [#14213] Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	6 months ago
Sachin Paryani	25c9f3d1d1	community[patch]: Support Streaming in Azure Machine Learning (#18246 ) - [x] PR title: "community: Support streaming in Azure ML and few naming changes" - [x] PR message: - Description: Added support for streaming for azureml_endpoint. Also, renamed and AzureMLEndpointApiType.realtime to AzureMLEndpointApiType.dedicated. Also, added new classes CustomOpenAIChatContentFormatter and CustomOpenAIContentFormatter and updated the classes LlamaChatContentFormatter and LlamaContentFormatter to now show a deprecated warning message when instantiated. --------- Co-authored-by: Sachin Paryani <saparan@microsoft.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
wulixuan	b7c8bc8268	community[patch]: fix yuan2 errors in LLMs (#19004 ) 1. fix yuan2 errors while invoke Yuan2. 2. update tests.	6 months ago
Davide Menini	f7042321f1	community[patch]: gather token usage info in BedrockChat during generation (#19127 ) This PR allows to calculate token usage for prompts and completion directly in the generation method of BedrockChat. The token usage details are then returned together with the generations, so that other downstream tasks can access them easily. This allows to define a callback for tokens tracking and cost calculation, similarly to what happens with OpenAI (see [OpenAICallbackHandler](https://api.python.langchain.com/en/latest/_modules/langchain_community/callbacks/openai_info.html#OpenAICallbackHandler). I plan on adding a BedrockCallbackHandler later. Right now keeping track of tokens in the callback is already possible, but it requires passing the llm, as done here: https://how.wtf/how-to-count-amazon-bedrock-anthropic-tokens-with-langchain.html. However, I find the approach of this PR cleaner. Thanks for your reviews. FYI @baskaryan, @hwchase17 --------- Co-authored-by: taamedag <Davide.Menini@swisscom.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
Shengsheng Huang	ac1dd8ad94	community[minor]: migrate `bigdl-llm` to `ipex-llm` (#19518 ) - Description: `bigdl-llm` library has been renamed to [`ipex-llm`](https://github.com/intel-analytics/ipex-llm). This PR migrates the `bigdl-llm` integration to `ipex-llm` . - Issue: N/A. The original PR of `bigdl-llm` is https://github.com/langchain-ai/langchain/pull/17953 - Dependencies: `ipex-llm` library - Contribution maintainer: @shane-huang Updated doc: docs/docs/integrations/llms/ipex_llm.ipynb Updated test: libs/community/tests/integration_tests/llms/test_ipex_llm.py	6 months ago
Chaunte W. Lacewell	a31f692f4e	community[minor]: Add VDMS vectorstore (#19551 ) - Description: Add support for Intel Lab's [Visual Data Management System (VDMS)](https://github.com/IntelLabs/vdms) as a vector store - Dependencies: `vdms` library which requires protobuf = "4.24.2". There is a conflict with dashvector in `langchain` package but conflict is resolved in `community`. - Contribution maintainer: [@cwlacewe](https://github.com/cwlacewe) - Added tests: libs/community/tests/integration_tests/vectorstores/test_vdms.py - Added docs: docs/docs/integrations/vectorstores/vdms.ipynb - Added cookbook: cookbook/multi_modal_RAG_vdms.ipynb --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
yongheng.liu	7e29b6061f	community[minor]: integrate China Mobile Ecloud vector search (#15298 ) - Description: integrate China Mobile Ecloud vector search, - Dependencies: elasticsearch==7.10.1 Co-authored-by: liuyongheng <liuyongheng@cmss.chinamobile.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
CaroFG	cf96060ab7	community[patch]: update for compatibility with latest Meilisearch version (#18970 ) - Description: Updates Meilisearch vectorstore for compatibility with v1.6 and above. Adds embedders settings and embedder_name which are now required. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	6 months ago
Adrian Valente	2763d8cbe5	community: add len() implementation to Chroma (#19419 ) Thank you for contributing to LangChain! - [x] Add len() implementation to Chroma: "package: community" - [x] PR message: - Description: add an implementation of the __len__() method for the Chroma vectostore, for convenience. - Issue: no exposed method to know the size of a Chroma vectorstore - Dependencies: None - Twitter handle: lowrank_adrian - [x] Add tests and docs - [x] Lint and test --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	6 months ago
Aayush Kataria	03c38005cb	community[patch]: Fixing some caching issues for AzureCosmosDBSemanticCache (#18884 ) Fixing some issues for AzureCosmosDBSemanticCache - Added the entry for "AzureCosmosDBSemanticCache" which was missing in langchain/cache.py - Added application name when creating the MongoClient for the AzureCosmosDBVectorSearch, for tracking purposes. @baskaryan, can you please review this PR, we need this to go in asap. These are just small fixes which we found today in our testing.	6 months ago
Anindyadeep	b2a11ce686	community[minor]: Prem AI langchain integration (#19113 ) ### Prem SDK integration in LangChain This PR adds the integration with [PremAI's](https://www.premai.io/) prem-sdk with langchain. User can now access to deployed models (llms/embeddings) and use it with langchain's ecosystem. This PR adds the following: ### This PR adds the following: - [x] Add chat support - [X] Adding embedding support - [X] writing integration tests - [X] writing tests for chat - [X] writing tests for embedding - [X] writing unit tests - [X] writing tests for chat - [X] writing tests for embedding - [X] Adding documentation - [X] writing documentation for chat - [X] writing documentation for embedding - [X] run `make test` - [X] run `make lint`, `make lint_diff` - [X] Final checks (spell check, lint, format and overall testing) --------- Co-authored-by: Anindyadeep Sannigrahi <anindyadeepsannigrahi@Anindyadeeps-MacBook-Pro.local> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	6 months ago
Marlene	f1313339ac	community[patch]: Fixing incorrect base URLs for Azure Cognitive Search Retriever (#19352 ) This PR adds code to make sure that the correct base URL is being created for the Azure Cognitive Search retriever. At the moment an incorrect base URL is being generated. I think this is happening because the original code was based on a depreciated API version. No dependencies need to be added. I've also added more context to the test doc strings. I should also note that ACS is now Azure AI Search. I will open a separate PR to make these changes as that would be a breaking change and should potentially be discussed. Twitter: @marlene_zw - No new tests added, however the current ACS retriever tests are now passing when I run them. - Code was linted. Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	6 months ago
Martin Kolb	e5bdb26f76	community[patch]: More flexible handling for entity names in vector store "HANA Cloud" (#19523 ) - Description: Added support for lower-case and mixed-case names The names for tables and columns previouly had to be UPPER_CASE. With this enhancement, also lower_case and MixedCase are supported, - Issue: N/A - Dependencies: no new dependecies added - Twitter handle: @sapopensource	6 months ago
Igor Muniz Soares	743f888580	community[minor]: Dappier chat model integration (#19370 ) Description: This PR adds [Dappier](https://dappier.com/) for the chat model. It supports generate, async generate, and batch functionalities. We added unit and integration tests as well as a notebook with more details about our chat model. Dependencies: No extra dependencies are needed.	6 months ago

1 2 3 4 5 ...

278 Commits (35ebd2620c56d8a109321d7eb3f7676e20175d69)