langchain

Commit Graph

Author	SHA1	Message	Date
Junlin Zhou	5b9bdcac1b	docs: fix link url (#9643 ) This pull request corrects the URL links in the Async API documentation to align with the updated project layout. The links had not been updated despite the changes in layout.	1 year ago
Aashish Saini	eb92da84a1	Fixings grammatical errors in Doc Files (#9647 ) Fixing some typos and grammatical error is doc file. @eyurtsev , @baskaryan Thanks --------- Co-authored-by: Aayush <142384656+AayushShorthillsAI@users.noreply.github.com> Co-authored-by: Ishita Chauhan <136303787+IshitaChauhanShortHillsAI@users.noreply.github.com>	1 year ago
Joseph McElroy	2a06e7b216	ElasticsearchStore: improve error logging for adding documents (#9648 ) Not obvious what the error is when you cannot index. This pr adds the ability to log the first errors reason, to help the user diagnose the issue. Also added some more documentation for when you want to use the vectorstore with an embedding model deployed in elasticsearch. Credit: @elastic and @phoey1	1 year ago
Julien Salinas	f1072cc31f	Merge branch 'master' into master	1 year ago
Leonid Ganeline	e1f4f9ac3e	docs: `integrations/providers` (#9631 ) Added missed pages for `integrations/providers` from `vectorstores`. Updated several `vectorstores` notebooks.	1 year ago
anifort	900c1f3e8d	Add support for structured data sources with google enterprise search (#9037 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: Added the capability to handles structured data from google enterprise search, - Issue: Retriever failed when underline search engine was integrated with structured data, - Dependencies: google-api-core - Tag maintainer: @jarokaz - Twitter handle: anifort Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> --------- Co-authored-by: Christos Aniftos <aniftos@google.com> Co-authored-by: Holt Skinner <13262395+holtskinner@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	1 year ago
Jacob Lee	632a83c48e	Update ChatOpenAI docs with fine-tuning example (#9632 )	1 year ago
Adilkhan Sarsen	f29312eb84	Fixing deeplake.mdx file as it uses outdates links (#9602 ) deeplake.mdx was using old links and was not working properly, in the PR we fix the issue.	1 year ago
klae01	b868ef23bc	Add AINetwork blockchain toolkit integration (#9527 ) # Description This PR introduces a new toolkit for interacting with the AINetwork blockchain. The toolkit provides a set of tools for performing various operations on the AINetwork blockchain, such as transferring AIN, reading and writing values to the blockchain database, managing apps, setting rules and owners. # Dependencies [ain-py](https://github.com/ainblockchain/ain-py) >= 1.0.2 # Misc The example notebook (langchain/docs/extras/integrations/toolkits/ainetwork.ipynb) is in the PR --------- Co-authored-by: kriii <kriii@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Vanessa Arndorfer	1ea2f9adf4	Document AzureML Deployment Example (#9571 ) Description: Link an example of deploying a Langchain app to an AzureML online endpoint to the deployments documentation page. Co-authored-by: Vanessa Arndorfer <vaarndor@microsoft.com>	1 year ago
toddkim95	fba29f203a	Add to support polars (#9610 ) ### Description Polars is a DataFrame interface on top of an OLAP Query Engine implemented in Rust. Polars is faster to read than pandas, so I'm looking forward to seeing it added to the document loader. ### Dependencies polars (https://pola-rs.github.io/polars-book/user-guide/) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Julien Salinas	4d0b7bb8e1	Remove Dolphin and GPT-J from the embeddings docs. These models are not proposed anymore.	1 year ago
Jeremy Suriel	0fa4516ce4	Fix typo (#9565 ) Corrected a minor documentation typo here: https://python.langchain.com/docs/modules/model_io/models/llms/#generate-batch-calls-richer-outputs	1 year ago
Jacob Lee	0fea987dd2	Add missing param to parent document retriever notebook (#9569 )	1 year ago
Zizhong Zhang	00eff8c4a7	feat: Add PromptGuard integration (#9481 ) Add PromptGuard integration ------- There are two approaches to integrate PromptGuard with a LangChain application. 1. PromptGuardLLMWrapper 2. functions that can be used in LangChain expression. ----- - Dependencies `promptguard` python package, which is a runtime requirement if you'd try out the demo. - @baskaryan @hwchase17 Thanks for the ideas and suggestions along the development process. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Oleksandr Ichenskyi	8bc1a3dca8	docs: Add memgraph notebook (#9448 ) - Description: added graph_memgraph_qa.ipynb which shows how to use LLMs to provide a natural language interface to a Memgraph database using [MemgraphGraph](https://github.com/langchain-ai/langchain/pull/8591) class. - Dependencies: given that the notebook utilizes the MemgraphGraph class, it relies on both this class and several Python packages that are installed in the notebook using pip (langchain, openai, neo4j, gqlalchemy). The notebook is dependent on having a functional Memgraph instance running, as it requires this instance to establish a connection.	1 year ago
Bagatur	d09cdb4880	update data connection -> retrieval (#9561 )	1 year ago
Matthew Zeiler	949b2cf177	Improvements to the Clarifai integration (#9290 ) - Improved docs - Improved performance in multiple ways through batching, threading, etc. - fixed error message - Added support for metadata filtering during similarity search. @baskaryan PTAL	1 year ago
ricki-epsilla	66a47d9a61	add Epsilla vectorstore (#9239 ) [Epsilla](https://github.com/epsilla-cloud/vectordb) vectordb is an open-source vector database that leverages the advanced academic parallel graph traversal techniques for vector indexing. This PR adds basic integration with [pyepsilla](https://github.com/epsilla-cloud/epsilla-python-client)(Epsilla vectordb python client) as a vectorstore. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Bagatur	4999e8af7e	pin pydantic api ref build (#9556 )	1 year ago
axiangcoding	05aa02005b	feat(llms): support ERNIE Embedding-V1 (#9370 ) - Description: support [ERNIE Embedding-V1](https://cloud.baidu.com/doc/WENXINWORKSHOP/s/alj562vvu), which is part of ERNIE ecology - Issue: None - Dependencies: None - Tag maintainer: @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
José Ferraz Neto	f116e10d53	Add SharePoint Loader (#4284 ) - Added a loader (`SharePointLoader`) that can pull documents (`pdf`, `docx`, `doc`) from the [SharePoint Document Library](https://support.microsoft.com/en-us/office/what-is-a-document-library-3b5976dd-65cf-4c9e-bf5a-713c10ca2872). - Added a Base Loader (`O365BaseLoader`) to be used for all Loaders that use [O365](https://github.com/O365/python-o365) Package - Code refactoring on `OneDriveLoader` to use the new `O365BaseLoader`. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Utku Ege Tuluk	bb4f7936f9	feat(llms): add streaming support to textgen (#9295 ) - Description: Added streaming support to the textgen component in the llms module. - Dependencies: websocket-client = "^1.6.1"	1 year ago
Harrison Chase	9930ddc555	beef up retrieval docs (#9518 )	1 year ago
Leonid Ganeline	fdbeb52756	`Qwen` model example (#9516 ) added an example for `Qwen-7B` model on `HugginfFaceHub` 🤗	1 year ago
Martin Schade	0c8a88b3fa	AmazonTextractPDFLoader documentation updates (#9415 ) Description: Updating documentation to add AmazonTextractPDFLoader according to [comment](https://github.com/langchain-ai/langchain/pull/8661#issuecomment-1666572992) from [baskaryan](https://github.com/baskaryan) Adding one notebook and instructions to the modules/data_connection/document_loaders/pdf.mdx --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Asif Ahmad	08feed3332	Changed the NIBittensorLLM API URL to the correct one (#9419 ) Changed https://api.neuralinterent.ai/ to https://api.neuralinternet.ai/ which is the valid URL for the API of NIBittensorLLM.	1 year ago
EpixMan	103094286e	Fixing class calling error in the documentation of connecting_to_a_feature_store.ipynb (#9508 )	1 year ago
IlyaKIS1	fd8fe209cb	Added In-Depth Langchain Agent Execution Guide (#9507 ) Made the notion document of how Langchain executes agents method by method in the codebase. Can be helpful for developers that just started working with the Langchain codebase.	1 year ago
Rosário P. Fernandes	09a92bb9bf	chatbots use case - fix broken collab URL (#9491 ) The current Collab URL returns a 404, since there is no `chatbots` directory under `use_cases`. <!-- Thank you for contributing to LangChain! If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. -->	1 year ago
bsenst	a956b69720	fix typo in huggingface_hub.ipynb (#9499 )	1 year ago
Bagatur	d87cfd33e8	Update pydantic compatibility guide (#9496 )	1 year ago
Taqi Jaffri	069c0a041f	comment update for poetry install	1 year ago
Taqi Jaffri	5cd244e9b7	CR feedback	1 year ago
Ikko Eltociear Ashimine	0808949e54	Fix typo in apis.ipynb (#9490 ) funtions -> functions	1 year ago
RajneeshSinghShorthillsAI	129d056085	fixed spelling mistake and added missing bracket in parent_document_r… (#9380 ) …etriever.ipynb Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	1 year ago
Matt Robinson	83d2a871eb	fix: apply unstructured preprocess functions (#9473 ) ### Summary Fixes a bug from #7850 where post processing functions in Unstructured loaders were not apply. Adds a assertion to the test to verify the post processing function was applied and also updates the explanation in the example notebook.	1 year ago
NavanitDubeyShorthillsAI	b58d492e05	Update pydantic_compatibility.md (#9382 ) Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	1 year ago
bsenst	083726ecda	fix small typo (#9464 )	1 year ago
Leonid Ganeline	99e5eaa9b1	`InternLM` example (#9465 ) Added `InternML` model example to the HubbingFace Hub notebook	1 year ago
William FH	d4f790fd40	Fix imports in notebook (#9458 )	1 year ago
AmitSinghShorthillsAI	2b06792c81	Fixing spelling mistakes in fallbacks.ipynb (#9376 ) Fix spelling errors in the text: 'Therefore' and 'Retrying I want to stress that your feedback is invaluable to us and is genuinely cherished. With gratitude, @baskaryan @hwchase17	1 year ago
PuneetDhimanShorthillsAI	61e4a06447	Corrected Sentence in router.ipynb (#9377 ) Added missing question marks in the lines in the router.ipynb @baskaryan @hwchase17	1 year ago
呂安	ead04487fd	doc: make install from source more clearer (#9433 ) Description: if just `pip install -e .` it will not install anything, we have to find the right directory to do `pip install -e .`	1 year ago
Leonid Ganeline	edcb03943e	👀 docs: updated `dependents` (#9426 ) Updated statistics (the previous statistics was taken 1+month ago). A lot of new dependents and more starts.	1 year ago
Holmodi	89a8121eaa	Fix a dead loop bug caused by assigning two variables with opposite values. (#9447 ) - Description: Fix a dead loop bug caused by assigning two variables with opposite values.	1 year ago
Lance Martin	589927e9e1	Update figure in OSS model guide (#9399 )	1 year ago
Bagatur	5d60ced7b3	pydantic compatibility guide fix (#9418 )	1 year ago
Bagatur	0c4683ebcc	Revert "Update compatibility guide for pydantic (#9396 )" (#9417 )	1 year ago
Eugene Yurtsev	b11c233304	Update compatibility guide for pydantic (#9396 ) Use langchain.pydantic_v1 instead of pydantic_v1	1 year ago
Leonid Kuligin	019aa04b06	fixed a pal chain reference (#9387 ) #9386 Co-authored-by: Leonid Kuligin <kuligin@google.com>	1 year ago
Sanskar Tanwar	c194828be0	Fixed Typo in Fallbacks.ipynb (#9373 ) Removed extra "the" in the sentence about the chicken crossing the road in fallbacks.ipynb. The sentence now reads correctly: "Why did the chicken cross the road?" This resolves the grammatical error and improves the overall quality of the content. @baskaryan , @hinthornw , @hwchase17	1 year ago
AashutoshPathakShorthillsAI	c71afb46d1	Corrected Sentence in .ipynb File (#9372 ) Fixed grammatical errors in the sentence by repositioning the word "are" for improved clarity and readability. @baskaryan @hwchase17 @hinthornw	1 year ago
Akshay Tripathi	de8dfde7f7	Corrected Grammatical errors in tutorials.mdx (#9358 ) I want to extend my heartfelt gratitude to the creator for masterfully crafting this remarkable application. 🙌 I am truly impressed by the meticulous attention to grammar and spelling in the documentation, which undoubtedly contributes to a polished and seamless reader experience. As always, your feedback holds immense value and is greatly appreciated. @baskaryan , @hwchase17	1 year ago
Md Nazish Arman	e842131425	Fixed Grammatical errors in tutorials.mdx (#9359 ) I want to convey my deep appreciation to the creator for their expert craftsmanship in developing this exceptional application. 👏 The remarkable dedication to upholding impeccable grammar and spelling in the documentation significantly enhances the polished and seamless experience for readers. I want to stress that your feedback is invaluable to us and is genuinely cherished. With gratitude, @baskaryan, @hwchase17	1 year ago
AnujMauryaShorthillsAI	6dedd94ba4	Update "Langchain" to "LangChain" in the tutorials.mdx file (#9361 ) In this commit, I have made a modification to the term "Langchain" to correctly reflect the project's name as "LangChain". This change ensures consistency and accuracy throughout the codebase and documentation. @baskaryan , @hwchase17	1 year ago
Adarsh Shrivastav	c5e23293f8	Corrected Typo in MultiPromptChain Example in router.ipynb (#9362 ) Refined the example in router.ipynb by addressing a minor typographical error. The typo "rins" has been corrected to "rains" in the code snippet that demonstrates the usage of the MultiPromptChain. This change ensures accuracy and consistency in the provided code example. This improvement enhances the readability and correctness of the notebook, making it easier for users to understand and follow the demonstration. The commit aims to maintain the quality and accuracy of the content within the repository. Thank you for your attention to detail, and please review the change at your convenience. @baskaryan , @hwchase17	1 year ago
AbhishekYadavShorthillsAI	90d7c55343	Fix Typo in "community.md" (#9360 ) Corrected a typographical error in the "community.md" file by removing an extra word from the sentence. @baskaryan , @hwchase17	1 year ago
Angel Luis	2e8733cf54	Fix typo in huggingface_textgen_inference.ipynb (#9313 ) Replaced incorrect `stream` parameter by `streaming` on Integrations docs.	1 year ago
Lance Martin	b04e472acf	Open source LLM guide (#9266 ) Guide for using open source LLMs locally.	1 year ago
Eugene Yurtsev	090411842e	Fix API reference docs (#9321 ) Do not document members nested within any private component	1 year ago
Eugene Yurtsev	0f9f213833	Pydantic Compatibility (#9327 ) Pydantic Compatibility Guidelines for migration plan + debugging	1 year ago
Chandler May	15f1af8ed6	Fix variable case in code snippet in docs (#9311 ) - Description: Fix a minor variable naming inconsistency in a code snippet in the docs - Issue: N/A - Dependencies: none - Tag maintainer: N/A - Twitter handle: N/A	1 year ago
Michael Bianco	23928a3311	docs: remove multiple code blocks from comma-separated docs (#9323 )	1 year ago
Navanit Dubey	3e6cea46e2	Guide import readable json (#9291 )	1 year ago
axiangcoding	63601551b1	fix(llms): improve the ernie chat model (#9289 ) - Description: improve the ernie chat model. - fix missing kwargs to payload - new test cases - add some debug level log - improve description - Issue: None - Dependencies: None - Tag maintainer: @baskaryan	1 year ago
Daniel Chalef	1d55141c50	zep/new ZepVectorStore (#9159 ) - new ZepVectorStore class - ZepVectorStore unit tests - ZepVectorStore demo notebook - update zep-python to ~1.0.2 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Bagatur	b9ca5cc5ea	update guide import (#9279 )	1 year ago
Bagatur	afba2be3dc	update openai functions docs (#9278 )	1 year ago
Bagatur	9abf60acb6	Bagatur/vectara regression (#9276 ) Co-authored-by: Ofer Mendelevitch <ofer@vectara.com> Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>	1 year ago
Xiaoyu Xee	b30f449dae	Add dashvector vectorstore (#9163 ) ## Description Add `Dashvector` vectorstore for langchain - [dashvector quick start](https://help.aliyun.com/document_detail/2510223.html) - [dashvector package description](https://pypi.org/project/dashvector/) ## How to use ```python from langchain.vectorstores.dashvector import DashVector dashvector = DashVector.from_documents(docs, embeddings) ``` --------- Co-authored-by: smallrain.xuxy <smallrain.xuxy@alibaba-inc.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Bagatur	bfbb97b74c	Bagatur/deeplake docs fixes (#9275 ) Co-authored-by: adilkhan <adilkhan.sarsen@nu.edu.kz>	1 year ago
Kunj-2206	1b3942ba74	Added BittensorLLM (#9250 ) Description: Adding NIBittensorLLM via Validator Endpoint to langchain llms Tag maintainer: @Kunj-2206 Maintainer responsibilities: Models / Prompts: @hwchase17, @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Toshish Jawale	852722ea45	Improvements in Nebula LLM (#9226 ) - Description: Added improvements in Nebula LLM to perform auto-retry; more generation parameters supported. Conversation is no longer required to be passed in the LLM object. Examples are updated. - Issue: N/A - Dependencies: N/A - Tag maintainer: @baskaryan - Twitter handle: symbldotai --------- Co-authored-by: toshishjawale <toshish@symbl.ai>	1 year ago
Bagatur	1aae77f26f	fix context nb (#9267 )	1 year ago
Alex Gamble	cf17c58b47	Update documentation for the Context integration with new URL and features (#9259 ) Update documentation and URLs for the Langchain Context integration. We've moved from getcontext.ai to context.ai \o/ Thanks in advance for the review!	1 year ago
Joseph McElroy	5e9687a196	Elasticsearch self-query retriever (#9248 ) Now with ElasticsearchStore VectorStore merged, i've added support for the self-query retriever. I've added a notebook also to demonstrate capability. I've also added unit tests. Credit @elastic and @phoey1 on twitter.	1 year ago
Anthony Mahanna	0a04e63811	docs: Update ArangoDB Links (#9251 ) ready for review - mdx link update - colab link update	1 year ago
Hech	4b505060bd	fix: max_marginal_relevance_search and docs in Dingo (#9244 )	1 year ago
axiangcoding	664ff28cba	feat(llms): support ernie chat (#9114 ) Description: support ernie (文心一言) chat model Related issue: #7990 Dependencies: None Tag maintainer: @baskaryan	1 year ago
fanyou-wbd	5e43768f61	docs: update LlamaCpp max_tokens args (#9238 ) This PR updates documentations only, `max_length` should be `max_tokens` according to latest LlamaCpp API doc: https://api.python.langchain.com/en/latest/llms/langchain.llms.llamacpp.LlamaCpp.html	1 year ago
Bagatur	a8aa1aba1c	nit (#9243 )	1 year ago
Bagatur	68d8f73698	consolidate redirects (#9242 )	1 year ago
Joshua Sundance Bailey	ef0664728e	ArcGISLoader update (#9240 ) Small bug fixes and added metadata based on user feedback. This PR is from the author of https://github.com/langchain-ai/langchain/pull/8873 .	1 year ago
Joseph McElroy	eac4ddb4bb	Elasticsearch Store Improvements (#8636 ) Todo: - [x] Connection options (cloud, localhost url, es_connection) support - [x] Logging support - [x] Customisable field support - [x] Distance Similarity support - [x] Metadata support - [x] Metadata Filter support - [x] Retrieval Strategies - [x] Approx - [x] Approx with Hybrid - [x] Exact - [x] Custom - [x] ELSER (excluding hybrid as we are working on RRF support) - [x] integration tests - [x] Documentation 👋 this is a contribution to improve Elasticsearch integration with Langchain. Its based loosely on the changes that are in master but with some notable changes: ## Package name & design improvements The import name is now `ElasticsearchStore`, to aid discoverability of the VectorStore. ```py ## Before from langchain.vectorstores.elastic_vector_search import ElasticVectorSearch, ElasticKnnSearch ## Now from langchain.vectorstores.elasticsearch import ElasticsearchStore ``` ## Retrieval Strategy support Before we had a number of classes, depending on the strategy you wanted. `ElasticKnnSearch` for approx, `ElasticVectorSearch` for exact / brute force. With `ElasticsearchStore` we have retrieval strategies: ### Approx Example Default strategy for the vast majority of developers who use Elasticsearch will be inferring the embeddings from outside of Elasticsearch. Uses KNN functionality of _search. ```py texts = ["foo", "bar", "baz"] docsearch = ElasticsearchStore.from_texts( texts, FakeEmbeddings(), es_url="http://localhost:9200", index_name="sample-index" ) output = docsearch.similarity_search("foo", k=1) ``` ### Approx, with hybrid Developers who want to search, using both the embedding and the text bm25 match. Its simple to enable. ```py texts = ["foo", "bar", "baz"] docsearch = ElasticsearchStore.from_texts( texts, FakeEmbeddings(), es_url="http://localhost:9200", index_name="sample-index", strategy=ElasticsearchStore.ApproxRetrievalStrategy(hybrid=True) ) output = docsearch.similarity_search("foo", k=1) ``` ### Approx, with `query_model_id` Developers who want to infer within Elasticsearch, using the model loaded in the ml node. This relies on the developer to setup the pipeline and index if they wish to embed the text in Elasticsearch. Example of this in the test. ```py texts = ["foo", "bar", "baz"] docsearch = ElasticsearchStore.from_texts( texts, FakeEmbeddings(), es_url="http://localhost:9200", index_name="sample-index", strategy=ElasticsearchStore.ApproxRetrievalStrategy( query_model_id="sentence-transformers__all-minilm-l6-v2" ), ) output = docsearch.similarity_search("foo", k=1) ``` ### I want to provide my own custom Elasticsearch Query You might want to have more control over the query, to perform multi-phase retrieval such as LTR, linearly boosting on document parameters like recently updated or geo-distance. You can do this with `custom_query_fn` ```py def my_custom_query(query_body: dict, query: str) -> dict: return {"query": {"match": {"text": {"query": "bar"}}}} texts = ["foo", "bar", "baz"] docsearch = ElasticsearchStore.from_texts( texts, FakeEmbeddings(), **elasticsearch_connection, index_name=index_name ) docsearch.similarity_search("foo", k=1, custom_query=my_custom_query) ``` ### Exact Example Developers who have a small dataset in Elasticsearch, dont want the cost of indexing the dims vs tradeoff on cost at query time. Uses script_score. ```py texts = ["foo", "bar", "baz"] docsearch = ElasticsearchStore.from_texts( texts, FakeEmbeddings(), es_url="http://localhost:9200", index_name="sample-index", strategy=ElasticsearchStore.ExactRetrievalStrategy(), ) output = docsearch.similarity_search("foo", k=1) ``` ### ELSER Example Elastic provides its own sparse vector model called ELSER. With these changes, its really easy to use. The vector store creates a pipeline and index thats setup for ELSER. All the developer needs to do is configure, ingest and query via langchain tooling. ```py texts = ["foo", "bar", "baz"] docsearch = ElasticsearchStore.from_texts( texts, FakeEmbeddings(), es_url="http://localhost:9200", index_name="sample-index", strategy=ElasticsearchStore.SparseVectorStrategy(), ) output = docsearch.similarity_search("foo", k=1) ``` ## Architecture In future, we can introduce new strategies and allow us to not break bwc as we evolve the index / query strategy. ## Credit On release, could you credit @elastic and @phoey1 please? Thank you! --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Harrison Chase	71d5b7c9bf	Harrison/fallbacks (#9233 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Lance Martin	41279a3ae1	Move self-check use case to "more" section (#9137 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Lance Martin	22858d99b5	Move code-writing use case to "more" section (#9134 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Bagatur	249d7d06a2	adapter doc nit (#9234 )	1 year ago
Lance Martin	969e1683de	Move graph use case to "more" section (#8997 ) Clean `use_cases` by moving the `GraphDB` to `integrations`. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Lance Martin	d0a0d560ad	Minor formatting on Web Research Use Case (#9221 )	1 year ago
Lance Martin	17ae2998e7	Update Ollama docs (#9220 ) Based on discussion w/ team.	1 year ago
Krish Dholakia	49f1d8477c	Adding ChatLiteLLM model (#9020 ) Description: Adding a langchain integration for the LiteLLM library Tag maintainer: @hwchase17, @baskaryan Twitter handle: @krrish_dh / @Berri_AI --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Emmanuel Gautier	f11e5442d6	docs: update LlamaCpp input args (#9173 ) This PR only updates the LlamaCpp args documentation. The input arg has been flattened.	1 year ago
Massimiliano Pronesti	d95eeaedbe	feat(llms): support vLLM's OpenAI-compatible server (#9179 ) This PR aims at supporting [vLLM's OpenAI-compatible server feature](https://vllm.readthedocs.io/en/latest/getting_started/quickstart.html#openai-compatible-server), i.e. allowing to call vLLM's LLMs like if they were OpenAI's. I've also udpated the related notebook providing an example usage. At the moment, vLLM only supports the `Completion` API.	1 year ago
Michael Goin	621da3c164	Adds DeepSparse as an LLM (#9184 ) Adds [DeepSparse](https://github.com/neuralmagic/deepsparse) as an LLM backend. DeepSparse supports running various open-source sparsified models hosted on [SparseZoo](https://sparsezoo.neuralmagic.com/) for performance gains on CPUs. Twitter handles: @mgoin_ @neuralmagic --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Bagatur	0fa69d8988	Bagatur/zep python 1.0 (#9186 ) Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>	1 year ago
Harrison Chase	8d69dacdf3	multiple retreival in parralel (#9174 )	1 year ago
Eugene Yurtsev	aca8cb5fba	API Reference: Do not document private modules (#9042 ) This PR prevents documentation of private modules in the API reference	1 year ago
UmerHA	8aab39e3ce	Added SmartGPT workflow (issue #4463 ) (#4816 ) # Added SmartGPT workflow by providing SmartLLM wrapper around LLMs Edit: As @hwchase17 suggested, this should be a chain, not an LLM. I have adapted the PR. It is used like this: ``` from langchain.prompts import PromptTemplate from langchain.chains import SmartLLMChain from langchain.chat_models import ChatOpenAI hard_question = "I have a 12 liter jug and a 6 liter jug. I want to measure 6 liters. How do I do it?" hard_question_prompt = PromptTemplate.from_template(hard_question) llm = ChatOpenAI(model_name="gpt-4") prompt = PromptTemplate.from_template(hard_question) chain = SmartLLMChain(llm=llm, prompt=prompt, verbose=True) chain.run({}) ``` Original text: Added SmartLLM wrapper around LLMs to allow for SmartGPT workflow (as in https://youtu.be/wVzuvf9D9BU). SmartLLM can be used wherever LLM can be used. E.g: ``` smart_llm = SmartLLM(llm=OpenAI()) smart_llm("What would be a good company name for a company that makes colorful socks?") ``` or ``` smart_llm = SmartLLM(llm=OpenAI()) prompt = PromptTemplate( input_variables=["product"], template="What is a good name for a company that makes {product}?", ) chain = LLMChain(llm=smart_llm, prompt=prompt) chain.run("colorful socks") ``` SmartGPT consists of 3 steps: 1. Ideate - generate n possible solutions ("ideas") to user prompt 2. Critique - find flaws in every idea & select best one 3. Resolve - improve upon best idea & return it Fixes #4463 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: - @hwchase17 - @agola11 Twitter: [@UmerHAdil](https://twitter.com/@UmerHAdil) \| Discord: RicChilligerDude#7589 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Bagatur	45741bcc1b	Bagatur/vectara nit (#9140 ) Co-authored-by: Ofer Mendelevitch <ofer@vectara.com>	1 year ago
Dominick DEV	9b64932e55	Add LangChain utility for real-time crypto exchange prices (#4501 ) This commit adds the LangChain utility which allows for the real-time retrieval of cryptocurrency exchange prices. With LangChain, users can easily access up-to-date pricing information by running the command ".run(from_currency, to_currency)". This new feature provides a convenient way to stay informed on the latest exchange rates and make informed decisions when trading crypto. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Joshua Sundance Bailey	eaa505fb09	Create ArcGISLoader & example notebook (#8873 ) - Description: Adds the ArcGISLoader class to `langchain.document_loaders` - Allows users to load data from ArcGIS Online, Portal, and similar - Users can authenticate with `arcgis.gis.GIS` or retrieve public data anonymously - Uses the `arcgis.features.FeatureLayer` class to retrieve the data - Defines the most relevant keywords arguments and accepts `**kwargs` - Dependencies: Using this class requires `arcgis` and, optionally, `bs4.BeautifulSoup`. Tagging maintainers: - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Aashish Saini	0aabded97f	Updating interactive walkthrough link in index.md to resolve 404 error (#9063 ) Updated interactive walkthrough link in index.md to resolve 404 error. Also, expressing deep gratitude to LangChain library developers for their exceptional efforts 🥇 . --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Hai The Dude	e4418d1b7e	Added new use case docs for Web Scraping, Chromium loader, BS4 transformer (#8732 ) - Description: Added a new use case category called "Web Scraping", and a tutorial to scrape websites using OpenAI Functions Extraction chain to the docs. - Tag maintainer:@baskaryan @hwchase17 , - Twitter handle: https://www.linkedin.com/in/haiphunghiem/ (I'm on LinkedIn mostly) --------- Co-authored-by: Lance Martin <lance@langchain.dev>	1 year ago
niklub	16af5f8690	Add LabelStudio integration (#8880 ) This PR introduces [Label Studio](https://labelstud.io/) integration with LangChain via `LabelStudioCallbackHandler`: - sending data to the Label Studio instance - labeling dataset for supervised LLM finetuning - rating model responses - tracking and displaying chat history - support for custom data labeling workflow ### Example ``` chat_llm = ChatOpenAI(callbacks=[LabelStudioCallbackHandler(mode="chat")]) chat_llm([ SystemMessage(content="Always use emojis in your responses."), HumanMessage(content="Hey AI, how's your day going?"), AIMessage(content="🤖 I don't have feelings, but I'm running smoothly! How can I help you today?"), HumanMessage(content="I'm feeling a bit down. Any advice?"), AIMessage(content="🤗 I'm sorry to hear that. Remember, it's okay to seek help or talk to someone if you need to. 💬"), HumanMessage(content="Can you tell me a joke to lighten the mood?"), AIMessage(content="Of course! 🎭 Why did the scarecrow win an award? Because he was outstanding in his field! 🌾"), HumanMessage(content="Haha, that was a good one! Thanks for cheering me up."), AIMessage(content="Always here to help! 😊 If you need anything else, just let me know."), HumanMessage(content="Will do! By the way, can you recommend a good movie?"), ]) ``` <img width="906" alt="image" src="https://github.com/langchain-ai/langchain/assets/6087484/0a1cf559-0bd3-4250-ad96-6e71dbb1d2f3"> ### Dependencies - [label-studio](https://pypi.org/project/label-studio/) - [label-studio-sdk](https://pypi.org/project/label-studio-sdk/) https://twitter.com/labelstudiohq --------- Co-authored-by: nik <nik@heartex.net>	1 year ago
Bagatur	8cb2594562	Bagatur/dingo (#9079 ) Co-authored-by: gary <1625721671@qq.com>	1 year ago
Manuel Soria	31cfc00845	Code understanding use case (#8801 ) Code understanding docs --------- Co-authored-by: Manuel Soria <manuel.soria@greyscaleai.com> Co-authored-by: Lance Martin <lance@langchain.dev>	1 year ago
Alvaro Bartolome	f7ae183f40	`ArgillaCallbackHandler` to properly use default values for `api_url` and `api_key` (#9113 ) As of the recent PR at #9043, after some testing we've realised that the default values were not being used for `api_key` and `api_url`. Besides that, the default for `api_key` was set to `argilla.apikey`, but since the default values are intended for people using the Argilla Quickstart (easy to run and setup), the defaults should be instead `owner.apikey` if using Argilla 1.11.0 or higher, or `admin.apikey` if using a lower version of Argilla. Additionally, we've removed the f-string replacements from the docstrings. --------- Co-authored-by: Gabriel Martin <gabriel@argilla.io>	1 year ago
Bagatur	0e5d09d0da	dalle nb fix (#9125 )	1 year ago
Francisco Ingham	9249d305af	tagging docs refactor (#8722 ) refactor of tagging use case according to new format --------- Co-authored-by: Lance Martin <lance@langchain.dev>	1 year ago
Aayush Shah	a429145420	Minor grammatical error (#9102 ) Have corrected a grammatical error in: https://python.langchain.com/docs/modules/model_io/models/llms/ document 😄	1 year ago
Ashutosh Sanzgiri	991b448dfc	minor edits (#9093 ) Description: Minor edit to PR#845 Thanks!	1 year ago
Chenyu Zhao	c0acbdca1b	Update Fireworks model names (#9085 )	1 year ago
Charles Lanahan	a2588d6c57	Update openai embeddings notebook with correct embedding model in section 2 (#5831 ) In second section it looks like a copy/paste from the first section and doesn't include the specific embedding model mentioned in the example so I added it for clarity. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Josh Phillips	5fc07fa524	change id column type to uuid to match function (#7456 ) The table creation process in these examples commands do not match what the recently updated functions in these example commands is looking for. This change updates the type in the table creation command. Issue Number for my report of the doc problem #7446 @rlancemartin and @eyurtsev I believe this is your area Twitter: @j1philli Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Bidhan Roy	02430e25b6	BagelDB (bageldb.ai), VectorStore integration. (#8971 ) - Description: [BagelDB](bageldb.ai) a collaborative vector database. Integrated the bageldb PyPi package with langchain with related tests and code. - Issue: Not applicable. - Dependencies: `betabageldb` PyPi package. - Tag maintainer: @rlancemartin, @eyurtsev, @baskaryan - Twitter handle: bageldb_ai (https://twitter.com/BagelDB_ai) We ran `make format`, `make lint` and `make test` locally. Followed the contribution guideline thoroughly https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --------- Co-authored-by: Towhid1 <nurulaktertowhid@gmail.com>	1 year ago
Harrison Chase	bb6fbf4c71	openai adapters (#8988 ) Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Nuno Campos <nuno@boringbits.io>	1 year ago
Piyush Jain	8eea46ed0e	Bedrock embeddings async methods (#9024 ) ## Description This PR adds the `aembed_query` and `aembed_documents` async methods for improving the embeddings generation for large documents. The implementation uses asyncio tasks and gather to achieve concurrency as there is no bedrock async API in boto3. ### Maintainers @agola11 @aarora79 ### Open questions To avoid throttling from the Bedrock API, should there be an option to limit the concurrency of the calls?	1 year ago
Nicolas	e3fb11bc10	docs: (Mendable Search) Fixes stuck when tabbing out issue (#9074 ) This fixes Mendable not completing when tabbing out and fixes the duplicate message issue as well.	1 year ago
Bagatur	1edead28b8	Add docs community page (#8992 ) Co-authored-by: briannawolfson <brianna.wolfson@gmail.com>	1 year ago
Eugene Yurtsev	a5a4c53280	RedisStore: Update init and Documentation updates (#9044 ) * Update Redis Store to support init from parameters * Update notebook to show how to use redis store, and some fixes in documentation	1 year ago
Bagatur	f3f5853e9f	update api ref exampels (#9065 ) manually update for now	1 year ago
Blake (Yung Cher Ho)	8d351bfc20	Takeoff integration (#9045 ) ## Description: This PR adds the Titan Takeoff Server to the available LLMs in LangChain. Titan Takeoff is an inference server created by [TitanML](https://www.titanml.co/) that allows you to deploy large language models locally on your hardware in a single command. Most generative model architectures are included, such as Falcon, Llama 2, GPT2, T5 and many more. Read more about Titan Takeoff here: - [Blog](https://medium.com/@TitanML/introducing-titan-takeoff-6c30e55a8e1e) - [Docs](https://docs.titanml.co/docs/titan-takeoff/getting-started) #### Testing As Titan Takeoff runs locally on port 8000 by default, no network access is needed. Responses are mocked for testing. - [x] Make Lint - [x] Make Format - [x] Make Test #### Dependencies No new dependencies are introduced. However, users will need to install the titan-iris package in their local environment and start the Titan Takeoff inferencing server in order to use the Titan Takeoff integration. Thanks for your help and please let me know if you have any questions. cc: @hwchase17 @baskaryan	1 year ago
Aashish Saini	8a320e55a0	Corrected grammatical errors and spelling mistakes in the index.mdx file. (#9026 ) Expressing gratitude to the creator for crafting this remarkable application. 🙌, Would like to Enhance grammar and spelling in the documentation for a polished reader experience. Your feedback is valuable as always @baskaryan , @hwchase17 , @eyurtsev	1 year ago
Eugene Yurtsev	5e05ba2140	Add embeddings cache (#8976 ) This PR adds the ability to temporarily cache or persistently store embeddings. A notebook has been included showing how to set up the cache and how to use it with a vectorstore.	1 year ago
Lance Martin	2380492c8e	API use case (#8546 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Luca Foppiano	dfb93dd2b5	Improved grobid documentation (#9025 ) - Description: Improvement in the Grobid loader documentation, typos and suggesting to use the docker image instead of installing Grobid in local (the documentation was also limited to Mac, while docker allow running in any platform) - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: @whitenoise	1 year ago
Hiroshige Umino	2c7297d243	Fix a broken code block display (#9034 ) - Description: Fix a broken code block in this page: https://python.langchain.com/docs/modules/model_io/prompts/prompt_templates/ - Issue: N/A - Dependencies: None - Tag maintainer: @baskaryan - Twitter handle: yaotti	1 year ago
Piyush Jain	3b51817706	Updating port and ssl use in sample notebook (#8995 ) ## Description This PR updates the sample notebook to use the default port (8182) and the ssl for the Neptune database connection.	1 year ago
Michael Shen	c2f46b2cdb	Fixed wrong paper reference (#8970 ) The ReAct reference references to MRKL paper. Corrected so that it points to the actual ReAct paper #8964.	1 year ago
Jerzy Czopek	539672a7fd	Feature/fix azureopenai model mappings (#8621 ) This pull request aims to ensure that the `OpenAICallbackHandler` can properly calculate the total cost for Azure OpenAI chat models. The following changes have resolved this issue: - The `model_name` has been added to the ChatResult llm_output. Without this, the default values of `gpt-35-turbo` were applied. This was causing the total cost for Azure OpenAI's GPT-4 to be significantly inaccurate. - A new parameter `model_version` has been added to `AzureChatOpenAI`. Azure does not include the model version in the response. With the addition of `model_name`, this is not a significant issue for GPT-4 models, but it's an issue for GPT-3.5-Turbo. Version 0301 (default) of GPT-3.5-Turbo on Azure has a flat rate of 0.002 per 1k tokens for both prompt and completion. However, version 0613 introduced a split in pricing for prompt and completion tokens. - The `OpenAICallbackHandler` implementation has been updated with the proper model names, versions, and cost per 1k tokens. Unit tests have been added to ensure the functionality works as expected; the Azure ChatOpenAI notebook has been updated with examples. Maintainers: @hwchase17, @baskaryan Twitter handle: @jjczopek --------- Co-authored-by: Jerzy Czopek <jerzy.czopek@avanade.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Harrison Chase	7de6a1b78e	parent document retriever (#8941 )	1 year ago
Taqi Jaffri	5919c0f4a2	notebook cleanup	1 year ago
Taqi Jaffri	bcdf3be530	Merge branch 'master' into tjaffri/docugami_loader_source	1 year ago
arjunbansal	a2681f950d	add instructions on integrating Log10 (#8938 ) - Description: Instruction for integration with Log10: an [open source](https://github.com/log10-io/log10) proxiless LLM data management and application development platform that lets you log, debug and tag your Langchain calls - Tag maintainer: @baskaryan - Twitter handle: @log10io @coffeephoenix Several examples showing the integration included [here](https://github.com/log10-io/log10/tree/main/examples/logging) and in the PR	1 year ago
Aarav Borthakur	3f64b8a761	Integrate Rockset as a chat history store (#8940 ) Description: Adds Rockset as a chat history store Dependencies: no changes Tag maintainer: @hwchase17 This PR passes linting and testing. I added a test for the integration and an example notebook showing its use.	1 year ago
Bagatur	0a1be1d501	document lcel fallbacks (#8942 )	1 year ago
Molly Cantillon	99b5a7226c	Weaviate: adding auth example + fixing spelling in ReadME (#8939 ) Added basic auth example to Weaviate notebook @baskaryan	1 year ago
Joe Reuter	8f0cd91d57	Airbyte based loaders (#8586 ) This PR adds 8 new loaders: * `AirbyteCDKLoader` This reader can wrap and run all python-based Airbyte source connectors. * Separate loaders for the most commonly used APIs: * `AirbyteGongLoader` * `AirbyteHubspotLoader` * `AirbyteSalesforceLoader` * `AirbyteShopifyLoader` * `AirbyteStripeLoader` * `AirbyteTypeformLoader` * `AirbyteZendeskSupportLoader` ## Documentation and getting started I added the basic shape of the config to the notebooks. This increases the maintenance effort a bit, but I think it's worth it to make sure people can get started quickly with these important connectors. This is also why I linked the spec and the documentation page in the readme as these two contain all the information to configure a source correctly (e.g. it won't suggest using oauth if that's avoidable even if the connector supports it). ## Document generation The "documents" produced by these loaders won't have a text part (instead, all the record fields are put into the metadata). If a text is required by the use case, the caller needs to do custom transformation suitable for their use case. ## Incremental sync All loaders support incremental syncs if the underlying streams support it. By storing the `last_state` from the reader instance away and passing it in when loading, it will only load updated records. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Harrison Chase	7543a3d70e	Harrison/image (#845 ) Co-authored-by: Ashutosh Sanzgiri <sanzgiri@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Leonid Ganeline	33a2f58fbf	`tensoflow_datasets` document loader (#8721 ) This PR adds `tensoflow_datasets` document loader	1 year ago
Leonid Ganeline	2d078c7767	`PubMed` document loader (#8893 ) - added `PubMed Document Loader` artifacts; ut-s; examples - fixed `PubMed utility`; ut-s @hwchase17	1 year ago
Jeremy W	c5c0735fc4	Remove Evaluation from Modules page (#8926 ) Remove Evaluation link (which gives 404 now) from Modules page, since it lives under Guides page now	1 year ago
Seif	6327eecdaf	Fix typo in Vectara docs (#8925 ) Fixed a typo in the Vectara docs description.	1 year ago
Chris Pappalardo	beab637f04	added filter kwarg to VectorStoreIndexWrapper query and query_with_so… (#8844 ) - Description: added filter to query methods in VectorStoreIndexWrapper for filtering by metadata (i.e. search_kwargs) - Tag maintainer: @rlancemartin, @eyurtsev Updated the doc snippet on this topic as well. It took me a long while to figure out how to filter the vectorstore by filename, so this might help someone else out. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Apurv Agarwal	4a63533216	addition to docs at 'Store and reference chat history' (#8910 ) - Description: I have added an example showing how to pass a custom template to ConversationRetrievalChain. Instead of CONDENSE_QUESTION_PROMPT we can pass any prompt in the argument condense_question_prompt. Look in Use cases -> QA over Documents -> How to -> Store and reference chat history, - Issue: #8864, - Dependencies: NA, - Tag maintainer: @hinthornw, - Twitter handle: --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
David vonThenen	bf4a112aa6	Fixes to the Nebula LLM Integration (#8918 ) This addresses some issues with introducing the Nebula LLM to LangChain in this PR: https://github.com/langchain-ai/langchain/pull/8876 This fixes the following: - Removes `SYMBLAI` from variable names - Fixes bug with `Bearer` for the API KEY Thanks again in advance for your help! cc: @hwchase17, @baskaryan --------- Co-authored-by: dvonthenen <david.vonthenen@gmail.com>	1 year ago
Jacob Lee	d1e305028f	Automatically set docs appearance to system default (#8924 ) @baskaryan	1 year ago
Josh Hart	6116cbf0de	Fix imports in awslambda docs (#8916 ) Minor doc fix to awslambda tool notebook. Add missing import for initialize_agent to awslambda agent example Co-authored-by: Josh Hart <josharj@amazon.com>	1 year ago
Maurits de Groot	61c2d918c6	Fixed inaccurate import in integrations:providers:bedrock documentation (#8915 ) Description: Fixed inaccurate import in integrations:providers:bedrock documentation In the current version of the bedrock documentation, page https://python.langchain.com/docs/integrations/providers/bedrock it states that the import is from langchain import Bedrock This has been changed to from langchain.llms.bedrock import Bedrock as stated in https://python.langchain.com/docs/integrations/llms/bedrock Issue: Not applicable Dependencies No dependencies required Tag maintainer @baskaryan Twitter handle: Not applicable	1 year ago
Manuel Soria	e74a605379	SQL use case docs (#8513 )	1 year ago
Jacob Lee	fa30a57034	Adds Ollama as an LLM (#8829 ) Adds Ollama as an LLM. Ollama can run various open source models locally e.g. Llama 2 and Vicuna, automatically configuring and GPU-optimizing them. @rlancemartin @hwchase17 --------- Co-authored-by: Lance Martin <lance@langchain.dev>	1 year ago
Ash Vardanian	1f9124ceaa	Add: USearch Vector Store (#8835 ) ## Description I am excited to propose an integration with USearch, a lightweight vector-search engine available for both Python and JavaScript, among other languages. ## Dependencies It introduces a new PyPi dependency - `usearch`. I am unsure if it must be added to the Poetry file, as this would make the PR too clunky. Please let me know. ## Profiles - Maintainers: @ashvardanian @davvard - Twitter handles: @ashvardanian @unum_cloud --------- Co-authored-by: Davit Vardanyan <78792753+davvard@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Leonid Kuligin	b52a3785c9	Allow to specify a custom loader for GcsFileLoader (#8868 ) Co-authored-by: Leonid Kuligin <kuligin@google.com>	1 year ago
Jeffrey Wang	ff44fe4e16	Change default Metaphor search example to use prompt optimizer (#8890 ) - fix install command - change example notebook to use Metaphor autoprompt by default <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	1 year ago
Jeffrey Wang	ce3666c28b	Fix metaphor install command in guide (#8888 )	1 year ago
Harrison Chase	bbd22b9b76	update metaphor docs (#8886 )	1 year ago
Carson	cc908d49a3	Fixes typo in documentation (#8882 ) Fixes a simple typo in the google search engine tool documentation @baskaryan	1 year ago
Joshua Sundance Bailey	7fc07ba5df	Create ChatAnyscale (#8770 ) - Description: Adds the ChatAnyscale class with llama-2 7b, llama-2 13b, and llama-2 70b on [Anyscale Endpoints](https://app.endpoints.anyscale.com/) - It inherits from ChatOpenAI and requires openai (probably unnecessary but it made for a quick and easy implementation) - Inspired by https://github.com/langchain-ai/langchain/pull/8434 (@kylehh and @baskaryan )	1 year ago
David vonThenen	40079d4936	Introduce Nebula LLM to LangChain (#8876 ) ## Description This PR adds Nebula to the available LLMs in LangChain. Nebula is an LLM focused on conversation understanding and enables users to extract conversation insights from video, audio, text, and chat-based conversations. These conversations can occur between any mix of human or AI participants. Examples of some questions you could ask Nebula from a given conversation are: - What could be the customer’s pain points based on the conversation? - What sales opportunities can be identified from this conversation? - What best practices can be derived from this conversation for future customer interactions? You can read more about Nebula here: https://symbl.ai/blog/extract-insights-symbl-ai-generative-ai-recall-ai-meetings/ #### Integration Test An integration test is added, but it requires network access. Since Nebula is fully managed like OpenAI, network access is required to exercise the integration test. #### Linting - [x] make lint - [x] make test (TODO: there seems to be a failure in another non-related test??? Need to check on this.) - [x] make format ### Dependencies No new dependencies were introduced. ### Twitter handle [@symbldotai](https://twitter.com/symbldotai) [@dvonthenen](https://twitter.com/dvonthenen) If you have any questions, please let me know. cc: @hwchase17, @baskaryan --------- Co-authored-by: dvonthenen <david.vonthenen@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Lance Martin	84c1ad7eaa	Fix colab link for extraction ntbk (#8878 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	1 year ago
Nuno Campos	9892e95d03	Add flush=True to stream examples (#8862 )	1 year ago
manmax31	40096c73cd	Add BGE embeddings support (#8848 ) - Description: [BGE-large](https://huggingface.co/BAAI/bge-large-en) embeddings from BAAI are at the top of [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard). Hence adding support for it. - Tag maintainer: @baskaryan - Twitter handle: @ManabChetia3 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Tudor Golubenco	aeaef8f3a3	Add support for Xata as a vector store (#8822 ) This adds support for [Xata](https://xata.io) (data platform based on Postgres) as a vector store. We have recently added [Xata to Langchain.js](https://github.com/hwchase17/langchainjs/pull/2125) and would love to have the equivalent in the Python project as well. The PR includes integration tests and a Jupyter notebook as docs. Please let me know if anything else would be needed or helpful. I have added the xata python SDK as an optional dependency. ## To run the integration tests You will need to create a DB in xata (see the docs), then run something like: ``` OPENAI_API_KEY=sk-... XATA_API_KEY=xau_... XATA_DB_URL='https://....xata.sh/db/langchain' poetry run pytest tests/integration_tests/vectorstores/test_xata.py ``` <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Philip Krauss <35487337+philkra@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Harrison Chase	472f00ada7	add moderation example (#8718 )	1 year ago
Massimiliano Pronesti	a616e19975	feat(llms): add support for vLLM (#8806 ) Hello langchain maintainers, this PR aims at integrating [vllm](https://vllm.readthedocs.io/en/latest/#) into langchain. This PR closes #8729. This feature clearly depends on `vllm`, but I've seen other models supported here depend on packages that are not included in the pyproject.toml (e.g. `gpt4all`, `text-generation`) so I thought it was the case for this as well. @hwchase17, @baskaryan --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Karthik Raja A	5a9765b1b5	MultiOn client toolkit update 2.0 (#8750 ) - Updated to use newer better function interaction - Previous version had only one callback - @hinthornw @hwchase17 Can you look into this - Shout out to @MultiON_AI @DivGarg9 on twitter --------- Co-authored-by: Naman Garg <ngarg3@binghamton.edu> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Harrison Chase	0adc282d70	Harrison/as retriever docstring (#8840 ) Co-authored-by: Bytestorm <31070777+Bytestorm5@users.noreply.github.com>	1 year ago
Zend	bd4865b6fe	Async Recursive URL loader (#8502 ) Description: This PR improves the function of recursive_url_loader, such as limiting the depth of the access, and customizable extractors(from the raw webpage to the text of the Document object), so that users can use other tools to extract the webpage. This PR also includes the document and test for the new loader. Old PR closed due to project structure change. #7756 Because socket requests are not allowed, the old unit test was removed. Issue: N/A Dependencies: asyncio, aiohttp Tag maintainer: @rlancemartin Twitter handle: @ Zend_Nihility --------- Co-authored-by: Lance Martin <lance@langchain.dev>	1 year ago
fqassemi	485d716c21	Feature faiss delete (#8135 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: docstore had two main method: add and search, however, dealing with docstore sometimes requires deleting an entry from docstore. So I have added a simple delete method that deletes items from docstore. Additionally, I have added the delete method to faiss vectorstore for the very same reason. - Issue: NA - Dependencies: NA - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Nicolas	b57fa1a39c	docs: Improvements on Mendable Search (#8808 ) - Balancing prioritization between keyword / AI search - Show snippets of highlighted keywords when searching - Improved keyword search - Fixed bugs and issues Shoutout to @calebpeffer for implementing and gathering feedback on it cc: @dev2049 @rlancemartin @hwchase17	1 year ago
Ikko Eltociear Ashimine	6b93670410	Fix typo in long_context_reorder.ipynb (#8811 ) begining -> beginning <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	1 year ago
Harrison Chase	2bb1d256f3	add example of memory and returning retrieved docs (#8830 )	1 year ago
Kshitij Wadhwa	5f1aab5487	Fix docs for Rockset (#8807 ) * remove error output for notebook * add comment about vector length for ingest transformation * change OPENAI_KEY -> OPENAI_API_KEY cc @baskaryan	1 year ago
Bagatur	d7b613a293	Bagatur/revert revert nuclia (#8833 )	1 year ago
Bagatur	2f309a4ce6	Revert "Bagatur/nuclia (#8404 )" (#8832 )	1 year ago
Snehil Kumar	1bd4890506	Update links on QA Use Case docs (#8784 ) - Description: 2 links were not working on Question Answering Use Cases documentation page. Hence, changed them to nearest useful links, - Issue: NA, - Dependencies: NA, - Tag maintainer: @baskaryan, - Twitter handle: NA <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	1 year ago
Bal Narendra Sapa	a22d502248	added the embeddings part (#8805 ) Description: forgot to add the embeddings part in the documentation. sorry 😅 @baskaryan	1 year ago
Bagatur	9fc9018951	Bagatur/nuclia (#8404 ) Co-authored-by: Eric BREHAULT <ebrehault@gmail.com>	1 year ago
Francisco Ingham	ef5bc1fef1	Refactor for extraction docs (#8465 ) Refactor for the extraction use case documentation --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Lance Martin <lance@langchain.dev>	1 year ago
Bagatur	21771a6f1c	rm sklearn links (#8773 )	1 year ago
Joshua Carroll	e5fed7d535	Extend the StreamlitChatMessageHistory docs with a fuller example and… (#8774 ) Add more details to the [notebook for StreamlitChatMessageHistory](https://python.langchain.com/docs/integrations/memory/streamlit_chat_message_history), including a link to a [running example app](https://langchain-st-memory.streamlit.app/). Original PR: https://github.com/langchain-ai/langchain/pull/8497	1 year ago
Eugene Yurtsev	19dfe166c9	Update documentation for prompts (#8381 ) * Documentation to favor creation without declaring input_variables * Cut out obvious examples, but add more description in a few places --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	1 year ago
Dayou Liu	91a0817e39	docs: llamacpp minor fixes (#8738 ) - Description: minor updates on llama cpp doc	1 year ago
Eugene Yurtsev	003e1ca9a0	Update api references (#8646 ) Update API reference documentation. This PR will pick up a number of missing classes, it also applies selective formatting based on the class / object type.	1 year ago
Snehil Kumar	a6ee646ef3	Update get_started.mdx (#8744 ) - Description: Added a missing word and rearranged a sentence in the documentation of Self Query Retrievers., - Issue: NA, - Dependencies: NA, - Tag maintainer: @baskaryan, - Twitter handle: NA Thanks for your time.	1 year ago
Bal Narendra Sapa	bd61757423	add documentation for serializer function (#8769 ) Description: Added necessary documentation for serializer functions @baskaryan	1 year ago
rjanardhan3	affaaea87b	Updates fireworks (#8765 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: Updates to Fireworks Documentation, - Issue: N/A, - Dependencies: N/A, - Tag maintainer: @rlancemartin, --------- Co-authored-by: Raj Janardhan <rajjanardhan@Rajs-Laptop.attlocal.net>	1 year ago
Bagatur	8c35fcb571	update rss doc (#8761 )	1 year ago
Bagatur	0d5a90f30a	Revert "add filter to sklearn vector store functions (#8113 )" (#8760 )	1 year ago
Lance Martin	be638ad77d	Chatbots use case (#8554 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Ruiqi Guo	6aee589eec	Add ScaNN support in vectorstore. (#8251 ) Description: Add ScaNN vectorstore to langchain. ScaNN is a Open Source, high performance vector similarity library optimized for AVX2-enabled CPUs. https://github.com/google-research/google-research/tree/master/scann - Dependencies: scann Python notebook to illustrate the usage: docs/extras/integrations/vectorstores/scann.ipynb Integration test: libs/langchain/tests/integration_tests/vectorstores/test_scann.py @rlancemartin, @eyurtsev for review. Thanks!	1 year ago
shibuiwilliam	0f0ccfe7f6	add filter to sklearn vector store functions (#8113 ) # What - This is to add filter option to sklearn vectore store functions <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: Add filter to sklearn vectore store functions. - Issue: None - Dependencies: None - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: @MlopsJ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
shibuiwilliam	2759e2d857	add save and load tfidf vectorizer and docs for TFIDFRetriever (#8112 ) This is to add save_local and load_local to tfidf_vectorizer and docs in tfidf_retriever to make the vectorizer reusable. <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: add save_local and load_local to tfidf_vectorizer and docs in tfidf_retriever - Issue: None - Dependencies: None - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: @MlopsJ Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Lance Martin	d1b95db874	Retriever that can re-phase user inputs (#8026 ) Simple retriever that applies an LLM between the user input and the query pass the to retriever. It can be used to pre-process the user input in any way. The default prompt: ``` DEFAULT_QUERY_PROMPT = PromptTemplate( input_variables=["question"], template="""You are an assistant tasked with taking a natural languge query from a user and converting it into a query for a vectorstore. In this process, you strip out information that is not relevant for the retrieval task. Here is the user query: {question} """ ) ``` --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Harrison Chase	6c3573e7f6	Harrison/aleph alpha (#8735 ) Co-authored-by: PiotrMazurek <piotr.mazurek@aleph-alpha.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Ilya	6f0bccfeb5	Add regex control over separators in character text splitter (#7933 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #7854 Added the ability to use the `separator` ase a regex or a simple character. Fixed a bug where `start_index` was incorrectly counting from -1. Who can review? @eyurtsev @hwchase17 @mmz-001	1 year ago
Ofer Mendelevitch	29f51055e8	Updates to Vectara documentation (#8699 ) - Description: updates to Vectara documentation with more details on how to get started. - Issue: NA - Dependencies: NA - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: @vectara, @ofermend --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
ruze	8ef7e14a85	RSS Feed / OPML loader (#8694 ) Replace this comment with: - Description: added a document loader for a list of RSS feeds or OPML. It iterates through the list and uses NewsURLLoader to load each article. - Issue: N/A - Dependencies: feedparser, listparser - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: @ruze --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Bagatur	b2b71b0d35	Bagatur/eden llm (#8670 ) Co-authored-by: RedhaWassim <rwasssim@gmail.com> Co-authored-by: KyrianC <ckyrian@protonmail.com> Co-authored-by: sam <melaine.samy@gmail.com>	1 year ago
axa99	1f54ec899b	updated interface jupyter notebook explanations (#8689 ) Updated the documentation in the interface.ipynb to clearly show the _input_ and _output_ types for various components @baskaryan	1 year ago
Lance Martin	37aade19da	Minor formatting and additional figure for summarization use case (#8663 )	1 year ago
Harrison Chase	43dffe39fb	Harrison/conversational retrieval agent (#8639 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
ruze	71f98db2fe	Newspaper (#8647 ) - Description: Added newspaper3k based news article loader. Provide a list of urls. - Issue: N/A - Dependencies: newspaper3k, - Tag maintainer: @rlancemartin , @eyurtsev - Twitter handle: @ruze --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
millerick	5018af8839	docs: fix some grammar (#8654 ) ### Description Fixes a grammar issue I noticed when reading through the documentation. ### Maintainers @baskaryan Co-authored-by: mmillerick <mmillerick@blend.com>	1 year ago
Lance Martin	59194c2214	Add summarization use-case (#8376 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Will Thompson	ee1d13678e	🐛 Docs Fixes [2 one-liners, examples broken] (#8519 ) ## Description: 1)Map reduce example in docs is missing an important import statement. Figured other people would benefit from being able to copy 🍝 the code. 2)RefineDocumentsChain example also broken. ## Issue: None ## Dependencies: None. One liner. ## Tag maintainer: @baskaryan ## Twitter handle: I mean, it's a one line fix lol. But @will_thompson_k is my twitter handle.	1 year ago
Leonid Ganeline	1335f2b9f8	`MLflow` examples (#8642 ) Updated `MLflow` examples with links to the examples from MLflow @baskaryan	1 year ago
Comendeiro	5c516945d0	Add local support for audio models (PR #7329 ) (#7591 ) - Description: run the poetry dependencies - Issue: #7329 - Dependencies: any dependencies required for this change, - Tag maintainer: @rlancemartin --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
rjanardhan3	68113348cc	Fireworks integration (#8322 ) Description - Integrates Fireworks within Langchain LLMs to allow users to use Fireworks models with Langchain, mainly for summarization. Issue - Not applicable Dependencies - None Tag maintainer - @rlancemartin --------- Co-authored-by: Raj Janardhan <rajjanardhan@Rajs-Laptop.attlocal.net>	1 year ago
Taqi Jaffri	4806504ebc	Fixed one last key name	1 year ago
Joshua Carroll	6705928b9d	Add StreamlitChatMessageHistory (#8497 ) Add a StreamlitChatMessageHistory class that stores chat messages in [Streamlit's Session State](https://docs.streamlit.io/library/api-reference/session-state). Note: The integration test uses a currently-experimental Streamlit testing framework to simulate the execution of a Streamlit app. Marking this PR as draft until I confirm with the Streamlit team that we're comfortable supporting it. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Matt Robinson	8961c720b8	docs: update `unstructured` install instructions (#8596 ) ### Summary Updates the `unstructured` install instructions. For `unstructured>=0.9.0`, dependencies are broken out by document type and the base `unstructured` package includes fewer dependencies. `pip install "unstructured[local-inference]"` has been replace by `pip install "unstructured[all-docs]"`, though the `local-inference` extra is still supported for the time being. ### Reviewers - @rlancemartin - @eyurtsev - @hwchase17	1 year ago
Bagatur	73072d3db8	mv (#8595 )	1 year ago
Tesfagabir Meharizghi	a7000ee89e	Callback handler for Amazon SageMaker Experiments (#8587 ) ## Description This PR implements a callback handler for SageMaker Experiments which is similar to that of mlflow. * When creating the callback handler, it takes the experiment's run object as an argument. All the callback outputs are then logged to the run object. * The output of each callback action (e.g., `on_llm_start`) is saved to S3 bucket as json file. * Optionally, you can also log additional information such as the LLM hyper-parameters to the same run object. * Once the callback object is no more needed, you will need to call the `flush_tracker()` method. This makes sure that any intermediate files are deleted. * A separate notebook example is provided to show how the callback is used. @3coins @agola11 --------- Co-authored-by: Tesfagabir Meharizghi <mehariz@amazon.com>	1 year ago
Taqi Jaffri	96843f3bd4	Fixed source key name for docugami loader	1 year ago
mpb159753	7df2dfc4c2	Add Support for Loading Documents from Huawei OBS (#8573 ) Description: This PR adds support for loading documents from Huawei OBS (Object Storage Service) in Langchain. OBS is a cloud-based object storage service provided by Huawei Cloud. With this enhancement, Langchain users can now easily access and load documents stored in Huawei OBS directly into the system. Key Changes: - Added a new document loader module specifically for Huawei OBS integration. - Implemented the necessary logic to authenticate and connect to Huawei OBS using access credentials. - Enabled the loading of individual documents from a specified bucket and object key in Huawei OBS. - Provided the option to specify custom authentication information or obtain security tokens from Huawei Cloud ECS for easy access. How to Test: 1. Ensure the required package "esdk-obs-python" is installed. 2. Configure the endpoint, access key, secret key, and bucket details for Huawei OBS in the Langchain settings. 3. Load documents from Huawei OBS using the updated document loader module. 4. Verify that documents are successfully retrieved and loaded into Langchain for further processing. Please review this PR and let us know if any further improvements are needed. Your feedback is highly appreciated! @rlancemartin, @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Harrison Chase	66226d1d4d	add example for memory (#8552 )	1 year ago
Shantanu Nair	53f3793504	Fast load conversationsummarymemory from existing summary (#7533 ) - Description: Adds an optional buffer arg to the memory's from_messages() method. If provided the existing memory will be loaded instead of regenerating a summary from the loaded messages. Why? If we have past messages to load from, it is likely we also have an existing summary. This is particularly helpful in cases where the chat is ephemeral and/or is backed by serverless where the chat history is not stored but where the updated chat history is passed back and forth between a backend/frontend. Eg: Take a stateless qa backend implementation that loads messages on every request and generates a response — without this addition, each time the messages are loaded via from_messages, the summaries are recomputed even though they may have just been computed during the previous response. With this, the previously computed summary can be passed in and avoid: 1) spending extra $$$ on tokens, and 2) increased response time by avoiding regenerating previously generated summary. Tag maintainer: @hwchase17 Twitter handle: https://twitter.com/ShantanuNair --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
DJ Atha	ec40ead980	Fixed bug7445 where a duplicate restuld_id is added to the vectorstore. (#7573 ) - Description: updated BabyAGI examples to append the iteration to the result id to fix error storing data to vectorstore. - Issue: 7445 - Dependencies: no - Tag maintainer: @eyurtsev - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! This fix worked for me locally. Happy to take some feedback and iterate on a better solution. I was considering appending a uuid instead but didnt want to over complicate the example.	1 year ago
Kenny	1e8fca5518	Add ConcurrentLoader (#7512 ) Works just like the GenericLoader but concurrently for those who choose to optimize their workflow. @rlancemartin @eyurtsev --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Danny Davenport	8d2344db43	updates some spelling mistakes (#8537 ) Just updating some spelling / grammar issues in the documentation. No code changes. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Leonid Kuligin	b4a126ae71	Updated docs on Vertex AI going GA (#8531 ) #8074 Co-authored-by: Leonid Kuligin <kuligin@google.com>	1 year ago
Bharat Raghunathan	c19a0b9c10	doc(prompts): Follow up on broken Prompt Sublink pages (#8530 ) - Description: Follow up of #8478 - Issue: #8477 - Dependencies: None - Tag maintainer: @baskaryan - Twitter handle: [@BharatR123](twitter.com/BharatR123) The links were still broken after #8478 and sadly the issue was not caught with either the Vercel app build and `make docs_linkcheck`	1 year ago
Harrison Chase	bca0749a11	conversational retrieval chain in lcel (#8532 )	1 year ago
Jeff Huber	07d6d1ca38	fix error in chroma docker instructions (#8533 ) This makes the Chroma instructions for Docker work! https://python.langchain.com/docs/integrations/vectorstores/chroma#basic-example-using-the-docker-container	1 year ago
Matthew DeGuzman	844eca98d5	Add LLaMa Formatter and AzureML Chat Endpoint (#8382 ) ## Description Microsoft and Meta recently [announced their collaboration](https://blogs.microsoft.com/blog/2023/07/18/microsoft-and-meta-expand-their-ai-partnership-with-llama-2-on-azure-and-windows/) on LLaMa2. This PR extends the current LLM wrapper and introduces a new Chat Model wrapper for AzureML to support LLaMa2. ## Dependencies No dependencies added :) ## Twitter Handles [@matthew_d13](https://twitter.com/matthew_d13) [@prakhar_in](https://twitter.com/prakhar_in) maintainers - @hwchase17, @baskaryan	1 year ago
Anthony Mahanna	1ab773c742	docs: Update ArangoDB Colab URL (#8547 ) 1-commit PR to update the Google Colab URL of the ArangoDB Graph QA Chain notebook	1 year ago
Harrison Chase	5e3b968078	router runnable (#8496 ) Co-authored-by: Nuno Campos <nuno@boringbits.io>	1 year ago
Anubhav Bindlish	913a156cff	Minor improvements to rockset vectorstore (#8416 ) This PR makes minor improvements to our python notebook, and adds support for `Rockset` workspaces in our vectorstore client. @rlancemartin, @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Harrison Chase	893f3014af	add xml agent notebook	1 year ago
Harrison Chase	6556a8fcfd	add initial anthropic agent (#8468 ) Co-authored-by: Nuno Campos <nuno@boringbits.io>	1 year ago
Muhammed Al-Dulaimi	9975ba4124	Fix ChromaDB integration -> docker container instructions (#8447 ) ## Description This PR handles modifying the Chroma DB integration's documentation. It modifies the Docker container example to fix the instructions mentioned in the documentation. In the current documentation, the below `client.reset()` line causes a runtime error: ```py ... client = chromadb.HttpClient(settings=Settings(allow_reset=True)) client.reset() # resets the database collection = client.create_collection("my_collection") ... ``` `Exception: {"error":"ValueError('Resetting is not allowed by this configuration')"}` This is due to the Chroma DB server needing to have the `allow_reset` flag set to `true` there as well. This is fixed by adding the `ALLOW_RESET=TRUE` to the `docker-compose` file environment variable to the docker container before spinning it ## Issue This fixes the runtime error that occurs when running the docker container example code ## Tag Maintainer @rlancemartin, @eyurtsev	1 year ago
Nicolas Raoul	7f9c6c3baa	Fixed typo: papaer -> paper (#8500 )	1 year ago
Piyush Jain	b2f8a5bae9	Fixed exports for NeptuneOpenCypherQAChain (#8439 ) ## Description The imports for `NeptuneOpenCypherQAChain` are failing. This PR adds the chain class to the `__init__.py` file to fix this issue. ## Maintainers @dev2049 @krlawrence	1 year ago
Bharat Raghunathan	04ebdbe98f	doc(prompts): Add redirects in Prompt subcategories pages (#8478 ) - Description: Fixes broken links in some Prompts subcategories in documentation (Example Selectors, Prompt Templates) - Issue: #8477 (Fixes #8477) - Dependencies: None - Tag maintainer: @baskaryan - Twitter handle: [@BharatR123](https://twitter.com/BharatR123)	1 year ago
Ludwig Hubert	08f5e6b801	Fix documentation for from_documents signature (#8482 ) Docs for from_documents() were outdated as seen in https://github.com/langchain-ai/langchain/issues/8457 . fixes #8457 <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	1 year ago
Muneeb Ahmad	4923cf029a	Added Proper Documentation for `faiss-gpu` Installation (#8492 ) ### Description In the LangChain Documentation and Comments, I've Noticed that `pip install faiss` was mentioned, instead of `pip install faiss-gpu`, since installing `pip install faiss` results in an error. I've gone ahead and updated the Documentation, and `faiss.ipynb`. This Change will ensure ease of use for the end user, trying to install `faiss-gpu`. ### Issue: Documentation / Comments Related. ### Dependencies: No Dependencies we're changed only updated the files with the wrong reference. ### Tag maintainer: @rlancemartin, @eyurtsev (Thank You for your contributions 😄 )	1 year ago
Harrison Chase	8f14ddefdf	add anthropic functions wrapper (#8475 ) a cheeky wrapper around claude that adds in function calling support (kind of, hence it going in experimental)	1 year ago
Harrison Chase	490ad93b3c	fix links generation (#8471 )	1 year ago
Harrison Chase	ae4638aa35	improve notebooks (#8461 )	1 year ago
Harrison Chase	412fa4e1db	add guide notebook (#8258 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> --------- Co-authored-by: Nuno Campos <nuno@boringbits.io> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	1 year ago
William FH	b7c0eb9ecb	Wfh/ref links (#8454 )	1 year ago
William FH	7d79178827	Wfh/update guide imports (#8452 )	1 year ago
Harrison Chase	17953ab61f	add notebook for sql query (#8442 )	1 year ago
Zack Proser	3892cefac6	Minor fixes to enhance notebook usability: (#8389 ) - Install langchain - Set Pinecone API key and environment as env vars - Create Pinecone index if it doesn't already exist --- - Description: Fix a couple minor issues I came across when running this notebook, - Issue: the issue # it fixes (if applicable), - Dependencies: none, - Tag maintainer: @rlancemartin @eyurtsev, - Twitter handle: @zackproser (certainly not necessary!)	1 year ago
Amélie	8ee56b9a5b	Feature: Add support for meilisearch vectorstore (#7649 ) Description: Add support for Meilisearch vector store. Resolve #7603 - No external dependencies added - A notebook has been added @rlancemartin https://twitter.com/meilisearch Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Bharat Raghunathan	62b8b459c6	doc(prompts): Add redirect to fix broken link on Prompts Page (#8408 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Bagatur	2311d57df4	mv dropbox (#8438 )	1 year ago
Bagatur	2db2987b1b	add experimental ref (#8435 )	1 year ago
HeTaoPKU	d5884017a9	Add Minimax llm model to langchain (#7645 ) - Description: Minimax is a great AI startup from China, recently they released their latest model and chat API, and the API is widely-spread in China. As a result, I'd like to add the Minimax llm model to Langchain. - Tag maintainer: @hwchase17, @baskaryan --------- Co-authored-by: the <tao.he@hulu.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Harrison Chase	1b0bfa54cf	cr	1 year ago
Jiayi Ni	1efb9bae5f	FEAT: Integrate Xinference LLMs and Embeddings (#8171 ) - [Xorbits Inference(Xinference)](https://github.com/xorbitsai/inference) is a powerful and versatile library designed to serve language, speech recognition, and multimodal models. Xinference supports a variety of GGML-compatible models including chatglm, whisper, and vicuna, and utilizes heterogeneous hardware and a distributed architecture for seamless cross-device and cross-server model deployment. - This PR integrates Xinference models and Xinference embeddings into LangChain. - Dependencies: To install the depenedencies for this integration, run `pip install "xinference[all]"` - Example Usage: To start a local instance of Xinference, run `xinference`. To deploy Xinference in a distributed cluster, first start an Xinference supervisor using `xinference-supervisor`: `xinference-supervisor -H "${supervisor_host}"` Then, start the Xinference workers using `xinference-worker` on each server you want to run them on. `xinference-worker -e "http://${supervisor_host}:9997"` To use Xinference with LangChain, you also need to launch a model. You can use command line interface (CLI) to do so. Fo example: `xinference launch -n vicuna-v1.3 -f ggmlv3 -q q4_0`. This launches a model named vicuna-v1.3 with `model_format="ggmlv3"` and `quantization="q4_0"`. A model UID is returned for you to use. Now you can use Xinference with LangChain: ```python from langchain.llms import Xinference llm = Xinference( server_url="http://0.0.0.0:9997", # suppose the supervisor_host is "0.0.0.0" model_uid = {model_uid} # model UID returned from launching a model ) llm( prompt="Q: where can we visit in the capital of France? A:", generate_config={"max_tokens": 1024}, ) ``` You can also use RESTful client to launch a model: ```python from xinference.client import RESTfulClient client = RESTfulClient("http://0.0.0.0:9997") model_uid = client.launch_model(model_name="vicuna-v1.3", model_size_in_billions=7, quantization="q4_0") ``` The following code block demonstrates how to use Xinference embeddings with LangChain: ```python from langchain.embeddings import XinferenceEmbeddings xinference = XinferenceEmbeddings( server_url="http://0.0.0.0:9997", model_uid = model_uid ) ``` ```python query_result = xinference.embed_query("This is a test query") ``` ```python doc_result = xinference.embed_documents(["text A", "text B"]) ``` Xinference is still under rapid development. Feel free to [join our Slack community](https://xorbitsio.slack.com/join/shared_invite/zt-1z3zsm9ep-87yI9YZ_B79HLB2ccTq4WA) to get the latest updates! - Request for review: @hwchase17, @baskaryan - Twitter handle: https://twitter.com/Xorbitsio --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Gordon Clark	e66759cc9d	Github add "Create PR" tool + Docs update (#8235 ) Added a new tool to the Github toolkit called Create Pull Request. Now we can make our own langchain contributor in langchain 😁 In order to have somewhere to pull from, I also added a new env var, "GITHUB_BASE_BRANCH." This will allow the existing env var, "GITHUB_BRANCH," to be a working branch for the bot (so that it doesn't have to always commit on the main/master). For example, if you want the bot to work in a branch called `bot_dev` and your repo base is `main`, you would set up the vars like: ``` GITHUB_BASE_BRANCH = "main" GITHUB_BRANCH = "bot_dev" ``` Maintainer responsibilities: - Agents / Tools / Toolkits: @hinthornw	1 year ago
William FH	ecd4aae818	Few Shot Chat Prompt (#8038 ) Proposal for a few shot chat message example selector --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	1 year ago
Karan V	a003a0baf6	fix(petals) allows to run models that aren't Bloom (Support for LLama and newer models) (#8356 ) In this PR: - Removed restricted model loading logic for Petals-Bloom - Removed petals imports (DistributedBloomForCausalLM, BloomTokenizerFast) - Instead imported more generalized versions of loader (AutoDistributedModelForCausalLM, AutoTokenizer) - Updated the Petals example notebook to allow for a successful installation of Petals in Apple Silicon Macs - Tag maintainer: @hwchase17, @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Harrison Chase	25b8cc7e3d	Harrison/update memory docs (#8384 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Taozhi Wang	594f195e54	Add embeddings for AwaEmbedding (#8353 ) - Description: Adds AwaEmbeddings class for embeddings, which provides users with a convenient way to do fine-tuning, as well as the potential need for multimodality - Tag maintainer: @baskaryan Create `Awa.ipynb`: an example notebook for AwaEmbeddings class Modify `embeddings/__init__.py`: Import the class Create `embeddings/awa.py`: The embedding class Create `embeddings/test_awa.py`: The test file. --------- Co-authored-by: taozhiwang <taozhiwa@gmail.com>	1 year ago
bheroder	dc3ca44e05	Add an example for azure ml managed feature store (#8324 ) We are adding an example of how one can connect to azure ml managed feature store and use such a prompt template in a llm chain. @baskaryan	1 year ago
evelynmitchell	539574670c	Update tot.ipynb (#8387 ) Spelling error fix <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	1 year ago
Sachin Varghese	01217b2247	Update sql database agent example (#8354 ) This PR fixes a minor documentation issue on the SQL database toolkit example notebook.	1 year ago
Bagatur	55beab326c	cleanup warnings (#8379 )	1 year ago
William FH	41524304bf	Update local script for docs build (#8377 )	1 year ago
Bagatur	68763bd25f	mv popular and additional chains to use cases (#8242 )	1 year ago
William FH	94a693e2ee	Link to use cases from tutorials (#8371 )	1 year ago
Rubén Barragán	ef6332ead6	Support loading files from Dropbox (#8271 ) ## Description This commit introduces the `DropboxLoader` class, a new document loader that allows loading files from Dropbox into the application. The loader relies on a Dropbox app, which requires creating an app on Dropbox, obtaining the necessary scope permissions, and generating an access token. Additionally, the dropbox Python package is required. The `DropboxLoader` class is designed to be used as a document loader for processing various file types, including text files, PDFs, and Dropbox Paper files. ## Dependencies `pip install dropbox` and `pip install unstructured` for PDF reading. ## Tag maintainer @rlancemartin, @eyurtsev (from Data Loaders). I'd appreciate some feedback here 🙏 . ## Social Networks https://github.com/rubenbarragan https://www.linkedin.com/in/rgbarragan/ https://twitter.com/RubenBarraganP --------- Co-authored-by: Ruben Barragan <rbarragan@Rubens-MacBook-Air.local>	1 year ago
Ikko Eltociear Ashimine	934ea80780	Fix typo in Etherscan.ipynb (#8340 ) specifc -> specific	1 year ago
Vadim Gubergrits	e7e5cb9d08	Tree of Thought introducing a new ToTChain. (#5167 ) # [WIP] Tree of Thought introducing a new ToTChain. This PR adds a new chain called ToTChain that implements the ["Large Language Model Guided Tree-of-Though"](https://arxiv.org/pdf/2305.08291.pdf) paper. There's a notebook example `docs/modules/chains/examples/tot.ipynb` that shows how to use it. Implements #4975 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: - @hwchase17 - @vowelparrot --------- Co-authored-by: Vadim Gubergrits <vgubergrits@outbox.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
William FH	412e29d436	Fix notebook that 'cannot convert' via nbdoc_build (#8333 )	1 year ago
William FH	9eb7e6e27f	Delete Old Evals Examples (#8252 ) Still retain: - Comparison Examples - Data + QA walkthrough - QA (but really minimize it)	1 year ago
Fabrizio Ruocco	ddc353a768	Azure Cognitive Search: Custom index and scoring profile support (#6843 ) Description: Adding support for custom index and scoring profile support in Azure Cognitive Search @hwchase17 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Leonid Ganeline	ed24de8467	removed namespace title (#8208 ) This change compacts the left-side Navbar (ToC) of the [API Reference](https://api.python.langchain.com/en/latest/api_reference.html). Now almost each namespace item is split into two lines. For example `langchain.chat_models: Chat Models` We remove the `Chat Models` and leave one the `langchain.chat_models`. This effectively compacts the navbar and increases the main page's usability. On my screen, it reduces # of lines in Toc from 28 t to 18, which is huge. Removing the namespace "title" (like `Chat Models`) does not remove any information because the title is composed directly from the namespace. API Reference users are developers. Usability for them is very important. We see less text => we find faster.	1 year ago
Kacper Łukawski	c5988c1d4b	Implement async support for Cohere (#8237 ) This PR introduces async API support for Cohere, both LLM and embeddings. It requires updating `cohere` package to `^4`. Tagging @hwchase17, @baskaryan, @agola11 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Bagatur	ceab0a7c1f	update api ref style (#8318 )	1 year ago
William FH	01a9b06400	Add api cross ref linking (#8275 ) Example of how it would show up in our python docs: ![image](https://github.com/langchain-ai/langchain/assets/13333726/0f0a88cc-ba4a-4778-bc47-118c66807f15) Examples added to the reference docs: https://api.python.langchain.com/en/wfh-api_crosslink/vectorstores/langchain.vectorstores.chroma.Chroma.html#langchain.vectorstores.chroma.Chroma ![image](https://github.com/langchain-ai/langchain/assets/13333726/dcd150de-cb56-4d42-b49a-a76a002a5a52)	1 year ago
Riche Akparuorji	f3d2fdd54c	Fix for code snippet in documentation (#8290 ) - Description: I fixed an issue in the code snippet related to the variable name and the evaluation of its length. The original code used the variable "docs," but the correct variable name is "docs_svm" after using the SVMRetriever. - maintainer: @baskaryan - Twitter handle: @iamreechi_ Co-authored-by: iamreechi <richieakparuorji>	1 year ago
Bagatur	f27176930a	fix geopandas link (#8305 )	1 year ago
Timon Palm	70604e590f	DuckDuckGoSearch News Tool (#8292 ) Description: I wanted to use the DuckDuckGoSearch tool in an agent to let him get the latest news for a topic. DuckDuckGoSearch has already an implemented function for retrieving news articles. But there wasn't a tool to use it. I simply adapted the SearchResult class with an extra argument "backend". You can set it to "news" to only get news articles. Furthermore, I added an example to the DuckDuckGo Notebook on how to further customize the results by using the DuckDuckGoSearchAPIWrapper. Dependencies: no new dependencies --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Aarav Borthakur	8ce661d5a1	Docs: Fix Rockset links (#8214 ) Fix broken Rockset links. Right now links at https://python.langchain.com/docs/integrations/providers/rockset are broken.	1 year ago
Jon Bennion	ad38eb2d50	correction to reference to code (#8301 ) - Description: fixes typo referencing code --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Naveen Tatikonda	9cbefcc56c	[ OpenSearch ] : Add AOSS Support to OpenSearch (#8256 ) ### Description This PR includes the following changes: - Adds AOSS (Amazon OpenSearch Service Serverless) support to OpenSearch. Please refer to the documentation on how to use it. - While creating an index, AOSS only supports Approximate Search with `nmslib` and `faiss` engines. During Search, only Approximate Search and Script Scoring (on doc values) are supported. - This PR also adds support to `efficient_filter` which can be used with `faiss` and `lucene` engines. - The `lucene_filter` is deprecated. Instead please use the `efficient_filter` for the lucene engine. Signed-off-by: Naveen Tatikonda <navtat@amazon.com>	1 year ago
Lance Martin	7a00f17033	Web research retriever (#8102 ) Given a user question, this will - * Use LLM to generate a set of queries. * Query for each. * The URLs from search results are stored in self.urls. * A check is performed for any new URLs that haven't been processed yet (not in self.url_database). * Only these new URLs are loaded, transformed, and added to the vectorstore. * The vectorstore is queried for relevant documents based on the questions generated by the LLM. * Only unique documents are returned as the final result. This code will avoid reprocessing of URLs across multiple runs of similar queries, which should improve the performance of the retriever. It also keeps track of all URLs that have been processed, which could be useful for debugging or understanding the retriever's behavior. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Byron Saltysiak	68a906bb31	added lxml to the pip install example since it is required (#8260 ) - Description: The trello dataloader example didn't work without an additional dependency installed - lxml - Issue: na	1 year ago
Emory Petermann	7734a2b5ab	update golden-query notebook and fix typo in golden docs (#8253 ) updating the documentation to be consistent for Golden query tool and have a better introduction to the tool	1 year ago
William FH	dd87275dde	Add LLMChain example of memory with chat models (#8250 )	1 year ago
William FH	1f40d3e094	Update Broken Links (#8247 )	1 year ago
William FH	30c2d3cd06	Update references (#8243 )	1 year ago
William FH	0a16b3d84b	Update Integrations links (#8206 )	1 year ago
Dayuan Jiang	125ae6d9de	add Hybrid retriever that not require any external service (#8108 ) - Until now, hybrid search was limited to modules requiring external services, such as Weaviate/Pinecone Hybrid Search. However, I have developed a hybrid retriever that can merge a list of retrievers using the [Reciprocal Rank Fusion](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf) algorithm. This new approach, similar to Weaviate hybrid search, does not require the initialization of any external service. - Dependencies: No - Twitter handle: dayuanjian21687 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Dario Ruben	04e45f9cde	Fixed grammar in LLM models documentation (#8210 ) Description: I fixed a typo in the documentation related to LLMs (https://python.langchain.com/docs/modules/model_io/models/llms/)	1 year ago
Taqi Jaffri	8f158b72fc	Added stop sequence support to replicate (#8107 ) Stop sequences are useful if you are doing long-running completions and need to early-out rather than running for the full max_length... not only does this save inference cost on Replicate, it is also much faster if you are going to truncate the output later anyway. Other LLMs support stop sequences natively (e.g. OpenAI) but I didn't see this for Replicate so adding this via their prediction cancel method. Housekeeping: I ran `make format` and `make lint`, no issues reported in the files I touched. I did update the replicate integration test and ran `poetry run pytest tests/integration_tests/llms/test_replicate.py` successfully. Finally, I am @tjaffri https://twitter.com/tjaffri for feature announcement tweets... or if you could please tag @docugami https://twitter.com/docugami we would really appreciate that :-) Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>	1 year ago
glaze	f7ad14acfa	Add etherscan document loader (#7943 ) @rlancemartin The modification includes: * etherscanLoader * test_etherscan * document ipynb I have run the test, lint, format, and spell check. I do encounter a linting error on ipynb, I am not sure how to address that. ``` docs/extras/modules/data_connection/document_loaders/integrations/Etherscan.ipynb:55: error: Name "null" is not defined [name-defined] docs/extras/modules/data_connection/document_loaders/integrations/Etherscan.ipynb:76: error: Name "null" is not defined [name-defined] Found 2 errors in 1 file (checked 1 source file) ``` - Description: The Etherscan loader uses etherscan api to load transaction histories under specific accounts on Ethereum Mainnet. - No dependency is introduced by this PR. - Twitter handle: glazecl --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Bagatur	483f6c2fe3	mv eval docs (#8209 )	1 year ago
Liu Ming	24f889f2bc	Change with_history option to False for ChatGLM by default (#8076 ) ChatGLM LLM integration will by default accumulate conversation history(with_history=True) to ChatGLM backend api, which is not expected in most cases. This PR set with_history=False by default, user should explicitly set llm.with_history=True to turn this feature on. Related PR: #8048 #7774 --------- Co-authored-by: mlot <limpo2000@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Anthony Mahanna	76102971c0	ArangoDB/AQL support for Graph QA Chain (#7880 ) Description: Serves as an introduction to LangChain's support for [ArangoDB](https://github.com/arangodb/arangodb), similar to https://github.com/hwchase17/langchain/pull/7165 and https://github.com/hwchase17/langchain/pull/4881 Issue: No issue has been created for this feature Dependencies: `python-arango` has been added as an optional dependency via the `CONTRIBUTING.md` guidelines Twitter handle: [at]arangodb - Integration test has been added - Notebook has been added: [graph_arangodb_qa.ipynb](https://github.com/amahanna/langchain/blob/master/docs/extras/modules/chains/additional/graph_arangodb_qa.ipynb) [![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/amahanna/langchain/blob/master/docs/extras/modules/chains/additional/graph_arangodb_qa.ipynb) ``` docker run -p 8529:8529 -e ARANGO_ROOT_PASSWORD= arangodb/arangodb ``` ``` pip install git+https://github.com/amahanna/langchain.git ``` ```python from arango import ArangoClient from langchain.chat_models import ChatOpenAI from langchain.graphs import ArangoGraph from langchain.chains import ArangoGraphQAChain db = ArangoClient(hosts="localhost:8529").db(name="_system", username="root", password="", verify=True) graph = ArangoGraph(db) chain = ArangoGraphQAChain.from_llm(ChatOpenAI(temperature=0), graph=graph) chain.run("Is Ned Stark alive?") ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Adilkhan Sarsen	3e7d2a1b64	SelfQuery support for deeplake (#7888 ) Added support SelfQuery for Deeplake	1 year ago
Juan José Torres	1cc7d4c9eb	Update SageMaker Endpoint Embeddings docs to be up to date with current requirements (#8103 ) - Description: Simple change of the Class that ContentHandler inherits from. To create an object of type SagemakerEndpointEmbeddings, the property content_handler must be of type EmbeddingsContentHandler not ContentHandlerBase anymore, - Twitter handle: @Juanjo_Torres11 Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Bagatur	1a7d8667c8	Bagatur/gateway chat (#8198 ) Signed-off-by: dbczumar <corey.zumar@databricks.com> Co-authored-by: dbczumar <corey.zumar@databricks.com>	1 year ago
Ettore Di Giacinto	ae28568e2a	Add embeddings for LocalAI (#8134 ) Description: This PR adds embeddings for LocalAI ( https://github.com/go-skynet/LocalAI ), a self-hosted OpenAI drop-in replacement. As LocalAI can re-use OpenAI clients it is mostly following the lines of the OpenAI embeddings, however when embedding documents, it just uses string instead of sending tokens as sending tokens is best-effort depending on the model being used in LocalAI. Sending tokens is also tricky as token id's can mismatch with the model - so it's safer to just send strings in this case. Partly related to: https://github.com/hwchase17/langchain/issues/5256 Dependencies: No new dependencies Twitter: @mudler_it --------- Signed-off-by: mudler <mudler@localai.io> Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Mike Nitsenko	d983046f90	Extend Cube Semantic Loader functionality (#8186 ) PR Description: This pull request introduces several enhancements and new features to the `CubeSemanticLoader`. The changes include the following: 1. Added imports for the `json` and `time` modules. 2. Added new constructor parameters: `load_dimension_values`, `dimension_values_limit`, `dimension_values_max_retries`, and `dimension_values_retry_delay`. 3. Updated the class documentation with descriptions for the new constructor parameters. 4. Added a new private method `_get_dimension_values()` to retrieve dimension values from Cube's REST API. 5. Modified the `load()` method to load dimension values for string dimensions if `load_dimension_values` is set to `True`. 6. Updated the API endpoint in the `load()` method from the base URL to the metadata endpoint. 7. Refactored the code to retrieve metadata from the response JSON. 8. Added the `column_member_type` field to the metadata dictionary to indicate if a column is a measure or a dimension. 9. Added the `column_values` field to the metadata dictionary to store the dimension values retrieved from Cube's API. 10. Modified the `page_content` construction to include the column title and description instead of the table name, column name, data type, title, and description. These changes improve the functionality and flexibility of the `CubeSemanticLoader` class by allowing the loading of dimension values and providing more detailed metadata for each document. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Bagatur	4928f7a9f5	undo bump (#8192 )	1 year ago
Bagatur	14aa27b5f4	redirect (#8189 )	1 year ago
Bagatur	e7d64f8b15	Bagatur/vercel test 3 (#8188 )	1 year ago
Bagatur	026269bfa9	redirects (#8183 )	1 year ago
Harrison Chase	3caccf304c	Harrison/hugginggpt (#8162 ) Co-authored-by: Yongliang Shen <withsyl@163.com>	1 year ago
Bagatur	c8c8635dc9	mv module integrations docs (#8101 )	1 year ago
Adarsh Shirawalmath	8ea840432f	Generalize Comment on Streaming Support for LLM Implementations and add examples (#8115 ) The example provided demonstrates the usage of the HuggingFaceTextGenInference implementation with streaming enabled.	1 year ago
Harrison Chase	33fd6184ba	beef up getting started (#8139 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Lawrence Lim	fa8906a9b7	fix typo: Entity Summary Memory documentation (#8145 ) Fixed a small typo I came across in the Memory documentation.	1 year ago
Fielding Johnston	fb62f2be70	nit: small typo in evaluation module docs (#8155 ) Hopefully, this doesn't come across as nitpicky! That isn't the intention. I only noticed it, because I enjoy reading the documentation and when I hit a mental road bump it is usually due to a missing word or something =) @baskaryan	1 year ago
SlapDrone	961a0e200f	Implement AgentExecutorIterator (#6929 ) - Description: Implements a `.iter()` method for the `AgentExecutor` class. This allows hooking into and intercepting intermediate agent steps. - Issue: #6925 - Dependencies: None - Tag maintainer: @vowelparrot @agola11 - Twitter handle: @SlapDron3 @lacicocodes --------- Co-authored-by: Lacico <Lacicocodes@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Harrison Chase	e46126eac6	add llamaapi (#8140 )	1 year ago
Harrison Chase	f0eb5db670	Harrison/agent intro (#8138 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Harrison Chase	cbf2fc8af8	prompt ergonomics (#7799 )	1 year ago
Samuel Berthe	d81d6e874f	doc(sqldatabasechain): use views when jsonb column description is not available (#8133 ) I think the PR diff is self explaining ;) @baskaryan	1 year ago
Karthik Raja A	8b08687fc4	MultiOn client toolkit (#8110 ) Addition of MultiOn Client Agent Toolkit Dependencies: multion pip package This PR consists of the following: - MultiOn utility,tools and integration with agent - sample jupyter notebook. Request @hwchase17 , @hinthornw --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Harrison Chase	aa0e69bc98	Harrison/official pre release (#8106 )	1 year ago
Bagatur	58f65fcf12	use top nav docs (#8090 )	1 year ago
Bagatur	08c658d3f8	fix api ref (#8083 )	1 year ago
Harrison Chase	f35db9f43e	(WIP) set up experimental (#7959 )	1 year ago
Lance Martin	5a084e1b20	Async HTML loader and HTML2Text transformer (#8036 ) New HTML loader that asynchronously loader a list of urls. New transformer using [HTML2Text](https://github.com/Alir3z4/html2text/) for HTML to clean, easy-to-read plain ASCII text (valid Markdown).	1 year ago
Wey Gu	cf60cff1ef	feat: Add with_history option for chatglm (#8048 ) In certain 0-shot scenarios, the existing stateful language model can unintentionally send/accumulate the .history. This commit adds the "with_history" option to chatglm, allowing users to control the behavior of .history and prevent unintended accumulation. Possible reviewers @hwchase17 @baskaryan @mlot Refer to discussion over this thread: https://twitter.com/wey_gu/status/1681996149543276545?s=20	1 year ago
Harrison Chase	1f3b987860	Harrison/GitHub toolkit (#8047 ) Co-authored-by: Trevor Dobbertin <trevordobbertin@gmail.com>	1 year ago
Harrison Chase	f99f497b2c	Harrison/predibase (#8046 ) Co-authored-by: Abhay Malik <32989166+Abhay-765@users.noreply.github.com>	1 year ago
Jacob Lee	56c6ab1715	Fix bad docs sidebar header (#7966 ) Quick fix for: <img width="283" alt="Screenshot 2023-07-19 at 2 49 44 PM" src="https://github.com/hwchase17/langchain/assets/6952323/91e4868c-b75e-413d-9f8f-d34762abf164"> CC @baskaryan	1 year ago
Kacper Łukawski	ed6a5532ac	Implement async support in Qdrant local mode (#8001 ) I've extended the support of async API to local Qdrant mode. It is faked but allows prototyping without spinning a container. The tests are improved to test the in-memory case as well. @baskaryan @rlancemartin @eyurtsev @agola11	1 year ago
Taqi Jaffri	973593c5c7	Added streaming support to Replicate (#8045 ) Streaming support is useful if you are doing long-running completions or need interactivity e.g. for chat... adding it to replicate, using a similar pattern to other LLMs that support streaming. Housekeeping: I ran `make format` and `make lint`, no issues reported in the files I touched. I did update the replicate integration test but ran into some issues, specifically: 1. The original test was failing for me due to the model argument not being specified... perhaps this test is not regularly run? I fixed it by adding a call to the lightweight hello world model which should not be burdensome for replicate infra. 2. I couldn't get the `make integration_tests` command to pass... a lot of failures in other integration tests due to missing dependencies... however I did make sure the particluar test file I updated does pass, by running `poetry run pytest tests/integration_tests/llms/test_replicate.py` Finally, I am @tjaffri https://twitter.com/tjaffri for feature announcement tweets... or if you could please tag @docugami https://twitter.com/docugami we would really appreciate that :-) Tagging model maintainers @hwchase17 @baskaryan Thank for all the awesome work you folks are doing. --------- Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>	1 year ago
Piyush Jain	31b7ddc12c	Neptune graph and openCypher QA Chain (#8035 ) ## Description This PR adds a graph class and an openCypher QA chain to work with the Amazon Neptune database. ## Dependencies `requests` which is included in the LangChain dependencies. ## Maintainers for Review @krlawrence @baskaryan ### Twitter handle pjain7	1 year ago
Emory Petermann	7239d57a53	Update Golden integration documentation (#8030 ) fixes some typos and cleans up onboarding for golden, thank you! @hinthornw	1 year ago
Jonathon Belotti	021bb9be84	Update Modal.com integration docs (#8014 ) Hey, I'm a Modal Labs engineer and I'm making this docs update after getting a user question in [our beta Slack space](https://join.slack.com/t/modalbetatesters/shared_invite/zt-1xl9gbob8-1QDgUY7_PRPg6dQ49hqEeQ) about the Langchain integration docs. 🔗 [Modal beta-testers link to docs discussion thread](https://modalbetatesters.slack.com/archives/C031Z7DBQFL/p1689777700594819?thread_ts=1689775859.855849&cid=C031Z7DBQFL)	1 year ago
Jeffrey Wang	62d0475c29	Add Metaphor new field and reformat docs (#8022 ) This PR reformats our python notebook example and also adds a new field we have. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	1 year ago
vrushankportkey	5f10d2ea1d	Add Portkey LLMOps integration (#7877 ) Integrating Portkey, which adds production features like caching, tracing, tagging, retries, etc. to langchain apps. - Dependencies: None - Twitter handle: https://twitter.com/portkeyai - test_portkey.py added for tests - example notebook added in new utilities folder in modules Also fixed a bug with OpenAIEmbeddings where headers weren't passing. cc @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Dwai Banerjee	d8c40253c3	Adding endpoint_url to embeddings/bedrock.py and updated docs (#7927 ) BedrockEmbeddings does not have endpoint_url so that switching to custom endpoint is not possible. I have access to Bedrock custom endpoint and cannot use BedrockEmbeddings --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Constantin Musca	d593833e4d	Add Golden Query Tool (#7930 ) Description: Golden Query is a wrapper on top of the [Golden Query API](https://docs.golden.com/reference/query-api) which enables programmatic access to query results on entities across Golden's Knowledge Base. For more information about Golden API, please see the [Golden API Getting Started](https://docs.golden.com/reference/getting-started) page. Issue: None Dependencies: requests(already present in project) Tag maintainer: @hinthornw Signed-off-by: Constantin Musca <constantin.musca@gmail.com>	1 year ago
Santiago Delgado	c416dbe8e0	Amadeus Flight and Travel Search Tool (#7890 ) ## Background With the addition on email and calendar tools, LangChain is continuing to complete its functionality to automate business processes. ## Challenge One of the pieces of business functionality that LangChain currently doesn't have is the ability to search for flights and travel in order to book business travel. ## Changes This PR implements an integration with the [Amadeus](https://developers.amadeus.com/) travel search API for LangChain, enabling seamless search for flights with a single authentication process. ## Who can review? @hinthornw ## Appendix @tsolakoua and @minjikarin, I utilized your [amadeus-python](https://github.com/amadeus4dev/amadeus-python) library extensively. Given the rising popularity of LangChain and similar AI frameworks, the convergence of libraries like amadeus-python and tools like this one is likely. So, I wanted to keep you updated on our progress. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Yun Kim	54e02e4392	Add datadog-langchain integration doc (#7955 ) ## Description Added a doc about the [Datadog APM integration for LangChain](https://github.com/DataDog/dd-trace-py/pull/6137). Note that the integration is on `ddtrace`'s end and so no code is introduced/required by this integration into the langchain library. For that reason I've refrained from adding an example notebook (although I've added setup instructions for enabling the integration in the doc) as no code is technically required to enable the integration. Tagging @baskaryan as reviewer on this PR, thank you very much! ## Dependencies Datadog APM users will need to have `ddtrace` installed, but the integration is on `ddtrace` end and so does not introduce any external dependencies to the LangChain project. Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Jithin James	493cbc9410	docs: fix a couple of small indentation errors in the strings (#7951 ) Fixed a few indentations I came across in the docs @baskaryan	1 year ago
Bhashithe Abeysinghe	73901ef132	Added windows specific instructions to Llama.cpp documentation. (#8000 ) - Description: Added windows specific instructions on llama.cpp in the notebook file - Issue: #6356 - Dependencies: None - Tag maintainer: @baskaryan	1 year ago
Jeff Huber	5694e7b8cf	Update chroma notebook (#7978 ) Fix up the Chroma notebook - remove `.persist()` -- this is no longer in Chroma as of `0.4.0` - update output to match `0.4.0` - other cleanup work	1 year ago
Harutaka Kawamura	4a5894db47	Fix incorrect field name in MLflow AI Gateway config example (#7983 )	1 year ago
Kacper Łukawski	19e8472521	Add async Qdrant to async_agent.ipynb (#7993 ) I added Qdrant to the async API docs. This is the only vector store that supports full async API. @baskaryan @rlancemartin, @eyurtsev	1 year ago
Bagatur	5d021c0962	nb fix (#7962 )	1 year ago
Julien Salinas	3adab5e5be	Integrate NLP Cloud embeddings endpoint (#7931 ) Add embeddings for [NLPCloud](https://docs.nlpcloud.com/#embeddings). --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Lance Martin <lance@langchain.dev>	1 year ago
Bagatur	854a2be0ca	Add debugging guide (#7956 )	1 year ago
Brendan Collins	9aef79c2e3	Add Geopandas.GeoDataFrame Document Loader (#3817 ) Work in Progress. WIP Not ready... Adds Document Loader support for [Geopandas.GeoDataFrames](https://geopandas.org/) Example: - [x] stub out `GeoDataFrameLoader` class - [x] stub out integration tests - [ ] Experiment with different geometry text representations - [ ] Verify CRS is successfully added in metadata - [ ] Test effectiveness of searches on geometries - [ ] Test with different geometry types (point, line, polygon with multi-variants). - [ ] Add documentation --------- Co-authored-by: Lance Martin <lance@langchain.dev> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Lance Martin <122662504+rlancemartin@users.noreply.github.com>	1 year ago
Lance Martin	dfc533aa74	Add llama-v2 to local document QA (#7952 )	1 year ago
Bagatur	f97535b33e	fix (#7947 )	1 year ago
Harutaka Kawamura	f6839a8682	Add integration for MLflow AI Gateway (#7113 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> - Adds integration for MLflow AI Gateway (this will be shipped in MLflow 2.5 this week). Manual testing: ```sh # Move to mlflow repo cd /path/to/mlflow # install langchain pip install git+https://github.com/harupy/langchain.git@gateway-integration # launch gateway service mlflow gateway start --config-path examples/gateway/openai/config.yaml # Then, run the examples in this PR ```	1 year ago
William FH	9d7e57f5c0	Docs Nit (#7918 )	1 year ago
Jarek Kazmierczak	f2ef3ff54a	Google Cloud Enterprise Search retriever (#7857 ) Added a retriever that encapsulated Google Cloud Enterprise Search. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Zizhong Zhang	bdf0c2267f	docs(custom_chain) fix typo (#7898 ) Fix typo in the document of custom_chain	1 year ago
Jeff Huber	2139d0197e	upgrade chroma to 0.4.0 (#7749 ) This should land Monday the 17th Chroma is upgrading from `0.3.29` to `0.4.0`. `0.4.0` is easier to build, more durable, faster, smaller, and more extensible. This comes with a few changes: 1. A simplified and improved client setup. Instead of having to remember weird settings, users can just do `EphemeralClient`, `PersistentClient` or `HttpClient` (the underlying direct `Client` implementation is also still accessible) 2. We migrated data stores away from `duckdb` and `clickhouse`. This changes the api for the `PersistentClient` that used to reference `chroma_db_impl="duckdb+parquet"`. Now we simply set `is_persistent=true`. `is_persistent` is set for you to `true` if you use `PersistentClient`. 3. Because we migrated away from `duckdb` and `clickhouse` - this also means that users need to migrate their data into the new layout and schema. Chroma is committed to providing extension notification and tooling around any schema and data migrations (for example - this PR!). After upgrading to `0.4.0` - if users try to access their data that was stored in the previous regime, the system will throw an `Exception` and instruct them how to use the migration assistant to migrate their data. The migration assitant is a pip installable CLI: `pip install chroma_migrate`. And is runnable by calling `chroma_migrate` -- TODO ADD here is a short video demonstrating how it works. Please reference the readme at [chroma-core/chroma-migrate](https://github.com/chroma-core/chroma-migrate) to see a full write-up of our philosophy on migrations as well as more details about this particular migration. Please direct any users facing issues upgrading to our Discord channel called [#get-help](https://discord.com/channels/1073293645303795742/1129200523111841883). We have also created a [email listserv](https://airtable.com/shrHaErIs1j9F97BE) to notify developers directly in the future about breaking changes. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Lance Martin	41c841ec85	Add Llama-v2 to Llama.cpp notebook (#7913 )	1 year ago
Bagatur	b9639f6067	fix docs (#7911 )	1 year ago
Jeff Huber	dc8b790214	Improve vector store onboarding exp (#6698 ) This PR - fixes the `similarity_search_by_vector` example, makes the code run and adds the example to mirror `similarity_search` - reverts back to chroma from faiss to remove sharp edges / create a happy path for new developers. (1) real metadata filtering, (2) expected functionality like `update`, `delete`, etc to serve beyond the most trivial use cases @hwchase17 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Lance Martin	862268175e	Add llama-v2 to docs (#7893 )	1 year ago
Filip Michalsky	69b9db2b5e	Notebook update: sales agent with tools (#7753 ) - Description: This is an update to a previously published notebook. Sales Agent now has access to tools, and this notebook shows how to use a Product Knowledge base to reduce hallucinations and act as a better sales person! - Issue: N/A - Dependencies: `chromadb openai tiktoken` - Tag maintainer: @baskaryan @hinthornw - Twitter handle: @FilipMichalsky	1 year ago
Orgil	75d3f1e5e6	remove unused import in voice assistant doc (#7757 ) Description: Removed unused import in voice_assistant doc. Tag maintainer: @baskaryan	1 year ago
maciej-skorupka	c6d1d6d7fc	feat: moving azure OpenAI API version to the latest 2023-05-15 (#7764 ) Moving to the latest non-preview Azure OpenAI API version=2023-05-15. The previous 2023-03-15-preview doesn't have support, SLA etc. For instance, OpenAI SDK has moved to this version https://github.com/openai/openai-python/releases/tag/v0.27.7 @baskaryan	1 year ago
satorioh	259a409998	docs(zilliz): connection_args add token description for serverless cl… (#7810 ) Description: Currently, Zilliz only support dedicated clusters using a pair of username and password for connection. Regarding serverless clusters, they can connect to them by using API keys( [ see official note detail](https://docs.zilliz.com/docs/manage-cluster-credentials)), so I add API key(token) description in Zilliz docs to make it more obvious and convenient for this group of users to better utilize Zilliz. No changes done to code. --------- Co-authored-by: Robin.Wang <3Jg$94sbQ@q1> Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
maciej-skorupka	5de7815310	docs: added comment from azure llm to azure chat about GPT-4 (#7884 ) Azure GPT-4 models can't be accessed via LLM model. It's easy to miss that and a lot of discussions about that are on the Internet. Therefore I added a comment in Azure LLM docs that mentions that and points to Azure Chat OpenAI docs. @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Bill Zhang	dda11d2a05	WeaviateHybridSearchRetriever option to enable scores. (#7861 ) Description: This PR adds the option to retrieve scores and explanations in the WeaviateHybridSearchRetriever. This feature improves the usability of the retriever by allowing users to understand the scoring logic behind the search results and further refine their search queries. Issue: This PR is a solution to the issue #7855 Dependencies: This PR does not introduce any new dependencies. Tag maintainer: @rlancemartin, @eyurtsev I have included a unit test for the added feature, ensuring that it retrieves scores and explanations correctly. I have also included an example notebook demonstrating its use.	1 year ago
Jonathan Pedoeem	c460c29a64	Adding Docs for `PromptLayerCallbackHandler` (#7860 ) Here I am adding documentation for the `PromptLayerCallbackHandler`. When we created the initial PR for the callback handler the docs were causing issues, so we merged without the docs.	1 year ago
German Martin	f1eaa9b626	Lost in the middle: We have been ordering documents the WRONG way. (for long context) (#7520 ) Motivation, it seems that when dealing with a long context and "big" number of relevant documents we must avoid using out of the box score ordering from vector stores. See: https://arxiv.org/pdf/2306.01150.pdf So, I added an additional parameter that allows you to reorder the retrieved documents so we can work around this performance degradation. The relevance respect the original search score but accommodates the lest relevant document in the middle of the context. Extract from the paper (one image speaks 1000 tokens): ![image](https://github.com/hwchase17/langchain/assets/1821407/fafe4843-6e18-4fa6-9416-50cc1d32e811) This seems to be common to all diff arquitectures. SO I think we need a good generic way to implement this reordering and run some test in our already running retrievers. It could be that my approach is not the best one from the architecture point of view, happy to have a discussion about that. For me this was the best place to introduce the change and start retesting diff implementations. @rlancemartin, @eyurtsev --------- Co-authored-by: Lance Martin <lance@langchain.dev>	1 year ago
Bagatur	6a32f93669	add ls link (#7847 )	1 year ago
William FH	c6f2d27789	Docs Nits (#7874 ) Add links to reference docs	1 year ago
William FH	3179ee3a56	Evals docs (#7460 ) Still don't have good "how to's", and the guides / examples section could be further pruned and improved, but this PR adds a couple examples for each of the common evaluator interfaces. - [x] Example docs for each implemented evaluator - [x] "how to make a custom evalutor" notebook for each low level APIs (comparison, string, agent) - [x] Move docs to modules area - [x] Link to reference docs for more information - [X] Still need to finish the evaluation index page - ~[ ] Don't have good data generation section~ - ~[ ] Don't have good how to section for other common scenarios / FAQs like regression testing, testing over similar inputs to measure sensitivity, etc.~	1 year ago
Nicolas	46330da2e7	docs: Mendable: Fixes pretty sources not working (#7863 ) This new version fixes the"Verified Sources" display that got broken. Instead of displaying the full URL, it shows the title of the page the source is from.	1 year ago
Jasper	5b4d53e8ef	Add text_content kwarg to BrowserlessLoader (#7856 ) Added keyword argument to toggle between getting the text content of a site versus its HTML when using the `BrowserlessLoader`	1 year ago
William FH	2aa3cf4e5f	update notebook (#7852 )	1 year ago
Matt Robinson	3c489be773	feat: optional post-processing for Unstructured loaders (#7850 ) ### Summary Adds a post-processing method for Unstructured loaders that allows users to optionally modify or clean extracted elements. ### Testing ```python from langchain.document_loaders import UnstructuredFileLoader from unstructured.cleaners.core import clean_extra_whitespace loader = UnstructuredFileLoader( "./example_data/layout-parser-paper.pdf", mode="elements", post_processors=[clean_extra_whitespace], ) docs = loader.load() docs[:5] ``` ### Reviewrs - @rlancemartin - @eyurtsev - @hwchase17	1 year ago
Bagatur	2a315dbee9	fix nb (#7843 )	1 year ago
Bagatur	98c48f303a	fix (#7838 )	1 year ago
Dayuan Jiang	ee40d37098	add bm25 module (#7779 ) - Description: Add a BM25 Retriever that do not need Elastic search - Dependencies: rank_bm25(if it is not installed it will be install by using pip, just like TFIDFRetriever do) - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: DayuanJian21687 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Liu Ming	fa0a9e502a	Add LLM for ChatGLM(2)-6B API (#7774 ) Description: Add LLM for ChatGLM-6B & ChatGLM2-6B API Related Issue: Will the langchain support ChatGLM? #4766 Add support for selfhost models like ChatGLM or transformer models #1780 Dependencies: No extra library install required. It wraps api call to a ChatGLM(2)-6B server(start with api.py), so api endpoint is required to run. Tag maintainer: @mlot Any comments on this PR would be appreciated. --------- Co-authored-by: mlot <limpo2000@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
sseide	25e3d3f283	Support Redis Sentinel database connections (#5196 ) # Support Redis Sentinel database connections This PR adds the support to connect not only to Redis standalone servers but High Availability Replication sets too (https://redis.io/docs/management/sentinel/) Redis Replica Sets have on Master allowing to write data and 2+ replicas with read-only access to the data. The additional Redis Sentinel instances monitor all server and reconfigure the RW-Master on the fly if it comes unavailable. Therefore all connections must be made through the Sentinels the query the current master for a read-write connection. This PR adds basic support to also allow a redis connection url specifying a Sentinel as Redis connection. Redis documentation and Jupyter notebook with Redis examples are updated to mention how to connect to a redis Replica Set with Sentinels - Remark - i did not found test cases for Redis server connections to add new cases here. Therefor i tests the new utility class locally with different kind of setups to make sure different connection urls are working as expected. But no test case here as part of this PR.	1 year ago
Yifei Song	2e47412073	Add Xorbits agent (#7647 ) - [Xorbits](https://doc.xorbits.io/en/latest/) is an open-source computing framework that makes it easy to scale data science and machine learning workloads in parallel. Xorbits can leverage multi cores or GPUs to accelerate computation on a single machine, or scale out up to thousands of machines to support processing terabytes of data. - This PR added support for the Xorbits agent, which allows langchain to interact with Xorbits Pandas dataframe and Xorbits Numpy array. - Dependencies: This change requires the Xorbits library to be installed in order to be used. `pip install xorbits` - Request for review: @hinthornw - Twitter handle: https://twitter.com/Xorbitsio	1 year ago
Ankush Gola	ff3aada0b2	minor langsmith notebook fixes (#7814 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	1 year ago
William FH	c58d35765d	Add examples to docstrings (#7796 ) and: - remove dataset name from autogenerated project name - print out project name to view	1 year ago
Ankush Gola	c4ece52dac	update LangSmith notebook (#7767 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	1 year ago
Lance Martin	1d06eee3b5	Fix ntbk link in docs (#7755 ) Minor fix to running to [docs](https://python.langchain.com/docs/use_cases/question_answering/local_retrieval_qa).	1 year ago
Mohammad Mohtashim	b8b8a138df	Simple Import fix in Tools Exception Docs (#7740 ) Issue: #7720 @hinthornw	1 year ago
Nicolas	43f900fd38	docs: Mendable Search Improvements (#7744 ) - New pin-to-side (button). This functionality allows you to search the docs while asking the AI for questions - Fixed the search bar in Firefox that won't detect a mouse click - Fixes and improvements overall in the model's performance	1 year ago
Lance Martin	b015647e31	Add GPT4All embeddings (#7743 ) Support for [GPT4All embeddings](https://docs.gpt4all.io/gpt4all_python_embedding.html) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Kacper Łukawski	1ff5b67025	Implement async API for Qdrant vector store (#7704 ) Inspired by #5550, I implemented full async API support in Qdrant. The docs were extended to mention the existence of asynchronous operations in Langchain. I also used that chance to restructure the tests of Qdrant and provided a suite of tests for the async version. Async API requires the GRPC protocol to be enabled. Thus, it doesn't work on local mode yet, but we're considering including the support to be consistent.	1 year ago
Bearnardd	275b926cf7	add missing import (#7730 ) Just a nit documentation fix @baskaryan	1 year ago
Lorenzo	77e6bbe6f0	fix typo in deeplake.ipynb (#7718 ) - Fixing typos in deeplake documentation - @baskaryan	1 year ago
Samuel Berthe	2be3515a66	SQLDatabase: adding security disclamer (#7710 ) It might be obvious to most engineers, but I think everybody should be cautious when using such a chain. ![image](https://github.com/hwchase17/langchain/assets/2951285/a1df6567-9d56-4c12-98ea-767401ae2ac8)	1 year ago
Bagatur	bae93682f6	update docs (#7714 )	1 year ago
Bagatur	b065da6933	Bagatur/docs nit (#7712 )	1 year ago
Bagatur	87d81b6acc	Redirect old text splitter page (#7708 ) related to #7665	1 year ago
Aarav Borthakur	210296a71f	Integrate Rockset as a document loader (#7681 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> Integrate [Rockset](https://rockset.com/docs/) as a document loader. Issue: None Dependencies: Nothing new (rockset's dependency was already added [here](https://github.com/hwchase17/langchain/pull/6216)) Tag maintainer: @rlancemartin I have added a test for the integration and an example notebook showing its use. I ran `make lint` and everything looks good. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Samuel Berthe	7d4843fe84	feat(chains): adding ElasticsearchDatabaseChain for interacting with analytics database (#7686 ) This pull request adds a ElasticsearchDatabaseChain chain for interacting with analytics database, in the manner of the SQLDatabaseChain. Maintainer: @samber Twitter handler: samuelberthe --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Daniel	6d88b23ef7	Update pgembedding.ipynb (#7699 ) Update the extension name. It changed from pg_hnsw to pg_embedding. Thank you. I missed this in my previous commit.	1 year ago
Richy Wang	45bb414be2	Add LLM for Alibaba's Damo Academy's Tongyi Qwen API (#7477 ) - Add langchain.llms.Tonyi for text completion, in examples into the Tonyi Text API, - Add system tests. Note async completion for the Text API is not yet supported and will be included in a future PR. Dependencies: dashscope. It will be installed manually cause it is not need by everyone. Happy for feedback on any aspect of this PR @hwchase17 @baskaryan.	1 year ago
Lance Martin	6325a3517c	Make recursive loader yield while crawling (#7568 ) Support actual lazy_load since it can take a while to crawl larger directories.	1 year ago
UmerHA	82f3e32d8d	[Small upgrade] Allow document limit in AzureCognitiveSearchRetriever (#7690 ) Multiple people have asked in #5081 for a way to limit the documents returned from an AzureCognitiveSearchRetriever. This PR adds the `top_n` parameter to allow that. Twitter handle: [@UmerHAdil](twitter.com/umerHAdil)	1 year ago
Daniel	854f3fe9b1	Update pgembedding.ipynb (#7682 ) Correct links to the pg_embedding repository and the Neon documentation.	1 year ago
William FH	051fac1e66	Improve walkthrough links for sphinx (#7672 ) Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>	1 year ago
Bagatur	5db4dba526	add integrations hub link to docs (#7675 )	1 year ago
Jasper	fbc97a77ed	add browserless loader (#7562 ) # Browserless Added support for Browserless' `/content` endpoint as a document loader. ### About Browserless Browserless is a cloud service that provides access to headless Chrome browsers via a REST API. It allows developers to automate Chromium in a serverless fashion without having to configure and maintain their own Chrome infrastructure. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Lance Martin <lance@langchain.dev>	1 year ago
frangin2003	c7b687e944	Simplify GraphQL Tool Initialization documentation by Removing 'llm' Argument (#7651 ) This PR is aimed at enhancing the clarity of the documentation in the langchain project. Description: In the graphql.ipynb file, I have removed the unnecessary 'llm' argument from the initialization process of the GraphQL tool (of type _EXTRA_OPTIONAL_TOOLS). The 'llm' argument is not required for this process. Its presence could potentially confuse users. This modification simplifies the understanding of tool initialization and minimizes potential confusion. Issue: Not applicable, as this is a documentation improvement. Dependencies: None. I kindly request a review from the following maintainer: @hinthornw, who is responsible for Agents / Tools / Toolkits. No new integration is being added in this PR, hence no need for a test or an example notebook. Please see the changes for more detail and let me know if any further modification is necessary.	1 year ago
William FH	a673a51efa	[Breaking] Update Evaluation Functionality (#7388 ) - Migrate from deprecated langchainplus_sdk to `langsmith` package - Update the `run_on_dataset()` API to use an eval config - Update a number of evaluators, as well as the loading logic - Update docstrings / reference docs - Update tracer to share single HTTP session	1 year ago
Matt Adams	98e1bbfbbd	Add missing dependencies to apify.ipynb (#6331 ) Fixes errors caused by missing dependencies when running the notebook.	1 year ago
Francisco Ingham	488d2d5da9	Entity extraction improvements (#6342 ) Added fix to avoid irrelevant attributes being returned plus an example of extracting unrelated entities and an exampe of using an 'extra_info' attribute to extract unstructured data for an entity. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Bagatur	7f8ff2a317	add tagger nb (#7637 )	1 year ago
Jacob Lee	cdb93ab5ca	Adds OpenAI functions powered document metadata tagger (#7521 ) Adds a new document transformer that automatically extracts metadata for a document based on an input schema. I also moved `document_transformers.py` to `document_transformers/__init__.py` to group it with this new transformer - it didn't seem to cause issues in the notebook, but let me know if I've done something wrong there. Also had a linter issue I couldn't figure out: ``` MacBook-Pro:langchain jacoblee$ make lint poetry run mypy . docs/dist/conf.py: error: Duplicate module named "conf" (also at "./docs/api_reference/conf.py") docs/dist/conf.py: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#mapping-file-paths-to-modules for more info docs/dist/conf.py: note: Common resolutions include: a) using `--exclude` to avoid checking one of them, b) adding `__init__.py` somewhere, c) using `--explicit-package-bases` or adjusting MYPYPATH Found 1 error in 1 file (errors prevented further checking) make: *** [lint] Error 2 ``` @rlancemartin @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Jason Fan	8effd90be0	Add new types of document transformers (#7379 ) - Description: Add two new document transformers that translates documents into different languages and converts documents into q&a format to improve vector search results. Uses OpenAI function calling via the [doctran](https://github.com/psychic-api/doctran/tree/main) library. - Issue: N/A - Dependencies: `doctran = "^0.0.5"` - Tag maintainer: @rlancemartin @eyurtsev @hwchase17 - Twitter handle: @psychicapi or @jfan001 Notes - Adheres to the `DocumentTransformer` abstraction set by @dev2049 in #3182 - refactored `EmbeddingsRedundantFilter` to put it in a file under a new `document_transformers` module - Added basic docs for `DocumentInterrogator`, `DocumentTransformer` as well as the existing `EmbeddingsRedundantFilter` --------- Co-authored-by: Lance Martin <lance@langchain.dev> Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Jamie Broomall	0e1d7a27c6	WhyLabsCallbackHandler updates (#7621 ) Updates to the WhyLabsCallbackHandler and example notebook - Update dependency to langkit 0.0.6 which defines new helper methods for callback integrations - Update WhyLabsCallbackHandler to use the new `get_callback_instance` so that the callback is mostly defined in langkit - Remove much of the implementation of the WhyLabsCallbackHandler here in favor of the callback instance This does not change the behavior of the whylabs callback handler implementation but is a reorganization that moves some of the implementation externally to our optional dependency package, and should make future updates easier. @agola11	1 year ago
Gaurang Pawar	53722dcfdc	Fixed a typo in pinecone_hybrid_search.ipynb (#7627 ) Fixed a small typo in documentation	1 year ago
Bagatur	ee70d4a0cd	mv tutorials (#7614 )	1 year ago
Yaohui Wang	d85c33a5c3	Fix the markdown rendering issue with a code block inside a markdown code block (#6625 ) ### Description - Fix the markdown rendering issue with a code block inside a markdown, using a different number of backticks for the delimiters. Current doc site: <https://python.langchain.com/docs/modules/data_connection/document_transformers/text_splitters/code_splitter#markdown> After fix: <img width="480" alt="image" src="https://github.com/hwchase17/langchain/assets/3115235/d9921d59-64e6-4a34-9c62-79743667f528"> ### Who can review PTAL @dev2049 Co-authored-by: Yaohui Wang <wangyaohui.01@bytedance.com>	1 year ago
Yaroslav Halchenko	0d92a7f357	codespell: workflow, config + some (quite a few) typos fixed (#6785 ) Probably the most boring PR to review ;) Individual commits might be easier to digest --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	1 year ago
Sam	931e68692e	Adds a chain around sympy for symbolic math (#6834 ) - Description: Adds a new chain that acts as a wrapper around Sympy to give LLMs the ability to do some symbolic math. - Dependencies: SymPy --------- Co-authored-by: sreiswig <sreiswig@github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Subsegment	6e1000dc8d	docs : Use more meaningful cnosdb examples (#7587 ) This change makes the ecosystem integrations cnosdb documentation more realistic and easy to understand. - change examples of question and table - modify typo and format	1 year ago
ausboss	50316f6477	Adding LLM wrapper for Kobold AI (#7560 ) - Description: add wrapper that lets you use KoboldAI api in langchain - Issue: n/a - Dependencies: none extra, just what exists in lanchain - Tag maintainer: @baskaryan - Twitter handle: @zanzibased --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
os1ma	2667ddc686	Fix `make docs_build` and related scripts (#7276 ) Description: a description of the change Fixed `make docs_build` and related scripts which caused errors. There are several changes. First, I made the build of the documentation and the API Reference into two separate commands. This is because it takes less time to build. The commands for documents are `make docs_build`, `make docs_clean`, and `make docs_linkcheck`. The commands for API Reference are `make api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`. It looked like `docs/.local_build.sh` could be used to build the documentation, so I used that. Since `.local_build.sh` was also building API Rerefence internally, I removed that process. `.local_build.sh` also added some Bash options to stop in error or so. Futher more added `cd "${SCRIPT_DIR}"` at the beginning so that the script will work no matter which directory it is executed in. `docs/api_reference/api_reference.rst` is removed, because which is generated by `docs/api_reference/create_api_rst.py`, and added it to .gitignore. Finally, the description of CONTRIBUTING.md was modified. Issue: the issue # it fixes (if applicable) https://github.com/hwchase17/langchain/issues/6413 Dependencies: any dependencies required for this change `nbdoc` was missing in group docs so it was added. I installed it with the `poetry add --group docs nbdoc` command. I am concerned if any modifications are needed to poetry.lock. I would greatly appreciate it if you could pay close attention to this file during the review. Tag maintainer - General / Misc / if you don't know who to tag: @baskaryan If this PR needs any additional changes, I'll be happy to make them! --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
schop-rob	e811c5e8c6	Add OpenAI organization ID to docs (#7398 ) Description: I added an example of how to reference the OpenAI API Organization ID, because I couldn't find it before. In the example, it is mentioned how to achieve this using environment variables as well as parameters for the OpenAI()-class Issue: - Dependencies: - Twitter @schop-rob	1 year ago
Kenny	8741e55e7c	Template formats documentation (#7404 ) Simple addition to the documentation, adding the correct import statement & showcasing using Python FStrings.	1 year ago
OwenElliott	9cb2347453	Fix broken link from Marqo Ecosystem (#7510 ) Small fix to a link from the Marqo page in the ecosystem. The link was not updated correctly when the documentation structure changed to html pages instead of links to notebooks.	1 year ago
Kacper Łukawski	1f83b5f47e	Reuse the existing collection if configured properly in Qdrant.from_texts (#7530 ) This PR changes the behavior of `Qdrant.from_texts` so the collection is reused if not requested to recreate it. Previously, calling `Qdrant.from_texts` or `Qdrant.from_documents` resulted in removing the old data which was confusing for many.	1 year ago
Felix Brockmeier	406a9dc11f	Add notebook example for Lemon AI NLP Workflow Automation (#7556 ) - Description: Added notebook to LangChain docs that explains how to use Lemon AI NLP Workflow Automation tool with Langchain - Issue: not applicable - Dependencies: not applicable - Tag maintainer: @agola11 - Twitter handle: felixbrockm	1 year ago
Lance Martin	9e067b8cc9	Add env setup (#7550 ) Include setup	1 year ago
Bagatur	d2137eea9f	fix cpal docs (#7545 )	1 year ago
Boris	9129318466	CPAL (#6255 ) # Causal program-aided language (CPAL) chain ## Motivation This builds on the recent [PAL](https://arxiv.org/abs/2211.10435) to stop LLM hallucination. The problem with the [PAL](https://arxiv.org/abs/2211.10435) approach is that it hallucinates on a math problem with a nested chain of dependence. The innovation here is that this new CPAL approach includes causal structure to fix hallucination. For example, using the below word problem, PAL answers with 5, and CPAL answers with 13. "Tim buys the same number of pets as Cindy and Boris." "Cindy buys the same number of pets as Bill plus Bob." "Boris buys the same number of pets as Ben plus Beth." "Bill buys the same number of pets as Obama." "Bob buys the same number of pets as Obama." "Ben buys the same number of pets as Obama." "Beth buys the same number of pets as Obama." "If Obama buys one pet, how many pets total does everyone buy?" The CPAL chain represents the causal structure of the above narrative as a causal graph or DAG, which it can also plot, as shown below. ![complex-graph](https://github.com/hwchase17/langchain/assets/367522/d938db15-f941-493d-8605-536ad530f576) . The two major sections below are: 1. Technical overview 2. Future application Also see [this jupyter notebook](https://github.com/borisdev/langchain/blob/master/docs/extras/modules/chains/additional/cpal.ipynb) doc. ## 1. Technical overview ### CPAL versus PAL Like [PAL](https://arxiv.org/abs/2211.10435), CPAL intends to reduce large language model (LLM) hallucination. The CPAL chain is different from the PAL chain for a couple of reasons. * CPAL adds a causal structure (or DAG) to link entity actions (or math expressions). * The CPAL math expressions are modeling a chain of cause and effect relations, which can be intervened upon, whereas for the PAL chain math expressions are projected math identities. PAL's generated python code is wrong. It hallucinates when complexity increases. ```python def solution(): """Tim buys the same number of pets as Cindy and Boris.Cindy buys the same number of pets as Bill plus Bob.Boris buys the same number of pets as Ben plus Beth.Bill buys the same number of pets as Obama.Bob buys the same number of pets as Obama.Ben buys the same number of pets as Obama.Beth buys the same number of pets as Obama.If Obama buys one pet, how many pets total does everyone buy?""" obama_pets = 1 tim_pets = obama_pets cindy_pets = obama_pets + obama_pets boris_pets = obama_pets + obama_pets total_pets = tim_pets + cindy_pets + boris_pets result = total_pets return result # math result is 5 ``` CPAL's generated python code is correct. ```python story outcome data name code value depends_on 0 obama pass 1.0 [] 1 bill bill.value = obama.value 1.0 [obama] 2 bob bob.value = obama.value 1.0 [obama] 3 ben ben.value = obama.value 1.0 [obama] 4 beth beth.value = obama.value 1.0 [obama] 5 cindy cindy.value = bill.value + bob.value 2.0 [bill, bob] 6 boris boris.value = ben.value + beth.value 2.0 [ben, beth] 7 tim tim.value = cindy.value + boris.value 4.0 [cindy, boris] query data { "question": "how many pets total does everyone buy?", "expression": "SELECT SUM(value) FROM df", "llm_error_msg": "" } # query result is 13 ``` Based on the comments below, CPAL's intended location in the library is `experimental/chains/cpal` and PAL's location is`chains/pal`. ### CPAL vs Graph QA Both the CPAL chain and the Graph QA chain extract entity-action-entity relations into a DAG. The CPAL chain is different from the Graph QA chain for a few reasons. * Graph QA does not connect entities to math expressions * Graph QA does not associate actions in a sequence of dependence. * Graph QA does not decompose the narrative into these three parts: 1. Story plot or causal model 4. Hypothetical question 5. Hypothetical condition ### Evaluation Preliminary evaluation on simple math word problems shows that this CPAL chain generates less hallucination than the PAL chain on answering questions about a causal narrative. Two examples are in [this jupyter notebook](https://github.com/borisdev/langchain/blob/master/docs/extras/modules/chains/additional/cpal.ipynb) doc. ## 2. Future application ### "Describe as Narrative, Test as Code" The thesis here is that the Describe as Narrative, Test as Code approach allows you to represent a causal mental model both as code and as a narrative, giving you the best of both worlds. #### Why describe a causal mental mode as a narrative? The narrative form is quick. At a consensus building meeting, people use narratives to persuade others of their causal mental model, aka. plan. You can share, version control and index a narrative. #### Why test a causal mental model as a code? Code is testable, complex narratives are not. Though fast, narratives are problematic as their complexity increases. The problem is LLMs and humans are prone to hallucination when predicting the outcomes of a narrative. The cost of building a consensus around the validity of a narrative outcome grows as its narrative complexity increases. Code does not require tribal knowledge or social power to validate. Code is composable, complex narratives are not. The answer of one CPAL chain can be the hypothetical conditions of another CPAL Chain. For stochastic simulations, a composable plan can be integrated with the [DoWhy library](https://github.com/py-why/dowhy). Lastly, for the futuristic folk, a composable plan as code allows ordinary community folk to design a plan that can be integrated with a blockchain for funding. An explanation of a dependency planning application is [here.](https://github.com/borisdev/cpal-llm-chain-demo) --- Twitter handle: @boris_dev --------- Co-authored-by: Boris Dev <borisdev@Boriss-MacBook-Air.local>	1 year ago
Alejandra De Luna	2e4047e5e7	feat: support generate as an early stopping method for `OpenAIFunctionsAgent` (#7229 ) This PR proposes an implementation to support `generate` as an `early_stopping_method` for the new `OpenAIFunctionsAgent` class. The motivation behind is to facilitate the user to set a maximum number of actions the agent can take with `max_iterations` and force a final response with this new agent (as with the `Agent` class). The following changes were made: - The `OpenAIFunctionsAgent.return_stopped_response` method was overwritten to support `generate` as an `early_stopping_method` - A boolean `with_functions` parameter was added to the `OpenAIFunctionsAgent.plan` method This way the `OpenAIFunctionsAgent.return_stopped_response` method can call the `OpenAIFunctionsAgent.plan` method with `with_function=False` when the `early_stopping_method` is set to `generate`, making a call to the LLM with no functions and forcing a final response from the `"assistant"`. - Relevant maintainer: @hinthornw - Twitter handle: @aledelunap --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Lance Martin	4a94f56258	Minor edits to QA docs (#7507 ) Small clean-ups	1 year ago
Lance Martin	bd0c6381f5	Minor update to clarify map-reduce custom prompt usage (#7453 ) Update docs for map-reduce custom prompt usage	1 year ago
Lance Martin	28d2b213a4	Update landing page for "question answering over documents" (#7152 ) Improve documentation for a central use-case, qa / chat over documents. This will be merged as an update to `index.mdx` [here](https://python.langchain.com/docs/use_cases/question_answering/). Testing w/ local Docusaurus server: ``` From `docs` directory: mkdir _dist cp -r {docs_skeleton,snippets} _dist cp -r extras/* _dist/docs_skeleton/docs cd _dist/docs_skeleton yarn install yarn start ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Adilkhan Sarsen	5debd5043e	Added deeplake use case examples of the new features (#6528 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> 1. Added use cases of the new features 2. Done some code refactoring --------- Co-authored-by: Ivo Stranic <istranic@gmail.com>	1 year ago
Kazuki Maeda	92b4418c8c	Datadog logs loader (#7356 ) ### Description Created a Loader to get a list of specific logs from Datadog Logs. ### Dependencies `datadog_api_client` is required. ### Twitter handle [kzk_maeda](https://twitter.com/kzk_maeda) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Yifei Song	7d29bb2c02	Add Xorbits Dataframe as a Document Loader (#7319 ) - [Xorbits](https://doc.xorbits.io/en/latest/) is an open-source computing framework that makes it easy to scale data science and machine learning workloads in parallel. Xorbits can leverage multi cores or GPUs to accelerate computation on a single machine, or scale out up to thousands of machines to support processing terabytes of data. - This PR added support for the Xorbits document loader, which allows langchain to leverage Xorbits to parallelize and distribute the loading of data. - Dependencies: This change requires the Xorbits library to be installed in order to be used. `pip install xorbits` - Request for review: @rlancemartin, @eyurtsev - Twitter handle: https://twitter.com/Xorbitsio Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Paul-Emile Brotons	d2cf0d16b3	adding max_marginal_relevance_search method to MongoDBAtlasVectorSearch (#7310 ) Adding a maximal_marginal_relevance method to the MongoDBAtlasVectorSearch vectorstore enhances the user experience by providing more diverse search results Issue: #7304	1 year ago
Matt Robinson	bcab894f4e	feat: Add `UnstructuredTSVLoader` (#7367 ) ### Summary Adds an `UnstructuredTSVLoader` for TSV files. Also updates the doc strings for `UnstructuredCSV` and `UnstructuredExcel` loaders. ### Testing ```python from langchain.document_loaders.tsv import UnstructuredTSVLoader loader = UnstructuredTSVLoader( file_path="example_data/mlb_teams_2012.csv", mode="elements" ) docs = loader.load() ```	1 year ago
nikkie	dfc3f83b0f	docs(vectorstores/integrations/chroma): Fix loading and saving (#7437 ) - Description: Fix loading and saving code about Chroma - Issue: the issue #7436 - Dependencies: - - Twitter handle: https://twitter.com/ftnext	1 year ago
Daniel Chalef	c7f7788d0b	Add ZepMemory; improve ZepChatMessageHistory handling of metadata; Fix bugs (#7444 ) Hey @hwchase17 - This PR adds a `ZepMemory` class, improves handling of Zep's message metadata, and makes it easier for folks building custom chains to persist metadata alongside their chat history. We've had plenty confused users unfamiliar with ChatMessageHistory classes and how to wrap the `ZepChatMessageHistory` in a `ConversationBufferMemory`. So we've created the `ZepMemory` class as a light wrapper for `ZepChatMessageHistory`. Details: - add ZepMemory, modify notebook to demo use of ZepMemory - Modify summary to be SystemMessage - add metadata argument to add_message; add Zep metadata to Message.additional_kwargs - support passing in metadata	1 year ago
Saurabh Chaturvedi	8f8e8d701e	Fix info about YouTube (#7447 ) (Unintentionally mean 😅) nit: YouTube wasn't created by Google, this PR fixes the mention in docs.	1 year ago
Jeroen Van Goey	f5bd88757e	Fix typo (#7416 ) `quesitons` -> `questions`.	1 year ago
Nolan	5da9f9abcb	docs(agents/toolkits): Fix error in document_comparison_toolkit.ipynb (#7417 ) Replace this comment with: - Description: Removes unneeded output warning in documentation at https://python.langchain.com/docs/modules/agents/toolkits/document_comparison_toolkit - Issue: - - Dependencies: - - Tag maintainer: @baskaryan - Twitter handle: @finnless	1 year ago
nikkie	2eb4a2ceea	docs(retrievers/get-started): Fix broken state_of_the_union.txt link (#7399 ) Thank you for this awesome library. - Description: Fix broken link in documentation - Issue: - https://python.langchain.com/docs/modules/data_connection/retrievers/#get-started - the URL: https://github.com/hwchase17/langchain/blob/master/docs/modules/state_of_the_union.txt - I think the right one is https://github.com/hwchase17/langchain/blob/master/docs/extras/modules/state_of_the_union.txt - Dependencies: - - Tag maintainer: @baskaryan - Twitter handle: -	1 year ago
Delgermurun	a1603fccfb	integrate JinaChat (#6927 ) Integration with https://chat.jina.ai/api. It is OpenAI compatible API. - Twitter handle: [https://twitter.com/JinaAI_](https://twitter.com/JinaAI_) --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Roger Yu	633b673b85	Update pinecone.ipynb (#7382 ) Fix typo	1 year ago
William FH	4789c99bc2	Add String Distance and Embedding Evaluators (#7123 ) Add a string evaluator and pairwise string evaluator implementation for: - Embedding distance - String distance Update docs	1 year ago
Bagatur	1ac347b4e3	update databerry-chaindesk redirect (#7378 )	1 year ago
Joshua Carroll	705d2f5b92	Update the API Reference link in Streamlit integration docs (#7377 ) This page: https://python.langchain.com/docs/modules/callbacks/integrations/streamlit Has a bad API Reference link currently. This PR fixes it to the correct link. Also updates the embedded app link to https://langchain-mrkl.streamlit.app/ (better name) which is hosted in langchain-ai/streamlit-agent repo	1 year ago
Georges Petrov	ec033ae277	Rename Databerry to Chaindesk (#7022 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Harrison Chase	7cdf97ba9b	Harrison/add to imports (#7370 ) pgvector cleanup	1 year ago
Bagatur	4d427b2397	Base language model docstrings (#7104 )	1 year ago
Alex Gamble	df746ad821	Add a callback handler for Context (https://getcontext.ai ) (#7151 ) ### Description Adding a callback handler for Context. Context is a product analytics platform for AI chat experiences to help you understand how users are interacting with your product. I've added the callback library + an example notebook showing its use. ### Dependencies Requires the user to install the `context-python` library. The library is lazily-loaded when the callback is instantiated. ### Announcing the feature We spoke with Harrison a few weeks ago about also doing a blog post announcing our integration, so will coordinate this with him. Our Twitter handle for the company is @getcontextai, and the founders are @_agamble and @HenrySG. Thanks in advance!	1 year ago
German Martin	3ce4e46c8c	The Fellowship of the Vectors: New Embeddings Filter using clustering. (#7015 ) Continuing with Tolkien inspired series of langchain tools. I bring to you: The Fellowship of the Vectors, AKA EmbeddingsClusteringFilter. This document filter uses embeddings to group vectors together into clusters, then allows you to pick an arbitrary number of documents vector based on proximity to the cluster centers. That's a representative sample of the cluster. The original idea is from [Greg Kamradt](https://github.com/gkamradt) from this video (Level4): https://www.youtube.com/watch?v=qaPMdcCqtWk&t=365s I added few tricks to make it a bit more versatile, so you can parametrize what to do with duplicate documents in case of cluster overlap: replace the duplicates with the next closest document or remove it. This allow you to use it as an special kind of redundant filter too. Additionally you can choose 2 diff orders: grouped by cluster or respecting the original retriever scores. In my use case I was using the docs grouped by cluster to run refine chains per cluster to generate summarization over a large corpus of documents. Let me know if you want to change anything! @rlancemartin, @eyurtsev, @hwchase17, --------- Co-authored-by: rlm <pexpresss31@gmail.com>	1 year ago
Leonid Ganeline	b489466488	docs: `dependents` update 4 (#7360 ) Updated links and counters of the `dependents` page.	1 year ago
Bagatur	d1c7237034	openai fn update nb (#7352 )	1 year ago
Bagatur	1c8cff32f1	Generic OpenAI fn chain (#7270 ) Add loading functions for openai function chains and add docs page	1 year ago
Bagatur	fd7145970f	Output parser redirect (#7330 ) Related to ##7311	1 year ago
OwenElliott	3074306ae1	Marqo Vector Store Examples & Type Hints (#7326 ) This PR improves the example notebook for the Marqo vectorstore implementation by adding a new RetrievalQAWithSourcesChain example. The `embedding` parameter in `from_documents` has its type updated to `Union[Embeddings, None]` and a default parameter of None because this is ignored in Marqo. This PR also upgrades the Marqo version to 0.11.0 to remove the device parameter after a breaking change to the API. Related to #7068 @tomhamer @hwchase17 --------- Co-authored-by: Tom Hamer <tom@marqo.ai>	1 year ago
Bagatur	a9c5b4bcea	Bagatur/clarifai update (#7324 ) This PR improves upon the Clarifai LangChain integration with improved docs, errors, args and the addition of embedding model support in LancChain for Clarifai's embedding models and an overview of the various ways you can integrate with Clarifai added to the docs. --------- Co-authored-by: Matthew Zeiler <zeiler@clarifai.com>	1 year ago
John Landahl	e047541b5f	Corrected a typo in elasticsearch.ipynb (#7318 ) Simple typo fix	1 year ago
Subsegment	152dc59060	docs : add cnosdb to Ecosystem Integrations (#7316 ) - Implement a `from_cnosdb` method for the `SQLDatabase` class - Write CnosDB documentation and add it to Ecosystem Integrations	1 year ago
Bagatur	a6b39afe0e	rm side nav (#7297 )	1 year ago
Leonid Ganeline	6ff9e9b34a	updated `huggingface_hub` examples (#7292 ) Added examples for models: - Google `Flan` - TII `Falcon` - Salesforce `XGen`	1 year ago
Dídac Sabatés	e0cb3ea90c	Fix sql_database.ipynb link (#6525 ) Looks like the [SQLDatabaseChain](https://langchain.readthedocs.io/en/latest/modules/chains/examples/sqlite.html) in the SQL Database Agent page was broken I've change it to the SQL Chain page	1 year ago
Leonid Ganeline	4450791edd	docs: tutorials update (#7230 ) updated `tutorials.mdx`: - added a link to new `Deeplearning AI` course on LangChain - added links to other tutorial videos - fixed format @baskaryan, @hwchase17	1 year ago
hayao-k	c23e16c459	docs: Fixed typos in Amazon Kendra Retriever documentation (#7261 ) ## Description Fixed to the official service name Amazon Kendra. ## Tag maintainer @baskaryan	1 year ago
zhaoshengbo	e8f24164f0	Improve the alibaba cloud opensearch vector store documentation (#6964 ) Based on user feedback, we have improved the Alibaba Cloud OpenSearch vector store documentation. Co-authored-by: zhaoshengbo <shengbo.zsb@alibaba-inc.com>	1 year ago
emarco177	b9d6d4cd4c	added template repo for CI/CD deployment on Google Cloud Run (#7218 ) Replace this comment with: - Description: added documentation for a template repo that helps dockerizing and deploying a LangChain using a Cloud Build CI/CD pipeline to Google Cloud build serverless - Issue: None, - Dependencies: None, - Tag maintainer: @baskaryan, - Twitter handle: EdenEmarco177 If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use.	1 year ago
Stefano Lottini	e61cfb6e99	FLARE Example notebook: switch to named arg to pass pydantic validation (#7267 ) Adding the name of the parameter to comply with latest requirements by Pydantic usage for BaseModels.	1 year ago
os1ma	b151d4257a	docs: Update documentation for Wikipedia tool to use WikipediaQueryRun (#7258 ) Description In the following page, "Wikipedia" tool is explained. https://python.langchain.com/docs/modules/agents/tools/integrations/wikipedia However, the WikipediaAPIWrapper being used is not a tool. This PR updated the documentation to use a tool WikipediaQueryRun. Issue None Tag maintainer Agents / Tools / Toolkits: @hinthornw	1 year ago
Jeroen Van Goey	887bb12287	Use correct Language for html_splitter (#7274 ) `html_splitter` was using `Language.MARKDOWN`.	1 year ago
Shantanu Nair	f773c21723	Update supabase match_docs ddl and notebook to use expected id type (#7257 ) - Description: Switch supabase match function DDL to use expected uuid type instead of bigint - Issue: https://github.com/hwchase17/langchain/issues/6743, https://github.com/hwchase17/langchain/issues/7179 - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: https://twitter.com/ShantanuNair	1 year ago
Myeongseop Kim	0e878ccc2d	Add HumanInputChatModel (#7256 ) - Description: This is a chat model equivalent of HumanInputLLM. An example notebook is also added. - Tag maintainer: @hwchase17, @baskaryan - Twitter handle: N/A	1 year ago
Harrison Chase	52b016920c	Harrison/update anthropic (#7237 ) Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>	1 year ago
Hashem Alsaket	6aa66fd2b0	Update Hugging Face Hub notebook (#7236 ) Description: `flan-t5-xl` hangs, updated to `flan-t5-xxl`. Tested all stabilityai LLMs- all hang so removed from tutorial. Temperature > 0 to prevent unintended determinism. Issue: #3275 Tag maintainer: @baskaryan	1 year ago
Harrison Chase	d6541da161	remove arize nb (#7238 ) was causing some issues with docs build	1 year ago
Mike Nitsenko	d669b9ece9	Document loader for Cube Semantic Layer (#6882 ) ### Description This pull request introduces the "Cube Semantic Layer" document loader, which demonstrates the retrieval of Cube's data model metadata in a format suitable for passing to LLMs as embeddings. This enhancement aims to provide contextual information and improve the understanding of data. Twitter handle: @the_cube_dev --------- Co-authored-by: rlm <pexpresss31@gmail.com>	1 year ago
Tom	e533da8bf2	Adding Marqo to vectorstore ecosystem (#7068 ) This PR brings in a vectorstore interface for [Marqo](https://www.marqo.ai/). The Marqo vectorstore exposes some of Marqo's functionality in addition the the VectorStore base class. The Marqo vectorstore also makes the embedding parameter optional because inference for embeddings is an inherent part of Marqo. Docs, notebook examples and integration tests included. Related PR: https://github.com/hwchase17/langchain/pull/2807 --------- Co-authored-by: Tom Hamer <tom@marqo.ai> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Harrison Chase	47e7d09dff	fix arize nb (#7227 )	1 year ago
Harrison Chase	6711854e30	Harrison/dataforseo (#7214 ) Co-authored-by: Alexander <sune357@gmail.com>	1 year ago
Hakan Tekgul	61938a02a1	Create arize_llm_observability.ipynb (#7000 ) Adding documentation and notebook for Arize callback handler. - @dev2049 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11	1 year ago
Leonid Ganeline	ecee4d6e92	docs: update `youtube` videos and tutorials (#6515 ) added tutorials.mdx; updated youtube.mdx Rationale: the Tutorials section in the documentation is top-priority. (for example, https://pytorch.org/docs/stable/index.html) Not every project has resources to make tutorials. We have such a privilege. Community experts created several tutorials on YouTube. But the tutorial links are now hidden on the YouTube page and not easily discovered by first-time visitors. - Added new videos and tutorials that were created since the last update. - Made some reprioritization between videos on the base of the view numbers. #### Who can review? - @hwchase17 - @dev2049	1 year ago
Josh Reini	30d8d1d3d0	add trulens integration (#7096 ) Description: Add TruLens integration. Twitter: @trulensml For review: - Tracing: @agola11 - Tools: @hinthornw	1 year ago
Conrad Fernandez	6eff0fa2ca	Added documentation for add_texts function for Pinecone integration (#7134 ) - Description: added some documentation to the Pinecone vector store docs page. - Issue: #7126 - Dependencies: None - Tag maintainer: @baskaryan I can add more documentation on the Pinecone integration functions as I am going to go in great depth into this area. Just wanted to check with the maintainers is if this is all good.	1 year ago
felixocker	db98c44f8f	Support for SPARQL (#7165 ) # [SPARQL](https://www.w3.org/TR/rdf-sparql-query/) for [LangChain](https://github.com/hwchase17/langchain) ## Description LangChain support for knowledge graphs relying on W3C standards using RDFlib: SPARQL/ RDF(S)/ OWL with special focus on RDF \ * Works with local files, files from the web, and SPARQL endpoints * Supports both SELECT and UPDATE queries * Includes both a Jupyter notebook with an example and integration tests ## Contribution compared to related PRs and discussions * [Wikibase agent](https://github.com/hwchase17/langchain/pull/2690) - uses SPARQL, but specifically for wikibase querying * [Cypher qa](https://github.com/hwchase17/langchain/pull/5078) - graph DB question answering for Neo4J via Cypher * [PR 6050](https://github.com/hwchase17/langchain/pull/6050) - tries something similar, but does not cover UPDATE queries and supports only RDF * Discussions on [w3c mailing list](mailto:semantic-web@w3.org) related to the combination of LLMs (specifically ChatGPT) and knowledge graphs ## Dependencies * [RDFlib](https://github.com/RDFLib/rdflib) ## Tag maintainer Graph database related to memory -> @hwchase17	1 year ago
Prakul Agarwal	38f853dfa3	Fixed typos in MongoDB Atlas Vector Search documentation (#7174 ) Fix for typos in MongoDB Atlas Vector Search documentation <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	1 year ago
Raouf Chebri	6fc24743b7	Add pg_hnsw vectorstore integration (#6893 ) Hi @rlancemartin, @eyurtsev! - Description: Adding HNSW extension support for Postgres. Similar to pgvector vectorstore, with 3 differences 1. it uses HNSW extension for exact and ANN searches, 2. Vectors are of type array of real 3. Only supports L2 - Dependencies: [HNSW](https://github.com/knizhnik/hnsw) extension for Postgres - Example: ```python db = HNSWVectoreStore.from_documents( embedding=embeddings, documents=docs, collection_name=collection_name, connection_string=connection_string ) query = "What did the president say about Ketanji Brown Jackson" docs_with_score: List[Tuple[Document, float]] = db.similarity_search_with_score(query) ``` The example notebook is in the PR too.	1 year ago
Max Cembalest	2984803597	cleaned Arthur tracking demo notebook (#7147 ) Cleaned title and reduced clutter for integration demo notebook for the Arthur callback handler	1 year ago
Deepankar Mahapatro	da69a6771f	docs: update Jina ecosystem (#7149 ) Documentation update for [Jina ecosystem](https://python.langchain.com/docs/ecosystem/integrations/jina) and `langchain-serve` in the deployments section to latest features. @hwchase17 <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	1 year ago
Simon Cheung	81eebc4070	Add HugeGraphQAChain to support gremlin generating chain (#7132 ) [Apache HugeGraph](https://github.com/apache/incubator-hugegraph) is a convenient, efficient, and adaptable graph database, compatible with the Apache TinkerPop3 framework and the Gremlin query language. In this PR, the HugeGraph and HugeGraphQAChain provide the same functionality as the existing integration with Neo4j and enables query generation and question answering over HugeGraph database. The difference is that the graph query language supported by HugeGraph is not cypher but another very popular graph query language [Gremlin](https://tinkerpop.apache.org/gremlin.html). A notebook example and a simple test case have also been added. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Saverio Proto	5585607654	Improve Bing Search example (#7128 ) # Description Improve Bing Search example:	1 year ago
Lance Martin	265c285057	Fix GPT4All bug w/ "n_ctx" param (#7093 ) Running `GPT4All` per the [docs](https://python.langchain.com/docs/modules/model_io/models/llms/integrations/gpt4all), I see: ``` $ from langchain.llms import GPT4All $ model = GPT4All(model=local_path) $ model("The capital of France is ", max_tokens=10) TypeError: generate() got an unexpected keyword argument 'n_ctx' ``` It appears `n_ctx` is [no longer a supported param](https://docs.gpt4all.io/gpt4all_python.html#gpt4all.gpt4all.GPT4All.generate) in the GPT4All API from https://github.com/nomic-ai/gpt4all/pull/1090. It now uses `max_tokens`, so I set this. And I also set other defaults used in GPT4All client [here](https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-bindings/python/gpt4all/gpt4all.py). Confirm it now works: ``` $ from langchain.llms import GPT4All $ model = GPT4All(model=local_path) $ model("The capital of France is ", max_tokens=10) < Model logging > "....Paris." ``` --------- Co-authored-by: R. Lance Martin <rlm@Rs-MacBook-Pro.local>	1 year ago
Stefano Lottini	6631fd5168	Align cassio versions between examples for Cassandra integration (#7099 ) Just reducing confusion by requiring cassio>=0.0.7 consistently across examples.	1 year ago
Ruixi Fan	0b69a7e9ab	[Document fix] Fix an expired link qa_benchmarking_pg.ipynb (#7110 ) ## Change description - Description: Fix an expired link that points to the readthedocs site. - Dependencies: No	1 year ago
Lance Martin	9ca4c54428	Minor updates to notebook for MultiQueryRetriever (#7102 ) * Add an easier-to-run example. * Add logging per https://github.com/hwchase17/langchain/pull/6891. * Updated params per https://github.com/hwchase17/langchain/pull/5962. --------- Co-authored-by: R. Lance Martin <rlm@Rs-MacBook-Pro.local> Co-authored-by: Lance Martin <lance@langchain.dev>	1 year ago
Nicolas	490fcf9d98	docs: New experimental UI for Mendable Search (#6558 ) This PR introduces a new Mendable UI tailored to a better search experience. We're more closely integrating our traditional search with our AI generation. With this change, you won't have to tab back and forth between the mendable bot and the keyword search. Both types of search are handled in the same bar. This should make the docs easier to navigate. while still letting users get code generations or AI-summarized answers if they so wish. Also, it should reduce the cost. Would love to hear your feedback :) Cc: @dev2049 @hwchase17	1 year ago
genewoo	e49abd1277	Add Metal support to llama.cpp doc (#7092 ) - Description: Add Metal support to llama.cpp doc - Issue: #7091 - Dependencies: N/A - Twitter handle: gene_wu	1 year ago
rjarun8	e2d61ab85a	Add SpacyEmbeddings class (#6967 ) - Description: Added a new SpacyEmbeddings class for generating embeddings using the Spacy library. - Issue: Sentencebert/Bert/Spacy/Doc2vec embedding support #6952 - Dependencies: This change requires the Spacy library and the 'en_core_web_sm' Spacy model. - Tag maintainer: @dev2049 - Twitter handle: N/A This change includes a new SpacyEmbeddings class, but does not include a test or an example notebook. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	1 year ago
Leonid Ganeline	16fbd528c5	docs: commented out `editUrl` option (#6440 )	1 year ago
adam91holt	80e86b602e	Remove duplicate mongodb integration doc (#7006 )	1 year ago
joaomsimoes	c669d98693	Update get_started.mdx (#7005 ) typo in chat = ChatOpenAI(open_api_key="...") should be openai_api_key	1 year ago
Johnny Lim	a081e419a0	Fix sample in FAISS section (#7050 ) This PR fixes a sample in the FAISS section in the reference docs.	1 year ago
Leonid Ganeline	200be43da6	added `Brave Search` document_loader (#6989 ) - Added `Brave Search` document loader. - Refactored BraveSearch wrapper - Added a Jupyter Notebook example - Added `Ecosystem/Integrations` BraveSearch page Please review: - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev	1 year ago
Sergey Kozlov	6d15854cda	Add JSON Lines support to JSONLoader (#6913 ) Description: The JSON Lines format is used by some services such as OpenAI and HuggingFace. It's also a convenient alternative to CSV. This PR adds JSON Lines support to `JSONLoader` and also updates related tests. Tag maintainer: @rlancemartin, @eyurtsev. PS I was not able to build docs locally so didn't update related section.	1 year ago
Ofer Mendelevitch	153b56d19b	Vectara upd2 (#6506 ) Update to Vectara integration - By user request added "add_files" to take advantage of Vectara capabilities to process files on the backend, without the need for separate loading of documents and chunking in the chain. - Updated vectara.ipynb example notebook to be broader and added testing of add_file() @hwchase17 - project lead --------- Co-authored-by: rlm <pexpresss31@gmail.com>	1 year ago
Leonid Ganeline	77ae8084a0	docstrings `document_loaders` 1 (#6847 ) - Updated docstrings in `document_loaders` - several code fixes. - added `docs/extras/ecosystem/integrations/airtable.md` @rlancemartin, @eyurtsev	1 year ago
Bagatur	7acd524210	Rm retriever kwargs (#7013 ) Doesn't actually limit the Retriever interface but hopefully in practice it does	1 year ago
Johnny Lim	9dc77614e3	Polish reference docs (#7045 ) This PR fixes broken links in the reference docs.	1 year ago
Johnny Lim	052c797429	Fix typo (#7023 ) This PR fixes a typo.	1 year ago
Stefano Lottini	8d2281a8ca	Second Attempt - Add concurrent insertion of vector rows in the Cassandra Vector Store (#7017 ) Retrying with the same improvements as in #6772, this time trying not to mess up with branches. @rlancemartin doing a fresh new PR from a branch with a new name. This should do. Thank you for your help! --------- Co-authored-by: Jonathan Ellis <jbellis@datastax.com> Co-authored-by: rlm <pexpresss31@gmail.com>	1 year ago
Matt Robinson	0498dad562	feat: enable `UnstructuredEmailLoader` to process attachments (#6977 ) ### Summary Updates `UnstructuredEmailLoader` so that it can process attachments in addition to the e-mail content. The loader will process attachments if the `process_attachments` kwarg is passed when the loader is instantiated. ### Testing ```python file_path = "fake-email-attachment.eml" loader = UnstructuredEmailLoader( file_path, mode="elements", process_attachments=True ) docs = loader.load() docs[-1] ``` ### Reviewers - @rlancemartin - @eyurtsev - @hwchase17	1 year ago
Matthew Foster Walsh	59697b406d	Fix typo in quickstart.mdx (#6985 ) Removed an extra "to" from a sentence. @dev2049 very minor documentation fix.	1 year ago
Paul Grillenberger	aa37b10b28	Fix: Correct typo (#6988 ) Description: Correct a minor typo in the docs. @dev2049	1 year ago
Zander Chase	b0859c9b18	Add New Retriever Interface with Callbacks (#5962 ) Handle the new retriever events in a way that (I think) is entirely backwards compatible? Needs more testing for some of the chain changes and all. This creates an entire new run type, however. We could also just treat this as an event within a chain run presumably (same with memory) Adds a subclass initializer that upgrades old retriever implementations to the new schema, along with tests to ensure they work. First commit doesn't upgrade any of our retriever implementations (to show that we can pass the tests along with additional ones testing the upgrade logic). Second commit upgrades the known universe of retrievers in langchain. - [X] Add callback handling methods for retriever start/end/error (open to renaming to 'retrieval' if you want that) - [X] Update BaseRetriever schema to support callbacks - [X] Tests for upgrading old "v1" retrievers for backwards compatibility - [X] Update existing retriever implementations to implement the new interface - [X] Update calls within chains to .{a]get_relevant_documents to pass the child callback manager - [X] Update the notebooks/docs to reflect the new interface - [X] Test notebooks thoroughly Not handled: - Memory pass throughs: retrieval memory doesn't have a parent callback manager passed through the method --------- Co-authored-by: Nuno Campos <nuno@boringbits.io> Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>	1 year ago
William FH	a5b206caf3	Remove Promptlayer Notebook (#6996 ) It's breaking our docs build	1 year ago
Daniel Chalef	b26cca8008	Zep Authentication (#6728 ) ## Description: Add Zep API Key argument to ZepChatMessageHistory and ZepRetriever - correct docs site links - add zep api_key auth to constructors ZepChatMessageHistory: @hwchase17, ZepRetriever: @rlancemartin, @eyurtsev	1 year ago
William FH	e4625846e5	Add Flyte Callback Handler (#6139 ) (#6986 ) Signed-off-by: Samhita Alla <aallasamhita@gmail.com> Co-authored-by: Samhita Alla <aallasamhita@gmail.com>	1 year ago
Davis Chase	eb180e321f	Page per class-style api reference (#6560 ) can make it prettier, but what do we think of overall structure? https://api.python.langchain.com/en/dev2049-page_per_class/api_ref.html --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Nuno Campos <nuno@boringbits.io>	1 year ago
William FH	64039b9f11	Promptlayer Callback (#6975 ) Co-authored-by: Saleh Hindi <saleh.hindi.one@gmail.com> Co-authored-by: jped <jonathanped@gmail.com>	1 year ago
William FH	13c62cf6b1	Arthur Callback (#6972 ) Co-authored-by: Max Cembalest <115359769+arthuractivemodeling@users.noreply.github.com>	1 year ago
William FH	8c73037dff	Simplify eval arg names (#6944 ) It'll be easier to switch between these if the names of predictions are consistent	1 year ago
Davis Chase	bd6a0ee9e9	Redirect vecstores (#6948 )	1 year ago
Davis Chase	f780678910	Add back in clickhouse mongo vecstore notebooks (#6949 )	1 year ago
Jacob Lee	73831ef3d8	Change code block color scheme (#6945 ) Adds contrast, makes code blocks more readable.	1 year ago
Hashem Alsaket	5861770a53	Updated QA notebook (#6801 ) Description: `all_metadatas` was not defined, `OpenAIEmbeddings` was not imported, Issue: #6723 the issue # it fixes (if applicable), Dependencies: lark, Tag maintainer: @vowelparrot , @dev2049 --------- Co-authored-by: rlm <pexpresss31@gmail.com>	1 year ago
Kacper Łukawski	140ba682f1	Support named vectors in Qdrant (#6871 ) # Description This PR makes it possible to use named vectors from Qdrant in Langchain. That was requested multiple times, as people want to reuse externally created collections in Langchain. It doesn't change anything for the existing applications. The changes were covered with some integration tests and included in the docs. ## Example ```python Qdrant.from_documents( docs, embeddings, location=":memory:", collection_name="my_documents", vector_name="custom_vector", ) ``` ### Issue: #2594 Tagging @rlancemartin & @eyurtsev. I'd appreciate your review.	1 year ago
corranmac	20c6ade2fc	Grobid parser for Scientific Articles from PDF (#6729 ) ### Scientific Article PDF Parsing via Grobid `Description:` This change adds the GrobidParser class, which uses the Grobid library to parse scientific articles into a universal XML format containing the article title, references, sections, section text etc. The GrobidParser uses a local Grobid server to return PDFs document as XML and parses the XML to optionally produce documents of individual sentences or of whole paragraphs. Metadata includes the text, paragraph number, pdf relative bboxes, pages (text may overlap over two pages), section title (Introduction, Methodology etc), section_number (i.e 1.1, 2.3), the title of the paper and finally the file path. Grobid parsing is useful beyond standard pdf parsing as it accurately outputs sections and paragraphs within them. This allows for post-fitering of results for specific sections i.e. limiting results to the methodology section or results. While sections are split via headings, ideally they could be classified specifically into introduction, methodology, results, discussion, conclusion. I'm currently experimenting with chatgpt-3.5 for this function, which could later be implemented as a textsplitter. `Dependencies:` For use, the grobid repo must be cloned and Java must be installed, for colab this is: ``` !apt-get install -y openjdk-11-jdk -q !update-alternatives --set java /usr/lib/jvm/java-11-openjdk-amd64/bin/java !git clone https://github.com/kermitt2/grobid.git os.environ["JAVA_HOME"] = "/usr/lib/jvm/java-11-openjdk-amd64" os.chdir('grobid') !./gradlew clean install ``` Once installed the server is ran on localhost:8070 via ``` get_ipython().system_raw('nohup ./gradlew run > grobid.log 2>&1 &') ``` @rlancemartin, @eyurtsev Twitter Handle: @Corranmac Grobid Demo Notebook is [here](https://colab.research.google.com/drive/1X-St_mQRmmm8YWtct_tcJNtoktbdGBmd?usp=sharing). --------- Co-authored-by: rlm <pexpresss31@gmail.com>	1 year ago
Harrison Chase	0ba175e13f	move octo notebook (#6901 )	1 year ago
Stefano Lottini	75fb9d2fdc	Cassandra support for chat history using CassIO library (#6771 ) ### Overview This PR aims at building on #4378, expanding the capabilities and building on top of the `cassIO` library to interface with the database (as opposed to using the core drivers directly). Usage of `cassIO` (a library abstracting Cassandra access for ML/GenAI-specific purposes) is already established since #6426 was merged, so no new dependencies are introduced. In the same spirit, we try to uniform the interface for using Cassandra instances throughout LangChain: all our appreciation of the work by @jj701 notwithstanding, who paved the way for this incremental work (thank you!), we identified a few reasons for changing the way a `CassandraChatMessageHistory` is instantiated. Advocating a syntax change is something we don't take lighthearted way, so we add some explanations about this below. Additionally, this PR expands on integration testing, enables use of Cassandra's native Time-to-Live (TTL) features and improves the phrasing around the notebook example and the short "integrations" documentation paragraph. We would kindly request @hwchase to review (since this is an elaboration and proposed improvement of #4378 who had the same reviewer). ### About the __init__ breaking changes There are [many](https://docs.datastax.com/en/developer/python-driver/3.28/api/cassandra/cluster/) options when creating the `Cluster` object, and new ones might be added at any time. Choosing some of them and exposing them as `__init__` parameters `CassandraChatMessageHistory` will prove to be insufficient for at least some users. On the other hand, working through `kwargs` or adding a long, long list of arguments to `__init__` is not a desirable option either. For this reason, (as done in #6426), we propose that whoever instantiates the Chat Message History class provide a Cassandra `Session` object, ready to use. This also enables easier injection of mocks and usage of Cassandra-compatible connections (such as those to the cloud database DataStax Astra DB, obtained with a different set of init parameters than `contact_points` and `port`). We feel that a breaking change might still be acceptable since LangChain is at `0.*`. However, while maintaining that the approach we propose will be more flexible in the future, room could be made for a "compatibility layer" that respects the current init method. Honestly, we would to that only if there are strong reasons for it, as that would entail an additional maintenance burden. ### Other changes We propose to remove the keyspace creation from the class code for two reasons: first, production Cassandra instances often employ RBAC so that the database user reading/writing from tables does not necessarily (and generally shouldn't) have permission to create keyspaces, and second that programmatic keyspace creation is not a best practice (it should be done more or less manually, with extra care about schema mismatched among nodes, etc). Removing this (usually unnecessary) operation from the `__init__` path would also improve initialization performance (shorter time). We suggest, likewise, to remove the `__del__` method (which would close the database connection), for the following reason: it is the recommended best practice to create a single Cassandra `Session` object throughout an application (it is a resource-heavy object capable to handle concurrency internally), so in case Cassandra is used in other ways by the app there is the risk of truncating the connection for all usages when the history instance is destroyed. Moreover, the `Session` object, in typical applications, is best left to garbage-collect itself automatically. As mentioned above, we defer the actual database I/O to the `cassIO` library, which is designed to encode practices optimized for LLM applications (among other) without the need to expose LangChain developers to the internals of CQL (Cassandra Query Language). CassIO is already employed by the LangChain's Vector Store support for Cassandra. We added a few more connection options in the companion notebook example (most notably, Astra DB) to encourage usage by anyone who cannot run their own Cassandra cluster. We surface the `ttl_seconds` option for automatic handling of an expiration time to chat history messages, a likely useful feature given that very old messages generally may lose their importance. We elaborated a bit more on the integration testing (Time-to-live, separation of "session ids", ...). ### Remarks from linter & co. We reinstated `cassio` as a dependency both in the "optional" group and in the "integration testing" group of `pyproject.toml`. This might not be the right thing do to, in which case the author of this PR offer his apologies (lack of confidence with Poetry - happy to be pointed in the right direction, though!). During linter tests, we were hit by some errors which appear unrelated to the code in the PR. We left them here and report on them here for awareness: ``` langchain/vectorstores/mongodb_atlas.py:137: error: Argument 1 to "insert_many" of "Collection" has incompatible type "List[Dict[str, Sequence[object]]]"; expected "Iterable[Union[MongoDBDocumentType, RawBSONDocument]]" [arg-type] langchain/vectorstores/mongodb_atlas.py:186: error: Argument 1 to "aggregate" of "Collection" has incompatible type "List[object]"; expected "Sequence[Mapping[str, Any]]" [arg-type] langchain/vectorstores/qdrant.py:16: error: Name "grpc" is not defined [name-defined] langchain/vectorstores/qdrant.py:19: error: Name "grpc" is not defined [name-defined] langchain/vectorstores/qdrant.py:20: error: Name "grpc" is not defined [name-defined] langchain/vectorstores/qdrant.py:22: error: Name "grpc" is not defined [name-defined] langchain/vectorstores/qdrant.py:23: error: Name "grpc" is not defined [name-defined] ``` In the same spirit, we observe that to even get `import langchain` run, it seems that a `pip install bs4` is missing from the minimal package installation path. Thank you!	1 year ago
Harrison Chase	3ac08c3de4	Harrison/octo ml (#6897 ) Co-authored-by: Bassem Yacoube <125713079+AI-Bassem@users.noreply.github.com> Co-authored-by: Shotaro Kohama <khmshtr28@gmail.com> Co-authored-by: Rian Dolphin <34861538+rian-dolphin@users.noreply.github.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com> Co-authored-by: Shashank Deshpande <shashankdeshpande18@gmail.com>	1 year ago
Shashank Deshpande	99cfe192da	added example notebook - use custom functions with openai agent (#6865 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	1 year ago
Robert Lewis	c9c8d2599e	Update Zapier Jupyter notebook to include brief OAuth example (#6892 ) Description: Adds a brief example of using an OAuth access token with the Zapier wrapper. Also links to the Zapier documentation to learn more about OAuth flows. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
Davis Chase	f07dd02b50	Docs /redirects (#6790 ) Auto-generated a bunch of redirects from initial docs refactor commit	1 year ago
Yaohui Wang	9d1bd18596	feat (documents): add LarkSuite document loader (#6420 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> ### Summary This PR adds a LarkSuite (FeiShu) document loader. > [LarkSuite](https://www.larksuite.com/) is an enterprise collaboration platform developed by ByteDance. ### Tests - an integration test case is added - an example notebook showing usage is added. [Notebook preview](https://github.com/yaohui-wyh/langchain/blob/master/docs/extras/modules/data_connection/document_loaders/integrations/larksuite.ipynb) <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ### Who can review? - PTAL @eyurtsev @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Yaohui Wang <wangyaohui.01@bytedance.com>	1 year ago
Jingsong Gao	a435a436c1	feat(document_loaders): add tencent cos directory and file loader (#6401 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> - add tencent cos directory and file support for document-loader #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? @eyurtsev	1 year ago
Shashank Deshpande	1db266b20d	Update link in apis.mdx (#6812 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	1 year ago
Lance Martin	3f9900a864	Create MultiQueryRetriever (#6833 ) Distance-based vector database retrieval embeds (represents) queries in high-dimensional space and finds similar embedded documents based on "distance". But, retrieval may produce difference results with subtle changes in query wording or if the embeddings do not capture the semantics of the data well. Prompt engineering / tuning is sometimes done to manually address these problems, but can be tedious. The `MultiQueryRetriever` automates the process of prompt tuning by using an LLM to generate multiple queries from different perspectives for a given user input query. For each query, it retrieves a set of relevant documents and takes the unique union across all queries to get a larger set of potentially relevant documents. By generating multiple perspectives on the same question, the `MultiQueryRetriever` might be able to overcome some of the limitations of the distance-based retrieval and get a richer set of results. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Tim Asp	3ca1a387c2	Web Loader: Add proxy support (#6792 ) Proxies are helpful, especially when you start querying against more anti-bot websites. [Proxy services](https://developers.oxylabs.io/advanced-proxy-solutions/web-unblocker/making-requests) (of which there are many) and `requests` make it easy to rotate IPs to prevent banning by just passing along a simple dict to `requests`. CC @rlancemartin, @eyurtsev	1 year ago
Matt Robinson	dd2a151543	Docs/unstructured api key (#6781 ) ### Summary The Unstructured API will soon begin requiring API keys. This PR updates the Unstructured integrations docs with instructions on how to generate Unstructured API keys. ### Reviewers @rlancemartin @eyurtsev @hwchase17	1 year ago
Matt Robinson	b24472eae3	feat: Add `UnstructuredOrgModeLoader` (#6842 ) ### Summary Adds `UnstructuredOrgModeLoader` for processing [Org-mode](https://en.wikipedia.org/wiki/Org-mode) documents. ### Testing ```python from langchain.document_loaders import UnstructuredOrgModeLoader loader = UnstructuredOrgModeLoader( file_path="example_data/README.org", mode="elements" ) docs = loader.load() print(docs[0]) ``` ### Reviewers - @rlancemartin - @eyurtsev - @hwchase17	1 year ago
Cristóbal Carnero Liñán	e494b0a09f	feat (documents): add a source code loader based on AST manipulation (#6486 ) #### Summary A new approach to loading source code is implemented: Each top-level function and class in the code is loaded into separate documents. Then, an additional document is created with the top-level code, but without the already loaded functions and classes. This could improve the accuracy of QA chains over source code. For instance, having this script: ``` class MyClass: def __init__(self, name): self.name = name def greet(self): print(f"Hello, {self.name}!") def main(): name = input("Enter your name: ") obj = MyClass(name) obj.greet() if __name__ == '__main__': main() ``` The loader will create three documents with this content: First document: ``` class MyClass: def __init__(self, name): self.name = name def greet(self): print(f"Hello, {self.name}!") ``` Second document: ``` def main(): name = input("Enter your name: ") obj = MyClass(name) obj.greet() ``` Third document: ``` # Code for: class MyClass: # Code for: def main(): if __name__ == '__main__': main() ``` A threshold parameter is added to control whether small scripts are split in this way or not. At this moment, only Python and JavaScript are supported. The appropriate parser is determined by examining the file extension. #### Tests This PR adds: - Unit tests - Integration tests #### Dependencies Only one dependency was added as optional (needed for the JavaScript parser). #### Documentation A notebook is added showing how the loader can be used. #### Who can review? @eyurtsev @hwchase17 --------- Co-authored-by: rlm <pexpresss31@gmail.com>	1 year ago
Robert Lewis	da462d9dd4	Zapier update oauth support (#6780 ) Description: Update documentation to 1) point to updated documentation links at Zapier.com (we've revamped our help docs and paths), and 2) To provide clarity how to use the wrapper with an access token for OAuth support Demo: Initializing the Zapier Wrapper with an OAuth Access Token `ZapierNLAWrapper(zapier_nla_oauth_access_token="<redacted>")` Using LangChain to resolve the current weather in Vancouver BC leveraging Zapier NLA to lookup weather by coords. ``` > Entering new chain... I need to use a tool to get the current weather. Action: The Weather: Get Current Weather Action Input: Get the current weather for Vancouver BC Observation: {"coord__lon": -123.1207, "coord__lat": 49.2827, "weather": [{"id": 802, "main": "Clouds", "description": "scattered clouds", "icon": "03d", "icon_url": "http://openweathermap.org/img/wn/03d@2x.png"}], "weather[]icon_url": ["http://openweathermap.org/img/wn/03d@2x.png"], "weather[]icon": ["03d"], "weather[]id": [802], "weather[]description": ["scattered clouds"], "weather[]main": ["Clouds"], "base": "stations", "main__temp": 71.69, "main__feels_like": 71.56, "main__temp_min": 67.64, "main__temp_max": 76.39, "main__pressure": 1015, "main__humidity": 64, "visibility": 10000, "wind__speed": 3, "wind__deg": 155, "wind__gust": 11.01, "clouds__all": 41, "dt": 1687806607, "sys__type": 2, "sys__id": 2011597, "sys__country": "CA", "sys__sunrise": 1687781297, "sys__sunset": 1687839730, "timezone": -25200, "id": 6173331, "name": "Vancouver", "cod": 200, "summary": "scattered clouds", "_zap_search_was_found_status": true} Thought: I now know the current weather in Vancouver BC. Final Answer: The current weather in Vancouver BC is scattered clouds with a temperature of 71.69 and wind speed of 3 ```	1 year ago
Joshua Carroll	24e4ae95ba	Initial Streamlit callback integration doc (md) (#6788 ) Description: Add a documentation page for the Streamlit Callback Handler integration (#6315) Notes: - Implemented as a markdown file instead of a notebook since example code runs in a Streamlit app (happy to discuss / consider alternatives now or later) - Contains an embedded Streamlit app -> https://mrkl-minimal.streamlit.app/ Currently this app is hosted out of a Streamlit repo but we're working to migrate the code to a LangChain owned repo ![streamlit_docs](https://github.com/hwchase17/langchain/assets/116604821/0b7a6239-361f-470c-8539-f22c40098d1a) cc @dev2049 @tconkling	1 year ago
Zander Chase	e1fdb67440	Update description in Evals notebook (#6808 )	1 year ago
Zander Chase	ad028bbb80	Permit Constitutional Principles (#6807 ) In the criteria evaluator.	1 year ago
WaseemH	7ac9b22886	`RecusiveUrlLoader` to `RecursiveUrlLoader` (#6787 )	1 year ago
Leonid Ganeline	49c864fa18	docs: vectorstore upgrades 2 (#6796 ) updated vectorstores/ notebooks; added new integrations into ecosystem/integrations/ @dev2049 @rlancemartin, @eyurtsev	1 year ago
Zander Chase	d7dbf4aefe	Clean up agent trajectory interface (#6799 ) - Enable reference - Enable not specifying tools at the start - Add methods with keywords	1 year ago
Zander Chase	cc60fed3be	Add a Pairwise Comparison Chain (#6703 ) Notebook shows preference scoring between two chains and reports wilson score interval + p value I think I'll add the option to insert ground truth labels but doesn't have to be in this PR	1 year ago
Zander Chase	c460b04c64	Update String Evaluator (#6615 ) - Add protocol for `evaluate_strings` - Move the criteria evaluator out so it's not restricted to being applied on traced runs	1 year ago
Chris Pappalardo	70f7c2bb2e	align chroma vectorstore get with chromadb to enable where filtering (#6686 ) allows for where filtering on collection via get - Description: aligns langchain chroma vectorstore get with underlying [chromadb collection get](https://github.com/chroma-core/chroma/blob/main/chromadb/api/models/Collection.py#L103) allowing for where filtering, etc. - Issue: NA - Dependencies: none - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: @pappanaka	1 year ago
Santiago Delgado	d84a3bcf7a	Office365 Tool (#6306 ) #### Background With the development of [structured tools](https://blog.langchain.dev/structured-tools/), the LangChain team expanded the platform's functionality to meet the needs of new applications. The GMail tool, empowered by structured tools, now supports multiple arguments and powerful search capabilities, demonstrating LangChain's ability to interact with dynamic data sources like email servers. #### Challenge The current GMail tool only supports GMail, while users often utilize other email services like Outlook in Office365. Additionally, the proposed calendar tool in PR https://github.com/hwchase17/langchain/pull/652 only works with Google Calendar, not Outlook. #### Changes This PR implements an Office365 integration for LangChain, enabling seamless email and calendar functionality with a single authentication process. #### Future Work With the core Office365 integration complete, future work could include integrating other Office365 tools such as Tasks and Address Book. #### Who can review? @hwchase17 or @vowelparrot can review this PR #### Appendix @janscas, I utilized your [O365](https://github.com/O365/python-o365) library extensively. Given the rising popularity of LangChain and similar AI frameworks, the convergence of libraries like O365 and tools like this one is likely. So, I wanted to keep you updated on our progress. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
Pau Ramon Revilla	87802c86d9	Added a MHTML document loader (#6311 ) MHTML is a very interesting format since it's used both for emails but also for archived webpages. Some scraping projects want to store pages in disk to process them later, mhtml is perfect for that use case. This is heavily inspired from the beautifulsoup html loader, but extracting the html part from the mhtml file. --------- Co-authored-by: rlm <pexpresss31@gmail.com>	1 year ago
Matt Robinson	be68f6f8ce	feat: Add `UnstructuredRSTLoader` (#6594 ) ### Summary Adds an `UnstructuredRSTLoader` for loading [reStructuredText](https://en.wikipedia.org/wiki/ReStructuredText) file. ### Testing ```python from langchain.document_loaders import UnstructuredRSTLoader loader = UnstructuredRSTLoader( file_path="example_data/README.rst", mode="elements" ) docs = loader.load() print(docs[0]) ``` ### Reviewers - @hwchase17 - @rlancemartin - @eyurtsev	1 year ago
刘方瑞	9d1b3bab76	Fix Typo in LangChain MyScale Integration Doc (#6705 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> - Description: Fix Typo in LangChain MyScale Integration Doc @hwchase17	1 year ago
UmerHA	068142fce2	Add caching to BaseChatModel (issue #1644 ) (#5089 ) # Add caching to BaseChatModel Fixes #1644 (Sidenote: While testing, I noticed we have multiple implementations of Fake LLMs, used for testing. I consolidated them.) ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: Models - @hwchase17 - @agola11 Twitter: [@UmerHAdil](https://twitter.com/@UmerHAdil) \| Discord: RicChilligerDude#7589 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Baichuan Sun	9fbe346860	Amazon API Gateway hosted LLM (#6673 ) This PR adds a new LLM class for the Amazon API Gateway hosted LLM. The PR also includes example notebooks for using the LLM class in an Agent chain. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
Akash	b7e1c54947	Just corrected a small inconsistency on a doc page (#6603 ) ### Just corrected a small inconsistency on a doc page (not exactly a typo, per se) - Description: There was inconsistency due to the use of single quotes at one place on the [Squential Chains](https://python.langchain.com/docs/modules/chains/foundational/sequential_chains) page of the docs, - Issue: NA, - Dependencies: NA, - Tag maintainer: @dev2049, - Twitter handle: kambleakash0	1 year ago
Davis Chase	f1e1ac2a01	chroma nb close img tag (#6669 )	1 year ago
Davis Chase	5e5b30b74f	openapi -> openai nit (#6667 )	1 year ago
Jeff Huber	2acf109c4b	update chroma notebook (#6664 ) @rlancemartin I updated the notebook for Chroma to hopefully be a lot easier for users.	1 year ago
Piyush Jain	b1de927f1b	Kendra retriever api (#6616 ) ## Description Replaces [Kendra Retriever](https://github.com/hwchase17/langchain/blob/master/langchain/retrievers/aws_kendra_index_retriever.py) with an updated version that uses the new [retriever API](https://docs.aws.amazon.com/kendra/latest/dg/searching-retrieve.html) which is better suited for retrieval augmented generation (RAG) systems. Note: This change requires the latest version (1.26.159) of boto3 to work. `pip install -U boto3` to upgrade the boto3 version. cc @hupe1980 cc @dev2049	1 year ago
ChrisLovejoy	4e5d78579b	fix minor typo in vector_db_qa.mdx (#6604 ) - Description: minor typo fixed - doesn't instead of does. No other changes.	1 year ago
Ikko Eltociear Ashimine	73da193a4b	Fix typo in myscale_self_query.ipynb (#6601 )	1 year ago
Aaron Pham	082976d8d0	fix(docs): broken link for OpenLLM (#6622 ) This link for the notebook of OpenLLM is not migrated to the new format Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	1 year ago
Lance Martin	c2b25c17c5	Recursive URL loader (#6455 ) We may want to process load all URLs under a root directory. For example, let's look at the [LangChain JS documentation](https://js.langchain.com/docs/). This has many interesting child pages that we may want to read in bulk. Of course, the `WebBaseLoader` can load a list of pages. But, the challenge is traversing the tree of child pages and actually assembling that list! We do this using the `RecusiveUrlLoader`. This also gives us the flexibility to exclude some children (e.g., the `api` directory with > 800 child pages).	1 year ago
Lance Martin	393f469eb3	Create merge loader that combines documents from a set of loaders (#6659 ) Simple utility loader that combines documents from a set of specified loaders.	1 year ago
Davis Chase	6988039975	openapi_openai docstring (#6661 )	1 year ago
Davis Chase	e013459b18	Openapi to openai (#6658 )	1 year ago
Lance Martin	6e69bfbb28	Loader for OpenCityData and minor cleanups to Pandas, Airtable loaders (#6301 ) Many cities have open data portals for events like crime, traffic, etc. Socrata provides an API for many, including SF (e.g., see [here](https://dev.socrata.com/foundry/data.sfgov.org/tmnf-yvry)). This is a new data loader for city data that uses Socrata API.	1 year ago
Christoph Kahl	9d42621fa4	added redis method to delete entries by keys (#6222 ) In addition to my last pr (return keys of added entries), we also need a method to delete the entries by keys. @dev2049	1 year ago
Harrison Chase	a9108c1809	add mongo (HOLD) (#6437 ) do not merge in	1 year ago
Lance Martin	30f7288082	MD header text splitter returns Documents (#6571 ) Return `Documents` from MD header text splitter to simplify UX. Updates the test as well as example notebooks.	1 year ago
minhajul-clarifai	6e57306a13	Clarifai integration (#5954 ) # Changes This PR adds [Clarifai](https://www.clarifai.com/) integration to Langchain. Clarifai is an end-to-end AI Platform. Clarifai offers user the ability to use many types of LLM (OpenAI, cohere, ect and other open source models). As well, a clarifai app can be treated as a vector database to upload and retrieve data. The integrations includes: - Clarifai LLM integration: Clarifai supports many types of language model that users can utilize for their application - Clarifai VectorDB: A Clarifai application can hold data and embeddings. You can run semantic search with the embeddings #### Before submitting - [x] Added integration test for LLM - [x] Added integration test for VectorDB - [x] Added notebook for LLM - [x] Added notebook for VectorDB Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
Jeroen Van Goey	7f6f5c2a6a	Add missing word in comment (#6587 ) Changed ``` # Do this so we can exactly what's going on under the hood ``` to ``` # Do this so we can see exactly what's going on under the hood ```	1 year ago
Davis Chase	d50de2728f	Add AzureML endpoint LLM wrapper (#6580 ) ### Description We have added a new LLM integration `azureml_endpoint` that allows users to leverage models from the AzureML platform. Microsoft recently announced the release of [Azure Foundation Models](https://learn.microsoft.com/en-us/azure/machine-learning/concept-foundation-models?view=azureml-api-2) which users can find in the AzureML Model Catalog. The Model Catalog contains a variety of open source and Hugging Face models that users can deploy on AzureML. The `azureml_endpoint` allows LangChain users to use the deployed Azure Foundation Models. ### Dependencies No added dependencies were required for the change. ### Tests Integration tests were added in `tests/integration_tests/llms/test_azureml_endpoint.py`. ### Notebook A Jupyter notebook demonstrating how to use `azureml_endpoint` was added to `docs/modules/llms/integrations/azureml_endpoint_example.ipynb`. ### Twitters [Prakhar Gupta](https://twitter.com/prakhar_in) [Matthew DeGuzman](https://twitter.com/matthew_d13) --------- Co-authored-by: Matthew DeGuzman <91019033+matthewdeguzman@users.noreply.github.com> Co-authored-by: prakharg-msft <75808410+prakharg-msft@users.noreply.github.com>	1 year ago
Davis Chase	4fabd02d25	Add OpenLLM wrapper(#6578 ) LLM wrapper for models served with OpenLLM --------- Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> Authored-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> Co-authored-by: Chaoyu <paranoyang@gmail.com>	1 year ago
Harrison Chase	937a7e93f2	add motherduck docs (#6572 )	1 year ago
Muhammad Vaid	ae81b96b60	Detailed using the Twilio tool to send messages with 3rd party apps incl. WhatsApp (#6562 ) Everything needed to support sending messages over WhatsApp Business Platform (GA), Facebook Messenger (Public Beta) and Google Business Messages (Private Beta) was present. Just added some details on leveraging it.	1 year ago
Gengliang Wang	0673245d0c	Remove duplicate databricks entries in ecosystem integrations (#6569 ) Currently, there are two Databricks entries in https://python.langchain.com/docs/ecosystem/integrations/ <img width="277" alt="image" src="https://github.com/hwchase17/langchain/assets/1097932/86ab4ad2-6bce-4459-9d56-1ab2fbb69f6d"> The reason is that there are duplicated notebooks for Databricks integration: * https://github.com/hwchase17/langchain/blob/master/docs/extras/ecosystem/integrations/databricks.ipynb * https://github.com/hwchase17/langchain/blob/master/docs/extras/ecosystem/integrations/databricks/databricks.ipynb This PR is to remove the second one for simplicity.	1 year ago
Andrey E. Vedishchev	a2a0715bd4	Minor Grammar Fixes in Docs and Comments (#6536 ) Just some grammar fixes: I found "retriver" instead of "retriever" in several comments across the documentation and in the comments. I fixed it. Co-authored-by: andrey.vedishchev <andrey.vedishchev@rgigroup.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
dirtysalt	57cc3d1d3d	[Feature][VectorStore] Support StarRocks as vector db (#6119 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> Here are some examples to use StarRocks as vectordb ``` from langchain.vectorstores import StarRocks from langchain.vectorstores.starrocks import StarRocksSettings embeddings = OpenAIEmbeddings() # conifgure starrocks settings settings = StarRocksSettings() settings.port = 41003 settings.host = '127.0.0.1' settings.username = 'root' settings.password = '' settings.database = 'zya' # to fill new embeddings docsearch = StarRocks.from_documents(split_docs, embeddings, config = settings) # or to use already-built embeddings in database. docsearch = StarRocks(embeddings, settings) ``` #### Who can review? Tag maintainers/contributors who might be interested: @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
Harrison Chase	ace442b992	bump to ver 208 (#6540 )	1 year ago
Harrison Chase	53c1f120a8	Harrison/multi tool (#6518 )	1 year ago
Naman Modi	37a89918e0	Infino integration for simplified logs, metrics & search across LLM data & token usage (#6218 ) ### Integration of Infino with LangChain for Enhanced Observability This PR aims to integrate [Infino](https://github.com/infinohq/infino), an open source observability platform written in rust for storing metrics and logs at scale, with LangChain, providing users with a streamlined and efficient method of tracking and recording LangChain experiments. By incorporating Infino into LangChain, users will be able to gain valuable insights and easily analyze the behavior of their language models. #### Please refer to the following files related to integration: - `InfinoCallbackHandler`: A [callback handler](https://github.com/naman-modi/langchain/blob/feature/infino-integration/langchain/callbacks/infino_callback.py) specifically designed for storing chain responses within Infino. - Example `infino.ipynb` file: A comprehensive notebook named [infino.ipynb](https://github.com/naman-modi/langchain/blob/feature/infino-integration/docs/extras/modules/callbacks/integrations/infino.ipynb) has been included to guide users on effectively leveraging Infino for tracking LangChain requests. - [Integration Doc](https://github.com/naman-modi/langchain/blob/feature/infino-integration/docs/extras/ecosystem/integrations/infino.mdx) for Infino integration. By integrating Infino, LangChain users will gain access to powerful visualization and debugging capabilities. Infino enables easy tracking of inputs, outputs, token usage, execution time of LLMs. This comprehensive observability ensures a deeper understanding of individual executions and facilitates effective debugging. Co-authors: @vinaykakade @savannahar68 --------- Co-authored-by: Vinay Kakade <vinaykakade@gmail.com>	1 year ago
Anubhav Bindlish	94c7899257	Integrate Rockset as Vectorstore (#6216 ) This PR adds Rockset as a vectorstore for langchain. [Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/) is a real time OLAP database which provides a fast and efficient vector search functionality. Further since it is entirely schemaless, it can store metadata in separate columns thereby allowing fast metadata filters during vector similarity search (as opposed to storing the entire metadata in a single JSON column). It currently supports three distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and `DOT_PRODUCT`. This PR adds `rockset` client as an optional dependency. We would love a twitter shoutout, our handle is https://twitter.com/RocksetCloud --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
ElReyZero	ab7ecc9c30	Feat: Add a prompt template parameter to qa with structure chains (#6495 ) This pull request introduces a new feature to the LangChain QA Retrieval Chains with Structures. The change involves adding a prompt template as an optional parameter for the RetrievalQA chains that utilize the recently implemented OpenAI Functions. The main purpose of this enhancement is to provide users with the ability to input a more customizable prompt to the chain. By introducing a prompt template as an optional parameter, users can tailor the prompt to their specific needs and context, thereby improving the flexibility and effectiveness of the RetrievalQA chains. ## Changes Made - Created a new optional parameter, "prompt", for the RetrievalQA with structure chains. - Added an example to the RetrievalQA with sources notebook. My twitter handle is @El_Rey_Zero --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
Hassan Ouda	456ca3d587	Be able to use Codey models on Vertex AI (#6354 ) Added the functionality to leverage 3 new Codey models from Vertex AI: - code-bison - Code generation using the existing LLM integration - code-gecko - Code completion using the existing LLM integration - codechat-bison - Code chat using the existing chat_model integration --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
囧囧	0fce8ef178	Add KuzuQAChain (#6454 ) This PR adds `KuzuGraph` and `KuzuQAChain` for interacting with [Kùzu database](https://github.com/kuzudb/kuzu). Kùzu is an in-process property graph database management system (GDBMS) built for query speed and scalability. The `KuzuGraph` and `KuzuQAChain` provide the same functionality as the existing integration with NebulaGraph and Neo4j and enables query generation and question answering over Kùzu database. A notebook example and a simple test case have also been added. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	1 year ago
Chanin Nantasenamat	6e07283dd5	Update index.mdx (#6326 ) #### Fix Added the mention of "store" amongst the tasks that the data connection module can perform aside from the existing 3 (load, transform and query). Particularly, this implies the generation of embeddings vectors and the creation of vector stores.	1 year ago
TheOnlyWayUp	bb437646fc	typo(llamacpp.ipynb): 'condiser' -> 'consider' (#6474 )	1 year ago
hsparmar	834c3378af	Documentation Fix: Correct the example code output in the prompt templates doc (#6496 ) Documentation is showing the wrong example output for the prompt templates code snippet. This PR fixes that issue.	1 year ago
Davis Chase	c91cf68754	Fix link (#6501 )	1 year ago
Davis Chase	3298bf4f00	docs/fix links (#6498 )	1 year ago
Lance Martin	ae6196507d	Update notebook for MD header splitter and create new cookbook (#6399 ) Move MD header text splitter example to its own cookbook.	1 year ago
Stefano Lottini	22af93d851	Vector store support for Cassandra (#6426 ) This addresses #6291 adding support for using Cassandra (and compatible databases, such as DataStax Astra DB) as a [Vector Store](https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-30%3A+Approximate+Nearest+Neighbor(ANN)+Vector+Search+via+Storage-Attached+Indexes). A new class `Cassandra` is introduced, which complies with the contract and interface for a vector store, along with the corresponding integration test, a sample notebook and modified dependency toml. Dependencies: the implementation relies on the library `cassio`, which simplifies interacting with Cassandra for ML- and LLM-oriented workloads. CassIO, in turn, uses the `cassandra-driver` low-lever drivers to communicate with the database. The former is added as optional dependency (+ in `extended_testing`), the latter was already in the project. Integration testing relies on a locally-running instance of Cassandra. [Here](https://cassio.org/more_info/#use-a-local-vector-capable-cassandra) a detailed description can be found on how to compile and run it (at the time of writing the feature has not made it yet to a release). During development of the integration tests, I added a new "fake embedding" class for what I consider a more controlled way of testing the MMR search method. Likewise, I had to amend what looked like a glitch in the behaviour of `ConsistentFakeEmbeddings` whereby an `embed_query` call would have bypassed storage of the requested text in the class cache for use in later repeated invocations. @dev2049 might be the right person to tag here for a review. Thank you! --------- Co-authored-by: rlm <pexpresss31@gmail.com>	1 year ago
zhaoshengbo	ab44c24333	Add Alibaba Cloud OpenSearch as a new vector store (#6154 ) Hello Folks, Thanks for creating and maintaining this great project. I'm excited to submit this PR to add Alibaba Cloud OpenSearch as a new vector store. OpenSearch is a one-stop platform to develop intelligent search services. OpenSearch was built based on the large-scale distributed search engine developed by Alibaba. OpenSearch serves more than 500 business cases in Alibaba Group and thousands of Alibaba Cloud customers. OpenSearch helps develop search services in different search scenarios, including e-commerce, O2O, multimedia, the content industry, communities and forums, and big data query in enterprises. OpenSearch provides the vector search feature. In specific scenarios, especially test question search and image search scenarios, you can use the vector search feature together with the multimodal search feature to improve the accuracy of search results. This PR includes: A AlibabaCloudOpenSearch class that can connect to the Alibaba Cloud OpenSearch instance. add embedings and metadata into a opensearch datasource. querying by squared euclidean and metadata. integration tests. ipython notebook and docs. I have read your contributing guidelines. And I have passed the tests below - [x] make format - [x] make lint - [x] make coverage - [x] make test --------- Co-authored-by: zhaoshengbo <shengbo.zsb@alibaba-inc.com>	1 year ago
Davis Chase	b7ad4c4c30	fix openai qa chain (#6487 )	1 year ago
Harrison Chase	9eec7c3206	Harrison/unstructured page number (#6464 ) Co-authored-by: Reza Sanaie <reza@sanaie.ca>	1 year ago
Grayson Adkins	9f5f747dc3	Fix broken links in autonomous agents docs (#6398 ) Fixes broken links here: https://python.langchain.com/docs/use_cases/autonomous_agents.html #### Who can review? Tag maintainers/contributors who might be interested: Agents / Tools / Toolkits - @hwchase17	1 year ago
volodymyr-memsql	d2e9b621ab	Update SinglStoreDB vectorstore (#6423 ) 1. Introduced new distance strategies support: DOT_PRODUCT and EUCLIDEAN_DISTANCE for enhanced flexibility. 2. Implemented a feature to filter results based on metadata fields. 3. Incorporated connection attributes specifying "langchain python sdk" usage for enhanced traceability and debugging. 4. Expanded the suite of integration tests for improved code reliability. 5. Updated the existing notebook with the usage example @dev2049 --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	1 year ago
Harrison Chase	02c0a1e77e	Harrison/functions in retrieval (#6463 )	1 year ago

... 10 11 12 13 14 ...

2344 Commits (eb903e211c315596f0bb8028b6d17c24593d6f47)