langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-11 19:11:02 +00:00

Author	SHA1	Message	Date
berkedilekoglu	f907b62526	Scores are explained in vectorestore docs (#5613 ) # Scores in Vectorestores' Docs Are Explained Following vectorestores can return scores with similar documents by using `similarity_search_with_score`: - chroma - docarray_hnsw - docarray_in_memory - faiss - myscale - qdrant - supabase - vectara - weaviate However, in documents, these scores were either not explained at all or explained in a way that could lead to misunderstandings (e.g., FAISS). For instance in FAISS document: if we consider the score returned by the function as a similarity score, we understand that a document returning a higher score is more similar to the source document. However, since the scores returned by the function are distance scores, we should understand that smaller scores correspond to more similar documents. For the libraries other than Vectara, I wrote the scores they use by investigating from the source libraries. Since I couldn't be certain about the score metric used by Vectara, I didn't make any changes in its documentation. The links mentioned in Vectara's documentation became broken due to updates, so I replaced them with working ones. VectorStores / Retrievers / Memory - @dev2049 my twitter: [berkedilekoglu](https://twitter.com/berkedilekoglu) --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-05 20:39:49 -07:00
Adil Ansari	233b52735e	feat: Support for `Tigris` Vector Database for vector search (#5703 ) ### Changes - New vector store integration - [Tigris](https://tigrisdata.com) - Adds [tigrisdb](https://pypi.org/project/tigrisdb/) optional dependency - Example notebook demonstrating usage Fixes #5535 Closes tigrisdata/tigris-client-python#40 #### Twitter handles We'd love a shoutout on our [@TigrisData](https://twitter.com/TigrisData) and [@adilansari](https://twitter.com/adilansari) twitter handles #### Who can review? @dev2049 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-05 20:39:16 -07:00
Edrick Da Corte Henriquez	38dabdbb3a	Update tutorials.md (#5761 ) # Added an overview of LangChain modules Aimed at introducing newcomers to LangChain's main modules :) Twitter handle is @edrick_dch ## Who can review? @eyurtsev	2023-06-05 20:37:11 -07:00
Harrison Chase	25487fa5ee	Harrison/youtube multi language (#5758 ) Co-authored-by: rafly lesmana <raflylesmana111@gmail.com>	2023-06-05 16:38:07 -07:00
M Waleed Kadous	5124c1e0d9	Add aviary support (#5661 ) Aviary is an open source toolkit for evaluating and deploying open source LLMs. You can find out more about it on [http://github.com/ray-project/aviary). You can try it out at [http://aviary.anyscale.com](aviary.anyscale.com). This code adds support for Aviary in LangChain. To minimize dependencies, it connects directly to the HTTP endpoint. The current implementation is not accelerated and uses the default implementation of `predict` and `generate`. It includes a test and a simple example. @hwchase17 and @agola11 could you have a look at this? --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-05 16:28:42 -07:00
Leonid Ganeline	87ad4fc4b2	docs: updated `ecosystem/dependents` (#5753 ) updated `ecosystem/dependents` data (it was updated 2+ weeks ago) #### Who can review? @hwchase17 @eyurtsev @dev2049	2023-06-05 16:09:55 -07:00
Leonid Ganeline	92a5f00ffb	docs: `ecosystem/integrations` update 5 (#5752 ) - added missed integration to `docs/ecosystem/integrations/` - updated notebooks to consistent format: changed titles, file names; added descriptions #### Who can review? @hwchase17 @dev2049	2023-06-05 16:08:55 -07:00
Lance Martin	aea090045b	Create OpenAIWhisperParser for generating Documents from audio files (#5580 ) # OpenAIWhisperParser This PR creates a new parser, `OpenAIWhisperParser`, that uses the [OpenAI Whisper model](https://platform.openai.com/docs/guides/speech-to-text/quickstart) to perform transcription of audio files to text (`Documents`). Please see the notebook for usage.	2023-06-05 15:51:13 -07:00
Hao Chen	a4c9053d40	Integrate Clickhouse as Vector Store (#5650 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> #### Description This PR is mainly to integrate open source version of ClickHouse as Vector Store as it is easy for both local development and adoption of LangChain for enterprises who already have large scale clickhouse deployment. ClickHouse is a open source real-time OLAP database with full SQL support and a wide range of functions to assist users in writing analytical queries. Some of these functions and data structures perform distance operations between vectors, [enabling ClickHouse to be used as a vector database](https://clickhouse.com/blog/vector-search-clickhouse-p1). Recently added ClickHouse capabilities like [Approximate Nearest Neighbour (ANN) indices](https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/annindexes) support faster approximate matching of vectors and provide a promising development aimed to further enhance the vector matching capabilities of ClickHouse. In LangChain, some ClickHouse based commercial variant vector stores like [Chroma](https://github.com/hwchase17/langchain/blob/master/langchain/vectorstores/chroma.py) and [MyScale](https://github.com/hwchase17/langchain/blob/master/langchain/vectorstores/myscale.py), etc are already integrated, but for some enterprises with large scale Clickhouse clusters deployment, it will be more straightforward to upgrade existing clickhouse infra instead of moving to another similar vector store solution, so we believe it's a valid requirement to integrate open source version of ClickHouse as vector store. As `clickhouse-connect` is already included by other integrations, this PR won't include any new dependencies. #### Before submitting <!-- If you're adding a new integration, please include: 1. Added a test for the integration: https://github.com/haoch/langchain/blob/clickhouse/tests/integration_tests/vectorstores/test_clickhouse.py 2. Added an example notebook and document showing its use: * Notebook: https://github.com/haoch/langchain/blob/clickhouse/docs/modules/indexes/vectorstores/examples/clickhouse.ipynb * Doc: https://github.com/haoch/langchain/blob/clickhouse/docs/integrations/clickhouse.md See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> 1. Added a test for the integration: https://github.com/haoch/langchain/blob/clickhouse/tests/integration_tests/vectorstores/test_clickhouse.py 2. Added an example notebook and document showing its use: * Notebook: https://github.com/haoch/langchain/blob/clickhouse/docs/modules/indexes/vectorstores/examples/clickhouse.ipynb * Doc: https://github.com/haoch/langchain/blob/clickhouse/docs/integrations/clickhouse.md #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> @hwchase17 @dev2049 Could you please help review? --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-05 13:32:04 -07:00
George Geddes	019eb13681	Fix a typo in the documentation for the Slack document loader (#5745 ) Fixes a typo I noticed while reading the docs.	2023-06-05 13:30:24 -07:00
kourosh hakhamaneshi	625717daa8	docs: Added Deploying LLMs into production + a new ecosystem (#4047 ) Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Co-authored-by: Kamil Kaczmarek <kaczmarek.poczta@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-05 12:47:27 -07:00
Jens Madsen	8d9e9e013c	refactor: extract token text splitter function (#5179 ) # Token text splitter for sentence transformers The current TokenTextSplitter only works with OpenAi models via the `tiktoken` package. This is not clear from the name `TokenTextSplitter`. In this (first PR) a token based text splitter for sentence transformer models is added. In the future I think we should work towards injecting a tokenizer into the TokenTextSplitter to make ti more flexible. Could perhaps be reviewed by @dev2049 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-04 14:41:44 -07:00
Jason Weill	6c11f94013	Retitles Bedrock doc to appear in correct alphabetical order in site nav (#5639 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes #5638. Retitles "Amazon Bedrock" page to "Bedrock" so that the Integrations section of the left nav is properly sorted in alphabetical order. #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-04 14:39:25 -07:00
Harrison Chase	b9040669a0	Harrison/pipeline prompt (#5540 ) idea is to make prompts more composable	2023-06-04 14:29:37 -07:00
mbchang	d3bdb8ea6d	FileCallbackHandler (#5589 ) # like [StdoutCallbackHandler](https://github.com/hwchase17/langchain/blob/master/langchain/callbacks/stdout.py), but writes to a file When running experiments I have found myself wanting to log the outputs of my chains in a more lightweight way than using WandB tracing. This PR contributes a callback handler that writes to file what `StdoutCallbackHandler` would print. <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> ## Example Notebook <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> See the included `filecallbackhandler.ipynb` notebook for usage. Would it be better to include this notebook under `modules/callbacks` or under `integrations/`? ![image](https://github.com/hwchase17/langchain/assets/6439365/c624de0e-343f-4eab-a55b-8808a887489f) ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-03 16:48:48 -07:00
rajib	1c51d3db0f	Created fix for 5475 (#5659 ) Created fix for 5475 Currently in PGvector, we do not have any function that returns the instance of an existing store. The from_documents always adds embeddings and then returns the store. This fix is to add a function that will return the instance of an existing store Also changed the jupyter example for PGVector to show the example of using the function <!-- Remove if not applicable --> Fixes # 5475 #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? @dev2049 @hwchase17 Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: rajib76 <rajib76@yahoo.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-03 16:47:52 -07:00
Michael Landis	475007d63a	fix: correct momento chat history notebook typo and title (#5646 ) This PR corrects a minor typo in the Momento chat message history notebook and also expands the title from "Momento" to "Momento Chat History", inline with other chat history storage providers. #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? cc @dev2049 who reviewed the original integration	2023-06-03 16:39:27 -07:00
Paul-Emile Brotons	92f218207b	removing client+namespace in favor of collection (#5610 ) removing client+namespace in favor of collection for an easier instantiation and to be similar to the typescript library @dev2049	2023-06-03 16:27:31 -07:00
Harrison Chase	ad09367a92	Harrison/pubmed integration (#5664 ) Co-authored-by: younis basher <71520361+younis-ba@users.noreply.github.com> Co-authored-by: Younis Bashir <younis@omicmd.com>	2023-06-03 16:25:28 -07:00
Harrison Chase	9921f8cc3a	Harrison/update azure nb (#5665 ) Co-authored-by: NEWTON MALLICK <38786893+N-E-W-T-O-N@users.noreply.github.com>	2023-06-03 16:25:08 -07:00
C.J. Jameson	4e71a1702b	nit: pgvector python example notebook, fix variable reference (#5595 ) # Your PR Title (What it does) Fixes the pgvector python example notebook : one of the variables was not referencing anything ## Before submitting ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: VectorStores / Retrievers / Memory - @dev2049	2023-06-03 15:29:34 -07:00
Leonid Ganeline	b201cfaa0f	docs `ecosystem/integrations` update 4 (#5590 ) # docs `ecosystem/integrations` update 4 Added missed integrations. Fixed inconsistencies. ## Who can review? @hwchase17 @dev2049	2023-06-03 15:29:03 -07:00
UmerHA	44ad9628c9	QuickFix for FinalStreamingStdOutCallbackHandler: Ignore new lines & white spaces (#5497 ) # Make FinalStreamingStdOutCallbackHandler more robust by ignoring new lines & white spaces `FinalStreamingStdOutCallbackHandler` doesn't work out of the box with `ChatOpenAI`, as it tokenized slightly differently than `OpenAI`. The response of `OpenAI` contains the tokens `["\nFinal", " Answer", ":"]` while `ChatOpenAI` contains `["Final", " Answer", ":"]`. This PR make `FinalStreamingStdOutCallbackHandler` more robust by ignoring new lines & white spaces when determining if the answer prefix has been reached. Fixes #5433 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: Tracing / Callbacks - @agola11 Twitter: [@UmerHAdil](https://twitter.com/@UmerHAdil) \| Discord: RicChilligerDude#7589	2023-06-03 15:05:58 -07:00
Felipe Ferreira	ae2cf1f598	Implements support for Personal Access Token Authentication in the ConfluenceLoader (#5385 ) # Implements support for Personal Access Token Authentication in the ConfluenceLoader Fixes #5191 Implements a new optional parameter for the ConfluenceLoader: `token`. This allows the use of personal access authentication when using the on-prem server version of Confluence. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @eyurtsev @Jflick58 Twitter Handle: felipe_yyc --------- Co-authored-by: Felipe <feferreira@ea.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-03 14:57:49 -07:00
mbchang	ce6dbe41a9	minor refactor GenerativeAgentMemory (#5315 ) # minor refactor of GenerativeAgentMemory <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> - refactor `format_memories_detail` to be more reusable - modified prompts for getting topics for reflection and for generating insights - update `characters.ipynb` to reflect changes ## Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> @vowelparrot @hwchase17 @dev2049	2023-06-03 14:53:14 -07:00
Leonid Ganeline	95c6ed0568	docs: `modules` pages simplified (#5116 ) # docs: modules pages simplified Fixied #5627 issue Merged several repetitive sections in the `modules` pages. Some texts, that were hard to understand, were also simplified. ## Who can review? @hwchase17 @dev2049	2023-06-03 14:44:32 -07:00
Chandan Routray	bc875a9df1	Fixed multi input prompt for MapReduceChain (#4979 ) # Fixed multi input prompt for MapReduceChain Added `kwargs` support for inner chains of `MapReduceChain` via `from_params` method Currently the `from_method` method of intialising `MapReduceChain` chain doesn't work if prompt has multiple inputs. It happens because it uses `StuffDocumentsChain` and `MapReduceDocumentsChain` underneath, both of them require specifying `document_variable_name` if `prompt` of their `llm_chain` has more than one `input`. With this PR, I have added support for passing their respective `kwargs` via the `from_params` method. ## Fixes https://github.com/hwchase17/langchain/issues/4752 ## Who can review? @dev2049 @hwchase17 @agola11 --------- Co-authored-by: imeckr <chandanroutray2012@gmail.com>	2023-06-03 14:41:03 -07:00
Matt Robinson	a97e4252e3	feat: add `UnstructuredExcelLoader` for `.xlsx` and `.xls` files (#5617 ) # Unstructured Excel Loader Adds an `UnstructuredExcelLoader` class for `.xlsx` and `.xls` files. Works with `unstructured>=0.6.7`. A plain text representation of the Excel file will be available under the `page_content` attribute in the doc. If you use the loader in `"elements"` mode, an HTML representation of the Excel file will be available under the `text_as_html` metadata key. Each sheet in the Excel document is its own document. ### Testing ```python from langchain.document_loaders import UnstructuredExcelLoader loader = UnstructuredExcelLoader( "example_data/stanley-cups.xlsx", mode="elements" ) docs = loader.load() ``` ## Who can review? @hwchase17 @eyurtsev	2023-06-03 12:44:12 -07:00
Davis Chase	d784401215	Dev2049/add argilla callback (#5621 ) Co-authored-by: Alvaro Bartolome <alvarobartt@gmail.com> Co-authored-by: Daniel Vila Suero <daniel@argilla.io> Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com> Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>	2023-06-02 09:05:06 -07:00
Jeff Vestal	d1f65d8dc1	Es knn index search 5346 (#5569 ) # Create elastic_vector_search.ElasticKnnSearch class This extends `langchain/vectorstores/elastic_vector_search.py` by adding a new class `ElasticKnnSearch` Features: - Allow creating an index with the `dense_vector` mapping compataible with kNN search - Store embeddings in index for use with kNN search (correct mapping creates HNSW data structure) - Perform approximate kNN search - Perform hybrid BM25 (`query{}`) + kNN (`knn{}`) search - perform knn search by either providing a `query_vector` or passing a hosted `model_id` to use query_vector_builder to automatically generate a query_vector at search time Connection options - Using `cloud_id` from Elastic Cloud - Passing elasticsearch client object search options - query - k - query_vector - model_id - size - source - knn_boost (hybrid search) - query_boost (hybrid search) - fields This also adds examples to `docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb` Fixes # [5346](https://github.com/hwchase17/langchain/issues/5346) cc: @dev2049 --> --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-02 08:40:35 -07:00
Davis Chase	8b3df18bcc	human approval callback (#5581 ) ![Screenshot 2023-06-01 at 2 39 40 PM](https://github.com/hwchase17/langchain/assets/130488702/769f1480-7e51-46d9-bcde-698d0b091803)	2023-06-02 06:59:33 -07:00
Bharat Ramanathan	28d6277396	docs(integration): update colab and external links in WandbTracing docs (#5602 ) # Update Wandb Tracking documentation This PR updates the Wandb Tracking documentation for formatting, updated broken links and colab notebook links --------- Co-authored-by: Bharat Ramanathan <ramanathan.parameshwaran@gohuddl.com>	2023-06-02 02:58:42 -07:00
Davis Chase	4c572ffe95	nit (#5578 )	2023-06-01 14:21:15 -07:00
sseide	001b147450	Documentation fixes (linting and broken links) (#5563 ) # Lint sphinx documentation and fix broken links This PR lints multiple warnings shown in generation of the project documentation (using "make docs_linkcheck" and "make docs_build"). Additionally documentation internal links to (now?) non-existent files are modified to point to existing documents as it seemed the new correct target. The documentation is not updated content wise. There are no source code changes. Fixes # (issue) - broken documentation links to other files within the project - sphinx formatting (linting) ## Before submitting No source code changes, so no new tests added. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-01 13:06:17 -07:00
Ikko Eltociear Ashimine	14a611775c	Fix typo in docugami.ipynb (#5571 ) # Fix typo in docugami.ipynb Fixed typo. infromation -> information	2023-06-01 11:45:56 -07:00
Davis Chase	6afb463e9b	Qdrant self query (#5567 ) Add self query abilities to qdrant vectorstore	2023-06-01 08:40:31 -07:00
Harrison Chase	342b671d05	add brave search util (#5538 ) Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-01 01:11:51 -07:00
Davis Chase	983a213bdc	add maxcompute (#5533 ) cc @pengwork (fresh branch, no creds)	2023-06-01 00:54:42 -07:00
Bharat Ramanathan	22603d19e0	feat(integrations): Add WandbTracer (#4521 ) # WandbTracer This PR adds the `WandbTracer` and deprecates the existing `WandbCallbackHandler`. Added an example notebook under the docs section alongside the `LangchainTracer` Here's an example [colab](https://colab.research.google.com/drive/1pY13ym8ENEZ8Fh7nA99ILk2GcdUQu0jR?usp=sharing) with the same notebook and the [trace](https://wandb.ai/parambharat/langchain-tracing/runs/8i45cst6) generated from the colab run Co-authored-by: Bharat Ramanathan <ramanathan.parameshwaran@gohuddl.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-01 00:01:19 -07:00
Leonid Ganeline	373ad49157	docs `ecosystem/integrations` update 3 (#5470 ) # docs: `ecosystem_integrations` update 3 Next cycle of updating the `ecosystem/integrations` * Added an integration `template` file * Added missed integration files * Fixed several document_loaders/notebooks ## Who can review? Is it possible to assign somebody to review PRs on docs? Thanks.	2023-05-31 17:54:05 -07:00
Tobias van der Werff	8d07ba0d51	Fix wrong class instantiation in docs MMR example (#5501 ) # Fix wrong class instantiation in docs MMR example <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> When looking at the Maximal Marginal Relevance ExampleSelector example at https://python.langchain.com/en/latest/modules/prompts/example_selectors/examples/mmr.html, I noticed that there seems to be an error. Initially, the `MaxMarginalRelevanceExampleSelector` class is used as an `example_selector` argument to the `FewShotPromptTemplate` class. Then, according to the text, a comparison is made to regular similarity search. However, the `FewShotPromptTemplate` still uses the `MaxMarginalRelevanceExampleSelector` class, so the output is the same. To fix it, I added an instantiation of the `SemanticSimilarityExampleSelector` class, because this seems to be what is intended. ## Who can review? @hwchase17	2023-05-31 17:30:59 -07:00
Timothy Ji	bd9e0f3934	Add param requests_kwargs for WebBaseLoader (#5485 ) # Add param `requests_kwargs` for WebBaseLoader Fixes # (issue) #5483 ## Who can review? @eyurtsev	2023-05-31 15:27:38 -07:00
Matt Robinson	4c8aad0d1b	docs: unstructured no longer requires installing detectron2 from source (#5524 ) # Update Unstructured docs to remove the `detectron2` install instructions Removes `detectron2` installation instructions from the Unstructured docs because installing `detectron2` is no longer required for `unstructured>=0.7.0`. The `detectron2` model now runs using the ONNX runtime. ## Who can review? @hwchase17 @eyurtsev	2023-05-31 15:03:21 -07:00
Rithwik Ediga Lakhamsani	d765d77e9b	Add minor fixes for PySpark Document Loader Docs (#5525 ) # Add minor fixes for PySpark Document Loader Docs Renamed "PySpack" to "PySpark" and executed the notebook to show outputs.	2023-05-31 15:02:57 -07:00
James O'Dwyer	226a7521ed	Add Managed Motorhead (#5507 ) # Add Managed Motorhead This change enabled MotorheadMemory to utilize Metal's managed version of Motorhead. We can easily enable this by passing in a `api_key` and `client_id` in order to hit the managed url and access the memory api on Metal. Twitter: [@softboyjimbo](https://twitter.com/softboyjimbo) ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @dev2049 @hwchase17 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-31 14:55:41 -07:00
Leonid Ganeline	6b47aaab82	added DeepLearing.AI course link (#5518 ) # added DeepLearing.AI course link ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: not @hwchase17 - hehe	2023-05-31 14:53:14 -07:00
Piyush Jain	562fdfc8f9	Bedrock llm and embeddings (#5464 ) # Bedrock LLM and Embeddings This PR adds a new LLM and an Embeddings class for the [Bedrock](https://aws.amazon.com/bedrock) service. The PR also includes example notebooks for using the LLM class in a conversation chain and embeddings usage in creating an embedding for a query and document. Note: AWS is doing a private release of the Bedrock service on 05/31/2023; users need to request access and added to an allowlist in order to start using the Bedrock models and embeddings. Please use the [Bedrock Home Page](https://aws.amazon.com/bedrock) to request access and to learn more about the models available in Bedrock. <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-05-31 07:17:01 -07:00
Harrison Chase	5ce74b5958	code splitter docs (#5480 ) Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-31 07:11:53 -07:00
Harrison Chase	470b2822a3	Add matching engine vectorstore (#3350 ) Co-authored-by: Tom Piaggio <tomaspiaggio@google.com> Co-authored-by: scafati98 <jupyter@matchingengine.us-central1-a.c.scafati-joonix.internal> Co-authored-by: scafati98 <scafatieugenio@gmail.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-31 02:28:02 -07:00
Kacper Łukawski	8bcaca435a	Feature: Qdrant filters supports (#5446 ) # Support Qdrant filters Qdrant has an [extensive filtering system](https://qdrant.tech/documentation/concepts/filtering/) with rich type support. This PR makes it possible to use the filters in Langchain by passing an additional param to both the `similarity_search_with_score` and `similarity_search` methods. ## Who can review? @dev2049 @hwchase17 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-31 02:26:16 -07:00
Harrison Chase	f72bb966f8	Harrison/html splitter (#5468 ) Co-authored-by: David Revillas <26328973+r3v1@users.noreply.github.com>	2023-05-30 21:06:07 -07:00
Ankush Gola	1671c2afb2	py tracer fixes (#5377 )	2023-05-30 18:47:06 -07:00
Jose Ignacio Hervás Díaz	ce8b7a2a69	SQLite-backed Entity Memory (#5129 ) # SQLite-backed Entity Memory Following the initiative of https://github.com/hwchase17/langchain/pull/2397 I think it would be helpful to be able to persist Entity Memory on disk by default Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 18:39:47 -07:00
Jeff Vestal	46e181aa8b	Allow ElasticsearchEmbeddings to create a connection with ES Client object (#5321 ) This PR adds a new method `from_es_connection` to the `ElasticsearchEmbeddings` class allowing users to use Elasticsearch clusters outside of Elastic Cloud. Users can create an Elasticsearch Client object and pass that to the new function. The returned object is identical to the one returned by calling `from_credentials` ``` # Create Elasticsearch connection es_connection = Elasticsearch( hosts=['https://es_cluster_url:port'], basic_auth=('user', 'password') ) # Instantiate ElasticsearchEmbeddings using es_connection embeddings = ElasticsearchEmbeddings.from_es_connection( model_id, es_connection, ) ``` I also added examples to the elasticsearch jupyter notebook Fixes # https://github.com/hwchase17/langchain/issues/5239 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 17:26:30 -07:00
Leonid Ganeline	1f11f80641	docs: cleaning (#5413 ) # docs cleaning Changed docs to consistent format (probably, we need an official doc integration template): - ClearML - added product descriptions; changed title/headers - Rebuff - added product descriptions; changed title/headers - WhyLabs - added product descriptions; changed title/headers - Docugami - changed title/headers/structure - Airbyte - fixed title - Wolfram Alpha - added descriptions, fixed title - OpenWeatherMap - - added product descriptions; changed title/headers - Unstructured - changed description ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @hwchase17 @dev2049	2023-05-30 13:58:16 -07:00
ByronHsu	9d658aaa5a	Add more code splitters (go, rst, js, java, cpp, scala, ruby, php, swift, rust) (#5171 ) As the title says, I added more code splitters. The implementation is trivial, so i don't add separate tests for each splitter. Let me know if any concerns. Fixes # (issue) https://github.com/hwchase17/langchain/issues/5170 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @eyurtsev @hwchase17 --------- Signed-off-by: byhsu <byhsu@linkedin.com> Co-authored-by: byhsu <byhsu@linkedin.com>	2023-05-30 11:04:05 -04:00
Paul-Emile Brotons	a61b7f7e7c	adding MongoDBAtlasVectorSearch (#5338 ) # Add MongoDBAtlasVectorSearch for the python library Fixes #5337 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 07:59:01 -07:00
Harrison Chase	c4b502a470	Harrison/condense q llm (#5438 )	2023-05-30 07:15:37 -07:00
Lei Xu	ee57054d05	Rename and fix typo in lancedb (#5425 ) # Fix typo in LanceDB notebook filename	2023-05-30 00:24:17 -07:00
Harrison Chase	760632b292	Harrison/spark reader (#5405 ) Co-authored-by: Rithwik Ediga Lakhamsani <rithwik.ediga@databricks.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-29 20:23:17 -07:00
UmerHA	8259f9b7fa	DocumentLoader for GitHub (#5408 ) # Creates GitHubLoader (#5257) GitHubLoader is a DocumentLoader that loads issues and PRs from GitHub. Fixes #5257 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-29 20:11:21 -07:00
German Martin	0b3e0dd1d2	New Trello document loader (#4767 ) # Added New Trello loader class and documentation Simple Loader on top of py-trello wrapper. With a board name you can pull cards and to do some field parameter tweaks on load operation. I included documentation and examples. Included unit test cases using patch and a fixture for py-trello client class. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-29 19:47:56 -07:00
Harrison Chase	72f99ff953	Harrison/text splitter (#5417 ) adds support for keeping separators around when using recursive text splitter	2023-05-29 16:56:31 -07:00
小铭	cf5803e44c	Add ToolException that a tool can throw. (#5050 ) # Add ToolException that a tool can throw This is an optional exception that tool throws when execution error occurs. When this exception is thrown, the agent will not stop working,but will handle the exception according to the handle_tool_error variable of the tool,and the processing result will be returned to the agent as observation,and printed in pink on the console.It can be used like this: ```python from langchain.schema import ToolException from langchain import LLMMathChain, SerpAPIWrapper, OpenAI from langchain.agents import AgentType, initialize_agent from langchain.chat_models import ChatOpenAI from langchain.tools import BaseTool, StructuredTool, Tool, tool from langchain.chat_models import ChatOpenAI llm = ChatOpenAI(temperature=0) llm_math_chain = LLMMathChain(llm=llm, verbose=True) class Error_tool: def run(self, s: str): raise ToolException('The current search tool is not available.') def handle_tool_error(error) -> str: return "The following errors occurred during tool execution:"+str(error) search_tool1 = Error_tool() search_tool2 = SerpAPIWrapper() tools = [ Tool.from_function( func=search_tool1.run, name="Search_tool1", description="useful for when you need to answer questions about current events.You should give priority to using it.", handle_tool_error=handle_tool_error, ), Tool.from_function( func=search_tool2.run, name="Search_tool2", description="useful for when you need to answer questions about current events", return_direct=True, ) ] agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True, handle_tool_errors=handle_tool_error) agent.run("Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?") ``` ![image](https://github.com/hwchase17/langchain/assets/32786500/51930410-b26e-4f85-a1e1-e6a6fb450ada) ## Who can review? - @vowelparrot --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-29 20:05:58 +00:00
Harrison Chase	2da8c48be1	Harrison/datetime parser (#4693 ) Co-authored-by: Jacob Valdez <jacobfv@msn.com> Co-authored-by: Jacob Valdez <jacob.valdez@limboid.ai> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-05-29 07:52:30 -07:00
Leonid Ganeline	1837caa70d	docs: `ecosystem/integrations` update 1 (#5219 ) # docs: ecosystem/integrations update It is the first in a series of `ecosystem/integrations` updates. The ecosystem/integrations list is missing many integrations. I'm adding the missing integrations in a consistent format: 1. description of the integrated system 2. `Installation and Setup` section with 'pip install ...`, Key setup, and other necessary settings 3. Sections like `LLM`, `Text Embedding Models`, `Chat Models`... with links to correspondent examples and imports of the used classes. This PR keeps new docs, that are presented in the `docs/modules/models/text_embedding/examples` but missed in the `ecosystem/integrations`. The next PRs will cover the next example sections. Also updated `integrations.rst`: added the `Dependencies` section with a link to the packages used in LangChain. ## Who can review? @hwchase17 @eyurtsev @dev2049	2023-05-29 07:25:17 -07:00
Leonid Ganeline	a3598193a0	docs: `ecosystem/integrations` update 2 (#5282 ) # docs: ecosystem/integrations update 2 #5219 - part 1 The second part of this update (parts are independent of each other! no overlap): - added diffbot.md - updated confluence.ipynb; added confluence.md - updated college_confidential.md - updated openai.md - added blackboard.md - added bilibili.md - added azure_blob_storage.md - added azlyrics.md - added aws_s3.md ## Who can review? @hwchase17@agola11 @agola11 @vowelparrot @dev2049	2023-05-29 07:19:43 -07:00
Harrison Chase	d6fb25c439	Harrison/prediction guard update (#5404 ) Co-authored-by: Daniel Whitenack <whitenack.daniel@gmail.com>	2023-05-29 07:14:59 -07:00
Harrison Chase	416c8b1da3	Harrison/deep infra (#5403 ) Co-authored-by: Yessen Kanapin <yessenzhar@gmail.com> Co-authored-by: Yessen Kanapin <yessen@deepinfra.com>	2023-05-29 07:10:50 -07:00
Timothy Ji	100d6655df	Reformat openai proxy setting as code (#5330 ) # Reformat the openai proxy setting as code Only affect the doc for openai Model - @hwchase17 - @agola11	2023-05-29 07:02:47 -07:00
Oleh Kuznetsov	f6615cac41	Update llamacpp demonstration notebook (#5344 ) # Update llamacpp demonstration notebook Add instructions to install with BLAS backend, and update the example of model usage. Fixes #5071. However, it is more like a prevention of similar issues in the future, not a fix, since there was no problem in the framework functionality ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: - @hwchase17 - @agola11	2023-05-29 06:43:26 -07:00
Martin Holecek	44b48d9518	Fix update_document function, add test and documentation. (#5359 ) # Fix for `update_document` Function in Chroma ## Summary This pull request addresses an issue with the `update_document` function in the Chroma class, as described in [#5031](https://github.com/hwchase17/langchain/issues/5031#issuecomment-1562577947). The issue was identified as an `AttributeError` raised when calling `update_document` due to a missing corresponding method in the `Collection` object. This fix refactors the `update_document` method in `Chroma` to correctly interact with the `Collection` object. ## Changes 1. Fixed the `update_document` method in the `Chroma` class to correctly call methods on the `Collection` object. 2. Added the corresponding test `test_chroma_update_document` in `tests/integration_tests/vectorstores/test_chroma.py` to reflect the updated method call. 3. Added an example and explanation of how to use the `update_document` function in the Jupyter notebook tutorial for Chroma. ## Test Plan All existing tests pass after this change. In addition, the `test_chroma_update_document` test case now correctly checks the functionality of `update_document`, ensuring that the function works as expected and updates the content of documents correctly. ## Reviewers @dev2049 This fix will ensure that users are able to use the `update_document` function as expected, without encountering the previous `AttributeError`. This will enhance the usability and reliability of the Chroma class for all users. Thank you for considering this pull request. I look forward to your feedback and suggestions.	2023-05-29 06:39:25 -07:00
Janos Tolgyesi	5f4552391f	Add SKLearnVectorStore (#5305 ) # Add SKLearnVectorStore This PR adds SKLearnVectorStore, a simply vector store based on NearestNeighbors implementations in the scikit-learn package. This provides a simple drop-in vector store implementation with minimal dependencies (scikit-learn is typically installed in a data scientist / ml engineer environment). The vector store can be persisted and loaded from json, bson and parquet format. SKLearnVectorStore has soft (dynamic) dependency on the scikit-learn, numpy and pandas packages. Persisting to bson requires the bson package, persisting to parquet requires the pyarrow package. ## Before submitting Integration tests are provided under `tests/integration_tests/vectorstores/test_sklearn.py` Sample usage notebook is provided under `docs/modules/indexes/vectorstores/examples/sklear.ipynb` Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-28 08:17:42 -07:00
Kenton	881dfe8179	Sample Notebook for DynamoDB Chat Message History (#5351 ) # Sample Notebook for DynamoDB Chat Message History @dev2049 Adding a sample notebook for the DynamoDB Chat Message History class. <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-05-27 21:16:24 -07:00
DanConstantini	c49c6ac97a	Add Chainlit to deployment options (#5314 ) # Add Chainlit to deployment options Add [Chainlit](https://github.com/Chainlit/chainlit) as deployment options Used links to Github examples and Chainlit doc on the LangChain integration Co-authored-by: Dan Constantini <danconstantini@Dan-Constantini-MacBook.local>	2023-05-27 21:12:53 -07:00
Harrison Chase	179ddbe88b	add enum output parser (#5165 )	2023-05-27 20:58:23 -07:00
Leonid Ganeline	465a970724	docs: added link to LangChain Handbook (#5311 ) # added a link to LangChain Handbook ## Who can review? Community members can review the PR once tests pass.	2023-05-27 20:57:40 -07:00
Russ	6e974b5f04	Fix typos (#5323 ) # Documentation typo fixes Fixes # (issue) Simple typos in the blockchain .ipynb documentation	2023-05-26 18:55:21 -07:00
Michael Landis	f75f0dbad6	docs: improve flow of llm caching notebook (#5309 ) # docs: improve flow of llm caching notebook The notebook `llm_caching` demos various caching providers. In the previous version, there was setup common to all examples but under the `In Memory Caching` heading. If a user comes and only wants to try a particular example, they will run the common setup, then the cells for the specific provider they are interested in. Then they will get import and variable reference errors. This commit moves the common setup to the top to avoid this. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @dev2049	2023-05-26 13:34:11 -04:00
Shukri	58e95cd11e	Better docs for weaviate hybrid search (#5290 ) # Better docs for weaviate hybrid search <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Fixes: NA ## Before submitting <!-- If you're adding a new integration, include an integration test and an example notebook showing its use! --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> @dev2049	2023-05-26 09:30:41 -07:00
Xiangrui Meng	aec642febb	LLM wrapper for Databricks (#5142 ) This PR adds LLM wrapper for Databricks. It supports two endpoint types: * serving endpoint * cluster driver proxy app An integration notebook is included to show how it works. Co-authored-by: Davis Chase <130488702+dev2049@users.noreply.github.com> Co-authored-by: Gengliang Wang <gengliang@apache.org> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-25 19:19:37 -07:00
Ted Martinez	1cb6498fdb	Tedma4/twilio tool (#5136 ) # Add twilio sms tool --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-25 19:19:22 -07:00
Moonsik Kang	a0281f5acb	Fixed typo: 'ouput' to 'output' in all documentation (#5272 ) # Fixed typo: 'ouput' to 'output' in all documentation In this instance, the typo 'ouput' was amended to 'output' in all occurrences within the documentation. There are no dependencies required for this change.	2023-05-25 19:18:31 -07:00
Michael Landis	7047a2c1af	feat: add Momento as a standard cache and chat message history provider (#5221 ) # Add Momento as a standard cache and chat message history provider This PR adds Momento as a standard caching provider. Implements the interface, adds integration tests, and documentation. We also add Momento as a chat history message provider along with integration tests, and documentation. [Momento](https://www.gomomento.com/) is a fully serverless cache. Similar to S3 or DynamoDB, it requires zero configuration, infrastructure management, and is instantly available. Users sign up for free and get 50GB of data in/out for free every month. ## Before submitting ✅ We have added documentation, notebooks, and integration tests demonstrating usage. Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-25 19:13:21 -07:00
Nicholas Liu	7652d2abb0	Add Multi-CSV/DF support in CSV and DataFrame Toolkits (#5009 ) Add Multi-CSV/DF support in CSV and DataFrame Toolkits * CSV and DataFrame toolkits now accept list of CSVs/DFs * Add default prompts for many dataframes in `pandas_dataframe` toolkit Fixes #1958 Potentially fixes #4423 ## Testing * Add single and multi-dataframe integration tests for `pandas_dataframe` toolkit with permutations of `include_df_in_prompt` * Add single and multi-CSV integration tests for csv toolkit --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-05-25 14:23:11 -07:00
Ravindra Marella	b3988621c5	Add C Transformers for GGML Models (#5218 ) # Add C Transformers for GGML Models I created Python bindings for the GGML models: https://github.com/marella/ctransformers Currently it supports GPT-2, GPT-J, GPT-NeoX, LLaMA, MPT, etc. See [Supported Models](https://github.com/marella/ctransformers#supported-models). It provides a unified interface for all models: ```python from langchain.llms import CTransformers llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2') print(llm('AI is going to')) ``` It can be used with models hosted on the Hugging Face Hub: ```py llm = CTransformers(model='marella/gpt-2-ggml') ``` It supports streaming: ```py from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler llm = CTransformers(model='marella/gpt-2-ggml', callbacks=[StreamingStdOutCallbackHandler()]) ``` Please see [README](https://github.com/marella/ctransformers#readme) for more details. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-25 13:42:44 -07:00
Davis Chase	ca88b25da6	Zep sdk version (#5267 ) zep-python's sync methods no longer need an asyncio wrapper. This was causing issues with FastAPI deployment. Zep also now supports putting and getting of arbitrary message metadata. Bump zep-python version to v0.30 Remove nest-asyncio from Zep example notebooks. Modify tests to include metadata. --------- Co-authored-by: Daniel Chalef <daniel.chalef@private.org> Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>	2023-05-25 13:42:10 -07:00
Janil Wörst	5525602df0	Docs link custom agent page in getting started (#5250 ) # Docs: link custom agent page in getting started	2023-05-25 13:11:30 -07:00
Davis Chase	3be9ba14f3	OpenSearch top k parameter fix (#5216 ) For most queries it's the `size` parameter that determines final number of documents to return. Since our abstractions refer to this as `k`, set this to be `k` everywhere instead of expecting a separate param. Would be great to have someone more familiar with OpenSearch validate that this is reasonable (e.g. that having `size` and what OpenSearch calls `k` be the same won't lead to any strange behavior). cc @naveentatikonda Closes #5212	2023-05-25 09:51:23 -07:00
Yves Maurer	88ed8e1cd6	Added the option of specifying a proxy for the OpenAI API (#5246 ) # Added the option of specifying a proxy for the OpenAI API Fixes #5243 Co-authored-by: Yves Maurer <>	2023-05-25 09:50:25 -07:00
mwinterde	9c0cb90997	Resolve error in StructuredOutputParser docs (#5240 ) # Resolve error in StructuredOutputParser docs Documentation for `StructuredOutputParser` currently not reproducible, that is, `output_parser.parse(output)` raises an error because the LLM returns a response with an invalid format ```python _input = prompt.format_prompt(question="what's the capital of france") output = model(_input.to_string()) output # ? # # ```json # { # "answer": "Paris", # "source": "https://www.worldatlas.com/articles/what-is-the-capital-of-france.html" # } # ``` ``` Was fixed by adding a question mark to the prompt	2023-05-25 07:47:25 -07:00
Shukri	09e246f306	Weaviate: Add QnA with sources example (#5247 ) # Add QnA with sources example <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Fixes: see https://stackoverflow.com/questions/76207160/langchain-doesnt-work-with-weaviate-vector-database-getting-valueerror/76210017#76210017 ## Before submitting <!-- If you're adding a new integration, include an integration test and an example notebook showing its use! --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> @dev2049	2023-05-25 09:58:33 -04:00
Archon	5cdd9ab7e1	Add MiniMax embeddings (#5174 ) - Add support for MiniMax embeddings Doc: [MiniMax embeddings](https://api.minimax.chat/document/guides/embeddings?id=6464722084cdc277dfaa966a) --------- Co-authored-by: Archon <archongum@outlook.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-25 06:57:49 -07:00
Eugene Yurtsev	5cfa72a130	Bibtex integration for document loader and retriever (#5137 ) # Bibtex integration Wrap bibtexparser to retrieve a list of docs from a bibtex file. * Get the metadata from the bibtex entries * `page_content` get from the local pdf referenced in the `file` field of the bibtex entry using `pymupdf` * If no valid pdf file, `page_content` set to the `abstract` field of the bibtex entry * Support Zotero flavour using regex to get the file path * Added usage example in `docs/modules/indexes/document_loaders/examples/bibtex.ipynb` --------- Co-authored-by: Sébastien M. Popoff <sebastien.popoff@espci.fr> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-25 00:21:31 -07:00
Keno	eff31a3361	Remove API key from docs (#5223 ) I found an API key for `serpapi_api_key` while reading the docs. It seems to have been modified very recently. Removed it in this PR @hwchase17 - project lead	2023-05-24 22:25:39 -07:00
Leonid Ganeline	2ad29f410d	fix a mistake in concepts.md (#5222 ) # fix a mistake in concepts.md ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:	2023-05-24 21:47:22 -07:00
Harrison Chase	a775aa6389	Harrison/vertex (#5049 ) Co-authored-by: Leonid Kuligin <kuligin@google.com> Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru> Co-authored-by: sasha-gitg <44654632+sasha-gitg@users.noreply.github.com> Co-authored-by: Justin Flick <Justinjayflick@gmail.com> Co-authored-by: Justin Flick <jflick@homesite.com>	2023-05-24 15:51:12 -07:00
Davis Chase	dcee8936c1	nit (#5208 )	2023-05-24 12:52:20 -07:00
Alon Diament	44abe925df	Add Joplin document loader (#5153 ) # Add Joplin document loader [Joplin](https://joplinapp.org/) is an open source note-taking app. Joplin has a [REST API](https://joplinapp.org/api/references/rest_api/) for accessing its local database. The proposed `JoplinLoader` uses the API to retrieve all notes in the database and their metadata. Joplin needs to be installed and running locally, and an access token is required. - The PR includes an integration test. - The PR includes an example notebook. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-24 12:31:55 -07:00
Rodrigo Siqueira	f10be072ff	Add Iugu document loader (#5162 ) Create IUGU loader --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-24 11:47:01 -07:00

1 2 3 4 5 ...

1138 Commits