langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-11 19:11:02 +00:00

Author	SHA1	Message	Date
Rohan Dey	41a4c06a94	Added support for a Pandas DataFrame OutputParser (#13257 ) Description: Added support for a Pandas DataFrame OutputParser with format instructions, along with unit tests and a demo notebook. Namely, we've added the ability to request data from a DataFrame, have the LLM parse the request, and then use that request to retrieve a well-formatted response. Within LangChain, it seamlessly integrates with language models like OpenAI's `text-davinci-003`, facilitating streamlined interaction using the format instructions (just like the other output parsers). This parser structures its requests as `<operation/column/row>[<optional_array_params>]`. The instructions detail permissible operations, valid columns, and array formats, ensuring clarity and adherence to the required format. For example: - When the LLM receives the input: "Retrieve the mean of `num_legs` from rows 1 to 3." - The provided format instructions guide the LLM to structure the request as: "mean:num_legs[1..3]". The parser processes this formatted request, leveraging the LLM's understanding to extract the mean of `num_legs` from rows 1 to 3 within the Pandas DataFrame. This integration allows users to communicate requests naturally, with the LLM transforming these instructions into structured commands understood by the `PandasDataFrameOutputParser`. The format instructions act as a bridge between natural language queries and precise DataFrame operations, optimizing communication and data retrieval. Issue: - https://github.com/langchain-ai/langchain/issues/11532 Dependencies: No additional dependencies :) Tag maintainer: @baskaryan Twitter handle: No need. :) --------- Co-authored-by: Wasee Alam <waseealam@protonmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-11-29 22:08:50 -05:00
Masanori Taniguchi	235bdb9fa7	Support Vald secure connection (#13269 ) Description: When using Vald, only insecure grpc connection was supported, so secure connection is now supported. In addition, grpc metadata can be added to Vald requests to enable authentication with a token. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-29 22:07:29 -05:00
sudranga	d1d693b2a7	Fix issue where response_if_no_docs_found is not implemented on async… (#13297 ) Response_if_no_docs_found is not implemented in ConversationalRetrievalChain for async code paths. Implemented it and added test cases Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-11-29 22:06:13 -05:00
AthulVincent	67c55cb5b0	Implemented MongoDB Atlas Self-Query Retriever (#13321 ) # Description This PR implements Self-Query Retriever for MongoDB Atlas vector store. I've implemented the comparators and operators that are supported by MongoDB Atlas vector store according to the section titled "Atlas Vector Search Pre-Filter" from https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-stage/. Namely: ``` allowed_comparators = [ Comparator.EQ, Comparator.NE, Comparator.GT, Comparator.GTE, Comparator.LT, Comparator.LTE, Comparator.IN, Comparator.NIN, ] """Subset of allowed logical operators.""" allowed_operators = [ Operator.AND, Operator.OR ] ``` Translations from comparators/operators to MongoDB Atlas filter operators(you can find the syntax in the "Atlas Vector Search Pre-Filter" section from the previous link) are done using the following dictionary: ``` map_dict = { Operator.AND: "$and", Operator.OR: "$or", Comparator.EQ: "$eq", Comparator.NE: "$ne", Comparator.GTE: "$gte", Comparator.LTE: "$lte", Comparator.LT: "$lt", Comparator.GT: "$gt", Comparator.IN: "$in", Comparator.NIN: "$nin", } ``` In visit_structured_query() the filters are passed as "pre_filter" and not "filter" as in the MongoDB link above since langchain's implementation of MongoDB atlas vector store(libs\langchain\langchain\vectorstores\mongodb_atlas.py) in _similarity_search_with_score() sets the "filter" key to have the value of the "pre_filter" argument. ``` params["filter"] = pre_filter ``` Test cases and documentation have also been added. # Issue #11616 # Dependencies No new dependencies have been added. # Documentation I have created the notebook mongodb_atlas_self_query.ipynb outlining the steps to get the self-query mechanism working. I worked closely with [@Farhan-Faisal](https://github.com/Farhan-Faisal) on this PR. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-29 22:05:06 -05:00
Josef Zoller	c2e3963da4	Merriam-Webster Dictionary Tool (#12044 ) # Description We implemented a simple tool for accessing the Merriam-Webster Collegiate Dictionary API (https://dictionaryapi.com/products/api-collegiate-dictionary). Here's a simple usage example: ```py from langchain.llms import OpenAI from langchain.agents import load_tools, initialize_agent, AgentType llm = OpenAI() tools = load_tools(["serpapi", "merriam-webster"], llm=llm) # Serp API gives our agent access to Google agent = initialize_agent( tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True ) agent.run("What is the english word for the german word Himbeere? Define that word.") ``` Sample output: ``` > Entering new AgentExecutor chain... I need to find the english word for Himbeere and then get the definition of that word. Action: Search Action Input: "English word for Himbeere" Observation: {'type': 'translation_result'} Thought: Now I have the english word, I can look up the definition. Action: MerriamWebster Action Input: raspberry Observation: Definitions of 'raspberry': 1. rasp-ber-ry, noun: any of various usually black or red edible berries that are aggregate fruits consisting of numerous small drupes on a fleshy receptacle and that are usually rounder and smaller than the closely related blackberries 2. rasp-ber-ry, noun: a perennial plant (genus Rubus) of the rose family that bears raspberries 3. rasp-ber-ry, noun: a sound of contempt made by protruding the tongue between the lips and expelling air forcibly to produce a vibration; broadly : an expression of disapproval or contempt 4. black raspberry, noun: a raspberry (Rubus occidentalis) of eastern North America that has a purplish-black fruit and is the source of several cultivated varieties —called also blackcap Thought: I now know the final answer. Final Answer: Raspberry is an english word for Himbeere and it is defined as any of various usually black or red edible berries that are aggregate fruits consisting of numerous small drupes on a fleshy receptacle and that are usually rounder and smaller than the closely related blackberries. > Finished chain. ``` # Issue This closes #12039. # Dependencies We added no extra dependencies. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Lara <63805048+larkgz@users.noreply.github.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-11-29 20:28:29 -05:00
Mohammad Mohtashim	f3dd4a10cf	DROP BOX Loader Documentation Update (#14047 ) - Description: Update the document for drop box loader + made the messages more verbose when loading pdf file since people were getting confused - Issue: #13952 - Tag maintainer: @baskaryan, @eyurtsev, @hwchase17, --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-29 17:25:35 -08:00
Cheng (William) Huang	a00db4b28f	Add multi-input Reddit search tool (#13893 ) - Description: Added a tool called RedditSearchRun and an accompanying API wrapper, which searches Reddit for posts with support for time filtering, post sorting, query string and subreddit filtering. - Issue: #13891 - Dependencies: `praw` module is used to search Reddit - Tag maintainer: @baskaryan , and any of the other maintainers if needed - Twitter handle: None. Hello, This is our first PR and we hope that our changes will be helpful to the community. We have run `make format`, `make lint` and `make test` locally before submitting the PR. To our knowledge, our changes do not introduce any new errors. Our PR integrates the `praw` package which is already used by RedditPostsLoader in LangChain. Nonetheless, we have added integration tests and edited unit tests to test our changes. An example notebook is also provided. These changes were put together by me, @Anika2000, @CharlesXu123, and @Jeremy-Cheng-stack Thank you in advance to the maintainers for their time. --------- Co-authored-by: What-Is-A-Username <49571870+What-Is-A-Username@users.noreply.github.com> Co-authored-by: Anika2000 <anika.sultana@mail.utoronto.ca> Co-authored-by: Jeremy Cheng <81793294+Jeremy-Cheng-stack@users.noreply.github.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-11-29 20:16:40 -05:00
Jawad Arshad	00a6e8962c	langchain[minor]: Add serpapi tools (#13934 ) - Description: Added some of the more endpoints supported by serpapi that are not suported on langchain at the moment, like google trends, google finance, google jobs, and google lens - Issue: [Add support for many of the querying endpoints with serpapi #11811](https://github.com/langchain-ai/langchain/issues/11811) --------- Co-authored-by: zushenglu <58179949+zushenglu@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Ian Xu <ian.xu@mail.utoronto.ca> Co-authored-by: zushenglu <zushenglu1809@gmail.com> Co-authored-by: KevinT928 <96837880+KevinT928@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-29 14:02:57 -08:00
h3l	dbaeb163aa	langchain[minor]: add volcengine endpoint as LLM (#13942 ) - Description: Volc Engine MaaS serves as an enterprise-grade, large-model service platform designed for developers. You can visit its homepage at https://www.volcengine.com/docs/82379/1099455 for details. This change will facilitate developers to integrate quickly with the platform. - Issue: None - Dependencies: volcengine - Tag maintainer: @baskaryan - Twitter handle: @he1v3tica --------- Co-authored-by: lvzhong <lvzhong@bytedance.com>	2023-11-29 13:16:42 -08:00
Mohammad Ahmad	1600ebe6c7	langchain[patch]: Mask API key for ForeFrontAI LLM (#14013 ) - Description: Mask API key for ForeFrontAI LLM and associated unit tests - Issue: https://github.com/langchain-ai/langchain/issues/12165 - Dependencies: N/A - Tag maintainer: @eyurtsev - Twitter handle: `__mmahmad__` I made the API key non-optional since linting required adding validation for None, but the key is required per documentation: https://python.langchain.com/docs/integrations/llms/forefrontai	2023-11-29 13:12:19 -08:00
yoch	a0e859df51	langchain[patch]: fix cohere reranker init #12899 (#14029 ) - Description: use post field validation for `CohereRerank` - Issue: #12899 and #13058 - Dependencies: - Tag maintainer: @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-29 12:57:06 -08:00
123-fake-st	9bd6e9df36	update pdf document loaders' metadata source to url for online pdf (#13274 ) - Description: Update 5 pdf document loaders in `langchain.document_loaders.pdf`, to store a url in the metadata (instead of a temporary, local file path) if the user provides a web path to a pdf: `PyPDFium2Loader`, `PDFMinerLoader`, `PDFMinerPDFasHTMLLoader`, `PyMuPDFLoader`, and `PDFPlumberLoader` were updated. - The updates follow the approach used to update `PyPDFLoader` for the same behavior in #12092 - The `PyMuPDFLoader` changes required additional work in updating `langchain.document_loaders.parsers.pdf.PyMuPDFParser` to be able to process either an `io.BufferedReader` (from local pdf) or `io.BytesIO` (from online pdf) - The `PDFMinerPDFasHTMLLoader` change used a simpler approach since the metadata is assigned by the loader and not the parser - Issue: Fixes #7034 - Dependencies: None ```python # PyPDFium2Loader example: # old behavior >>> from langchain.document_loaders import PyPDFium2Loader >>> loader = PyPDFium2Loader('https://arxiv.org/pdf/1706.03762.pdf') >>> docs = loader.load() >>> docs[0].metadata {'source': '/var/folders/7z/d5dt407n673drh1f5cm8spj40000gn/T/tmpm5oqa92f/tmp.pdf', 'page': 0} # new behavior >>> from langchain.document_loaders import PyPDFium2Loader >>> loader = PyPDFium2Loader('https://arxiv.org/pdf/1706.03762.pdf') >>> docs = loader.load() >>> docs[0].metadata {'source': 'https://arxiv.org/pdf/1706.03762.pdf', 'page': 0} ```	2023-11-29 15:07:46 -05:00
Toshish Jawale	6f64cb5078	Remove deprecated param and flexibility for prompt (#13310 ) - Description: Updated to remove deprecated parameter penalty_alpha, and use string variation of prompt rather than json object for better flexibility. - Issue: the issue # it fixes (if applicable), - Dependencies: N/A - Tag maintainer: @eyurtsev - Twitter handle: @symbldotai --------- Co-authored-by: toshishjawale <toshish@symbl.ai> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-11-29 14:48:25 -05:00
Tomaz Bratanic	3eb391561b	langchain[minor]: Reduce the number of tokens required to describe a Cypher/Neo4j schema (#13851 ) Instead of using JSON-like syntax to describe node and relationship properties we changed to a shorter and more concise schema description Old: ``` Node properties are the following: [{'properties': [{'property': 'name', 'type': 'STRING'}], 'labels': 'Movie'}, {'properties': [{'property': 'name', 'type': 'STRING'}], 'labels': 'Actor'}] Relationship properties are the following: [] The relationships are the following: ['(:Actor)-[:ACTED_IN]->(:Movie)'] ``` New: ``` Node properties are the following: Movie {name: STRING},Actor {name: STRING} Relationship properties are the following: The relationships are the following: (:Actor)-[:ACTED_IN]->(:Movie) ```	2023-11-29 11:13:12 -08:00
Sauhaard	7ec4dbeb80	langchain[minor]: Add StackExchange API integration (#14002 ) Implements [#12115](https://github.com/langchain-ai/langchain/issues/12115) Who can review? @baskaryan , @eyurtsev , @hwchase17 Integrated Stack Exchange API into Langchain, enabling access to diverse communities within the platform. This addition enhances Langchain's capabilities by allowing users to query Stack Exchange for specialized information and engage in discussions. The integration provides seamless interaction with Stack Exchange content, offering content from varied knowledge repositories. A notebook example and test cases were included to demonstrate the functionality and reliability of this integration. - Add StackExchange as a tool. - Add unit test for the StackExchange wrapper and tool. - Add documentation for the StackExchange wrapper and tool. If you have time, could you please review the code and provide any feedback as necessary! My team is welcome to any suggestions. --------- Co-authored-by: Yuval Kamani <yuvalkamani@gmail.com> Co-authored-by: Aryan Thakur <aryanthakur@Aryans-MacBook-Pro.local> Co-authored-by: Manas1818 <79381912+manas1818@users.noreply.github.com> Co-authored-by: aryan-thakur <61063777+aryan-thakur@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-29 10:32:07 -08:00
Bagatur	d4405bc94e	langchain[patch]: Release 0.0.343 (#14037 )	2023-11-29 10:31:03 -08:00
Yves Zumbühl	9c0ad0cebb	langchain[patch]: Improve HyDe with custom prompts and ability to supply the run_manager (#14016 ) - Description: The class allows to only select between a few predefined prompts from the paper. That is not ideal, since other use cases might need a custom prompt. The changes made allow for this. To be able to monitor those, I also added functionality to supply a custom run_manager. - Issue: no issue, but a new feature, - Dependencies: none, - Tag maintainer: @hwchase17, - Twitter handle: @yvesloy --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-29 09:40:53 -08:00
Chad Norvell	1c4bfb8c5f	langchain[patch]: Mathpix PDF loader supports arbitrary extra params (#13950 ) - Description: Support providing whatever extra parameters you want to the Mathpix PDF loader API request. - Issue: #12773 - Dependencies: None --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-29 02:12:32 -08:00
Unai Garay Maestre	9e2ae866c4	langchain[patch]: Adds progress bar to GooglePalmEmbeddings (#13812 ) - Description: Adds a tqdm progress bar to GooglePalmEmbeddings when embedding a list. - Issue: #13637 - Dependencies: TQDM as a main dependency (instead of extra) Signed-off-by: ugm2 <unaigaraymaestre@gmail.com> --------- Signed-off-by: ugm2 <unaigaraymaestre@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-11-29 01:58:53 -08:00
David Norman	a578076aea	Mask api key for Together LLM (#13981 ) - Description: Add unit tests and mask api key for Together LLM - Issue: the issue https://github.com/langchain-ai/langchain/issues/12165 , - Dependencies: N/A - Tag maintainer: ?, - Twitter handle: N/A --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-11-28 22:57:40 -05:00
Johnny	6463d2d0bd	small fix matching engine AttributeError - object has no attribute (#13763 ) This PR is fixing an attributeError: object endpoint has no attribute "_public_match_client" when using gcp matching engine with private VPC network. @baskaryan, @eyurtsev, @hwchase17. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-11-28 22:42:29 -05:00
Amyh102	750485eaa8	Add object parsing functionality (#13864 ) * Description: Parses huggingface dataset Sequence objects into strings for Document loading. * Issue: Fixes #10674 * Tag maintainter: @baskaryan @eyurtsev --------- Co-authored-by: Amy Han <amyhan@Amys-Air.lan> Co-authored-by: Amy Han <amyhan@Amys-MacBook-Air.local>	2023-11-28 22:33:16 -05:00
ggeutzzang	981f78f920	Fix: (issue #13825 ) Getting an error with DallEAPIWrapper (#13874 ) - Description: As of OpenAI's Python package 1.0, the existing DallEAPIWrapper does not work correctly, so the example in the LangChain Documentation link below does not work either. https://python.langchain.com/docs/integrations/tools/dalle_image_generator Also, since OpenAI only supports DALL-E version 2 or version 3, I modified the DallEAPIWrapper to support it. - Issue: #13825 - Twitter handle: ggeutzzang	2023-11-28 22:31:25 -05:00
Kunal	74045bf5c0	max length attribute for spacy splitter for large docs (#13875 ) For large size documents spacy splitter doesn't work it throws an error as shown in below screenshot. Reason its default max_length is 1000000 and there is no option to increase it. So i added it in this PR. ![image](https://github.com/langchain-ai/langchain/assets/73680423/613625c3-0e21-4834-9aad-2a73cf56eecc) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-28 22:30:26 -05:00
Wang Wei	fe9341a29c	feat: Add ERNIE-Bot-8K model support for ErnieBotChat. (#13716 ) - Description: According to the document https://cloud.baidu.com/doc/WENXINWORKSHOP/s/6lp69is2a, add ERNIE-Bot-8K model support for ErnieBotChat. - Dependencies: Before using the ERNIE-Bot-8K, you should have the model's access authority.	2023-11-28 22:22:23 -05:00
Burak Ömür	0e462b72ef	Update openai/create_llm_result function to consider kwargs (#13815 ) Replace this entire comment with: - Description: updates `create_llm_result` function within `openai.py` to consider latest `params`, - Issue: #8928 - Dependencies: -, - Tag maintainer: - - Twitter handle: [burkomr](https://twitter.com/burkomr) <!-- If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Burak Ömür <burakomur@retorio.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-11-28 22:02:38 -05:00
chyroc	f97ab84c6b	Merge pull request #13907 * feat: mask api_key for jina	2023-11-28 21:24:50 -05:00
nhywieza	9b86fb3fcb	secretStr for baichuan chat model api key (#13946 ) Merge pull request #13946 * secretStr for baichuan chat model api key	2023-11-28 21:20:23 -05:00
卢靖轩	aff1dba252	Merge pull request #13945 * feat: mask api key for nlpcloud	2023-11-28 21:16:36 -05:00
Leonid Kuligin	85bb3a418c	Switched VertexAI models from preview (#13657 ) Replace this entire comment with: - Description: VertexAI models are now GA, moved away from using preview ones from the SDK - Issue: #13606 --------- Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-11-28 20:38:04 -05:00
Erick Friis	5eca1bd93f	Library Licenses (#13300 ) Same change as #8403 but in other libs also updates (c) LangChain Inc. instead of @hwchase17	2023-11-28 17:34:27 -08:00
Bagatur	14799b139a	infra[patch]: add base deps and fix docs lint (#13998 )	2023-11-28 17:27:37 -08:00
Théo LEBRUN	926d4cfda7	Set default region from boto3 session for Bedrock (#13694 ) - Description: Set default region from boto3 session for Bedrock - Issue: #13683	2023-11-28 20:26:54 -05:00
Snow	1a33e5b500	Repair Wikipedia document loader `load_max_docs` and improve test coverage. (#13769 ) Description: Repair Wikipedia document loader `load_max_docs` and improve test coverage. Issue: The Wikipedia document loader was not respecting the `load_max_docs` paramater (not reported) and would always return a maximum of 10 documents. This is because the API wrapper (in `utilities/wikipedia.py`) wasn't passing `top_k_results` to the underlying [Wikipedia library](https://wikipedia.readthedocs.io/en/latest/code.html#module-wikipedia). By default this library returns 10 results. The default number of results for the document loader has been reduced from 100 to 25. This is because loading 100 results takes a very long time and is an inconvenient default. It should possibly be 10. In addition, the documentation for the loader reported that there was a hard limit (300) on the number of documents returned. In actuality 300 is the maximum Wikipedia query character length set by the API wrapper. Tests have been added for the document loader (previously missing) and to test the correct numbers of documents are being returned by each class, both by default, and when overridden. Also repaired is the `assert_docs` test which has been updated to correctly test for the default metadata (which includes `source` in recent releases). Dependencies: nil Tag maintainer: @leo-gan Twitter handle: @queenvictoria	2023-11-28 20:26:40 -05:00
Bob Lin	04c4878306	Remove `python_repl` from _BASE_TOOLS (#13962 ) ### Description: Previously `python_repl` was a built-in tool, but now it has been moved to `langchain_experimental`. When I use `load_tools` I get an error: ```python In [1]: from langchain.agents import load_tools In [2]: load_tools(["python_repl"]) --------------------------------------------------------------------------- ImportError Traceback (most recent call last) Cell In[2], line 1 ----> 1 load_tools(["python_repl"]) File ~/workspace/langchain/libs/langchain/langchain/agents/load_tools.py:530, in load_tools(tool_names, llm, callbacks, kwargs) 528 tool_names.extend(requests_method_tools) 529 elif name in _BASE_TOOLS: --> 530 tools.append(_BASE_TOOLS[name]()) 531 elif name in _LLM_TOOLS: 532 if llm is None: File ~/workspace/langchain/libs/langchain/langchain/agents/load_tools.py:84, in _get_python_repl() 83 def _get_python_repl() -> BaseTool: ---> 84 raise ImportError( 85 "This tool has been moved to langchain experiment. " 86 "This tool has access to a python REPL. " 87 "For best practices make sure to sandbox this tool. " 88 "Read https://github.com/langchain-ai/langchain/blob/master/SECURITY.md " 89 "To keep using this code as is, install langchain experimental and " 90 "update relevant imports replacing 'langchain' with 'langchain_experimental'" 91 ) ImportError: This tool has been moved to langchain experiment. This tool has access to a python REPL. For best practices make sure to sandbox this tool. Read https://github.com/langchain-ai/langchain/blob/master/SECURITY.md To keep using this code as is, install langchain experimental and update relevant imports replacing 'langchain' with 'langchain_experimental' ``` In this case, it will be very confusing. I think it is no longer a built-in tool now, so it can be removed from `_BASE_TOOLS` ### Issue: https://github.com/langchain-ai/langchain/issues/13858, https://github.com/langchain-ai/langchain/issues/13859, https://github.com/langchain-ai/langchain/issues/13856 ### Twitter handle:** [lin_bob57617](https://twitter.com/lin_bob57617)	2023-11-28 20:13:54 -05:00
Leonid Ganeline	52eee458bb	renamed `google_vertex_ai_vector_search` notebook (#13484 ) The `integrations/vectorstores/matchingengine.ipynb` example has the "Google Vertex AI Vector Search" title. This place this Title in the wrong order in the ToC (it is sorted by the file name). - Renamed `integrations/vectorstores/matchingengine.ipynb` into `integrations/vectorstores/google_vertex_ai_vector_search.ipynb`. - Updated a correspondent comment in docstring - Rerouted old URL to a new URL --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-28 16:58:29 -08:00
Leonid Ganeline	bf5787f58b	experimental[patch]: fixed namespace bug (#13585 ) It was : `from langchain.schema.prompts import BasePromptTemplate` but because of the breaking change in the ns, it is now `from langchain.schema.prompt_template import BasePromptTemplate` This bug prevents building the API Reference for the langchain_experimental	2023-11-28 16:40:27 -08:00
Taqi Jaffri	144710ad9a	langchain[minor]: Updated DocugamiLoader, includes breaking changes (#13265 ) There are the following main changes in this PR: 1. Rewrite of the DocugamiLoader to not do any XML parsing of the DGML format internally, and instead use the `dgml-utils` library we are separately working on. This is a very lightweight dependency. 2. Added MMR search type as an option to multi-vector retriever, similar to other retrievers. MMR is especially useful when using Docugami for RAG since we deal with large sets of documents within which a few might be duplicates and straight similarity based search doesn't give great results in many cases. We are @docugami on twitter, and I am @tjaffri --------- Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>	2023-11-28 15:56:22 -08:00
Bagatur	a20e8f8bb0	experimental[patch]: release 0.0.43 (#13570 )	2023-11-28 15:38:09 -08:00
Bagatur	d8fe987ef5	langchain[patch]: release 0.0.342 (#13992 )	2023-11-28 14:34:57 -08:00
david qiu	9fb6805be4	langchain[minor]: Add retriever for Knowledge Bases for Amazon Bedrock (#13980 ) - Description: Adds a retriever implementation for [Knowledge Bases for Amazon Bedrock](https://aws.amazon.com/bedrock/knowledge-bases/), a new service announced at AWS re:Invent, shortly before this PR was opened. This depends on the `bedrock-agent-runtime` service, which will be included in a future version of `boto3` and of `botocore`. We will open a follow-up PR documenting the minimum required versions of `boto3` and `botocore` after that information is available. - Issue: N/A - Dependencies: `boto3>=1.33.2, botocore>=1.33.2` - Tag maintainer: @baskaryan - Twitter handles: `@pjain7` `@dead_letter_q` This PR includes a documentation notebook under `docs/docs/integrations/retrievers`, which I (@dlqqq) have verified independently. EDIT: `bedrock-agent-runtime` service is now included in `boto3>=1.33.2`: `5cf793f493` --------- Co-authored-by: Piyush Jain <piyushjain@duck.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-28 14:10:23 -08:00
Bagatur	1aed2d1f08	core[patch]: release 0.0.7 (#13989 )	2023-11-28 14:05:01 -08:00
David Duong	eb67f07e32	Track RunnableAssign as a separate run trace (#13972 ) Addressing incorrect order being sent to callbacks / tracers, due to the nature of threading --------- Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-11-28 22:02:31 +00:00
Nuno Campos	0f255bb6c4	In Runnable.stream_log build up final_output from adding output chunks (#12781 ) Add arg to omit streamed_output list, in cases where final_output is enough this saves bandwidth <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-28 21:50:41 +00:00
Nuno Campos	970fe23feb	Fixes for opengpts release (#13960 )	2023-11-28 21:49:43 +00:00
David Duong	947daaf833	Exclude Bedrock client and credentials_profile_name fields from serialisation (#13603 )	2023-11-28 16:34:46 -05:00
Bagatur	48fbc5513d	infra[patch], langchain[patch]: fix test deps and upper bound langchain dep on core(#13984 )	2023-11-28 13:26:15 -08:00
Stefano Lottini	1fd724293b	Astra DB vector store, move constructor docstring to class docstring (#13784 ) This PR rearranges the docstring for the `AstraDB` vector store class so as to have all useful information in the _class_ docstring for ease of reading. (incidentally, due to an oversight, the docstring that was in the constructor ended up buried below some lines of code, thereby disappearing altogether from accessibility. Apologies.)	2023-11-28 16:25:44 -05:00
Johannes Foulds	fc40bd4cdb	AnthropicFunctions function_call compatibility (#13901 ) - Description: Updates to `AnthropicFunctions` to be compatible with the OpenAI `function_call` functionality. - Issue: The functionality to indicate `auto`, `none` and a forced function_call was not completely implemented in the existing code. - Dependencies: None - Tag maintainer: @baskaryan , and any of the other maintainers if needed. - Twitter handle: None I have specifically tested this functionality via AWS Bedrock with the Claude-2 and Claude-Instant models.	2023-11-28 16:22:55 -05:00
mengjincn	05ea4fd37d	fix merge None value and non None value error (#13703 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-28 15:49:56 -05:00
Ali Orozgani	32d794f5a3	iMessage loader: implement message content extraction from attributed… (#13634 ) - Description: We are adding functionality to extract message content from the `attributedBody` field of the database, in case the content is not in the `text` field. - Issue: Closes #13326 and #10680 - Dependencies: None. - Tag maintainer: @eyurtsev, @hwchase17 --------- Co-authored-by: onotate <johnp.pham@mail.utoronto.ca>	2023-11-28 15:45:43 -05:00
William FH	e5256bcb69	[Evals] Add Project Tags (#13982 ) Add them to project extra	2023-11-28 11:38:59 -08:00
Nuno Campos	e0bcc98436	infra[patch]: Use langchain core in-tree as a dev dependency (#13957 ) Using the published version means master is broken for contributors whenever we make changes in one lib that depend on the other.	2023-11-28 09:23:43 -08:00
unifyh	2703a1b061	Fix `MarkdownHeaderTextSplitter` not recognizing tilde-fenced code blocks (#13511 ) - Description: Previously `MarkdownHeaderTextSplitter` did not consider tilde-fenced code blocks (https://spec.commonmark.org/0.30/#fenced-code-blocks). This PR fixes that. ````md # Bug caused by previous implementation: ~~~py foo() # This is a comment that would be considered header bar() ~~~ ```` - Tag maintainer: @baskaryan	2023-11-28 11:52:38 -05:00
Leonid Ganeline	7929b26017	office365 toolkit bug fixes (#13618 ) Several bug fixes: - emails: instead of `bcc` the `cc` is used. - errors in the truncation descriptions - no truncation of the `message_search` Several updates: - generalized UTC format - truncation limit can be changed now in _call()	2023-11-28 11:49:24 -05:00
William FH	60309341bd	Eval Error Key (#13974 )	2023-11-28 08:38:30 -08:00
Erick Friis	f9bef600f1	RELEASE: core 0.0.7 (#13973 )	2023-11-28 10:28:28 -05:00
Nicolas Bondoux	e17edc4d0b	RunnableLambda: create afunc instance from func when not provided (#13408 ) Fixes #13407. This workaround consists in letting the RunnableLambda create its self.afunc from its self.func when self.afunc is not provided; the change has no dependency. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Nuno Campos <nuno@langchain.dev>	2023-11-28 11:18:26 +00:00
Nuno Campos	391f200eaa	Implement stream() and astream() for agents (#12783 ) ``` ---- chunk 1 {'actions': [AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})])], 'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]} ---- chunk 2 {'messages': [FunctionMessage(content="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”", name='Search')], 'steps': [AgentStep(action=AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]), observation="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”")]} ---- chunk 3 {'actions': [AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})])], 'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})]} ---- chunk 4 {'messages': [FunctionMessage(content='25 years', name='Search')], 'steps': [AgentStep(action=AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})]), observation='25 years')]} ---- chunk 5 {'actions': [AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})])], 'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})]} ---- chunk 6 {'messages': [FunctionMessage(content='Answer: 3.991298452658078', name='Calculator')], 'steps': [AgentStep(action=AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})]), observation='Answer: 3.991298452658078')]} ---- chunk 7 {'messages': [AIMessage(content="Leonardo DiCaprio's current girlfriend is the Italian model Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 power is approximately 3.99.")], 'output': "Leonardo DiCaprio's current girlfriend is the Italian model " 'Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 ' 'power is approximately 3.99.'} ---- final {'actions': [AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]), AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})]), AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})])], 'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}}), FunctionMessage(content="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”", name='Search'), AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}}), FunctionMessage(content='25 years', name='Search'), AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}}), FunctionMessage(content='Answer: 3.991298452658078', name='Calculator'), AIMessage(content="Leonardo DiCaprio's current girlfriend is the Italian model Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 power is approximately 3.99.")], 'output': "Leonardo DiCaprio's current girlfriend is the Italian model " 'Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 ' 'power is approximately 3.99.', 'steps': [AgentStep(action=AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]), observation="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”"), AgentStep(action=AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})]), observation='25 years'), AgentStep(action=AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})]), observation='Answer: 3.991298452658078')]} ```	2023-11-28 08:11:37 +00:00
Michael Feil	686162670e	langchain[minor]: Adding `infinity` embedding integration. (#13928 ) This adds integation to https://github.com/michaelfeil/infinity. Users requested it in https://github.com/michaelfeil/infinity/issues/36 @saatvikshah Follows my implementation of gradient.ai. Feedback 1: Well done - I love your CI / repo / poetry setup - I adapted a lot in https://github.com/michaelfeil/infinity. Feedback 2: Not so good: The openai integration contains to much reverse engineering - in general projects such as michaelfeil/infinity and huggingface/text-embeddings-inference are compatible to the `pip install openai` package. Reverse engineering like this one is really hindering the use for me: `8e88ba16a8/libs/langchain/langchain/embeddings/openai.py (L347)` `8e88ba16a8/libs/langchain/langchain/embeddings/openai.py (L351)` - it is about preventing 3rd party providers to use the same url + uses interfaces of openai, that are not publically documented.	2023-11-27 16:43:47 -08:00
Bagatur	10a6e7cbb6	langchain[patch], core[patch]: Make common utils public (#13932 ) - rename `langchain_core.chat_models.base._generate_from_stream` -> `generate_from_stream` - rename `langchain_core.chat_models.base._agenerate_from_stream` -> `agenerate_from_stream` - export `langchain_core.utils.utils.build_extra_kwargs` from `langchain_core.utils`	2023-11-27 15:34:46 -08:00
Oleksandr Yaremchuk	c0277d06e8	experimental[patch] Update prompt injection model (#13930 ) - Description: Existing model used for Prompt Injection is quite outdated but we fine-tuned and open-source a new model based on the same model deberta-v3-base from Microsoft - [laiyer/deberta-v3-base-prompt-injection](https://huggingface.co/laiyer/deberta-v3-base-prompt-injection). It supports more up-to-date injections and less prone to false-positives. - Dependencies: No - Tag maintainer: - - Twitter handle: @alex_yaremchuk --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-27 17:56:53 -05:00
Bob Lin	e6ebde9688	experimental[patch]: Add experimental.agent imports (#13839 ) - Description: The experimental package needs to be compatible with the usage of importing agents For example, if i use `from langchain.agents import create_pandas_dataframe_agent`, running the program will prompt the following information: ``` Traceback (most recent call last): File "/Users/dongwm/test/main.py", line 1, in <module> from langchain.agents import create_pandas_dataframe_agent File "/Users/dongwm/test/venv/lib/python3.11/site-packages/langchain/agents/__init__.py", line 87, in __getattr__ raise ImportError( ImportError: create_pandas_dataframe_agent has been moved to langchain experimental. See https://github.com/langchain-ai/langchain/discussions/11680 for more information. Please update your import statement from: `langchain.agents.create_pandas_dataframe_agent` to `langchain_experimental.agents.create_pandas_dataframe_agent`. ``` But when I changed to `from langchain_experimental.agents import create_pandas_dataframe_agent`, it was actually wrong: ```python Traceback (most recent call last): File "/Users/dongwm/test/main.py", line 2, in <module> from langchain_experimental.agents import create_pandas_dataframe_agent ImportError: cannot import name 'create_pandas_dataframe_agent' from 'langchain_experimental.agents' (/Users/dongwm/test/venv/lib/python3.11/site-packages/langchain_experimental/agents/__init__.py) ``` I should use `from langchain_experimental.agents.agent_toolkits import create_pandas_dataframe_agent`. In order to solve the problem and make it compatible, I added additional import code to the langchain_experimental package. Now it can be like this Used `from langchain_experimental.agents import create_pandas_dataframe_agent` - Twitter handle: [lin_bob57617](https://twitter.com/lin_bob57617)	2023-11-27 14:03:47 -08:00
Tyler Titsworth	afcfa2a5e7	langchain[patch]: Add progress bar option to OllamaEmbeddings (#13882 ) - Description: Adds a tqdm progress bar to OllamaEmbeddings when embedding a list. - Issue: Related to #13637, but extended to Ollama. - Dependencies: `tqdm` made a necessary dependency. Thanks to @ugm2 for helping identify a common problem. Embeddings take a very long time to finish on local machines, and require a progress bar to help identify if one should even attempt the workload. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-27 13:56:13 -08:00
jeremyb-data	cd77fba562	Improvement: Weaviate multitenant adddocs (#13827 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: Added a line to pass the tenant parameter to add_data_object - Issue: An extra line added from the fix for #9956 - Dependencies: n/a - Tag maintainer: @baskaryan Tested locally, works as expected with the line change. --------- Co-authored-by: Simon Dai <simon6752@gmail.com>	2023-11-27 12:59:57 -08:00
jiangying	3e30cd8261	NIT: comment typo (#13817 )	2023-11-27 12:59:12 -08:00
Assaf Toledo	ba62ff89cc	BUGFIX: Support for elastic indices that don't return 'metadata' in '_source' (#13903 ) Description: Some Elastic indexes do not return a 'metadata' field in '_source'. However, prior to this PR, the code assumed there always is a 'metadata' field. This PR adds support for cases where the field is missing by adding it manually. Issue: #13869	2023-11-27 12:52:57 -08:00
Enric Soler Rastrollo	c156d0281a	BUGFIX: Use embedding key in azure_cosmos_db index creation (#13919 ) Description: Implement embedding key parametrisation Issue: https://github.com/langchain-ai/langchain/issues/13918 Dependencies: None Tag maintainer: @hwchase17 @izzymsft Twitter handle:@MaddogoS	2023-11-27 12:51:08 -08:00
Bagatur	ac67422a3d	IMPROVEMENT: import Document from core (#13905 )	2023-11-27 12:48:43 -08:00
chyroc	886bc2d50a	IMPROVEMENT: fix qianfan validate_environment typo (#13908 )	2023-11-27 11:17:27 -08:00
Chengzu Ou	4b8e053fe8	FEATURE: Add Databricks Vector Search as a new vector store (#13621 ) Description: This PR adds Databricks Vector Search as a new vector store in LangChain. - [x] Add `DatabricksVectorSearch` in `langchain/vectorstores/` - [x] Unit tests - [x] Add [`databricks-vectorsearch`](https://pypi.org/project/databricks-vectorsearch/) as a new optional dependency We ran the following checks: - `make format` passed ✅ - `make lint` failed but the failures were caused by other files + Files touched by this PR passed the linter ✅ - `make test` passed ✅ - `make coverage` failed but the failures were caused by other files. Tests added by or related to this PR all passed + langchain/vectorstores/databricks_vector_search.py test coverage 94% ✅ - `make spell_check` passed ✅ The example notebook and updates to the [provider's documentation page](https://github.com/langchain-ai/langchain/blob/master/docs/docs/integrations/providers/databricks.md) will be added later in a separate PR. Dependencies: Optional dependency: [`databricks-vectorsearch`](https://pypi.org/project/databricks-vectorsearch/) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-27 11:07:26 -08:00
Leonid Kuligin	25387db432	BUFIX: add support for various OSS images from Vertex Model Garden (#13917 ) - Description: add support for various OSS images from Model Garden - Issue: #13370	2023-11-27 10:31:53 -08:00
Eugene Yurtsev	e186637921	Document Runnable Binding (#13927 ) Document runnable binding	2023-11-27 13:21:27 -05:00
Bagatur	46b3311190	RELEASE: 0.0.341 (#13926 )	2023-11-27 09:51:12 -08:00
umair mehmood	b3e08f9239	improvement: fix chat prompt loading from config (#13818 ) Add loader for loading chat prompt from config file. fixed: #13667 @efriis @baskaryan	2023-11-27 11:39:50 -05:00
Nuno Campos	8a3e0c9afa	Add option to prefix config keys in configurable_alts (#13714 )	2023-11-27 15:25:17 +00:00
ggeutzzang	3749af79ae	DOCS: fixed error in the docstring of RunnablePassthrough class (#13843 ) This pull request addresses an issue found in the example code within the docstring of `libs/core/langchain_core/runnables/passthrough.py` The original code snippet caused a `NameError` due to the missing import of `RunnableLambda`. The error was as follows: ``` 12 return "completion" 13 ---> 14 chain = RunnableLambda(fake_llm) \| { 15 'original': RunnablePassthrough(), # Original LLM output 16 'parsed': lambda text: text[::-1] # Parsing logic NameError: name 'RunnableLambda' is not defined ``` To resolve this, I have modified the example code to include the necessary import statement for `RunnableLambda`. Additionally, I have adjusted the indentation in the code snippet to ensure consistency and readability. The modified code now successfully defines and utilizes `RunnableLambda`, ensuring that users referencing the docstring will have a functional and clear example to follow. There are no related GitHub issues for this particular change. Modified Code: ```python from langchain_core.runnables import RunnablePassthrough, RunnableParallel from langchain_core.runnables import RunnableLambda runnable = RunnableParallel( origin=RunnablePassthrough(), modified=lambda x: x+1 ) runnable.invoke(1) # {'origin': 1, 'modified': 2} def fake_llm(prompt: str) -> str: # Fake LLM for the example return "completion" chain = RunnableLambda(fake_llm) \| { 'original': RunnablePassthrough(), # Original LLM output 'parsed': lambda text: text[::-1] # Parsing logic } chain.invoke('hello') # {'original': 'completion', 'parsed': 'noitelpmoc'} ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-27 00:06:55 -08:00
Dylan Williams	1983a39894	FEATURE: Add OneNote document loader (#13841 ) - Description: Added OneNote document loader - Issue: #12125 - Dependencies: msal Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-26 23:59:52 -08:00
Tomaz Bratanic	1ad65f7a98	BUGFIX: Fix bugs with Cypher validation (#13849 ) Fixes https://github.com/langchain-ai/langchain/issues/13803. Thanks to @sakusaku-rich	2023-11-26 19:30:11 -08:00
Harrison Chase	6a35831128	BUGFIX: export more types (#13886 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-26 19:15:34 -08:00
Yusuf Khan	935f78c944	FEATURE: Add retriever for Outline (#13889 ) - Description: Added a retriever for the Outline API to ask questions on knowledge base - Issue: resolves #11814 - Dependencies: None - Tag maintainer: @baskaryan	2023-11-26 18:56:12 -08:00
Bagatur	0efa59cbb8	RELEASE: 0.0.339rc3 (#13852 )	2023-11-25 10:37:30 -08:00
Bagatur	7222c42077	RELEASE: core 0.0.6 (#13853 )	2023-11-25 10:21:14 -08:00
raelix	c172605ea6	IMPROVEMENT: Added title metadata to GoogleDriveLoader for optional File Loaders (#13832 ) - Description: Simple change, I just added title metadata to GoogleDriveLoader for optional File Loaders - Dependencies: no dependencies - Tag maintainer: @hwchase17 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-24 18:53:55 -08:00
Stefano Lottini	19c68c7652	FEATURE: Astra DB, LLM cache classes (exact-match and semantic cache) (#13834 ) This PR provides idiomatic implementations for the exact-match and the semantic LLM caches using Astra DB as backend through the database's HTTP JSON API. These caches require the `astrapy` library as dependency. Comes with integration tests and example usage in the `llm_cache.ipynb` in the docs. @baskaryan this is the Astra DB counterpart for the Cassandra classes you merged some time ago, tagging you for your familiarity with the topic. Thank you! --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-24 18:53:37 -08:00
Stefano Lottini	272df9dcae	Astra DB, chat message history (#13836 ) This PR adds a chat message history component that uses Astra DB for persistence through the JSON API. The `astrapy` package is required for this class to work. I have added tests and a small notebook, and updated the relevant references in the other docs pages. (@rlancemartin this is the counterpart of the Cassandra equivalent class you so helpfully reviewed back at the end of June) Thank you!	2023-11-24 18:12:29 -08:00
Bagatur	58f7e109ac	BUGFIX: Add import types and typevars from core (#13829 )	2023-11-24 17:04:10 -08:00
Bagatur	751226e067	bump 0.0.339rc2 (#13787 )	2023-11-23 12:50:09 -08:00
Bagatur	300ff01824	RELEASE: core 0.0.5 (#13786 )	2023-11-23 12:23:50 -08:00
Bagatur	72c108b003	IMPROVEMENT: filter global warnings properly (#13754 )	2023-11-22 16:26:37 -08:00
William FH	163bf165ed	Add Batch Size kwarg to the llm start callback (#13483 ) So you can more easily use the token counts directly from the API endpoint for batch size of 1	2023-11-22 14:47:57 -08:00
Bagatur	0be515f720	RELEASE: 0.0.339rc1 (#13746 )	2023-11-22 14:29:49 -08:00
Bagatur	2bc5bd67f7	RELEASE: core 0.0.4 (#13745 )	2023-11-22 13:57:28 -08:00
Bagatur	32d087fcb8	REFACTOR: combine core documents files (#13733 )	2023-11-22 10:10:26 -08:00
William FH	5b90fe5b1c	Fix locking (#13725 )	2023-11-22 07:37:25 -08:00
Bagatur	16af282429	BUGFIX: add prompt imports for backwards compat (#13702 )	2023-11-21 23:04:20 -08:00
Bagatur	e327bb4ba4	IMPROVEMENT: Conditionally import core type hints (#13700 )	2023-11-21 21:38:49 -08:00
dandanwei	d47ee1ae79	BUGFIX: redis vector store overwrites falsey metadata (#13652 ) - Description: This commit fixed the problem that Redis vector store will change the value of a metadata from 0 to empty when saving the document, which should be an un-intended behavior. - Issue: N/A - Dependencies: N/A	2023-11-21 20:16:23 -08:00
Bagatur	a21e84faf7	BUGFIX: llm backwards compat imports (#13698 )	2023-11-21 20:12:35 -08:00
Yujie Qian	ace9e64d62	IMPROVEMENT: VoyageEmbeddings embed_general_texts (#13620 ) - Description: add method embed_general_texts in VoyageEmebddings to support input_type - Issue: - Dependencies: - Tag maintainer: - Twitter handle: @Voyage_AI_	2023-11-21 18:33:07 -08:00
tanujtiwari-at	5064890fcf	BUGFIX: handle tool message type when converting to string (#13626 ) Description: Currently, if we pass in a ToolMessage back to the chain, it crashes with error `Got unsupported message type: ` This fixes it. Tested locally --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-21 18:20:58 -08:00
Josep Pon Farreny	143049c90f	Added partial_variables to BaseStringMessagePromptTemplate.from_template(...) (#13645 ) Description: BaseStringMessagePromptTemplate.from_template was passing the value of partial_variables into cls(...) via *kwargs, rather than passing it to PromptTemplate.from_template. Which resulted in those partial_variables being* lost and becoming required input_variables. Co-authored-by: Josep Pon Farreny <josep.pon-farreny@siemens.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-21 17:48:38 -08:00
Erick Friis	c5ae9f832d	INFRA: Lint for imports (#13632 ) - Adds pydantic/import linting to core - Adds a check for `langchain_experimental` imports to langchain	2023-11-21 17:42:56 -08:00
Erick Friis	131db4ba68	BUGFIX: anthropic models on bedrock (#13629 ) Introduced in #13403	2023-11-21 17:40:29 -08:00
David Ruan	04bddbaba4	BUGFIX: Update bedrock.py to fix provider bug (#13646 ) Provider check was incorrectly failing for anything other than "meta"	2023-11-21 17:28:38 -08:00
Bagatur	dc53523837	IMPROVEMENT: bump core dep 0.0.3 (#13690 )	2023-11-21 15:50:19 -08:00
Bagatur	a208abe6b7	add callback import test (#13689 )	2023-11-21 15:28:49 -08:00
Bagatur	083afba697	BUG: Add core utils imports (#13688 )	2023-11-21 15:25:47 -08:00
Bagatur	c61e30632e	BUG: more core fixes (#13665 ) Fix some circular deps: - move PromptValue into top level module bc both PromptTemplates and OutputParsers import - move tracer context vars to `tracers.context` and import them in functions in `callbacks.manager` - add core import tests	2023-11-21 15:15:48 -08:00
William FH	59df16ab92	Update name (#13676 )	2023-11-21 13:39:30 -08:00
Erick Friis	bfb980b968	CLI 0.0.19 (#13677 )	2023-11-21 12:34:38 -08:00
jakerachleff	249c796785	update langserve to v0.0.30 (#13673 ) Upgrade langserve template version to 0.0.30 to include new improvements	2023-11-21 11:17:47 -08:00
jakerachleff	c6937a2eb4	fix templates dockerfile (#13672 ) - Description: We need to update the Dockerfile for templates to also copy your README.md. This is because poetry requires that a readme exists if it is specified in the pyproject.toml	2023-11-21 11:09:55 -08:00
Bagatur	11614700a4	bump 0.0.339rc0 (#13664 )	2023-11-21 08:41:59 -08:00
Bagatur	d32e511826	REFACTOR: Refactor langchain_core (#13627 ) Changes: - remove langchain_core/schema since no clear distinction b/n schema and non-schema modules - make every module that doesn't end in -y plural - where easy have 1-2 classes per file - no more than one level of nesting in directories - only import from top level core modules in langchain	2023-11-21 08:35:29 -08:00
William FH	17c6551c18	Add error rate (#13568 ) To the in-memory outputs. Separate it out from the outputs so it's present in the dataframe.describe() results	2023-11-21 07:51:30 -08:00
Nuno Campos	8329f81072	Use pytest asyncio auto mode (#13643 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-21 15:00:13 +00:00
Bagatur	99b4f46cbe	REFACTOR: Add core as dep (#13623 )	2023-11-20 14:38:10 -08:00
Harrison Chase	d82cbf5e76	Separate out langchain_core package (#13577 ) Co-authored-by: Nuno Campos <nuno@boringbits.io> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-20 13:09:30 -08:00
Bagatur	e620347a83	RELEASE: bump 339 (#13613 )	2023-11-20 09:56:43 -08:00
Ofer Mendelevitch	52e23e50b1	BUG: Fix search_kwargs in Vectara retriever (#13299 ) - Description: fix a bug that prevented as_retriever() in Vectara to use the desired input arguments - Issue: as_retriever did not pass the arguments properly - Tag maintainer: @baskaryan - Twitter handle: @ofermend	2023-11-20 09:44:43 -08:00
Holt Skinner	1c08dbfb33	IMPROVEMENT: Reduce post-processing time for `DocAIParser` (#13210 ) - Remove `WrappedDocument` introduced in https://github.com/langchain-ai/langchain/pull/11413 - https://github.com/googleapis/python-documentai-toolbox/issues/198 in Document AI Toolbox to improve initialization time for `WrappedDocument` object. @lkuligin @baskaryan @hwchase17	2023-11-20 09:41:44 -08:00
Leonid Kuligin	f3fcdea574	fixed an UnboundLocalError when no documents are found (#12995 ) Replace this entire comment with: - Description: fixed a bug - Issue: the issue # #12780	2023-11-20 09:41:14 -08:00
Stijn Tratsaert	b6f70d776b	VertexAI LLM count_tokens method requires list of prompts (#13451 ) I encountered this during summarization with VertexAI. I was receiving an INVALID_ARGUMENT error, as it was trying to send a list of about 17000 single characters. The [count_tokens method](https://github.com/googleapis/python-aiplatform/blob/main/vertexai/language_models/_language_models.py#L658) made available by Google takes in a list of prompts. It does not fail for small texts, but it does for longer documents because the argument list will be exceeding Googles allowed limit. Enforcing the list type makes it work successfully. This change will cast the input text to count to a list of that single text so that the input format is always correct. [Twitter](https://www.x.com/stijn_tratsaert)	2023-11-20 09:40:48 -08:00
Wang Wei	fe7b40cb2a	feat: add ERNIE-Bot-4 Function Calling (#13320 ) - Description: ERNIE-Bot-Chat-4 Large Language Model adds the ability of `Function Calling` by passing parameters through the `functions` parameter in the request. To simplify function calling for ERNIE-Bot-Chat-4, the `create_ernie_fn_chain()` function has been added. The definition and usage of the `create_ernie_fn_chain()` function is similar to that of the `create_openai_fn_chain()` function. Examples as the follows: ``` import json from langchain.chains.ernie_functions import ( create_ernie_fn_chain, ) from langchain.chat_models import ErnieBotChat from langchain.prompts import ChatPromptTemplate def get_current_news(location: str) -> str: """Get the current news based on the location.' Args: location (str): The location to query. Returs: str: Current news based on the location. """ news_info = { "location": location, "news": [ "I have a Book.", "It's a nice day, today." ] } return json.dumps(news_info) def get_current_weather(location: str, unit: str="celsius") -> str: """Get the current weather in a given location Args: location (str): location of the weather. unit (str): unit of the tempuature. Returns: str: weather in the given location. """ weather_info = { "location": location, "temperature": "27", "unit": unit, "forecast": ["sunny", "windy"], } return json.dumps(weather_info) llm = ErnieBotChat(model_name="ERNIE-Bot-4") prompt = ChatPromptTemplate.from_messages( [ ("human", "{query}"), ] ) chain = create_ernie_fn_chain([get_current_weather, get_current_news], llm, prompt, verbose=True) res = chain.run("北京今天的新闻是什么？") print(res) ``` The running results of the above program are shown below： ``` > Entering new LLMChain chain... Prompt after formatting: Human: 北京今天的新闻是什么？ > Finished chain. {'name': 'get_current_news', 'thoughts': '用户想要知道北京今天的新闻。我可以使用get_current_news工具来获取这些信息。', 'arguments': {'location': '北京'}} ```	2023-11-19 22:36:12 -08:00
Adilkhan Sarsen	10418ab0c1	DeepLake Backwards compatibility fix (#13388 ) - Description: during search with DeepLake some people are facing backwards compatibility issues, this PR fixes it by making search accessible for the older datasets --------- Co-authored-by: adolkhan <adilkhan.sarsen@alumni.nu.edu.kz>	2023-11-19 21:46:01 -08:00
Tyler Hutcherson	190952fe76	IMPROVEMENT: Minor redis improvements (#13381 ) - Description: - Fixes a `key_prefix` bug where passing it in on `Redis.from_existing(...)` did not work properly. Updates doc strings accordingly. - Updates Redis filter classes logic with best practices on typing, string formatting, and handling "empty" filters. - Fixes a bug that would prevent multiple tag filters from being applied together in some scenarios. - Added a whole new filter unit testing module. Also updated code formatting for a number of modules that were failing the `make` commands. - Issue: N/A - Dependencies: N/A - Tag maintainer: @baskaryan - Twitter handle: @tchutch94	2023-11-19 19:15:45 -08:00
Sergey Kozlov	df03267edf	Fix tool arguments formatting in StructuredChatAgent (#10480 ) In the `FORMAT_INSTRUCTIONS` template, 4 curly braces (escaping) are used to get single curly brace after formatting: ``` "{{{ ... }}}}" -> format_instructions.format() -> "{{ ... }}" -> template.format() -> "{ ... }". ``` Tool's `args_schema` string contains single braces `{ ... }`, and is also transformed to `{{{{ ... }}}}` form. But this is not really correct since there is only one `format()` call: ``` "{{{{ ... }}}}" -> template.format() -> "{{ ... }}". ``` As a result we get double curly braces in the prompt: ```` Respond to the human as helpfully and accurately as possible. You have access to the following tools: foo: Test tool FOO, args: {{'tool_input': {{'type': 'string'}}}} # <--- !!! ... Provide only ONE action per $JSON_BLOB, as shown: ``` { "action": $TOOL_NAME, "action_input": $INPUT } ``` ```` This PR fixes curly braces escaping in the `args_schema` to have single braces in the final prompt: ```` Respond to the human as helpfully and accurately as possible. You have access to the following tools: foo: Test tool FOO, args: {'tool_input': {'type': 'string'}} # <--- !!! ... Provide only ONE action per $JSON_BLOB, as shown: ``` { "action": $TOOL_NAME, "action_input": $INPUT } ``` ```` --------- Co-authored-by: Sergey Kozlov <sergey.kozlov@ludditelabs.io>	2023-11-19 18:45:43 -08:00
Wouter Durnez	ef7802b325	Add llama2-13b-chat-v1 support to `chat_models.BedrockChat` (#13403 ) Hi 👋 We are working with Llama2 on Bedrock, and would like to add it to Langchain. We saw a [pull request](https://github.com/langchain-ai/langchain/pull/13322) to add it to the `llm.Bedrock` class, but since it concerns a chat model, we would like to add it to `BedrockChat` as well. - Description: Add support for Llama2 to `BedrockChat` in `chat_models` - Issue: the issue # it fixes (if applicable) [#13316](https://github.com/langchain-ai/langchain/issues/13316) - Dependencies: any dependencies required for this change `None` - Tag maintainer: / - Twitter handle: `@SimonBockaert @WouterDurnez` --------- Co-authored-by: wouter.durnez <wouter.durnez@showpad.com> Co-authored-by: Simon Bockaert <simon.bockaert@showpad.com>	2023-11-19 18:44:58 -08:00
jwbeck97	a93616e972	FEAT: Add azure cognitive health tool (#13448 ) - Description: This change adds an agent to the Azure Cognitive Services toolkit for identifying healthcare entities - Dependencies: azure-ai-textanalytics (Optional) --------- Co-authored-by: James Beck <James.Beck@sa.gov.au> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-19 18:44:01 -08:00
Massimiliano Pronesti	6bf9b2cb51	BUG: Limit Azure OpenAI embeddings chunk size (#13425 ) Hi! This short PR aims at: * Fixing `OpenAIEmbeddings`' check on `chunk_size` when used with Azure OpenAI (thus with openai < 1.0). Azure OpenAI embeddings support at most 16 chunks per batch, I believe we are supposed to take the min between the passed value/default value and 16, not the max - which, I suppose, was introduced by accident while refactoring the previous version of this check from this other PR of mine: #10707 * Porting this fix to the newest class (`AzureOpenAIEmbeddings`) for openai >= 1.0 This fixes #13539 (closed but the issue persists). @baskaryan @hwchase17	2023-11-19 18:34:51 -08:00
Zeyang Lin	e53f59f01a	DOCS: doc-string - langchain.vectorstores.dashvector.DashVector (#13502 ) - Description: There are several mistakes in the sample code in the doc-string of `DashVector` class, and this pull request aims to correct them. The correction code has been tested against latest version (at the time of creation of this pull request) of: `langchain==0.0.336` `dashvector==1.0.6` . - Issue: No issue is created for this. - Dependencies: No dependency is required for this change, <!-- - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), --> - Twitter handle: `zeyanglin` <!-- Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-19 18:24:05 -08:00
John Mai	16f7912e1b	BUG: fix hunyuan appid type (#13496 ) - Description: fix hunyuan appid type - Issue: https://github.com/langchain-ai/langchain/pull/12022#issuecomment-1815627855	2023-11-19 18:23:45 -08:00
Nicolò Boschi	8362bd729b	AstraDB: use includeSimilarity option instead of $similarity (#13512 ) - Description: AstraDB is going to deprecate the `$similarity` projection property in favor of the ´includeSimilarity´ option flag. I moved all the queries to the new format. - Tag maintainer: @hemidactylus - Twitter handle: nicoloboschi	2023-11-19 17:54:35 -08:00
shumpei	7100d586ef	Introduce search_kwargs for Custom Parameters in BingSearchAPIWrapper (#13525 ) Added a `search_kwargs` field to BingSearchAPIWrapper in `bing_search.py,` enabling users to include extra keyword arguments in Bing search queries. This update, like specifying language preferences, adds more customization to searches. The `search_kwargs` seamlessly merge with standard parameters in `_bing_search_results` method. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-19 17:51:02 -08:00
Nicolò Boschi	ad0c3b9479	Fix Astra integration tests (#13520 ) - Description: Fix Astra integration tests that are failing. The `delete` always return True as the deletion is successful if no errors are thrown. I aligned the test to verify this behaviour - Tag maintainer: @hemidactylus - Twitter handle: nicoloboschi --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-19 17:50:49 -08:00
umair mehmood	69d39e2173	fix: VLLMOpenAI -- create() got an unexpected keyword argument 'api_key' (#13517 ) The issue was accuring because of `openai` update in Completions. its not accepting `api_key` and 'api_base' args. The fix is we check for the openai version and if ats v1 then remove these keys from args before passing them to `Compilation.create(...)` when sending from `VLLMOpenAI` Fixed: #13507 @eyu @efriis @hwchase17 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-19 17:49:55 -08:00
Manuel Alemán Cueto	6bc08266e0	Fix for oracle schema parsing stated on the issue #7928 (#13545 ) - Description: In this pull request, we address an issue related to assigning a schema to the SQLDatabase class when utilizing an Oracle database. The current implementation encounters a bug where, upon attempting to execute a query, the alter session parse is not appropriately defined for Oracle, leading to an error, - Issue: #7928, - Dependencies: No dependencies, - Tag maintainer: @baskaryan, --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-19 17:35:27 -08:00
Andrew Teeter	325bdac673	feat: load all namespaces (#13549 ) - Description: This change allows for the `MWDumpLoader` to load all namespaces including custom by default instead of only loading the [default namespaces](https://www.mediawiki.org/wiki/Help:Namespaces#Localisation). - Tag maintainer: @hwchase17	2023-11-19 17:35:17 -08:00
Taranjeet Singh	47451764a7	Add embedchain retriever (#13553 ) Description: This commit adds embedchain retriever along with tests and docs. Embedchain is a RAG framework to create data pipelines. Twitter handle: - [Taranjeet's twitter](https://twitter.com/taranjeetio) and [Embedchain's twitter](https://twitter.com/embedchain) Reviewer @hwchase17 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-19 17:35:03 -08:00
rafly lesmana	420a17542d	fix: Make YoutubeLoader support on demand language translation (#13583 ) Description: Enhance the functionality of YoutubeLoader to enable the translation of available transcripts by refining the existing logic. Issue: Encountering a problem with YoutubeLoader (#13523) where the translation feature is not functioning as expected. Tag maintainers/contributors who might be interested: @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-19 17:34:48 -08:00
Bagatur	78a1f4b264	bump 338, exp 42 (#13564 )	2023-11-18 15:12:07 -08:00
Harrison Chase	f4c0e3cc15	move streaming stdout (#13559 )	2023-11-18 12:24:49 -05:00
Leonid Ganeline	43dad6cb91	BUG fixed `openai_assistant` namespace (#13543 ) BUG: langchain.agents.openai_assistant has a reference as `from langchain_experimental.openai_assistant.base import OpenAIAssistantRunnable` should be `from langchain.agents.openai_assistant.base import OpenAIAssistantRunnable` This prevents building of the API Reference docs	2023-11-17 17:15:33 -08:00
Bassem Yacoube	ff382b7b1b	IMPROVEMENT Adds support for new OctoAI endpoints (#13521 ) small fix to add support for new OctoAI LLM endpoints	2023-11-17 17:15:21 -08:00
William FH	cac849ae86	Use random seed (#13544 ) For default eval llm	2023-11-17 16:33:31 -08:00
Martin Krasser	79ed66f870	EXPERIMENTAL Generic LLM wrapper to support chat model interface with configurable chat prompt format (#8295 ) ## Update 2023-09-08 This PR now supports further models in addition to Lllama-2 chat models. See [this comment](#issuecomment-1668988543) for further details. The title of this PR has been updated accordingly. ## Original PR description This PR adds a generic `Llama2Chat` model, a wrapper for LLMs able to serve Llama-2 chat models (like `LlamaCPP`, `HuggingFaceTextGenInference`, ...). It implements `BaseChatModel`, converts a list of chat messages into the [required Llama-2 chat prompt format](https://huggingface.co/blog/llama2#how-to-prompt-llama-2) and forwards the formatted prompt as `str` to the wrapped `LLM`. Usage example: ```python # uses a locally hosted Llama2 chat model llm = HuggingFaceTextGenInference( inference_server_url="http://127.0.0.1:8080/", max_new_tokens=512, top_k=50, temperature=0.1, repetition_penalty=1.03, ) # Wrap llm to support Llama2 chat prompt format. # Resulting model is a chat model model = Llama2Chat(llm=llm) messages = [ SystemMessage(content="You are a helpful assistant."), MessagesPlaceholder(variable_name="chat_history"), HumanMessagePromptTemplate.from_template("{text}"), ] prompt = ChatPromptTemplate.from_messages(messages) memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True) chain = LLMChain(llm=model, prompt=prompt, memory=memory) # use chat model in a conversation # ... ``` Also part of this PR are tests and a demo notebook. - Tag maintainer: @hwchase17 - Twitter handle: `@mrt1nz` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-17 16:32:13 -08:00
William FH	c56faa6ef1	Add execution time (#13542 ) And warn instead of raising an error, since the chain API is too inconsistent.	2023-11-17 16:04:16 -08:00
pedro-inf-custodio	0fb5f857f9	IMPROVEMENT WebResearchRetriever error handling in urls with connection error (#13401 ) - Description: Added a method `fetch_valid_documents` to `WebResearchRetriever` class that will test the connection for every url in `new_urls` and remove those that raise a `ConnectionError`. - Issue: [Previous PR](https://github.com/langchain-ai/langchain/pull/13353), - Dependencies: None, - Tag maintainer: @efriis Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17.	2023-11-17 14:02:26 -08:00
Piyush Jain	d2335d0114	IMPROVEMENT Neptune graph updates (#13491 ) ## Description This PR adds an option to allow unsigned requests to the Neptune database when using the `NeptuneGraph` class. ```python graph = NeptuneGraph( host='<my-cluster>', port=8182, sign=False ) ``` Also, added is an option in the `NeptuneOpenCypherQAChain` to provide additional domain instructions to the graph query generation prompt. This will be injected in the prompt as-is, so you should include any provider specific tags, for example `<instructions>` or `<INSTR>`. ```python chain = NeptuneOpenCypherQAChain.from_llm( llm=llm, graph=graph, extra_instructions=""" Follow these instructions to build the query: 1. Countries contain airports, not the other way around 2. Use the airport code for identifying airports """ ) ```	2023-11-17 13:49:31 -08:00
William FH	5a28dc3210	Override Keys Option (#13537 ) Should be able to override the global key if you want to evaluate different outputs in a single run	2023-11-17 13:32:43 -08:00
Bagatur	e584b28c54	bump 337 (#13534 )	2023-11-17 12:50:52 -08:00
Bagatur	2e2114d2d0	FEATURE: Runnable with message history (#13418 ) Add RunnableWithMessageHistory class that can wrap certain runnables and manages chat history for them.	2023-11-17 12:00:01 -08:00
Bagatur	0fc3af8932	IMPROVEMENT: update assistants output and doc (#13480 )	2023-11-17 11:58:54 -08:00
Hugues Chocart	35e04f204b	[LLMonitorCallbackHandler] Various improvements (#13151 ) Small improvements for the llmonitor callback handler, like better support for non-openai models. --------- Co-authored-by: vincelwt <vince@lyser.io>	2023-11-16 23:39:36 -08:00
Noah Stapp	c1b041c188	Add Wrapping Library Metadata to MongoDB vector store (#13084 ) Description MongoDB drivers are used in various flavors and languages. Making sure we exercise our due diligence in identifying the "origin" of the library calls makes it best to understand how our Atlas servers get accessed.	2023-11-16 22:20:04 -08:00
Guy Korland	7f8fd70ac4	Add optional arguments to FalkorDBGraph constructor (#13459 ) Description: Add optional arguments to FalkorDBGraph constructor Tag maintainer: baskaryan Twitter handle: @g_korland	2023-11-16 18:15:40 -08:00
chris stucchio	d7f014cd89	Bug: OpenAIFunctionsAgentOutputParser doesn't handle functions with no args (#13467 ) Description/Issue: When OpenAI calls a function with no args, the args are `""` rather than `"{}"`. Then `json.loads("")` blows up. This PR handles it correctly. Dependencies: None	2023-11-16 16:47:05 -08:00
Yujie Qian	41a433fa33	IMPROVEMENT: add input_type to VoyageEmbeddings (#13488 ) - Description: add input_type to VoyageEmbeddings	2023-11-16 16:35:36 -08:00
David Duong	ea6e017b85	Add serialisation arguments to Bedrock and ChatBedrock (#13465 )	2023-11-17 01:33:24 +01:00
Erick Friis	427331d621	IMPROVEMENT Lock pydantic v1 in app template, cli 0.0.18 (#13485 )	2023-11-16 15:22:11 -08:00
Erick Friis	75363f048f	BUG Fix app_name in cli app new (#13482 )	2023-11-16 14:19:35 -08:00
ifduyue	324ab382ad	Use List instead of list (#13443 ) Unify List usages in libs/langchain/langchain/text_splitter.py, only one place it's `list`, all other ocurrences are `List`	2023-11-16 13:15:58 -08:00
Stefano Lottini	b029d9f4e6	Astra DB: minor improvements to docstrings and demo notebook (#13449 ) This PR brings a few minor improvements to the docs, namely class/method docstrings and the demo notebook. - A note on how to control concurrency levels to tune performance in bulk inserts, both in the class docstring and the demo notebook; - Slightly increased concurrency defaults after careful experimentation (still on the conservative side even for clients running on less-than-typical network/hardware specs) - renamed the DB token variable to the standardized `ASTRA_DB_APPLICATION_TOKEN` name (used elsewhere, e.g. in the Astra DB docs) - added a note and a reference (add_text docstring, demo notebook) on allowed metadata field names. Thank you!	2023-11-16 12:48:32 -08:00
Eugene Yurtsev	1e43fd6afe	Add ahandle_event to _all_ (#13469 ) Add ahandle_event for backwards compatibility as it is used by langserve	2023-11-16 12:46:20 -08:00
Harrison Chase	f90249305a	callback refactor (#13372 ) Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-11-16 08:25:09 -08:00
Bagatur	a9b2c943e6	bump 336, exp 44 (#13420 )	2023-11-15 14:08:34 -08:00
Bagatur	1372296dc8	FIX: Infer runnable agent single or multi action (#13412 )	2023-11-15 13:58:14 -08:00
Eugene Yurtsev	accadccf8e	Use secretstr for api keys for javelin-ai-gateway (#13417 ) - Make javelin_ai_gateway_api_key a SecretStr --------- Co-authored-by: Hiroshi Tashiro <hiroshitash@gmail.com>	2023-11-15 16:12:05 -05:00
William FH	ba501b27a0	Fix Runnable Lambda Afunc Repr (#13413 ) Otherwise, you get an error when using async functions. h/t to Chris Ruppelt	2023-11-15 16:11:42 -05:00
Sumukh Sridhara	1726d5dcdd	Merge pull request #13232 * PGVector needs to close its connection if its garbage collected	2023-11-15 15:34:37 -05:00
Nuno Campos	85a77d2c27	IMPROVEMENT Passthrough kwargs in runnable lambda (#13405 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-15 11:45:16 -08:00
Bagatur	76c317ed78	DOCS: update rag use case (#13319 )	2023-11-15 10:54:15 -08:00
Clay Elmore	8823e3831f	FEAT Bedrock cohere embedding support (#13366 ) - Description: adding cohere embedding support to bedrock embedding class - Issue: N/A - Dependencies: None - Tag maintainer: @3coins - Twitter handle: celmore25 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-15 10:19:12 -08:00
Nuno Campos	d5aeff706a	Make it easier to subclass RunnableEach (#13346 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-15 13:12:57 +00:00
竹内謙太	3b5e8bacfa	FEAT Add some properties to NotionDBLoader (#13358 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> fix #13356 Add supports following properties for metadata to NotionDBLoader. - `checkbox` - `email` - `number` - `select` There are no relevant tests for this code to be updated.	2023-11-14 20:31:12 -08:00
Fielding Johnston	37eb44c591	BUG Add limit_to_domains to APIChain based tools (#13367 ) - Description: Adds `limit_to_domains` param to the APIChain based tools (open_meteo, TMDB, podcast_docs, and news_api) - Issue: I didn't open an issue, but after upgrading to 0.0.328 using these tools would throw an error. - Dependencies: N/A - Tag maintainer: @baskaryan Note: I included the trailing / simply because the docs here did `fc886cc303/docs/docs/use_cases/apis.ipynb (L246)` , but I checked the code and it is using `urlparse`. SoI followed the docs since it comes down to stylee.	2023-11-14 19:07:16 -08:00
Bagatur	38180ad25f	bump openai support (#13262 )	2023-11-14 16:50:23 -08:00
Erick Friis	7c3066f9ec	more cli interactivity, bugfix (#13360 )	2023-11-14 14:49:43 -08:00
Predrag Gruevski	d63d4994c0	Bump all libraries to the latest `ruff` version. (#13350 ) This version of `ruff` is the one we'll be using to lint the docs and cookbooks (#12677), so I'm making it used everywhere else too.	2023-11-14 16:00:21 -05:00
Massimiliano Pronesti	344cab0739	IMPROVEMENT: support Openai API v1 for Azure OpenAI completions (#13231 ) Hi, this PR adds support for OpenAI API v1 for Azure OpenAI completion API. @baskaryan @hwchase17 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-14 12:10:18 -08:00
dependabot[bot]	fc886cc303	Bump pyarrow from 13.0.0 to 14.0.1 in /libs/langchain (#13363 ) Bumps [pyarrow](https://github.com/apache/arrow) from 13.0.0 to 14.0.1. <details> <summary>Commits</summary> <ul> <li><a href="`ba53748361`"><code>ba53748</code></a> MINOR: [Release] Update versions for 14.0.1</li> <li><a href="`529f3768fa`"><code>529f376</code></a> MINOR: [Release] Update .deb/.rpm changelogs for 14.0.1</li> <li><a href="`b84bbcac64`"><code>b84bbca</code></a> MINOR: [Release] Update CHANGELOG.md for 14.0.1</li> <li><a href="`f141709763`"><code>f141709</code></a> <a href="https://redirect.github.com/apache/arrow/issues/38607">GH-38607</a>: [Python] Disable PyExtensionType autoload (<a href="https://redirect.github.com/apache/arrow/issues/38608">#38608</a>)</li> <li><a href="`5a37e74198`"><code>5a37e74</code></a> <a href="https://redirect.github.com/apache/arrow/issues/38431">GH-38431</a>: [Python][CI] Update fs.type_name checks for s3fs tests (<a href="https://redirect.github.com/apache/arrow/issues/38455">#38455</a>)</li> <li><a href="`2dcee3f82c`"><code>2dcee3f</code></a> MINOR: [Release] Update versions for 14.0.0</li> <li><a href="`297428cbf2`"><code>297428c</code></a> MINOR: [Release] Update .deb/.rpm changelogs for 14.0.0</li> <li><a href="`3e9734f883`"><code>3e9734f</code></a> MINOR: [Release] Update CHANGELOG.md for 14.0.0</li> <li><a href="`9f90995c8c`"><code>9f90995</code></a> <a href="https://redirect.github.com/apache/arrow/issues/38332">GH-38332</a>: [CI][Release] Resolve symlinks in RAT lint (<a href="https://redirect.github.com/apache/arrow/issues/38337">#38337</a>)</li> <li><a href="`bd61239a32`"><code>bd61239</code></a> <a href="https://redirect.github.com/apache/arrow/issues/35531">GH-35531</a>: [Python] C Data Interface PyCapsule Protocol (<a href="https://redirect.github.com/apache/arrow/issues/37797">#37797</a>)</li> <li>Additional commits viewable in <a href="https://github.com/apache/arrow/compare/go/v13.0.0...go/v14.0.1">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pyarrow&package-manager=pip&previous-version=13.0.0&new-version=14.0.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/langchain-ai/langchain/network/alerts). </details> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>	2023-11-14 14:23:52 -05:00
Erick Friis	c0e6045c0b	cli 0.0.17 (#13359 )	2023-11-14 09:56:18 -08:00
Erick Friis	927824b7cb	CLI interactivity (#13148 ) Will implement more later	2023-11-14 09:53:29 -08:00
billytrend-cohere	2f6fe6ddf3	Fix latest message index (#13355 ) There is a bug which caused the earliest message rather than the latest message being sent	2023-11-14 09:23:25 -08:00
Harrison Chase	be854225c7	add more reasonable arxiv retriever (#13327 )	2023-11-13 20:54:14 -08:00
Krish Dholakia	5a920e14c0	fix litellm openai imports (#13307 )	2023-11-13 17:55:10 -08:00
Bagatur	1c67db4c18	Move OAI assistants to langchain and add callbacks (#13236 )	2023-11-13 17:42:07 -08:00
Erick Friis	280ecfd8eb	IMPROVEMENT redirect root to docs in langserve app template (#13303 )	2023-11-13 15:51:41 -08:00
mertkayhan	9b4974871d	IMPROVEMENT Increase flexibility of ElasticVectorSearch (#6863 ) Hey @rlancemartin, @eyurtsev , I did some minimal changes to the `ElasticVectorSearch` client so that it plays better with existing ES indices. Main changes are as follows: 1. You can pass the dense vector field name into `_default_script_query` 2. You can pass a custom script query implementation and the respective parameters to `similarity_search_with_score` 3. You can pass functions for building page content and metadata for the resulting `Document` <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 4. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-11-13 14:36:03 -08:00
Erick Friis	50a5c919f0	IMPROVEMENT self-query template (#13305 ) - [ ] https://github.com/langchain-ai/langchain/pull/12694#discussion_r1391334719 -> keep date - [x] https://github.com/langchain-ai/langchain/pull/12694#discussion_r1391336586	2023-11-13 14:03:15 -08:00
Yasin	b46f88d364	IMPROVEMENT add license file to subproject (#8403 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> hi! This is pretty straight-forward: The sdist package does not contain the license file (which is needed by e.g. conda) because the package is built from the subdir and can't see the license. I _copied_ the license but since I'm unfamiliar with the projects direction, I'm not sure that's correct. thanks! --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-13 11:48:21 -08:00
Rui Ramos	ff19a62afc	Fix Pinecone cosine relevance score (#8920 ) Fixes: #8207 Description: Pinecone returns scores (not distances) with cosine similarity. The values according to the docs are [-1, 1], although I could never reproduce negative values. This PR ensures that the score returned from Pinecone is preserved, rather than inverted, so the most relevant documents can be filtered (eg when using similarity thresholds) I'll leave this as a draft PR as I couldn't run the tests (my pinecone account might not be enough - some errors were being thrown around namespaces) so hopefully someone who _can_ will pick this up. Maintainers: @rlancemartin, @eyurtsev --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-13 11:47:38 -08:00
Bagatur	2e42ed5de6	Self-query template (#12694 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-13 11:44:19 -08:00
Konstantin Spieß	1e43025bf5	Fix serialization issue in Matching Engine Vector Store (#13266 ) - Description: Fixed a serialization issue in the add_texts method of the Matching Engine Vector Store caused by a typo, leading to an attempt to serialize the json module itself. - Issue: #12154 - Dependencies: ./. - Tag maintainer:	2023-11-13 11:04:11 -08:00
William FH	9169d77cf6	Update error message in evaluation runner (#13296 )	2023-11-13 11:03:20 -08:00
takatost	f22f273f93	FIX: 'from_texts' method in Weaviate with non-existent kwargs param (#11604 ) Due to the possibility of external inputs including UUIDs, there may be additional values in kwargs, while Weaviate's `__init__` method does not support passing extra kwarg parameters. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-13 10:32:20 -08:00
Frank995	971d2b2e34	Add missing filter to max_marginal_relevance_search inner call to max_marginal_relevance_search_by_vector (#13260 ) When calling max_marginal_relevance_search from PGVector the filter param is not carried over to max_marginal_relevance_search_by_vector --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-13 10:31:34 -08:00
chevalmuscle	3ad78e48e2	Use endpoint_url if provided with boto3 session for dynamodb (#11622 ) - Description: Uses `endpoint_url` if provided with a boto3 session. When running dynamodb locally, credentials are required even if invalid. With this change, it will be possible to pass a boto3 session with credentials and specify an endpoint_url --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-13 10:31:16 -08:00
Erick Friis	18acc22f29	Ollama pass kwargs as options instead of top (#13280 ) Noticed params are really in `options` instead while reviewing #12895	2023-11-13 10:28:47 -08:00
刘方瑞	46af56dc4f	Add MyScaleWithoutJSON which allows user to wrap columns into Document's Metadata (#13164 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> Replace this entire comment with: - Description: Add MyScaleWithoutJSON which allows user to wrap columns into Document's Metadata - Tag maintainer: @baskaryan	2023-11-13 10:10:36 -08:00
Michael Landis	2aa13f1e10	chore: bump momento dependency version and refactor search hit usage (#13111 ) Description Bumps the Momento dependency to the latest version and refactors the usage of `SearchHit` in the Momento Vector Index (MVI) vector store integration. This change is a one liner where we use the preferred attribute `score` to read the query-document similarity instead of `distance`. The latest versions of Momento clients will use this attribute going forward. Dependencies Updated the Momento dependency to latest version. Tests 💚 I re-ran the existing MVI integration tests (`tests/integration_tests/vectorstores/test_momento_vector_index.py`) and they pass. Review cc @baskaryan @eyurtsev	2023-11-13 09:12:21 -08:00
kYLe	cc55d2fcee	Add OpenAI API v1 support for ChatAnyscale and fixed a bug with openai_api_key (#13237 ) 1. Add OpenAI API v1 support 2. Fixed a bug to call `get_secret_value` on a str value (values["openai_api_key"])	2023-11-13 09:01:54 -08:00
Govind.S.B	9024593468	added system prompt and template fields to ollama (#13022 ) Description the ollama api now supports passing system prompt and template directly instead of modifying the model file , but the ollama integration in langchain did not have this change updated . The update just adds these two parameters to it ( there are 2 more parameters that are pending to be updated, I was not sure about their utility wrt to langchain ) Refer : `8713ac23a8` Issue : None Applicable Dependencies : None Changed Twitter handle : https://twitter.com/violetto96 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-11-13 08:45:11 -08:00
langchain-infra	f55f67055f	Add dockerfile template (#13240 )	2023-11-13 10:33:01 -05:00
Guillem Orellana Trullols	0f31cd8b49	Remove `_get_kwarg_value` function (#13184 ) `_get_kwarg_value` function is useless, one can rely on python builtin functionalities to do the exact same thing. - Description: Removed `_get_kwarg_value`. Helps with code readability. - Issue: the issue # it fixes (if applicable), - Twitter handle: @Guillem_96	2023-11-13 00:09:54 -08:00
SuperDa Fu	e1c020dfe1	dalle add model parameter (#13201 ) - Description: dalle_image_generator adding a new model parameter, - Issue: N/A, - Dependencies: - Tag maintainer: @hwchase17 - Twitter handle:** --------- Co-authored-by: dafu <xiangbingze@wenru.wang> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Erick Friis <erickfriis@gmail.com>	2023-11-13 00:09:20 -08:00
Dennis de Greef	64e11592bb	Improve CSV reader which can't call .strip() on NoneType (#13079 ) Improve CSV reader which can't call .strip() on NoneType if there are less cells in the row compared to the header <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: I have a CSV file as followed ``` headerA,headerB,headerC v1A,v1B,v1C, v2A,v2B v3A,v3B,v3C ``` In this case, row 2 is missing a value, which results in reading a None type. The strip() method can not be called on None, hence raising. In this PR I am making the change to only call strip if the value if not None. - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-12 23:51:39 -08:00
glad4enkonm	339973db47	Update ollama.py (#12895 ) duplicate option removed Description: An issue fix, http stop option duplicate removed. Issue: the issue #12892 fix Dependencies: no Tag maintainer: @eyurtsev --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-12 23:43:59 -08:00
Isak Nyberg	8f81703d76	Add new models to openai callback (#13244 ) Description: Adding the new models to the openai callback function, info taken from [model announcement](https://platform.openai.com/docs/models) and [pricing](https://openai.com/pricing) A short description for a short PR :)	2023-11-12 12:01:19 -08:00
Bagatur	ea6dd3a550	bump 335 (#13261 )	2023-11-12 11:30:25 -08:00
William FH	a837b03e55	Update langsmith version 0.63 (#13208 )	2023-11-12 11:29:25 -08:00
Harrison Chase	7f1d26160d	update tools (#13243 )	2023-11-12 10:22:54 -08:00
Nuno Campos	8d6faf5665	Make it easier to subclass runnable binding with custom init args (#13189 )	2023-11-11 09:01:17 +00:00
Peter Vandenabeele	7f1964b264	Fix BeautifulSoupTransformer: no more duplicates and correct order of tags + tests (#12596 )	2023-11-11 08:56:37 +00:00
Erick Friis	9c7afa8adb	Upgrade cohere embedding model to v3 (#13219 ) Just updates API docs, doesn't change default param from 2.0 (could be breaking change)	2023-11-10 16:25:58 -08:00
Erick Friis	8fdf15c023	Fix Document Loader Unit Test - Docusaurus (#13228 )	2023-11-10 14:52:01 -08:00
Lee	72ad448daa	feat: Docusaurus Loader (#9138 ) Added a Docusaurus Loader Issue: #6353 I had to implement this for working with the Ionic documentation, and wanted to open this up as a draft to get some guidance on building this out further. I wasn't sure if having it be a light extension of the SitemapLoader was in the spirit of a proper feature for the library -- but I'm grateful for the opportunities Langchain has given me and I'd love to build this out properly for the sake of the community. Any feedback welcome!	2023-11-10 14:21:55 -08:00
Tomaz Bratanic	0dc4ab0be1	Neo4j chat message history (#13008 )	2023-11-10 11:53:34 -08:00
fyasla	d266b3ea4a	issue #12165 mask API key in chat_models/azureml_endpoint module (#12836 ) - Description: `AzureMLChatOnlineEndpoint` object from langchain/chat_models/azureml_endpoint.py safe to print without having any secrets included in raw format in the string representation. - Issue: #12165, - Tag maintainer: @eyurtsev --------- Co-authored-by: Faysal Bougamale <faysal.bougamale@horiba.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-10 14:05:19 -05:00
Anush	52f34de9b7	feat: FastEmbed embedding provider (#13109 ) ## Description: This PR intends to add [Qdrant/FastEmbed](https://qdrant.github.io/fastembed/) as a local embeddings provider, associated tests and documentation. Documentation preview: https://langchain-git-fork-anush008-master-langchain.vercel.app/docs/integrations/text_embedding/fastembed --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-11-10 13:51:52 -05:00
Eugene Yurtsev	b0e8cbe0b3	Add RunnableSequence documentation (#13094 ) Add RunnableSequence documentation	2023-11-10 13:44:43 -05:00
Eugene Yurtsev	869df62736	Document RunnableWithFallbacks (#13088 ) Add documentation to RunnableWithFallbacks	2023-11-10 13:16:21 -05:00
Eugene Yurtsev	8313c218da	Add more runnable documentation (#13083 ) - Adding documentation to the runnable. - Documentation is not organized in the best way for the runnable; i.e., in terms of LCEL vs. other standard methods, will follow up with more edits.	2023-11-10 13:14:57 -05:00
Bagatur	24386e0860	bump 334, exp 40 (#13211 )	2023-11-10 09:43:29 -08:00
Lance Martin	d2e50b3108	Add Chroma multimodal cookbook (#12952 ) Pending: * https://github.com/chroma-core/chroma/pull/1294 * https://github.com/chroma-core/chroma/pull/1293 --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-10 09:43:10 -08:00
The1Bill	55912868da	Update toolkit.py to remove single quotes around table names (#12445 ) Description: Removing the single quote wrapper around the table names in the SQL agent toolkit.py file as it misleads the LLM into querying against tables with single quotes around their names. Issue: #7457 Dependencies: None Tag maintainer: @hwchase17 Twitter handle: None	2023-11-10 06:39:15 -08:00
Nuno Campos	362a446999	Changes to root listener (#12174 ) - Implement config_specs to include session_id - Remove Runnable method and update notebook - Add more details to notebook, eg. show input schema and config schema before and after adding message history --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-11-10 09:53:48 +00:00
Nuno Campos	b2b94424db	Update return type for Runnable.__or__ (#12880 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-10 09:52:38 +00:00
Harrison Chase	0a2b1c7471	improve duck duck go tool (#13165 )	2023-11-09 20:49:39 -08:00
Shinya Maeda	28cc60b347	Fix langchain.llms OpenAI completion doesn't work due to v1 client update (#13099 ) This commit fixes the issue that langchain.llms OpenAI completion stopped working since the V1 openai client update. Replace this entire comment with: - Description: This PR fixes the issue [AttributeError: module 'openai' has no attribute 'Completion'](https://github.com/langchain-ai/langchain/issues/12967) similar to `8e0cb2eb84` and https://github.com/langchain-ai/langchain/pull/12969, - Issue: https://github.com/langchain-ai/langchain/issues/12967, - Dependencies: `openai` v1.x.x client, - Tag maintainer: @baskaryan, - Twitter handle: @dosuken123 Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-09 15:12:19 -08:00
Bagatur	ff43cd6701	OpenAI remove httpx typing (#13154 ) Addresses #13124	2023-11-09 14:32:09 -08:00
Bagatur	8b2a82b5ce	Bagatur/docs smith context (#13139 )	2023-11-09 10:22:49 -08:00
Bagatur	f04cc4b7e1	bump 333 (#13131 )	2023-11-09 07:33:15 -08:00
billytrend-cohere	b346d4a455	Add message to documents (#12552 ) This adds the response message as a document to the rag retriever so users can choose to use this. Also drops document limit. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-09 07:30:48 -08:00
Harrison Chase	5f38770161	Support oai tool call (#13110 ) Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-11-09 07:29:29 -08:00
Holt Skinner	0fc8fd12bd	feat: Vertex AI Search - Add Snippet Retrieval for Non-Advanced Website Data Stores (#13020 ) https://cloud.google.com/generative-ai-app-builder/docs/snippets#snippets --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-11-08 21:52:50 -05:00
Jacob Lee	76283e9625	Adds embeddings filter option to return scores in state (#12489 ) CC @baskaryan @assafelovic	2023-11-08 17:50:06 -08:00
jakerachleff	18601bd4c8	Get project from langchain sdk (#13100 ) ## Description We need to centralize the API we use to get the project name for our tracers. This PR makes it so we always get this from a shared function in the langsmith sdk. ## Dependencies Upgraded langsmith from 0.52 to 0.62 to include the new API `get_tracer_project`	2023-11-08 17:10:12 -08:00
Bagatur	72e12f6bcf	update more azure docs (#13093 )	2023-11-08 14:11:16 -08:00
Bagatur	1703f132c6	update azure embedding docs (#13091 )	2023-11-08 13:39:31 -08:00
Bagatur	9fdfac22c2	bump 332 (#13089 )	2023-11-08 13:23:16 -08:00
Bagatur	1f85ec34d5	bump 331rc3 exp 39 (#13086 )	2023-11-08 13:00:13 -08:00
Anton Troynikov	9f077270c8	Don't pass EF to chroma (#13085 ) - Description: Recently Chroma rolled out a breaking change on the way we handle embedding functions, in order to support multi-modal collections. This broke the way LangChain's `Chroma` objects get created, because we were passing the EF down into the Chroma collection: https://docs.trychroma.com/migration#migration-to-0416---november-7-2023 However, internally, we are never actually using embeddings on the chroma collection - LangChain's `Chroma` object calls it instead. Thus we just don't pass an `embedding_function` to Chroma itself, which fixes the issue.	2023-11-08 12:55:35 -08:00
Erick Friis	f15f8e01cf	Azure OpenAI Embeddings (#13039 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-08 12:37:17 -08:00
David Peterson	37561d8986	Add Proper Import Error (#13042 ) - Description: The issue was not listing the proper import error for amazon textract loader. - Issue: Time wasted trying to figure out what to install... (langchain docs don't list the dependency either) - Dependencies: N/A - Tag maintainer: @sbusso - Twitter handle: @h9ste --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-11-08 10:29:08 -08:00
Eugene Yurtsev	06c503f672	Add RunnableRetry Documentation (#13074 )	2023-11-08 18:20:18 +00:00
Bagatur	55aeff6777	oai assistant multiple actions (#13068 )	2023-11-08 08:25:37 -08:00
Erick Friis	a9b70baef9	cli updates, 0.0.16 (#13034 ) - confirm flags, serve detection - 0.0.16 - always gen code - pip bool	2023-11-08 07:47:30 -08:00
Erick Friis	506f81563f	Update Deps in Experimental (#13029 )	2023-11-07 15:15:09 -08:00
Stefano Lottini	4f4b020582	Add "Astra DB" vector store integration (#12966 ) # Astra DB Vector store integration - Description: This PR adds a `VectorStore` implementation for DataStax Astra DB using its HTTP API - Issue: (no related issue) - Dependencies: A new required dependency is `astrapy` (`>=0.5.3`) which was added to pyptoject.toml, optional, as per guidelines - Tag maintainer: I recently mentioned to @baskaryan this integration was coming - Twitter handle: `@rsprrs` if you want to mention me This PR introduces the `AstraDB` vector store class, extensive integration test coverage, a reworking of the documentation which conflates Cassandra and Astra DB on a single "provider" page and a new, completely reworked vector-store example notebook (common to the Cassandra store, since parts of the flow is shared by the two APIs). I also took care in ensuring docs (and redirects therein) are behaving correctly. All style, linting, typechecks and tests pass as far as the `AstraDB` integration is concerned. I could build the documentation and check it all right (but ran into trouble with the `api_docs_build` makefile target which I could not verify: `Error: Unable to import module 'plan_and_execute.agent_executor' with error: No module named 'langchain_experimental'` was the first of many similar errors) Thank you for a review! Stefano --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-07 14:45:33 -08:00
Yang, Bo	600caff03c	Add `Memorize` tool (#11722 ) - Description: Add `Memorize` tool - Tag maintainer: @hwchase17 This PR added a new tool `Memorize` so that an agent can use it to fine-tune itself. This tool requires `TrainableLLM` introduced in #11721 DEMO: `6a9003d5db` ![image](https://github.com/langchain-ai/langchain/assets/601530/d6f0cb45-54df-4dcf-b143-f8aefb1e76e3)	2023-11-07 12:42:10 -08:00
Bagatur	cf481c9418	bump exp 38 (#13016 )	2023-11-07 11:49:23 -08:00
Bagatur	57e19989f6	Bagatur/oai assistant (#13010 )	2023-11-07 11:44:53 -08:00
Erick Friis	74134dd7e1	cli pyproject updating (#12945 ) `langchain app add` and `langchain app remove` will now keep the dependencies list updated. --------- Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-11-07 11:06:08 -08:00
Bagatur	6175dc30aa	bump 331rc2 (#13006 )	2023-11-07 08:52:17 -08:00
Erick Friis	0c81cd923e	oai v1 embeddings (#12969 ) Initial PR to get OpenAIEmbeddings working with the new sdk fyi @rlancemartin Fixes #12943 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-06 18:52:33 -08:00
Bagatur	fdbb45d79e	bump 331rc1 (#12965 )	2023-11-06 15:36:43 -08:00
Bagatur	3bb8030a6e	fix max_tokens (#12964 )	2023-11-06 15:36:05 -08:00
Bagatur	a9002a82b8	bump 331rc0 (#12963 )	2023-11-06 15:19:33 -08:00
Harrison Chase	c27400efeb	Support multimodal messages (#11320 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-06 15:14:18 -08:00
Bagatur	4f7dff9d66	Record system fingerprint chat openai (#12960 )	2023-11-06 14:25:53 -08:00
Bagatur	8e0cb2eb84	ChatOpenAI and AzureChatOpenAI openai>=1 compatible (#12948 )	2023-11-06 13:24:18 -08:00
Kacper Łukawski	52d0055a91	Add support of Cohere Embed v3 (#12940 ) Cohere released the new embedding API (Embed v3: https://txt.cohere.com/introducing-embed-v3/) that treats document and query embeddings differently. This PR updated the `CohereEmbeddings` to use them appropriately. It also works with the old models.	2023-11-06 15:06:58 -05:00
Praveen Venkateswaran	8e0dcb37d2	Add SecretStr for Symbl.ai Nebula API (#12896 ) Description: This PR masks API key secrets for the Nebula model from Symbl.ai Issue: #12165 Maintainer: @eyurtsev --------- Co-authored-by: Praveen Venkateswaran <praveen.venkateswaran@ibm.com>	2023-11-06 14:13:59 -05:00
Vinzenz Klass	59d0bd2150	feat: acquire advisory lock before creating extension in pgvector (#12935 ) - Description: Acquire advisory lock before attempting to create extension on postgres server, preventing errors in concurrent executions. - Issue: #12933 - Dependencies: None --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-11-06 14:00:39 -05:00
Eugene Yurtsev	b376854b26	Fix for anyscale chat model api key (#12938 ) * ChatAnyscale was missing coercion to SecretStr for anyscale api key * The model inherits from ChatOpenAI so it should not force the openai api key to be secret str until openai model has the same changes https://github.com/langchain-ai/langchain/issues/12841	2023-11-06 13:28:02 -05:00
hmasdev	622bf12c2e	fix regex pattern of structured output parser (#12929 ) - Description: fix the regex pattern of [StructuredChatOutputParser](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/agents/structured_chat/output_parser.py#L18) and add unit tests for the code change. - Issue: #12158 #12922 - Dependencies: None - Tag maintainer: - Twitter handle: @hmdev3 - NOTE: This PR conflicts #7495 . After #7495 is merged, I am going to update PR.	2023-11-06 07:53:14 -08:00
wemysschen	8d7144e6a6	fix baiducloud directory loader import file loader (#12924 ) Issue: fix baiducloud BOS directory loader imports its file loader --------- Co-authored-by: wemysschen <root@icoding-cwx.bcc-szzj.baidu.com>	2023-11-06 07:52:31 -08:00
Kacper Łukawski	621419f71e	Fix normalizing the cosine distance in Qdrant (#12934 ) Qdrant was incorrectly calculating the cosine similarity and returning `0.0` for the best match, instead of `1.0`. Internally Qdrant returns a cosine score from `-1.0` (worst match) to `1.0` (best match), and the current formula reflects it.	2023-11-06 07:36:59 -08:00
Hech	8fe6bcc662	Fix return metadata when searching for DingoDB (#12937 )	2023-11-06 07:35:36 -08:00
Jakub Novák	ada3d2cbd1	Add possibility to pass on_artifacts for a specific conversation (#12687 ) Possibility to pass on_artifacts to a conversation. It can be then achieved by adding this way: ```python result = agent.run( input=message.text, metadata={ "on_artifact": CALLBACK_FUNCTION }, ) ```	2023-11-06 07:29:47 -08:00
Bagatur	53f453f01a	bump 331 (#12932 )	2023-11-06 05:58:12 -08:00
Erick Friis	5000c7308e	cli template gitignores (#12914 ) - ap gitignore - package	2023-11-05 22:34:45 -08:00
Harrison Chase	aba407f774	use keys not items (#12918 )	2023-11-05 22:08:29 -08:00
wemysschen	e14aa37d59	fix bes vector store search (#12828 ) Issue: fix search body in baidu cloud vectorsearch --------- Co-authored-by: wemysschen <root@icoding-cwx.bcc-szzj.baidu.com>	2023-11-03 15:39:19 -07:00
Lance Martin	ea1ab391d4	Open Clip multimodal embeddings (#12754 )	2023-11-03 13:33:36 -07:00
Bagatur	ebee616822	bump 330 (#12853 )	2023-11-03 13:26:41 -07:00
Erick Friis	6c237716c4	Update readmes with new cli install (#12847 ) Old command still works. Just simplifying. Merge after releasing CLI 0.0.15	2023-11-03 12:10:32 -07:00
Erick Friis	7db49d3842	Confirm sys.path includes current dir for app serve (#12851 ) - Make sure sys.path is set properly for langchain app serve - bump	2023-11-03 11:37:20 -07:00
Erick Friis	1bc35f61cb	CLI 0.0.14, Uvicorn update and no more [serve] (#12845 ) Calls uvicorn directly from cli: Reload works if you define app by import string instead of object. (was doing subprocess in order to get reloading) Version bump to 0.0.14 Remove the need for [serve] for simplicity. Readmes are updated in #12847 to avoid cluttering this PR	2023-11-03 11:05:52 -07:00
William FH	18005c6384	Disable trace_on_chain_group auto-tracing (#12807 ) Previously we treated trace_on_chain_group as a command to always start tracing. This is unintuitive (makes the function do 2 things), and makes it harder to toggle tracing	2023-11-03 10:05:09 -07:00
Erick Friis	0da75b9ebd	Autopopulate module name in cli init (#12814 )	2023-11-02 23:45:38 -07:00
William FH	98aff29fbd	Add Dataset Page to printout (#12816 )	2023-11-02 20:36:56 -07:00
Manuel Rech	2e2b9c76d9	Keep also original query - multi_query.py (#12696 ) When you use a MultiQuery it might be useful to use the original query as well as the newly generated ones to maximise the changes to retriever the correct document. I haven't created an issue, it seems a very small and easy thing. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-02 18:15:02 -07:00
Bagatur	658a3a8607	FEAT: Merge TileDB vecstore (#12811 )	2023-11-02 17:40:32 -07:00
Akio Nishimura	c04647bb4e	Correct number of elements in config list in `batch()` and `abatch()` of `BaseLLM` (#12713 ) - Description: Correct number of elements in config list in `batch()` and `abatch()` of `BaseLLM` in case `max_concurrency` is not None. - Issue: #12643 - Twitter handle: @akionux --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-02 17:28:48 -07:00
James Braza	88b506b321	Adds missing `urllib.parse` for IDE warning of `PubMedAPIWrapper` (#12808 ) Resolves an IDE (PyCharm 2023.2.3 PE) warning around `urllib.parse.quote`, also enabling CTRL-click	2023-11-02 17:27:25 -07:00
Bagatur	a2bb0dd445	TileDB update import unit tests	2023-11-02 17:24:22 -07:00
Nikos Papailiou	2fdaa1e5fd	Add TileDB vectorstore implementation (#12624 ) - Description: Add [TileDB](https://tiledb.com) vectorstore implementation. TileDB offers ANN search capabilities using the [TileDB-Vector-Search](https://github.com/TileDB-Inc/TileDB-Vector-Search) module. It provides serverless execution of ANN queries and storage of vector indexes both on local disk and cloud object stores (i.e. AWS S3). More details in: - [Why TileDB as a Vector Database](https://tiledb.com/blog/why-tiledb-as-a-vector-database) - [TileDB 101: Vector Search](https://tiledb.com/blog/tiledb-101-vector-search) - Twitter handle: @tiledb	2023-11-02 17:21:03 -07:00
盐粒 Yanli	1b233798a0	feat: Supprt pgvecto.rs as a VectorStore (#12718 ) Supprt [pgvecto.rs](https://github.com/tensorchord/pgvecto.rs) as a new VectorStore type. This introduces a new dependency [pgvecto_rs](https://pypi.org/project/pgvecto_rs/) and upgrade SQLAlchemy to ^2. Relate to https://github.com/tensorchord/pgvecto.rs/issues/11 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-02 17:16:04 -07:00
Daniel Chalef	0cbdba6a9b	zep: VectorStore: Use Native MMR (#12690 ) - refactor to use Zep's native MMR; update example - @baskaryan @eyurtsev	2023-11-02 16:45:42 -07:00
Daniel Chalef	cc3d3920e3	Zep: Summary Search and Example (#12686 ) Zep now has the ability to search over chat history summaries. This PR adds support for doing so. More here: https://blog.getzep.com/zep-v0-17/ @baskaryan @eyurtsev	2023-11-02 16:31:11 -07:00
Bagatur	526313002c	add import tests to all modules (#12806 )	2023-11-02 15:32:55 -07:00
Harrison Chase	6609a6033f	fix vectorstore imports (#12804 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-02 15:32:31 -07:00
Nuno Campos	f66a9d2adf	Automatically add configurable key to config_schema if config_specs i… (#12798 ) …s present <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-02 21:46:15 +00:00
Praveen Venkateswaran	21eeba075c	enable the device_map parameter in huggingface pipeline (#12731 ) ### Enabling `device_map` in HuggingFacePipeline For multi-gpu settings with large models, the [accelerate](https://huggingface.co/docs/accelerate/usage_guides/big_modeling#using--accelerate) library provides the `device_map` parameter to automatically distribute the model across GPUs / disk. The [Transformers pipeline](`3520e37e86/src/transformers/pipelines/__init__.py (L543)`) enables users to specify `device` (or) `device_map`, and handles cases (with warnings) when both are specified. However, Langchain's HuggingFacePipeline only supports specifying `device` when calling transformers which limits large models and multi-gpu use-cases. Additionally, the [default value](`8bd3ce59cd/libs/langchain/langchain/llms/huggingface_pipeline.py (L72)`) of `device` is initialized to `-1` , which is incompatible with the transformers pipeline when `device_map` is specified. This PR addresses the addition of `device_map` as a parameter , and solves the incompatibility of `device = -1` when `device_map` is also specified. An additional test has been added for this feature. Additionally, some existing tests no longer work since 1. `max_new_tokens` has to be specified under `pipeline_kwargs` and not `model_kwargs` 2. The GPT2 tokenizer raises a `ValueError: Pipeline with tokenizer without pad_token cannot do batching`, since the `tokenizer.pad_token` is `None` ([related issue](https://github.com/huggingface/transformers/issues/19853) on the transformers repo). This PR handles fixing these tests as well. Co-authored-by: Praveen Venkateswaran <praveen.venkateswaran@ibm.com>	2023-11-02 14:29:06 -07:00
Mark Bell	3276aa3e17	__getattr__ should rase AttributeError not ImportError on missing attributes (#12801 ) [The python spec](https://docs.python.org/3/reference/datamodel.html#object.__getattr__) requires that `__getattr__` throw `AttributeError` for missing attributes but there are several places throwing `ImportError` in the current code base. This causes a specific problem with `hasattr` since it calls `__getattr__` then looks only for `AttributeError` exceptions. At present, calling `hasattr` on any of these modules will raise an unexpected exception that most code will not handle as `hasattr` throwing exceptions is not expected. In our case this is triggered by an exception tracker (Airbrake) that attempts to collect the version of all installed modules with code that looks like: `if hasattr(mod, "__version__"):`. With `HEAD` this is causing our exception tracker to fail on all exceptions. I only changed instances of unknown attributes raising `ImportError` and left instances of known attributes raising `ImportError`. It feels a little weird but doesn't seem to break anything.	2023-11-02 17:08:54 -04:00
Illia	71d1a48b66	Use data from all Google search results in SerpApi.com wrapper (#12770 ) - Description: Use all Google search results data in SerpApi.com wrapper instead of the first one only - Tag maintainer: @hwchase17 _P.S. `libs/langchain/tests/integration_tests/utilities/test_serpapi.py` are not executed during the `make test`._	2023-11-02 13:31:27 -07:00
Nuno Campos	c4fdf78d03	Fix AddableDict raising exception when used with non-addable values (#12785 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-02 18:56:29 +00:00
Erick Friis	49e283a0cd	CLI 0.0.13, Configurable Template Demo (#12796 )	2023-11-02 11:42:57 -07:00
Nuno Campos	d1c6ad7769	Fix on_llm_new_token(chunk=) for some chat models (#12784 ) It was passing in message instead of generation <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-02 16:33:44 +00:00
Erick Friis	070823f294	CLI 0.0.12 (#12787 )	2023-11-02 08:29:27 -07:00
Bagatur	979501c0ca	bump 329 (#12778 )	2023-11-02 06:02:43 -07:00
Erick Friis	da821320d3	Fixes 'Nonetype' not iterable for ObsidianLoader (#12751 ) Implements #12726 from @Di3mex	2023-11-01 16:07:09 -07:00
Eugene Yurtsev	b1caae62fd	APIChain add restrictions to domains (CVE-2023-32786) (#12747 ) * Restrict the chain to specific domains by default * This is a breaking change, but it will fail loudly upon object instantiation -- so there should be no silent errors for users * Resolves CVE-2023-32786	2023-11-01 18:50:34 -04:00
Erick Friis	4421ba46d7	Demo Server, Fix Timescale (#12746 ) - improve demo server - missing deps	2023-11-01 15:29:34 -07:00
Eugene Yurtsev	0e1aedb9f4	Use jinja2 sandboxing by default (#12733 ) * This is an opt-in feature, so users should be aware of risks if using jinja2. * Regardless we'll add sandboxing by default to jinja2 templates -- this sandboxing is a best effort basis. * Best strategy is still to make sure that jinja2 templates are only loaded from trusted sources.	2023-11-01 14:54:01 -07:00
Erick Friis	14340ee7cd	use http.client instead of urllib3 (#12660 ) dep problems with requests cloudflare debugging not worth it with urllib	2023-11-01 11:15:05 -07:00
Bagatur	eee5181b7a	bump 328, exp 37 (#12722 )	2023-11-01 10:27:39 -07:00
Erick Friis	3405dbbc64	dash not underscore (#12716 ) template names are auto-populating with the wrong convention (with underscores)	2023-11-01 09:48:37 -07:00
123-fake-st	8bd3ce59cd	PyPDFLoader use url in metadata source if file is a web path (#12092 ) Description: Update `langchain.document_loaders.pdf.PyPDFLoader` to store url in metadata (instead of a temporary file path) if user provides a web path to a pdf - Issue: Related to #7034; the reporter on that issue submitted a PR updating `PyMuPDFParser` for this behavior, but it has unresolved merge issues as of 20 Oct 2023 #7077 - In addition to `PyPDFLoader` and `PyMuPDFParser`, these other classes in `langchain.document_loaders.pdf` exhibit similar behavior and could benefit from an update: `PyPDFium2Loader`, `PDFMinerLoader`, `PDFMinerPDFasHTMLLoader`, `PDFPlumberLoader` (I'm happy to contribute to some/all of that, including assisting with `PyMuPDFParser`, if my work is agreeable) - The root cause is that the underlying pdf parser classes, e.g. `langchain.document_loaders.parsers.pdf.PyPDFParser`, never receive information about the url; the parsers receive a `langchain.document_loaders.blob_loaders.blob`, which contains the pdf contents and local file path, but not the url - This update passes the web path directly to the parser since it's minimally invasive and doesn't require further changes to maintain existing behavior for local files... bigger picture, I'd consider extending `blob` so that extra information like this can be communicated, but that has much bigger implications on the codebase which I think warrants maintainer input - Dependencies: None ```python # old behavior >>> from langchain.document_loaders import PyPDFLoader >>> loader = PyPDFLoader('https://arxiv.org/pdf/1706.03762.pdf') >>> docs = loader.load() >>> docs[0].metadata {'source': '/var/folders/w2/zx77z1cs01s1thx5dhshkd58h3jtrv/T/tmpfgrorsi5/tmp.pdf', 'page': 0} # new behavior >>> from langchain.document_loaders import PyPDFLoader >>> loader = PyPDFLoader('https://arxiv.org/pdf/1706.03762.pdf') >>> docs = loader.load() >>> docs[0].metadata {'source': 'https://arxiv.org/pdf/1706.03762.pdf', 'page': 0} ```	2023-11-01 11:27:00 -04:00
Dave Kwon	b1954aab13	feat: Add page metadata on PDFMinerLoader (#12277 ) - Description: #12273 's suggestion PR Like other PDFLoader, loading pdf per each page and giving page metadata. - Issue: #12273 - Twitter handle: @blue0_0hope --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-11-01 11:25:37 -04:00
Duda Nogueira	7148f3e1fe	Weaviate - Fix schema existence check (#12711 ) This will allow you create the schema beforehand. The check was failing and preventing importing into existing classes. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-01 08:22:15 -07:00
Aidos Kanapyanov	ae63c186af	Mask API key for Anyscale LLM (#12406 ) Description: Add masking of API Key for Anyscale LLM when printed. Issue: #12165 Dependencies: None Tag maintainer: @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-11-01 10:22:26 -04:00
Predrag Gruevski	5ae51a8a85	Fix typo highlighted by `ruff` autoformatter. (#12691 ) H/t @MichaReiser for spotting it: https://github.com/langchain-ai/langchain/pull/12585/files#r1378253045	2023-10-31 22:16:06 -04:00
Erick Friis	44c8b159b9	properly increment version in cli (#12685 ) Went from 0.0.9 -> 0.0.11 without releasing. Back to 10, then release.	2023-10-31 17:27:43 -07:00
Leonid Ganeline	ddcec005bc	fix for `YahooFinanceNewsTool` (#12665 ) Added YahooFinanceNewsTool to the __init__.py It was missed here.	2023-10-31 14:58:09 -07:00
Predrag Gruevski	01a3c9b94e	Use an in-project virtualenv in the CLI package. (#12678 ) Keeping it in sync with how our other packages are configured.	2023-10-31 14:51:24 -07:00
Jacob Lee	bd668fcea1	Adds version CLI command (#12619 ) Will be automatically bumped with `poetry version patch`. @efriis @hwchase17 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-31 14:50:04 -07:00
Frank	bf5805bb32	Add quip loader (#12259 ) - Description: implement [quip](https://quip.com) loader - Issue: https://github.com/langchain-ai/langchain/issues/10352 - Dependencies: No - pass make format, make lint, make test --------- Co-authored-by: Hao Fan <h_fan@apple.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-31 14:11:24 -07:00
Roman Vasilyev	c9a6940d58	PGVector fix (#12592 ) latest release broken, this fixes it --------- Co-authored-by: Roman Vasilyev <rvasilyev@mozilla.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-31 17:01:15 -04:00
Predrag Gruevski	e8b99364b3	Use `ruff` for both linting and formatting in `langchain-cli`. (#12672 ) Prior to this PR, `ruff` was used only for linting and not for formatting, despite the names of the commands. This PR makes it be used for both linting code and autoformatting it.	2023-10-31 13:52:25 -07:00
Margaret Qian	acfc485808	Update MosaicML Embedding Input Key (#12657 ) This input key was missed in the last update PR: https://github.com/langchain-ai/langchain/pull/7391 The input/output formats are intended to be like this: ``` {"inputs": [<prompt>]} {"outputs": [<output_text>]} ```	2023-10-31 14:43:30 -04:00
Predrag Gruevski	c871cc5055	Remove `print()` statements which seemed leftover from debugging. (#12648 ) Added in #12159 presumably during debugging. Right now they cause a bit of visual noise.	2023-10-31 13:45:48 -04:00
Noam Gat	14e8c74736	LM Format Enforcer Integration + Sample Notebook (#12625 ) ## Description This PR adds support for [lm-format-enforcer](https://github.com/noamgat/lm-format-enforcer) to LangChain. ![image](https://raw.githubusercontent.com/noamgat/lm-format-enforcer/main/docs/Intro.webp) The library is similar to jsonformer / RELLM which are supported in Langchain, but has several advantages such as - Batching and Beam search support - More complete JSON Schema support - LLM has control over whitespace, improving quality - Better runtime performance due to only calling the LLM's generate() function once per generate() call. The integration is loosely based on the jsonformer integration in terms of project structure. ## Dependencies No compile-time dependency was added, but if `lm-format-enforcer` is not installed, a runtime error will occur if it is trying to be used. ## Tests Due to the integration modifying the internal parameters of the underlying huggingface transformer LLM, it is not possible to test without building a real LM, which requires internet access. So, similar to the jsonformer and RELLM integrations, the testing is via the notebook. ## Twitter Handle [@noamgat](https://twitter.com/noamgat) Looking forward to hearing feedback! --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-31 09:49:01 -07:00
Erick Friis	7f6e751a3d	template updates (#12646 )	2023-10-31 09:13:58 -07:00
Predrag Gruevski	f94e24dfd7	Install and use `ruff format` instead of black for code formatting. (#12585 ) Best to review one commit at a time, since two of the commits are 100% autogenerated changes from running `ruff format`: - Install and use `ruff format` instead of black for code formatting. - Output of `ruff format .` in the `langchain` package. - Use `ruff format` in experimental package. - Format changes in experimental package by `ruff format`. - Manual formatting fixes to make `ruff .` pass.	2023-10-31 10:53:12 -04:00
William FH	bfd719f9d8	bind_functions convenience method (#12518 ) I always take 20-30 seconds to re-discover where the `convert_to_openai_function` wrapper lives in our codebase. Chat langchain [has no clue](https://smith.langchain.com/public/3989d687-18c7-4108-958e-96e88803da86/r) what to do either. There's the older `create_openai_fn_chain` , but we haven't been recommending it in LCEL. The example we show in the [cookbook](https://python.langchain.com/docs/expression_language/how_to/binding#attaching-openai-functions) is really verbose. General function calling should be as simple as possible to do, so this seems a bit more ergonomic to me (feel free to disagree). Another option would be to directly coerce directly in the class's init (or when calling invoke), if provided. I'm not 100% set against that. That approach may be too easy but not simple. This PR feels like a decent compromise between simple and easy. ``` from enum import Enum from typing import Optional from pydantic import BaseModel, Field class Category(str, Enum): """The category of the issue.""" bug = "bug" nit = "nit" improvement = "improvement" other = "other" class IssueClassification(BaseModel): """Classify an issue.""" category: Category other_description: Optional[str] = Field( description="If classified as 'other', the suggested other category" ) from langchain.chat_models import ChatOpenAI llm = ChatOpenAI().bind_functions([IssueClassification]) llm.invoke("This PR adds a convenience wrapper to the bind argument") # AIMessage(content='', additional_kwargs={'function_call': {'name': 'IssueClassification', 'arguments': '{\n "category": "improvement"\n}'}}) ```	2023-10-31 07:15:37 -07:00
Nuno Campos	3143324984	Improve Runnable type inference for input_schemas (#12630 ) - Prefer lambda type annotations over inferred dict schema - For sequences that start with RunnableAssign infer seq input type as "input type of 2nd item in sequence - output type of runnable assign"	2023-10-31 13:22:54 +00:00
Nuno Campos	2f563cee20	Add Runnable.with_listeners() (#12549 ) - This binds start/end/error listeners to a runnable, which will be called with the Run object	2023-10-31 11:04:51 +00:00
Bagatur	bcc62d63be	bump 327 (#12623 )	2023-10-31 02:18:08 -07:00
Erick Friis	a1fae1fddd	Readme rewrite (#12615 ) Co-authored-by: Lance Martin <lance@langchain.dev> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-10-31 00:06:02 -07:00
Yujie Qian	1dbb77d7db	VoyageEmbeddings (#12608 ) - Description: Integrate VoyageEmbeddings into LangChain, with tests and docs - Issue: N/A - Dependencies: N/A - Tag maintainer: N/A - Twitter handle: @Voyage_AI_ --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 18:37:43 -07:00
chocolate4	92bf40a921	Add a new vector store hippo for langchain #11763 (#12412 ) #11763 --------- Co-authored-by: TranswarpHippo <hippo.0.assistant@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 18:35:23 -07:00
Karthik Raja A	342d6c7ab6	Multi on client toolkit (#12392 ) Replace this entire comment with: -Add MultiOn close function and update key value and add async functionality - solved the key value TabId not found.. (updated to use latest key value) @hwchase17	2023-10-30 18:34:56 -07:00
Prabin Nepal	b109cb031b	SecretStr for fireworks api (#12475 ) - Description: This pull request removes secrets present in raw format, - Issue: Fireworks api key was exposed when printing out the langchain object [#12165](https://github.com/langchain-ai/langchain/issues/12165) - Maintainer: @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 18:17:53 -07:00
Harrison Chase	a32c236c64	bump cli to 009 (#12611 )	2023-10-30 18:12:08 -07:00
Martin Schade	0c7f1d8b21	Textract linearizer (#12446 ) Description: Textract PDF Loader generating linearized output, meaning it will replicate the structure of the source document as close as possible based on the features passed into the call (e. g. LAYOUT, FORMS, TABLES). With LAYOUT reading order for multi-column documents or identification of lists and figures is supported and with TABLES it will generate the table structure as well. FORMS will indicate "key: value" with columms. - Issue: the issue fixes #12068 - Dependencies: amazon-textract-textractor is added, which provides the linearization - Tag maintainer: @3coins --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 18:02:10 -07:00
Erick Friis	f39246bd7e	cli should pull instead of delete+clone (#12607 ) Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-10-30 16:44:09 -07:00
Harrison Chase	8b5e879171	add a template for the package readme (#12499 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-30 16:39:39 -07:00
Bagatur	9bedda50f2	Bagatur/lakefs loader2 (#12524 ) Co-authored-by: Jonathan Rosenberg <96974219+Jonathan-Rosenberg@users.noreply.github.com>	2023-10-30 16:30:27 -07:00
Ackermann Yuriy	99b69fe607	Fixed missing optional tags. Added default key value for Ollama (#12599 ) Added missing Optional typings. Added default values for Ollama optional keys. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 16:30:10 -07:00
Bagatur	016813d189	factor out to_secret (#12593 )	2023-10-30 15:10:25 -07:00
hsuyuming	630ae24b28	implement get_num_tokens to use google's count_tokens function (#10565 ) can get the correct token count instead of using gpt-2 model Description: Implement get_num_tokens within VertexLLM to use google's count_tokens function. (https://cloud.google.com/vertex-ai/docs/generative-ai/get-token-count). So we don't need to download gpt-2 model from huggingface, also when we do the mapreduce chain we can get correct token count. Tag maintainer: @lkuligin Twitter handle: My twitter: @abehsu1992626 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 15:10:05 -07:00
Pham Vu Thai Minh	33e77a1007	Async support for FAISS (#11333 ) Following this tutoral about using OpenAI Embeddings with FAISS https://python.langchain.com/docs/integrations/vectorstores/faiss ```python from langchain.embeddings.openai import OpenAIEmbeddings from langchain.text_splitter import CharacterTextSplitter from langchain.vectorstores import FAISS from langchain.document_loaders import TextLoader from langchain.document_loaders import TextLoader loader = TextLoader("../../../extras/modules/state_of_the_union.txt") documents = loader.load() text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0) docs = text_splitter.split_documents(documents) embeddings = OpenAIEmbeddings() ``` This works fine ```python db = FAISS.from_documents(docs, embeddings) query = "What did the president say about Ketanji Brown Jackson" docs = db.similarity_search(query) ``` But the async version is not ```python db = await FAISS.afrom_documents(docs, embeddings) # NotImplementedError query = "What did the president say about Ketanji Brown Jackson" docs = await db.asimilarity_search(query) # this will use await asyncio.get_event_loop().run_in_executor under the hood and will not call OpenAIEmbeddings.aembed_query but call OpenAIEmbeddings.embed_query ``` So this PR add async/await supports for FAISS --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-30 15:08:53 -07:00
Jeff Zhuo	13b89815a3	Issue: fix the issue #11648 init minimax llm (#12554 ) e https://github.com/langchain-ai/langchain/issues/11648 Minimax llm failed to initialize The idea of this fix is https://github.com/langchain-ai/langchain/issues/10917#issuecomment-1765606725 do not use underscore in python model class --------- Co-authored-by: zhuojianming@cmcm.com <zhuojianming@cmcm.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 14:30:17 -07:00
Florian Valeye	bfb27324cb	[Matching Engine] Update the Matching Engine to include the distance and filters (#12555 ) Hello 👋, This Pull Request adds more capability to the [MatchingEngine](https://api.python.langchain.com/en/latest/vectorstores/langchain.vectorstores.matching_engine.MatchingEngine.html) vectorstore of GCP. It includes the `similarity_search_by_vector_with_relevance_scores` function and also [filters](https://cloud.google.com/vertex-ai/docs/vector-search/filtering) to `filter` the namespaces when retrieving the results. - Description: Add [filter](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform.MatchingEngineIndexEndpoint#google_cloud_aiplatform_MatchingEngineIndexEndpoint_find_neighbors) in `similarity_search` and add `similarity_search_by_vector_with_relevance_scores` method - Dependencies: None - Tag maintainer: Unknown Thank you! --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 14:12:59 -07:00
Harrison Chase	1d51363e49	change project template (#12493 )	2023-10-30 14:06:30 -07:00
Holt Skinner	e53b9ccd70	feat: Add Google Cloud Text-to-Speech Tool (#12572 ) - Add Tool for [Google Cloud Text-to-Speech](https://cloud.google.com/text-to-speech) - Follows similar structure to [Eleven Labs Text2Speech](https://python.langchain.com/docs/integrations/tools/eleven_labs_tts) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 14:05:39 -07:00
Adilkhan Sarsen	6e702b9c36	Deep memory support in LangChain (#12268 ) - Description: adding support to Activeloop's DeepMemory feature that boosts recall up to 25%. Added Jupyter notebook showcasing the feature and also made index params explicit. - Twitter handle: will really appreciate if we could announce this on twitter. --------- Co-authored-by: adolkhan <adilkhan.sarsen@alumni.nu.edu.kz>	2023-10-30 12:16:14 -07:00
billytrend-cohere	b1e3843931	Add client_name="langchain" to Cohere usage (#11328 ) Hey, we're looking to invest more in adding cohere integrations to langchain so would love to get more of an idea for how it's used. Hopefully this pr is acceptable. This week I'm also going to be looking into adding our new [retrieval augmented generation product](https://txt.cohere.com/chat-with-rag/) to langchain. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 11:20:55 -07:00
Bagatur	37aec1e050	bump 326 (#12569 )	2023-10-30 10:11:17 -07:00
Eugene Yurtsev	1b1a2d5740	Image Caption accepts bytes for images (#12561 ) Accept bytes for images in image caption --------- Co-authored-by: webcoderz <19884161+webcoderz@users.noreply.github.com>	2023-10-30 12:29:54 -04:00
Nuno Campos	7897483819	Allow astream_log to be used inside atrace_as_chain_group (#12558 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-30 15:55:16 +00:00
Holt Skinner	e05bb938de	Merge pull request #12433 * feat: Add Google Cloud Translation document transformer * Merge branch 'langchain-ai:master' into google-translate * Add documentation for Google Translate Document Transformer * Fix line length error * Merge branch 'master' into google-translate * Merge branch 'google-translate' of https://github.com/holtskinner/lan… * Addressed code review comments * Merge branch 'master' into google-translate * Merge branch 'google-translate' of https://github.com/holtskinner/lan… * Removed extra variable * Merge branch 'google-translate' of https://github.com/holtskinner/lan… * Merge branch 'master' into google-translate * Merge branch 'google-translate' of https://github.com/holtskinner/lan… * Removed extra import	2023-10-29 21:22:36 -04:00
Samad Koita	d1fdcd4fcb	Masking of API Key for GooseAI LLM (#12496 ) Description: Add masking of API Key for GooseAI LLM when printed. Issue: https://github.com/langchain-ai/langchain/issues/12165 Dependencies: None Tag maintainer: @eyurtsev --------- Co-authored-by: Samad Koita <>	2023-10-29 21:21:33 -04:00
Andrew Zhou	64c4a698a8	More comprehensive readthedocs document loader (#12382 ) ## Description: When building our own readthedocs.io scraper, we noticed a couple interesting things: 1. Text lines with a lot of nested <span> tags would give unclean text with a bunch of newlines. For example, for [Langchain's documentation](https://api.python.langchain.com/en/latest/document_loaders/langchain.document_loaders.readthedocs.ReadTheDocsLoader.html#langchain.document_loaders.readthedocs.ReadTheDocsLoader), a single line is represented in a complicated nested HTML structure, and the naive `soup.get_text()` call currently being made will create a newline for each nested HTML element. Therefore, the document loader would give a messy, newline-separated blob of text. This would be true in a lot of cases. <img width="945" alt="Screenshot 2023-10-26 at 6 15 39 PM" src="https://github.com/langchain-ai/langchain/assets/44193474/eca85d1f-d2bf-4487-a18a-e1e732fadf19"> <img width="1031" alt="Screenshot 2023-10-26 at 6 16 00 PM" src="https://github.com/langchain-ai/langchain/assets/44193474/035938a0-9892-4f6a-83cd-0d7b409b00a3"> Additionally, content from iframes, code from scripts, css from styles, etc. will be gotten if it's a subclass of the selector (which happens more often than you'd think). For example, [this page](https://pydeck.gl/gallery/contour_layer.html#) will scrape 1.5 million characters of content that looks like this: <img width="1372" alt="Screenshot 2023-10-26 at 6 32 55 PM" src="https://github.com/langchain-ai/langchain/assets/44193474/dbd89e39-9478-4a18-9e84-f0eb91954eac"> Therefore, I wrote a recursive _get_clean_text(soup) class function that 1. skips all irrelevant elements, and 2. only adds newlines when necessary. 2. Index pages (like [this one](https://api.python.langchain.com/en/latest/api_reference.html)) would be loaded, chunked, and eventually embedded. This is really bad not just because the user will be embedding irrelevant information - but because index pages are very likely to show up in retrieved content, making retrieval less effective (in our tests). Therefore, I added a bool parameter `exclude_index_pages` defaulted to False (which is the current behavior — although I'd petition to default this to True) that will skip all pages where links take up 50%+ of the page. Through manual testing, this seems to be the best threshold. ## Other Information: - Issue: n/a - Dependencies: n/a - Tag maintainer: n/a - Twitter handle: @andrewthezhou --------- Co-authored-by: Andrew Zhou <andrew@heykona.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-29 16:26:53 -07:00
Peter Vandenabeele	3468c038ba	Add unit tests for document_transformers/beautiful_soup_transformer.py (#12520 ) - Description: * Add unit tests for document_transformers/beautiful_soup_transformer.py * Basic functionality is tested (extract tags, remove tags, drop lines) * add a FIXME comment about the order of tags that is not preserved (and a passing test, but with the expected tags now out-of-order) - Issue: None - Dependencies: None - Tag maintainer: @rlancemartin - Twitter handle: `peter_v` Please make sure your PR is passing linting and testing before submitting. => OK: I ran `make format`, `make test` (passing after install of beautifulsoup4) and `make lint`.	2023-10-29 16:24:47 -07:00
Anirudh Gautam	b257e6a4e8	Mask API key for AI21 LLM (#12418 ) - Description: Added masking of the API Key for AI21 LLM when printed and improved the docstring for AI21 LLM. - Updated the AI21 LLM to utilize SecretStr from pydantic to securely manage API key. - Made improvements in the docstring of AI21 LLM. It now mentions that the API key can also be passed as a named parameter to the constructor. - Added unit tests. - Issue: #12165 - Tag maintainer: @eyurtsev --------- Co-authored-by: Anirudh Gautam <anirudh@Anirudhs-Mac-mini.local>	2023-10-29 14:53:41 -07:00
silvhua	9dead1034c	`_dalle_image_url` returns list of urls if n>1 (#11800 ) - Description: Updated the `_dalle_image_url` method to return a list of URLs if self.n>1, - Issue: #10691, - Dependencies: unsure, - Tag maintainer: @eyurtsev, - Twitter handle: @silvhua --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-29 14:23:23 -07:00
Bagatur	1815ea2fdb	OpenAI runnable constructor (#12455 )	2023-10-29 13:40:30 -07:00
William FH	a830b809f3	Patch forward ref bug (#12508 ) Currently this gives a bug: ``` from langchain.schema.runnable import RunnableLambda bound = RunnableLambda(lambda x: x).with_config({"callbacks": []}) # ConfigError: field "callbacks" not yet prepared so type is still a ForwardRef, you might need to call RunnableConfig.update_forward_refs(). ``` Rather than deal with cyclic imports and extra load time, etc., I think it makes sense to just have a separate Callbacks definition here that is a relaxed typehint.	2023-10-29 00:53:01 -07:00
William FH	36204c2baf	Evaluation Callback Multi Response (#12505 ) 1. Allow run evaluators to return {"results": [list of evaluation results]} in the evaluator callback. 2. Allows run evaluators to pick the target run ID to provide feedback to (1) means you could do something like a function call that populates a full rubric in one go (not sure how reliable that is in general though) rather than splitting off into separate LLM calls - cheaper and less code to write (2) means you can provide feedback to runs on subsequent calls. Immediate use case is if you wanted to add an evaluator to a chat bot and assign to assign to previous conversation turns have a corresponding one in the SDK	2023-10-28 23:18:29 -07:00
Harrison Chase	9e0ae56287	various templates improvements (#12500 )	2023-10-28 22:13:22 -07:00
0xC9	79cf01366e	Update tool.py (#12472 ) In the GoogleSerperResults class, the name field is defined as 'google_serrper_results_json'. This looks like a typo, and perhaps should be 'google_serper_results_json'. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-28 21:49:01 -07:00
Harrison Chase	eb903e211c	bump to 36 (#12487 )	2023-10-28 08:51:23 -07:00
Tyler Hutcherson	4209457bdc	Redis langserve template (#12443 ) Add Redis langserve template! Eventually will add semantic caching to this too. But I was struggling to get that to work for some reason with the LCEL implementation here. - Description: Introduces the Redis LangServe template. A simple RAG based app built on top of Redis that allows you to chat with company's public financial data (Edgar 10k filings) - Issue: None - Dependencies: The template contains the poetry project requirements to run this template - Tag maintainer: @baskaryan @Spartee - Twitter handle: @tchutch94 Note: this requires the commit here that deletes the `_aget_relevant_documents()` method from the Redis retriever class that wasn't implemented. That was breaking the langserve app. --------- Co-authored-by: Sam Partee <sam.partee@redis.com>	2023-10-28 08:31:12 -07:00
Erick Friis	9adaa78c65	cli improvements (#12465 ) Features - add multiple repos by their branch/repo - generate `pip install` commands and `add_route()` code ![Screenshot 2023-10-27 at 4 49 52 PM](https://github.com/langchain-ai/langchain/assets/9557659/3aec4cbb-3f67-4f04-8370-5b54ea983b2a) Optimizations: - group installs by repo/branch to avoid duplicate cloning	2023-10-28 08:25:31 -07:00
Adam Law	df4960a6d8	add reranking to azuresearch (#12454 ) -Description Adds returning the reranking score when using semantic search -*Issue: #12317 --------- Co-authored-by: Adam Law <adamlaw@microsoft.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-27 14:14:09 -07:00
Eugene Yurtsev	60d009f75a	Add security note to API chain (#12452 ) Add security note	2023-10-27 17:09:42 -04:00
Matvey Arye	11505f95d3	Improve handling of empty queries for timescale vector (#12393 ) Description: Improve handling of empty queries in timescale-vector. For timescale-vector it is more efficient to get a None embedding when the embedding has no semantic meaning. It allows timescale-vector to perform more optimizations. Thus, when the query is empty, use a None embedding. Also pass down constructor arguments to the timescale vector client. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-27 13:55:16 -07:00
Erick Friis	38cee5fae0	cli updates 2 (#12447 ) - extras group - readme - another readme --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-27 13:37:03 -07:00
William FH	5d40e36c75	Trace if run tree set (#12444 ) This code path is hit in the following case: - Start in langchain code and manually provide a tracer - Handoff to the traceable - Hand back to langchain code. Which happens for evaluating `@traceable` functions unfortunately	2023-10-27 12:29:18 -07:00
Bagatur	c2a0a6b6df	make doc utils public (#12394 )	2023-10-27 12:08:08 -07:00
Henter	d6888a90d0	Fix the missing temperature parameter for Baichuan-AI chat_model (#12420 ) Description: the missing `temperature` parameter for Baichuan-AI chat_model Baichuan-AI api doc: https://platform.baichuan-ai.com/docs/api	2023-10-27 12:07:21 -07:00
Erick Friis	6908634428	cli updates oct27 (#12436 )	2023-10-27 12:06:46 -07:00
HwangJohn	d38c8369b3	added rrf argument in ApproxRetrievalStrategy class __init__() (#11987 ) - Description: To handle the hybrid search with RRF(Reciprocal Rank Fusion) in the Elasticsearch, rrf argument was added for adjusting 'rank_constant' and 'window_size' to combine multiple result sets with different relevance indicators into a single result set. (ref: https://www.elastic.co/kr/blog/whats-new-elastic-enterprise-search-8-9-0), - Issue: the issue # it fixes (if applicable), - Dependencies: No dependencies changed, - Tag maintainer: @baskaryan, Nice to meet you, I'm a newbie for contributions and it's my first PR. I only changed the langchain/vectorstores/elasticsearch.py file. I did make format&lint I got this message, ```shell make lint_diff ./scripts/check_pydantic.sh . ./scripts/check_imports.sh poetry run ruff . [ "langchain/vectorstores/elasticsearch.py" = "" ] \|\| poetry run black langchain/vectorstores/elasticsearch.py --check All done! ✨ 🍰 ✨ 1 file would be left unchanged. [ "langchain/vectorstores/elasticsearch.py" = "" ] \|\| poetry run mypy langchain/vectorstores/elasticsearch.py langchain/__init__.py: error: Source file found twice under different module names: "mvp.nlp.langchain.libs.langchain.langchain" and "langchain" Found 1 error in 1 file (errors prevented further checking) make: * [lint_diff] Error 2 ``` Thank you --------- Co-authored-by: 황중원 <jwhwang@amorepacific.com>	2023-10-27 11:53:19 -07:00
Roman Vasilyev	2c58dca5f0	optional reusable connection (#12051 ) My postgres out of connections after continuous PGVector usage, and the reason because it constantly creates new connections, so adding a reusable pre established connection seems like solves an issue --------- Co-authored-by: Roman Vasilyev <rvasilyev@mozilla.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-27 11:52:42 -07:00
Ennio Pastore	48fde2004f	Update long_context_reorder.py (#12422 ) The function comment was confusing and inaccurate	2023-10-27 11:52:28 -07:00
Bagatur	a8c68d4ffa	Type LLMChain.llm as runnable (#12385 )	2023-10-27 11:52:01 -07:00
Bagatur	d12b88557a	Bagatur/bump 325 (#12440 )	2023-10-27 11:49:09 -07:00
Eugene Yurtsev	cadfce295f	Deprecate PythonRepl tools and Pandas/Xorbits/Spark DataFrame/Python/CSV agents (#12427 ) See discussion here: https://github.com/langchain-ai/langchain/discussions/11680 The code is available for usage from langchain_experimental. The reason for the deprecation is that the agents are relying on a Python REPL. The code can only be run safely with appropriate sandboxing. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-27 14:16:42 -04:00
Harrison Chase	0ca539eb85	Clean up deprecated agents and update __init__ in experimental (#12231 ) Update init paths in experimental	2023-10-27 13:52:50 -04:00
Holt Skinner	134f085824	feat: Add Google Speech to Text API Document Loader (#12298 ) - Add Document Loader for Google Speech to Text - Similar Structure to [Assembly AI Document Loader][1] [1]: https://python.langchain.com/docs/integrations/document_loaders/assemblyai	2023-10-27 09:34:26 -07:00
David Duong	52c194ec3a	Fix templates typos (#12428 )	2023-10-27 09:32:57 -07:00
Massimiliano Pronesti	c8195769f2	fix(openai-callback): completion count logic (#12383 ) The changes introduced in #12267 and #12190 broke the cost computation of the `completion` tokens for fine-tuned models because of the early return. This PR aims at fixing this. @baskaryan.	2023-10-27 09:08:54 -07:00
Stefan Langenbach	b22da81af8	Mask API key for Aleph Alpha LLM (#12377 ) - Description: Add masking of API Key for Aleph Alpha LLM when printed. - Issue: #12165 - Dependencies: None - Tag maintainer: @eyurtsev --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-27 11:32:43 -04:00
William FH	4254028c52	Str Evaluator Mapper (#12401 )	2023-10-26 21:38:47 -07:00
William FH	fcad1d2965	Add space (#12395 )	2023-10-26 20:32:23 -07:00
William FH	922d7910ef	Wfh/json schema evaluation (#12389 ) Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2023-10-26 20:32:05 -07:00
Christian Kasim Loan	a35445c65f	johnsnowlabs embeddings support (#11271 ) - Description: Introducing the [JohnSnowLabsEmbeddings](https://www.johnsnowlabs.com/) - Dependencies: johnsnowlabs - Tag maintainer: @C-K-Loan - Twitter handle: https://twitter.com/JohnSnowLabs https://twitter.com/ChristianKasimL --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-26 20:22:50 -07:00
SteveLiao	c08b622b2d	Add HTML Title and Page Language into metadata for AsyncHtmlLoader (#11326 ) Description: Revise `libs/langchain/langchain/document_loaders/async_html.py` to store the HTML Title and Page Language in the `metadata` of `AsyncHtmlLoader`.	2023-10-26 20:22:31 -07:00
Shorthills AI	25c98dbba9	Fixed some grammatical and Exception types issues (#12015 ) Fixed some grammatical issues and Exception types. @baskaryan , @eyurtsev --------- Co-authored-by: Sanskar Tanwar <142409040+SanskarTanwarShorthillsAI@users.noreply.github.com> Co-authored-by: UpneetShorthillsAI <144228282+UpneetShorthillsAI@users.noreply.github.com> Co-authored-by: HarshGuptaShorthillsAI <144897987+HarshGuptaShorthillsAI@users.noreply.github.com> Co-authored-by: AdityaKalraShorthillsAI <143726711+AdityaKalraShorthillsAI@users.noreply.github.com> Co-authored-by: SakshiShorthillsAI <144228183+SakshiShorthillsAI@users.noreply.github.com>	2023-10-26 21:12:38 -04:00
William FH	923696b664	Wfh/json edit dist (#12361 ) Compare predicted json to reference. First canonicalize (sort keys, rm whitespace separators), then return normalized string edit distance. Not a silver bullet but maybe an easy way to capture structure differences in a less flakey way --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2023-10-26 18:10:28 -07:00
Erick Friis	4db8d82c55	CLI CI 2 (#12387 ) Will run all CI because of _test change, but future PRs against CLI will only trigger the new CLI one Has a bunch of file changes related to formatting/linting. No mypy yet - coming soon	2023-10-26 17:01:31 -07:00
Tyler Hutcherson	231d553824	Update broken redis tests (#12371 ) Update broken redis tests -- tiny PR :) - Description: Fixes Redis tests on master (look like it was broken by https://github.com/langchain-ai/langchain/pull/11257) - Issue: None, - Dependencies: No - Tag maintainer: @baskaryan @Spartee - Twitter handle: N/A Co-authored-by: Sam Partee <sam.partee@redis.com>	2023-10-26 16:13:14 -07:00
Erick Friis	03e79e62c2	cli fix (#12380 )	2023-10-26 15:29:49 -07:00
Bagatur	76230d2c08	fireworks scheduled integration tests (#12373 )	2023-10-26 14:24:42 -07:00
Josh Phillips	01c5cd365b	Fix SupbaseVectoreStore write operation timeout (#12318 ) Description This small change will make chunk_size a configurable parameter for loading documents into a Supabase database. Issue https://github.com/langchain-ai/langchain/issues/11422 Dependencies No chanages Twitter @ j1philli Reminder If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --------- Co-authored-by: Greg Richardson <greg.nmr@gmail.com>	2023-10-26 14:19:17 -07:00
Bagatur	b10cefb160	lint fix: rm init (#12374 )	2023-10-26 14:16:25 -07:00
Harrison Chase	b43996e553	Harrison/improve cli (#12368 )	2023-10-26 13:53:59 -07:00
Harrison Chase	9ce38726a2	fix some stuff (#12292 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-26 13:30:36 -07:00
Cynthia Yang	6ce276e099	Support Fireworks batching (#8 ) (#12052 ) Description * Add _generate and _agenerate to support Fireworks batching. * Add stop words test cases * Opt out retry mechanism Issue - Not applicable Dependencies - None Tag maintainer - @baskaryan	2023-10-26 16:01:08 -04:00
Tyler Hutcherson	2f0c9d8269	Fix redis vectorfield schema defaults (#12223 ) - Description: refactors the redis vector field schema to properly handle default values, includes a new unit test suite. - Issue: N/A - Dependencies: nothing new. - Tag maintainer: @baskaryan @Spartee - Twitter handle: this is a tiny fix/improvement :) This issue was causing some clients/cuatomers issues when building a vector index on Redis on smaller db instances (due to fault default values in index configuration). It would raise an error like: ```redis.exceptions.ResponseError: Vector index initial capacity 20000 exceeded server limit (852 with the given parameters)``` This PR will address this moving forward.	2023-10-26 12:17:58 -07:00
Jakub Novák	9544d64ad8	E2B tool - Improve description wuth uploaded files info (#12355 )	2023-10-26 11:44:24 -07:00
Bagatur	c6a733802b	bump 324 and 35 (#12352 )	2023-10-26 10:10:26 -07:00
Nuno Campos	683e97766d	Fix json key output parser in partial (streaming) mode (#12332 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-26 17:45:04 +01:00
Nikhil Jha	dff24285ea	Comprehend Moderation 0.2 (#11730 ) This PR replaces the previous `Intent` check with the new `Prompt Safety` check. The logic and steps to enable chain moderation via the Amazon Comprehend service, allowing you to detect and redact PII, Toxic, and Prompt Safety information in the LLM prompt or answer remains unchanged. This implementation updates the code and configuration types with respect to `Prompt Safety`. ### Usage sample ```python from langchain_experimental.comprehend_moderation import (BaseModerationConfig, ModerationPromptSafetyConfig, ModerationPiiConfig, ModerationToxicityConfig ) pii_config = ModerationPiiConfig( labels=["SSN"], redact=True, mask_character="X" ) toxicity_config = ModerationToxicityConfig( threshold=0.5 ) prompt_safety_config = ModerationPromptSafetyConfig( threshold=0.5 ) moderation_config = BaseModerationConfig( filters=[pii_config, toxicity_config, prompt_safety_config] ) comp_moderation_with_config = AmazonComprehendModerationChain( moderation_config=moderation_config, #specify the configuration client=comprehend_client, #optionally pass the Boto3 Client verbose=True ) template = """Question: {question} Answer:""" prompt = PromptTemplate(template=template, input_variables=["question"]) responses = [ "Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like 323-22-9980. John Doe's phone number is (999)253-9876.", "Final Answer: This is a really shitty way of constructing a birdhouse. This is fucking insane to think that any birds would actually create their motherfucking nests here." ] llm = FakeListLLM(responses=responses) llm_chain = LLMChain(prompt=prompt, llm=llm) chain = ( prompt \| comp_moderation_with_config \| {llm_chain.input_keys[0]: lambda x: x['output'] } \| llm_chain \| { "input": lambda x: x['text'] } \| comp_moderation_with_config ) try: response = chain.invoke({"question": "A sample SSN number looks like this 123-456-7890. Can you give me some more samples?"}) except Exception as e: print(str(e)) else: print(response['output']) ``` ### Output ```python > Entering new AmazonComprehendModerationChain chain... Running AmazonComprehendModerationChain... Running pii Validation... Running toxicity Validation... Running prompt safety Validation... > Finished chain. > Entering new AmazonComprehendModerationChain chain... Running AmazonComprehendModerationChain... Running pii Validation... Running toxicity Validation... Running prompt safety Validation... > Finished chain. Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like XXXXXXXXXXXX John Doe's phone number is (999)253-9876. ``` --------- Co-authored-by: Jha <nikjha@amazon.com> Co-authored-by: Anjan Biswas <anjanavb@amazon.com> Co-authored-by: Anjan Biswas <84933469+anjanvb@users.noreply.github.com>	2023-10-26 09:42:18 -07:00
Blake (Yung Cher Ho)	b9410f2b6f	Takeoff pro support (#12070 ) Description: This PR adds support for the [Pro version of Titan Takeoff Server](https://docs.titanml.co/docs/category/pro-features). Users of the Pro version will have to import the TitanTakeoffPro model, which is different from TitanTakeoff. Issue: Also minor fixes to docs for Titan Takeoff (Community version) Dependencies: No additional dependencies Twitter handle: @becoming_blake @baskaryan @hwchase17	2023-10-26 09:39:32 -07:00
Leonid Kuligin	4e47fe1dce	fixed error message and a check for processor name (#12200 ) Replace this entire comment with: - Description: a small fix on error description / a check for processor name - Issue: the issue #11407	2023-10-26 09:38:25 -07:00
Nir Kopler	9298aff783	Finetuned openai azure models cost calculation (#12267 ) Description: Add cost calculation for fine tuned Azure with relevant unit tests. see https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/fine-tuning?tabs=turbo&pivots=programming-language-studio for more information. this PR is the result of this PR: https://github.com/langchain-ai/langchain/pull/12190 Twitter handle: @nirkopler	2023-10-26 09:38:10 -07:00
gnakw	20fe515f20	Fix the exception from langchain.utilities import ArceeWrapper (#12342 ) - Description: Fix the exception from langchain.utilities import ArceeWrapper	2023-10-26 09:19:43 -07:00
Qihui Xie	6720458c7d	add allowed_operators property in QdrantTranslator (#12328 ) - Description: This PR adds `allowd_operators` property to `QdrantTranslator` to fix the `TypeError: can only join an iterable` bug. This property is required in `get_query_constructor_prompt` in `query_constructor\base.py`: ``` allowed_operators=" \| ".join(allowed_operators), ``` - Issue: #12061 --------- Co-authored-by: XIE Qihui <qihui.xie@bopufund.com>	2023-10-26 09:18:29 -07:00
Bagatur	f5a57fc1ef	fix self query constructor (#12349 )	2023-10-26 09:18:15 -07:00
Vasek Mlejnsky	cdd75b687e	e2b tool - fix initialization and improve tool description (#12345 )	2023-10-26 08:47:50 -07:00
Harrison Chase	8ec7aade9f	add docs for templates (#12346 )	2023-10-26 08:28:01 -07:00
Erick Friis	ebf998acb6	Templates (#12294 ) Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Lance Martin <lance@langchain.dev> Co-authored-by: Jacob Lee <jacoblee93@gmail.com>	2023-10-25 18:47:42 -07:00
Erick Friis	43257a295c	CLI Git Improvements (#12311 ) - delete repo sources like pip - git dep fixes - error messaging	2023-10-25 18:30:02 -07:00
William FH	1d568e1add	Better wrap traceable (#12303 ) If user function is wrapped as a traceable function, this will help hand off the trace between the two. Also update handling fields to reflect optional values	2023-10-25 16:34:23 -07:00
Eugene Yurtsev	5a71b81609	Relax type annotation for custom input/output types (#12300 ) This is needed to be able to do stuff like: ```python runnable.with_types(input_type=List[str]) ```	2023-10-25 19:00:22 -04:00
William FH	988f6d9912	Rm langchain server (#12305 )	2023-10-25 15:26:46 -07:00
wemysschen	3f16acc538	Add baidu cloud vector search in vectorstore and fix some unit test in vectorstores (#11605 ) Description: Add baidu cloud vector search in vectorstore --------- Co-authored-by: root <root@icoding-cwx.bcc-szzj.baidu.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-25 13:44:19 -07:00
mrbean	b7e559c7e1	use snippet search optionally (#12236 ) Add an additional flag which allows for hitting our new endpoint.	2023-10-25 13:37:28 -07:00
felixocker	cce132d146	fix sparql queries for relations in schema description (#9136 ) - Description: Fix for the SPARQL QA chain: fixed SPARQL queries for retrieving information about relations in the graph to create a textual description of the schema for the language model. This should resolve #8907 - Issue: #8907 - Dependencies: None - Tag maintainer: @baskaryan, @hwchase17	2023-10-25 13:36:57 -07:00
Donato Azevedo	d9f1bcf366	Strips leading/trailing whitespace before parsing xml (#12297 ) Description: When llms output leading or trailing whitespace for xml (when using XMLOutputParser) the parser would raise a `ValueError: Could not parse output: ...`. However, leading or trailing whitespace are "ignorable" in the sense of XML standard. Issue: I did not find an issue related. Dependencies: None Tag maintainer: Twitter handle: donatoaz Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. Done, updated unit test and ran `make docker_test`.	2023-10-25 13:34:58 -07:00
Erick Friis	47070b8314	CLI (#12284 ) Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-10-25 11:06:58 -07:00
Shwu Ku	07c2649753	response parser for ArceeRetriever (#12270 ) - Description: Response parser for arcee retriever, - Issue: follow-up pr on #11578 and [discussion](https://github.com/arcee-ai/arcee-python/issues/15#issuecomment-1759874053), - Dependencies: NA This pr implements a parser for the response from ArceeRetreiver to convert to langchain `Document`. This closes the loop of generation and retrieval for Arcee DALMs in langchain. The reference for the response parser is [api-docs:retrieve](https://api.arcee.ai/docs#/v2/retrieve_model) Attaching screenshot of working implementation: <img width="1984" alt="Screenshot 2023-10-25 at 7 42 34 PM" src="https://github.com/langchain-ai/langchain/assets/65639964/026987b9-34b2-4e4b-b87d-69fcd0c6641a"> \*api key deleted --- Successful tests, lints, etc. ```shell Re-run pytest with --snapshot-update to delete unused snapshots. ==================================================================================================================== slowest 5 durations ===================================================================================================================== 1.56s call tests/unit_tests/schema/runnable/test_runnable.py::test_retrying 0.63s call tests/unit_tests/schema/runnable/test_runnable.py::test_map_astream 0.33s call tests/unit_tests/schema/runnable/test_runnable.py::test_map_stream_iterator_input 0.30s call tests/unit_tests/schema/runnable/test_runnable.py::test_map_astream_iterator_input 0.20s call tests/unit_tests/indexes/test_indexing.py::test_cleanup_with_different_batchsize ======================================================================================================= 1265 passed, 270 skipped, 32 warnings in 6.55s ======================================================================================================= [ "." = "" ] \|\| poetry run black . All done! ✨ 🍰 ✨ 1871 files left unchanged. [ "." = "" ] \|\| poetry run ruff --select I --fix . ./scripts/check_pydantic.sh . ./scripts/check_imports.sh poetry run ruff . [ "." = "" ] \|\| poetry run black . --check All done! ✨ 🍰 ✨ 1871 files would be left unchanged. [ "." = "" ] \|\| poetry run mypy . Success: no issues found in 1868 source files poetry run codespell --toml pyproject.toml poetry run codespell --toml pyproject.toml -w ``` Co-authored-by: Shubham Kushwaha <shwu@Shubhams-MacBook-Pro.local>	2023-10-25 10:55:13 -07:00
Johanna Appel	c26ec7789f	CohereEmbeddings: Add max_retries and request_timeout (#12275 ) Add max_retries and request_timeout to CohereEmbeddings, akin to how it works in OpenAIEmbeddings. Since the Cohere client already implements these parameters, we can simply pass them down. Uses parameters from these two cohere client objects: https://github.com/cohere-ai/cohere-python/blob/main/cohere/client.py https://github.com/cohere-ai/cohere-python/blob/main/cohere/client_async.py	2023-10-25 10:37:25 -07:00
Nuno Campos	7108084947	Remove CLI (#12283 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-25 10:33:52 -07:00
Nuno Campos	b5b2d07681	Pop max concurrency when recursing (#12281 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-25 18:03:58 +01:00
Bagatur	69f4e402e4	bump 323 (#12278 )	2023-10-25 09:06:12 -07:00
David Duong	c25b174db5	Add serialisation props to Fireworks and ChatFireworks (#12255 )	2023-10-25 11:41:33 +01:00
Richard Adams	fd5f549a9e	demonstrate use of RetrievalQAWithSourcesChain.from_chain (#12235 ) Description: Documents further usage of RetrievalQAWithSourcesChain in an existing test. I'd not found much documented usage of RetrievalQAWithSourcesChain and how to get the sources out. This additional code will hopefully be useful to other potential users of this retriever. Issue: No raised issue Dependencies: No new dependencies needed to run the test (it already needs `open-ai`, `faiss-cpu` and `unstructured`). Note - `make lint` showed 8 linting errors in unrelated files --------- Co-authored-by: richarda23 <richard.c.adams@infinityworks.com>	2023-10-24 21:33:34 -07:00
James Braza	53f35c5f5c	Adding `STRUCTURED_FORMAT_SIMPLE_INSTRUCTIONS` missing backticks (#12238 ) This PR fixes the fact that `STRUCTURED_FORMAT_SIMPLE_INSTRUCTIONS` was missing backticks at the end	2023-10-24 21:30:25 -07:00
William FH	276c6ba115	Check for ls project in run tree context (#12242 ) If I go traceable -> runnable when the project is manually specified, the runnable wont be logged. This makes sure the session/project is threaded through appropriately.	2023-10-24 17:18:59 -07:00
Vasek Mlejnsky	1f8094938f	Integrate E2B's data analysis/code interpreter (#12011 ) This PR adds a data [E2B's](https://e2b.dev/) analysis/code interpreter sandbox as a tool --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Jakub Novak <jakub@e2b.dev>	2023-10-24 16:04:02 -07:00
Bagatur	286a29a49e	bump 322 and 34 (#12228 )	2023-10-24 13:52:17 -07:00
Eugene Yurtsev	583dc49477	Add type to Generation and sub-classes, handle root validator (#12220 ) * Add a type literal for the generation and sub-classes for serialization purposes. * Fix the root validator of ChatGeneration to return ValueError instead of KeyError or Attribute error if intialized improperly. * This change is done for langserve to make sure that llm related callbacks can be serialized/deserialized properly.	2023-10-24 16:21:00 -04:00
Eugene Yurtsev	81052ee18e	Fix code block in runnable doc (#12221 ) Fix code block syntax in runnable doc-string	2023-10-24 16:11:58 -04:00
Mikelarg	46e28b9613	Added GigaChat chat model support (#12201 ) - Description: Added integration with [GigaChat](https://developers.sber.ru/portal/products/gigachat) language model. - Twitter handle: @dvoshansky	2023-10-24 12:53:51 -07:00
Anurag Wagh	d5c2ce7c2e	[fix] create redis vector index before adding docs, add prefix to doc… (#11257 ) Fix Description: For Redis Vector integration in add_texts method, there were two issues that lead to this bug. 1. Vector index is not being created leading to no such_index error 2. `doc:index` prefix was also missing for Redis Keys. resolves #11197 Maintainer: @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-24 10:51:25 -07:00
Eugene Yurtsev	079d1f3b8e	Expose handle_event and ahandle_events as public API (#12181 ) Expose functionality to handle generic events.	2023-10-24 13:42:28 -04:00
William FH	67c4fd0ad0	Update deprecation (#12178 ) in runner_utils	2023-10-24 10:37:28 -07:00
Nir Kopler	d3744175bf	Finetuned OpenAI models cost calculation #11715 (#12190 ) Description: Add cost calculation for fine tuned models (new and legacy), this is required after OpenAI added new models for fine tuning and separated the costs of I/O for fine tuned models. Also I updated the relevant unit tests see https://platform.openai.com/docs/guides/fine-tuning for more information. issue: https://github.com/langchain-ai/langchain/issues/11715 - Issue: 11715 - Twitter handle: @nirkopler	2023-10-24 10:22:05 -07:00
Spyros	a2840a2b42	fix vertexai codey models (#12173 ) Description: This PR fixes issue #12156 by checking for Codey models appropriately before result parsing. Maintainer: @hwchase17 , @agola11	2023-10-24 10:20:05 -07:00
Hech	d76f026d72	Fix flexible dimension and doc for DingoDB (#12187 )	2023-10-24 10:16:19 -07:00
Erick Friis	95ae40ff90	Fix Anthropic Functions ainvoke (#12215 ) Removes custom `NotImplementedError` in experimental anthropic functions, allowing it to fallback on default `ainvoke` implementation.	2023-10-24 10:07:01 -07:00
Iskren Ivov Chernev	d5d7ba582a	Improvements to llm/deepinfra (#10846 ) - replace `requests` package with `langchain.requests` - add `_acall` support - add `_stream` and `_astream` - freshen up the documentation a bit - update vendor doc	2023-10-24 09:54:23 -07:00
sudranga	f09f82541b	Expose configuration options in GraphCypherQAChain (#12159 ) Allows for passing arguments into the LLM chains used by the GraphCypherQAChain. This is to address a request by a user to include memory in the Cypher creating chain. Will keep the prompt variables as-is to be backward compatible. But, would be a good idea to deprecate them and use the **kwargs variables. Added a test case. In general, I think it would be good for any chain to automatically pass in a readonlymemory(of its input) to its subchains whilist allowing for an override. But, this would be a different change.	2023-10-24 09:52:55 -07:00
Leonid Ganeline	11f13aed53	docstrings update (#12093 ) Added missed docstrings. Added missed Args:, Returns: Raises:	2023-10-24 09:34:10 -07:00
Johnny Oshika	ba20c14e28	Fix typo in stuff_prompt's system_template (#12063 ) - Description: Add missing apostrophe in `user's` in stuff_prompt's system_template. The first sentence in the system template went from: > Use the following pieces of context to answer the users question. to > Use the following pieces of context to answer the user's question. - Issue: - Dependencies: none - Tag maintainer: @baskaryan - Twitter handle: ojohnnyo	2023-10-24 09:21:28 -07:00
Holt Skinner	69d9eae5cd	feat: Add Client Info to available Google Cloud Clients (#12168 ) - This is used internally to gather aggregate usage metrics for the LangChain integrations - Note: This cannot be added to some of the Vertex AI integrations at this time because the SDK doesn't allow overriding the [`ClientInfo`](https://googleapis.dev/python/google-api-core/latest/client_info.html#module-google.api_core.client_info) - Added to: - BigQuery - Google Cloud Storage - Document AI - Vertex AI Model Garden - Document AI Warehouse - Vertex AI Search - Vertex AI Matching Engine (Cloud Storage Client) @baskaryan, @eyurtsev, @hwchase17 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-24 08:49:11 -07:00
Lukas Wolf	69f5f82804	Update extraction.py (#12207 ) Description: Pass tags as argument to create_extraction_chain Issue: create_extraction_chain does not pass tags to chain yet @baskaryan	2023-10-24 08:25:14 -07:00
Nuno Campos	34ffb94770	Remove GetLocal, PutLocal (#12133 ) Do you agree?	2023-10-24 10:16:46 +01:00
Eric Hartford	8c150ad7f6	Add COBOL parser and splitter (#11674 ) - Description: Add COBOL parser and splitter - Issue: n/a - Dependencies: n/a - Tag maintainer: @baskaryan - Twitter handle: erhartford --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-23 15:44:31 -04:00
John Mai	ebf749c40c	Baichuan & Hunyuan set default api_base (#12059 ) ### Description Baichuan & Hunyuan set default api_base env	2023-10-23 15:33:35 -04:00
Shilong Dai	99afc1b4f8	Fixed hardcoded "vector" and replaced with vector_query_field variable (#12126 ) - Description: In the max_marginal_relevance_search function of the ElasticsearchStore vector store, the name of the field corresponding to the vector embedding of the document is hard coded in the delete statement that drops the field from the document metadata. This results in an exception if the vector embedding field is customized. This PR changes the hard-coded "vector" into the vector_query_field variable. - Issue: None - Dependencies: None - Tag maintainer: @hwchase17 Co-authored-by: Shilong Dai <sdai@viperfish.net>	2023-10-23 15:08:55 -04:00
Vikram Shitole	0d44746430	10634: Added the capability to inject boto3 client in SagemakerEndpointEmbeddings (#12146 ) Description: Allow to inject boto3 client for Cross account access type of scenarios in using SagemakerEndpointEmbeddings and also updated the documentation for same in the sample notebook Issue:SagemakerEndpointEmbeddings cross account capability #10634 #10184 Dependencies: None Tag maintainer: Twitter handle:lethargicoder Co-authored-by: Vikram(VS) <vssht@amazon.com>	2023-10-23 15:08:26 -04:00
aubin_mzt	66f8cb015d	Add connection args for pgvector vector store (#11930 ) - Description: sqlalchemy create_engine() does not take into account connect_args which are mandatory for managed PGSQL instances on cloud providers (ssl_context for example). Also re-enabled create_vector_extension at post_init for using pgvector class seamlessly - Tag maintainer: @baskaryan, @eyurtsev, @hwchase17. --------- Co-authored-by: Sami Bargaoui <bargaoui.sam@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-23 14:43:44 -04:00
NuODaniel	4d6243fa87	fix: doc string of default params in chat_models, llm qianfan (#12153 ) - Description: a fix of the doc string in Qianfan - Issue: no - Dependencies: no - Tag maintainer: @baskaryan - Twitter handle: no	2023-10-23 14:03:18 -04:00
Predrag Gruevski	f82bdf4613	Update deprecated `langchain` imports with suggested new paths. (#12164 ) Let's help our users find the proper import to use instead of the deprecated top-level ones.	2023-10-23 13:52:08 -04:00
Bagatur	963ff93476	bump 321 (#12161 )	2023-10-23 12:49:38 -04:00
Nuno Campos	d0505c0d47	Update default recursion_limit, update docs (#12134 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-23 16:29:17 +01:00
William FH	4f23aa677a	Fix Pickle Error (#12141 ) If non-pickleable objects (like locks) get passed to the tracing callback, they'll fail in the deepcopy. Fallback to a shallow copy in these instances .	2023-10-23 08:22:47 -07:00
Predrag Gruevski	95a1b598fe	Update to `actions/checkout@v4`. (#11951 ) We don't use any of the new functionality at the moment. Just making sure we don't fall back on versions and fail to benefit from new patches. This is an easy upgrade and it's always harder to upgrade across multiple major versions at once.	2023-10-23 10:01:33 -04:00
William FH	7c4f340cc0	Include Parent Run ID (#12139 ) If you set local callbacks	2023-10-22 17:19:11 -07:00
omahs	f3cc9bba5b	Fix typos (#12128 ) Fix typos	2023-10-22 17:16:03 -07:00
Nuno Campos	1afdb40b48	Add optional config arg to RunnablePassthrough func arg (#12131 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-22 19:57:16 +01:00
Nuno Campos	325fdde8b4	Fix bug where types were lost when calling with_cconfig or bind (#12137 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-22 19:26:13 +01:00
Nuno Campos	02dce74b97	Fix type hint for older py versions (#12132 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-22 18:01:09 +01:00
Nuno Campos	d0ce374731	Allow specifying custom input/output schemas for runnables with .with_types() (#12083 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-22 17:26:48 +01:00
Harrison Chase	ee69116761	move csv agent to langchain experimental (#12113 )	2023-10-21 10:26:02 -07:00
Harrison Chase	03bf6ef473	add missing init files (#12114 )	2023-10-21 10:25:50 -07:00
Bagatur	ef8b180d6d	bump 320 (#12108 )	2023-10-21 11:52:52 -04:00
Rotem Weiss	78d186fb44	Add Tavily Search API as a Tool (#12103 ) Adding Tavily Search API as a tool. I will be the maintainer and assaf_elovic is the twitter handler. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-21 11:23:21 -04:00
Bagatur	85302a9ec1	Add CI check that integration tests compile (#12090 )	2023-10-21 10:52:18 -04:00
verlocks	5dbe456aae	Bug fix tongyi.py to be compatible with DashScope API (#11956 ) Current ChatTongyi is not compatible with DashScope API, which will cause error when passing api key to chat model directly. - Description: Update tongyi.py to be compatible with DashScope API. Specifically, update parameter name "dashscope_api_key" to "api_key". - Issue: None. - Dependencies: Nothing new, Tongyi would require DashScope as before.	2023-10-20 18:46:41 -04:00
Tomaz Bratanic	82f4c0589c	Add neo4j graph environment variables (#12080 )	2023-10-20 14:43:01 -07:00
Mohammad Mohtashim	d5400f6502	Google Scholar Search Tool using serpapi (#11513 ) - Description: Implementing the Google Scholar Tool as requested in PR #11505. The tool will be using the [serpapi python package](https://serpapi.com/integrations/python#search-google-scholar). The main idea of the tool will be to return the results from a Google Scholar search given a query as an input to the tool. - Tag maintainer: @baskaryan, @eyurtsev, @hwchase17	2023-10-20 17:35:55 -04:00
Holt Skinner	f5be2d525a	fix: Add `_serving_config` property to `GoogleVertexAISearchRetriever` (#12084 ) - Fixes error: ``` ValueError: "GoogleVertexAISearchRetriever" object has no field "_serving_config" ``` Introduced in #11736 @baskaryan, @eyurtsev, @hwchase17 if you could review and merge quickly, that would be appreciated :)	2023-10-20 15:16:42 -04:00
Nuno Campos	5fee61a207	Support runnable factories in .configurable_alts() (#12065 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-20 15:22:09 +01:00
Zhitao Xu	a4c3a44712	Fix documentation typo in Clickhouse Class (#12047 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> - Description: The return info in the documentation for similarity_search_by_vector and similarity_search_with_relevance_scores is wrong	2023-10-19 17:00:22 -04:00
William FH	25418b9b4d	Always add run ID (#12046 ) in eval callback handler. Useful if you're using a custom run evaluator and don't want to thread things through.	2023-10-19 12:38:07 -07:00
Eugene Yurtsev	44d7763580	Add zapier deprecation warning (#12045 ) Add zapier deprecation	2023-10-19 15:27:56 -04:00
John Mai	4188f046ec	Add Tencent Hunyuan chat model (#12022 ) ### Description: The Tencent Hunyuan model, developed by Tencent, is a large language model by robust Chinese text generation capabilities, adeptness in logical reasoning within complex contexts, and reliable task execution proficiency.For more information, see [https://cloud.tencent.com/document/product/1729](https://cloud.tencent.com/document/product/1729)	2023-10-19 15:10:12 -04:00
Eugene Yurtsev	68599d98c2	More security notes (#12040 ) Add more security notes	2023-10-19 14:49:09 -04:00
Bagatur	0006075b08	bump 319 (#12041 )	2023-10-19 11:45:27 -07:00
John Mai	8eb40b5fe2	`baichuan_secret_key` use pydantic.types.SecretStr & Add Baichuan tests (#12031 ) ### Description - `baichuan_secret_key` use pydantic.types.SecretStr - Add Baichuan tests	2023-10-19 14:37:41 -04:00
Nuno Campos	85bac75729	nc/runnable-dynamic-schemas-from-config (#12038 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-19 19:34:35 +01:00
Nuno Campos	85eaa4ccee	Revert "nc/runnable-dynamic-schemas-from-config" (#12037 ) This reverts commit `a46eef64a7`. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-19 19:27:02 +01:00
Nuno Campos	a46eef64a7	nc/runnable-dynamic-schemas-from-config	2023-10-19 19:17:48 +01:00
Nuno Campos	d392e030be	Add default value (#12032 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-19 18:30:05 +01:00
Kenneth Choe	62efe1ffb9	support add_embeddings for elasticsearch (#11002 ) - Description: Provide a way to use different text for embedding. - For example, if you are ingesting stack-overflow Q&As for RAG, you would want to embed the questions and return the answer(s) for the hits. With this change, the consumer of langchain can implement that easily. - I noticed the similar function is added on faiss.py with #1912 which was for performance reason, but I see the same function can be used to achieve what I thought. So instead of changing Document class to have embedding_content, I mimicked the implementation of faiss.py. - The test should provide some guidance on how to use it. It would be more intuitive if I just pass texts and embedding_texts as separate arguments, but I chose to use `zip`-ed object for the consistency with faiss.py implementation. - I plan to make similar pull request for OpenSearch. - Issue: N/A - Dependencies: None other than the existing ones. Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-19 09:43:51 -07:00
Bagatur	76d3afaef0	bump 318 (#12030 )	2023-10-19 09:33:39 -07:00
Dmitry Tyumentsev	5dd2161c4b	add _acall method to YandexGPT (#12029 ) - Description: Add async support for YandexGPT LLM model Co-authored-by: Dmitry Tyumentsev <dmitry.tyumentsev@raftds.com>	2023-10-19 09:15:26 -07:00
Peter Krenesky	8425f33363	Pydantic v2 support for OpenAPI Specs (#11936 ) - Description: Adding Pydantic v2 support for OpenAPI Specs - Issue: - OpenAPI spec support was disabled because `openapi-schema-pydantic` doesn't support Pydantic v2: #9205 - Caused errors in `get_openapi_chain` - This may be the cause of #9520. - Tag maintainer: @eyurtsev - Twitter handle: kreneskyp The root cause was that `openapi-schema-pydantic` hasn't been updated in some time but [openapi-pydantic](https://github.com/mike-oakley/openapi-pydantic) forked and updated the project.	2023-10-19 11:06:11 -04:00
Joe McElroy	c9f1768cb9	Elasticsearch Query Retriever: Use match + fuzziness for LIKE (#12023 ) Updated the elasticsearch self query retriever to use the match clause for LIKE operator instead of the non-analyzed fuzzy search clause. Other small updates include: - fixing the stack inference integration test where the index's default pipeline didn't use the inference pipeline created - adding a user-agent to the old implementation to track usage - improved the documentation for ElasticsearchStore filters	2023-10-19 09:47:21 -04:00
Nuno Campos	7db6aabf65	Update chat model output type (#11833 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-19 00:55:15 -07:00
Simon Dai	ed62984cb2	update Weaviate to support multi tenancy (#11842 ) - Description: update Weaviate to support multi tenancy - Issue: 9956 - Dependencies: - Tag maintainer: hwchase17 - Twitter handle: dsx1986_	2023-10-19 00:49:30 -07:00
hiigao	f818ec49b8	Encapsulate alicloud pai-eas access method for chatmodels and llms (#11852 ) ### Description: To provide an eas llm service access methods in this pull request by impletementing `PaiEasEndpoint` and `PaiEasChatEndpoint` classes in `langchain.llms` and `langchain.chat_models` modules. Base on this pr, langchain users can build up a chain to call remote eas llm service and get the llm inference results. ### About EAS Service EAS is a Alicloud product on Alibaba Cloud Machine Learning Platform for AI which is short for AliCloud PAI. EAS provides model inference deployment services for the users. We build up a llm inference services on EAS with a general llm docker images. Therefore, end users can quickly setup their llm remote instances to load majority of the hugginface llm models, and serve as a backend for most of the llm apps. ### Dependencies This pr does't involve any new dependencies. --------- Co-authored-by: 子洪 <gaoyihong.gyh@alibaba-inc.com>	2023-10-19 00:20:18 -07:00
John Mai	a6b483dcbc	Supported RetryOutputParser & RetryWithErrorOutputParser max_retries (#11903 ) Description: Supported RetryOutputParser & RetryWithErrorOutputParser max_retries - max_retries: Maximum number of retries to parser. Issue: None Dependencies: None Tag maintainer: @baskaryan Twitter handle:	2023-10-18 23:57:16 -07:00
Hugues Chocart	008c7df80d	[LLMonitorCallbackHandler] Refactor + add llmonitor-py dependency (#11948 ) We now require uses to have the pip package `llmonitor` installed. It allows us to have cleaner code and avoid duplicates between our library and our code in Langchain.	2023-10-18 23:54:10 -07:00
Sian Cao	77fc2f7644	fix: impl missing embeddings method (#10823 ) FAISS does not implement embeddings method and use embed_query to embedding texts which is wrong for some embedding models. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-18 23:51:28 -07:00
Holt Skinner	2661dc94f3	feat: Google Vertex AI Search Retriever - Add support for Website Data Stores (#11736 ) - Only works for Data stores with Advanced Website Indexing - https://cloud.google.com/generative-ai-app-builder/docs/about-advanced-features - Minor restructuring - Follow up to #10513 - Remove outdated docs (readded in https://github.com/langchain-ai/langchain/pull/11620) - Move legacy class into new py file to clean up the directory - Shouldn't cause backwards compatibility issues as the import works the same way for users	2023-10-18 23:41:48 -07:00
Shorthills AI	4b6fdd7bf0	Update modal.py (#11588 ) feat: Raise KeyError when 'prompt' key is missing in JSON response This commit updates the error handling in the code to raise a KeyError when the 'prompt' key is not found in the JSON response. This change makes the code more explicit about the nature of the error, helping to improve clarity and debugging. @baskaryan, @eyurtsev.	2023-10-18 23:40:37 -07:00
William FH	dfb4baa3f9	Fix Fireworks Callbacks (#12003 ) I may be missing something but it seems like we inappropriately overrode the 'stream()' method, losing callbacks in the process. I don't think (?) it gave us anything in this case to customize it here? See new trace: https://smith.langchain.com/public/fbb82825-3a16-446b-8207-35622358db3b/r and confirmed it streams. Also fixes the stopwords issues from #12000	2023-10-18 23:33:09 -07:00
Wang Wei	e26559f512	Add ERNIE-Bot-4 model support for ErnieBotChat. (#11969 ) - Description: According to the document https://cloud.baidu.com/doc/WENXINWORKSHOP/s/clntwmv7t, add ERNIE-Bot-4 model support for ErnieBotChat. - Dependencies: Before using the ERNIE-Bot-4, you should have the model's access authority.	2023-10-18 14:55:29 -07:00
Eugene Yurtsev	f4bec9686d	Add more security notes (#11990 ) Add more security notes	2023-10-18 15:00:56 -04:00
Eugene Yurtsev	3d81c76160	Add security notes to agent toolkits (#11989 ) Add more security notes to agent toolkits.	2023-10-18 14:36:29 -04:00
Leonid Ganeline	b81a4c1d94	docstrings added (#11988 ) Added docstrings. Some docsctrings formatting.	2023-10-18 13:05:49 -04:00
Bagatur	35c7c1f050	bump 317 (#11986 )	2023-10-18 09:25:18 -07:00
Bagatur	122af2effe	fix chroma from_texts bug (#11984 )	2023-10-18 09:24:04 -07:00
Erick Friis	c149954cc5	Hub Runnable (#11946 ) Adds `langchain.runnables.hub.HubRunnable` for pulling configurable objects from the hub	2023-10-18 09:21:45 -07:00
Owen	9e24626e87	chore: remove duplicated export variables (#11962 ) - Description: remove duplicated `__all__` variables	2023-10-18 12:08:50 -04:00
Nuno Campos	6bd9c1d2b3	Make prompt validation opt-in (#11973 ) By default replace input_variables with the correct value <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-18 16:28:47 +01:00
Nuno Campos	9bc7e1851a	Ensure dict() does not raise not implemented error, which should instead be raised in our custom method save() (#11970 ) .dict() is a Pydantic method that cannot raise exceptions, as it is used eg. in `__eq__` <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-18 16:28:33 +01:00
Nuno Campos	653cf56e0e	Lint	2023-10-18 16:02:00 +01:00
Predrag Gruevski	debcf053eb	Fix `invalid escape sequence` warnings by using raw strings for regexes. (#11943 ) This code also generates warnings when our users' apps hit it, which is annoying and doesn't look great. Let's fix it.	2023-10-18 10:55:17 -04:00
Nuno Campos	e4ae690244	Sort order	2023-10-18 15:42:13 +01:00
Nuno Campos	b753bf3323	Make prompt validation opt-in By default replace input_variables with the correct value	2023-10-18 10:46:22 +01:00
Nuno Campos	202acce0c9	Ensure dict() does not raise not implemented error, which should instead be raised in our custom method save()	2023-10-18 09:44:41 +01:00
Predrag Gruevski	392df7b2e3	Type hints on varargs and kwargs that take anything should be `Any`. (#11950 ) Type hinting `args` as `List[Any]` means that each positional argument should be a list. Type hinting `*kwargs` as `Dict[str, Any]` means that each keyword argument should be a dict of strings. This is almost never what we actually wanted, and doesn't seem to be what we want in any of the cases I'm replacing here.	2023-10-17 21:31:44 -04:00
Eugene Yurtsev	908c7bf33e	Add documentation to tools (#11938 ) Add security notes to tools --------- Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>	2023-10-17 21:27:59 -04:00
Eugene Yurtsev	43dc669332	Update playwright documentation (#11949 ) Add security note to playwright tool	2023-10-17 21:22:26 -04:00
Daniel Chalef	2beb767ae5	zep: Memory Retriever MMR Support & Docs Updates (#11954 ) - Update Zep Memory and Retriever docstrings - Zep Memory Retriever: Add support for native MMR - Add MMR example to existing ZepRetriever Notebook @baskaryan	2023-10-17 16:35:11 -07:00
William FH	a27fa9bf10	Use traceable context (#11896 ) Example ``` from langchain.schema.runnable import RunnableLambda from langsmith import traceable chain = RunnableLambda(lambda x: x) @traceable(run_type = "chain") def my_traceable(a): chain.invoke(a) my_traceable(5) ``` Would have a nested result. This would NOT work for interleaving chains and traceables. E.g., things like thiswould still not work well ``` from langchain.schema.runnable import RunnableLambda from langsmith import traceable @traceable() def other_traceable(a): return a def foo(x): return other_traceable(x) chain = RunnableLambda(foo) @traceable(run_type = "chain") def my_traceable(a): chain.invoke(a) my_traceable(5) ```	2023-10-17 15:10:20 -07:00
Predrag Gruevski	dcd0392423	Upgrade to newer black (23.10) and ruff (first 0.1.x!) versions. (#11944 ) Minor lint dependency version upgrade to pick up latest functionality. Ruff's new v0.1 version comes with lots of nice features, like fix-safety guarantees and a preview mode for not-yet-stable features: https://astral.sh/blog/ruff-v0.1.0	2023-10-17 17:24:51 -04:00
Trayan Azarov	1fd21ed21c	Chroma batching (#11203 ) - Description: Chroma >= 0.4.10 added support for batch sizes validation of add/upsert. This batch size is dependent on the SQLite limits of the target system and varies. In this change, for Chroma>=0.4.10 batch splitting was added as the aforementioned validation is starting to surface in the Chroma community (users using LC) - Issue: N/A - Dependencies: N/A - Tag maintainer: @eyurtsev - Twitter handle: t_azarov	2023-10-17 13:59:42 -07:00
Guy Korland	9373b9c004	Add Graph interface (#11012 ) Replace this entire comment with: - Description: Add a Graph interface - Tag maintainer: @baskaryan @hwchase17 - Twitter handle: @g_korland	2023-10-17 13:54:05 -07:00
DanielZzz	b647505280	feat: support ChatModels Qianfan `QianfanChatEndpoint` function_call (#11107 ) - Description: * feature for `QianfanChatEndpoint` function_call ability, add integration_test for it * add `model`, `endpoint` supported in calling params * add raw response in ChatModel Message - Issue: * #10867 * #11105 * #10215 - Dependencies: no - Tag maintainer: @baskaryan - Twitter handle: no	2023-10-17 13:33:55 -07:00
M Bharat lal	67300567d3	GCSFileLoader retrieve blob custom metadata and append to document metadata (#11066 ) - Description: GCSFileLoader retrieve blob's custom metadata and append to document's metadata - Issue: #9975, - Tag maintainer: @baskaryan please review Co-authored-by: b0l00ib <bharat.lal@walmart.com>	2023-10-17 12:17:59 -07:00
billytrend-cohere	f4742dce50	Add Cohere retrieval augmented generation to retrievers (#11483 ) Add Cohere retrieval augmented generation to retrievers --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-17 11:51:04 -07:00
刘方瑞	0a24ac7388	Revised notebook and add delete to MyScale vector store (#11848 ) - Description: - Add `.delete` to myscale vector store. - Revised vector store notebooks - Tag maintainer: @baskaryan - Twitter handle: @myscaledb @mpsk_liu	2023-10-17 11:42:21 -07:00
John Mai	3fb5e4d185	Add Baichuan chat model (#11923 ) Description: A large language models developed by Baichuan Intelligent Technology，https://www.baichuan-ai.com/home Issue: None Dependencies: None Tag maintainer: Twitter handle:	2023-10-17 11:30:57 -07:00
Eugene Yurtsev	9ecb7240a4	Add security note to recursive url loader (#11934 ) Add security note to recursive loader	2023-10-17 13:41:43 -04:00
maks-operlejn-ds	42dcc502c7	Anonymizer small fixes (#11915 )	2023-10-17 10:27:29 -07:00
Eugene Yurtsev	90e9ec6962	Sitemap specify default filter url (#11925 ) Specify default filter URL in sitemap loader and add a security note --------- Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>	2023-10-17 13:19:27 -04:00
Bagatur	ba0d729961	bump 316 (#11928 )	2023-10-17 09:47:57 -07:00
Eugene Yurtsev	12d7eaa0c2	Add security notices to toolkits (#11900 ) This adds security notices to toolkits init, and to several toolkits. We'll need to continue documenting the rest of the toolkits. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-17 11:45:09 -04:00
Eugene Yurtsev	5f4a697ce3	Add deprecation warnings (#11899 ) Add deprecation warnings Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-17 10:59:38 -04:00
Nuno Campos	8b79cf9566	Add lock for using global config enum weak map (#11920 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-17 15:50:35 +01:00
Nuno Campos	2a8ded6c8c	Export merge_configs function (#11916 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-17 15:36:11 +01:00
Nuno Campos	778e7c526e	Add comment	2023-10-17 15:29:39 +01:00
Nuno Campos	19319e1746	Allow configs with None values	2023-10-17 15:23:58 +01:00
Nuno Campos	b0d5882fe1	Export merge_configs function	2023-10-17 13:22:07 +01:00
Nuno Campos	12596b9a9b	Add validation for configurable keys passed to .with_config() - Fix some typing issues found while doing that	2023-10-17 08:50:31 +01:00
Nuno Campos	754aca794f	remove print	2023-10-17 08:46:07 +01:00
Nuno Campos	cf448a6314	Ensure that configurable fields with enums support deduplication	2023-10-17 08:25:21 +01:00
Leonid Ganeline	31f264169d	evaluation criteria (#11681 ) the updated value was: ` Criteria.MISOGYNY: "Is the submission misogynistic? If so, respond Y." ` The " If so, respond Y." should not be here. This sub-string is not presented in any other criteria and should not be presented here. I also added a synonym to "misogynistic" as it done in many other criteria.	2023-10-16 21:05:08 -07:00
Dmitry Tyumentsev	e8c1850369	Add YandexGPT LLM and Chat model (#11703 ) Description: Introducing an ability to work with the [YandexGPT](https://cloud.yandex.com/en/services/yandexgpt) language model.	2023-10-16 20:30:07 -07:00
Bagatur	c15701eebf	Revert "Add baichuan model" (#11901 ) cc @cloudscool, apologies your PR wasn't actually passing CI	2023-10-16 20:01:12 -07:00
cloudscool	c1d811c4bc	Add baichuan model	2023-10-16 19:27:35 -07:00
John Mai	0169d45ba8	Supported OutputFixingParser max_retries (#11754 ) Description: Supported OutputFixingParser max_retries - max_retries: Maximum number of retries to parser. Issue: None Dependencies: None Tag maintainer: @baskaryan Twitter handle: @JohnMai95 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-16 19:25:47 -07:00
volodymyr-memsql	ff8e6981ff	SingleStoreDBChatMessageHistory: Add singlestoredb support for ChatMessageHistory (#11705 ) Description - Added the `SingleStoreDBChatMessageHistory` class that inherits `BaseChatMessageHistory` and allows to use of a SingleStoreDB database as a storage for chat message history. - Added integration test to check that everything works (requires `singlestoredb` to be installed) - Added notebook with usage example - Removed custom retriever for SingleStoreDB vector store (as it is useless) --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>	2023-10-16 21:59:45 -04:00
Mohammad Mohtashim	634ccb8ccd	test_stream_log_retriever Unit Test + Tool names fix (#11808 ) ## Description \| Tool \| Original Tool Name \| \|-----------------------------\|---------------------------\| \| open-meteo-api \| Open Meteo API \| \| news-api \| News API \| \| tmdb-api \| TMDB API \| \| podcast-api \| Podcast API \| \| golden_query \| Golden Query \| \| dall-e-image-generator \| Dall-E Image Generator \| \| twilio \| Text Message \| \| searx_search_results \| Searx Search Results \| \| dataforseo \| DataForSeo Results JSON \| When using these tools through `load_tools`, I encountered the following validation error: ```console openai.error.InvalidRequestError: 'TMDB API' does not match '^[a-zA-Z0-9_-]{1,64}$' - 'functions.0.name' ``` In order to avoid this error, I replaced spaces with hyphens in the tool names: \| Tool \| Corrected Tool Name \| \|-----------------------------\|---------------------------\| \| open-meteo-api \| Open-Meteo-API \| \| news-api \| News-API \| \| tmdb-api \| TMDB-API \| \| podcast-api \| Podcast-API \| \| golden_query \| Golden-Query \| \| dall-e-image-generator \| Dall-E-Image-Generator \| \| twilio \| Text-Message \| \| searx_search_results \| Searx-Search-Results \| \| dataforseo \| DataForSeo-Results-JSON \| This correction resolved the validation error. Additionally, a unit test, `tests/unit_tests/schema/runnable/test_runnable.py::test_stream_log_retriever`, was failing at random. Upon further investigation, I confirmed that the failure was not related to the above-mentioned changes. The `stream_log` variable was generating the order of logs in two ways at random The reason for this behavior is unclear, but in the assertion, I included both possible orders to account for this variability.	2023-10-16 18:46:19 -07:00
Predrag Gruevski	7c0f1bf23f	Upgrade experimental package dependencies and use Poetry 1.6.1. (#11339 ) Part of upgrading our CI to use Poetry 1.6.1.	2023-10-16 21:13:31 -04:00
Eugene Yurtsev	c2c0814a94	Add security notice to file management tool (#11878 ) Add security notice to file management tool --------- Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>	2023-10-16 21:12:13 -04:00
zhaoshengbo	cb7e12f6ba	Adapt to the latest version of Alibaba Cloud OpenSearch vector store API (#11849 ) Hello Folks, Alibaba Cloud OpenSearch has released a new version of the vector storage engine, which has significantly improved performance compared to the previous version. At the same time, the sdk has also undergone changes, requiring adjustments alibaba opensearch vector store code to adapt. This PR includes: Adapt to the latest version of Alibaba Cloud OpenSearch API. More comprehensive unit testing. Improve documentation. I have read your contributing guidelines. And I have passed the tests below - [x] make format - [x] make lint - [x] make coverage - [x] make test --------- Co-authored-by: zhaoshengbo <shengbo.zsb@alibaba-inc.com>	2023-10-16 18:07:24 -07:00
Lee	e669f9d731	Fix: Sitemap Document Loader Tests and Documentation (#11866 ) Description: While working on the Docusaurus site loader #9138, I noticed some outdated docs and tests for the Sitemap Loader. Issue: This is tangentially related to #6691 in reference to doc links. I plan on digging in to a few of these issue when I find time next.	2023-10-16 17:42:10 -07:00
Jean-Louis Queguiner	8b697ff0ee	feat(llm): add together.xyz as an LLM provider (#11892 ) - Description: added together.xyz as an LLM provider, - Issues: fix some linting issues - twitter handle @jilijeanlouis --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-16 17:08:04 -07:00
Leonid Kuligin	d269dd2e2f	added a multiturn search based on Vertex AI Search (#11885 ) Replace this entire comment with: - Description: Added a retriever based on multi-turn Vertex AI Search - Twitter handle: lkuligin	2023-10-16 17:05:12 -07:00
Leonid Kuligin	38ed55245f	added Vertex examples as attributes (#11890 ) - Description: added examples to Vertex chat models as optional class attributes, so that a model with examples can be used inside a chain - Twitter handle: lkuligin	2023-10-16 16:55:45 -07:00
eryk-dsai	5019f59724	fix: more robust check whether the HF model is quantized (#11891 ) Removes the check of `model.is_quantized` and adds more robust way of checking for 4bit and 8bit quantization in the `huggingface_pipeline.py` script. I had to make the original change on the outdated version of `transformers`, because the models had this property before. Seems redundant now. Fixes: https://github.com/langchain-ai/langchain/issues/11809 and https://github.com/langchain-ai/langchain/issues/11759	2023-10-16 16:54:20 -07:00
Eugene Yurtsev	210a48cfb5	Add security considerations (#11869 ) Add security considerations to existing graph tools.	2023-10-16 12:23:48 -04:00
Bagatur	25b1d65305	bump 315 (#11850 )	2023-10-16 00:50:54 -07:00
Nuno Campos	4321d192ea	Use a less specific return type for \| on Runnables (#11762 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-15 21:15:06 +01:00
Harrison Chase	a506302772	bearly tool (#11812 )	2023-10-14 16:03:58 -07:00
Harrison Chase	4a2f0c51a1	use get_llm_cache and set_llm_cache (#11741 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-14 09:29:30 -07:00
Harrison Chase	f3ad22e64a	pipe default key (#11788 )	2023-10-14 08:39:23 +01:00
Eugene Yurtsev	0d37b4c27d	Add python,pandas,xorbits,spark agents to experimental (#11774 ) See for contex https://github.com/langchain-ai/langchain/discussions/11680	2023-10-13 17:36:44 -04:00
Michael Feil	233a904f2e	GradientLLM Docs update and model_id renaming. (#10963 ) Related to #10800 - Errors in the Docstring of GradientLLM / Gradient.ai LLM - Renamed the `model_id` to `model` and adapting this in all tests. Reason to so is to be in Sync with `GradientEmbeddings` and other LLM's. - inmproving tests so they check the headers in the sent request. - making the aiosession a private attribute in the docs, as in the future `pip install gradientai` will be replacing aiosession. - adding a example how to fine-tune on the Prompt Template as suggested in #10800	2023-10-13 13:57:58 -07:00
Bagatur	1559ba4bfc	fix upstash test import (#11781 )	2023-10-13 13:31:36 -07:00
Leonid Kuligin	9f0a718198	added candidate_count for Vertex models (#11729 ) - Description: added support for `candidate_count` parameter on Vertex	2023-10-13 13:31:20 -07:00
David	9d200e6cbe	Create ChatEverlyAI (#11357 ) - Description: Adds the ChatEverlyAI class with llama-2 7b on [EverlyAI Hosted Endpoints](https://everlyai.xyz/) - It inherits from ChatOpenAI and requires openai (probably unnecessary but it made for a quick and easy implementation) --------- Co-authored-by: everly-studio <127131037+everly-studio@users.noreply.github.com>	2023-10-13 12:25:11 -07:00
Hristo G	7fb25b4154	Add graceful fallback for ES vectorstore when content field is missing (#11726 ) - Description: - If the Elasticsearch field used for Langchain > Document.page_content is missing because the specific document is somehow malformed fail gracefully. - Tag maintainer: - @joemcelroy	2023-10-13 12:03:32 -07:00
Bagatur	f06fcde0d7	rm duplicate zilliz import (#11777 )	2023-10-13 12:01:22 -07:00
Bagatur	a3330c4258	bump 314 (#11773 )	2023-10-13 11:09:54 -07:00
Erick Friis	1861cc7100	General anthropic functions, steps towards experimental integration tests (#11727 ) To match change in js here https://github.com/langchain-ai/langchainjs/pull/2892 Some integration tests need a bit more work in experimental: ![Screenshot 2023-10-12 at 12 02 49 PM](https://github.com/langchain-ai/langchain/assets/9557659/262d7d22-c405-40e9-afef-669e8d585307) Pretty sure the sqldatabase ones are an actual regression or change in interface because it's returning a placeholder. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-13 09:48:24 -07:00
Nuno Campos	17c69678ab	Revert "New add Baichuan Model" (#11761 ) Reverts langchain-ai/langchain#11714 This has linting and formatting issues, plus it's added to chat models folder but doesn't subclass Chat Model base class	2023-10-13 08:23:15 -07:00
cloudscool	56653c53aa	New add Baichuan Model (#11714 ) Motivation and Context At present, the Baichuan Large Language Model is relatively popular and efficient in performance. Due to widespread market recognition, this model has been added to enhance the scalability of Langchain's ability to access the big language model, so as to facilitate application access and usage for interested users. System Info langchain： 0.0.295 python：3.8.3 IDE：vs code Description Add the following files: 1. Add baichuan_baichuaninc_endpoint.py in the libs/langchain/langchain/chat_models 2. Modify the __init__.py file,which is located in the libs/langchain/langchain/chat_models/__init__.py： a. Add "from langchain.chat_models.baichuan_baichuaninc_endpoint import BaichuanChatEndpoint" b. Add "BaichuanChatEndpoint" In the file's __ All__ method Your contribution I am willing to help implement this feature and submit a PR, but I would appreciate guidance from the maintainers or community to ensure the changes are made correctly and in line with the project's standards and practices.	2023-10-12 23:04:28 -07:00
Yang, Bo	9e1e0f54d2	Add `TrainableLLM` (#11721 ) - Description: Add `TrainableLLM` for those LLM support fine-tuning - Tag maintainer: @hwchase17 This PR add training methods to `GradientLLM` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-12 17:38:33 -07:00
Burak Yılmaz	63e516c2b0	Upstash redis integration (#10871 ) - Description: Introduced Upstash provider with following wrappers: UpstashRedisCache, UpstashRedisEntityStore, UpstashRedisChatMessageHistory, UpstashRedisStore - Issue: -, - Dependencies: upstash-redis python package is needed, - Tag maintainer: @baskaryan - Twitter handle: @BurakY744 --------- Co-authored-by: Burak Yılmaz <burakyilmaz@Buraks-MacBook-Pro.local> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-12 17:36:51 -07:00
Bagatur	a9db2b0b92	fix tongyi import (#11745 )	2023-10-12 17:24:06 -07:00
Aaron Pham	6c61315067	fix(openllm): update with newer remote client implementation (#11740 ) cc @baskaryan --------- Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-10-12 17:01:18 -07:00
Richy Wang	11cdfe44af	Implement Alibaba Tongyi chat model apis. (#10922 ) Hi there This PR is aim to implement chat model for Alibaba Tongyi LLM model. It contains work below: 1.Implement ChatTongyi chat model in langchain.chat_models.tongyi. Note this is different with tongyi llm model to another PR https://github.com/langchain-ai/langchain/pull/10878. For detail it implements _generate() and _stream() function in ChatTongyi. 2. Add some examples in chat/tongyi.ipynb. 3. Add integration test in chat_models/test_tongyi.py Note async completion for the Text API is not yet supported. Dependencies: dashscope. It will be installed manually cause it is not need by everyone.	2023-10-12 16:59:37 -07:00
Adam Demjen	008348ce71	Add ElasticsearchChatMessageHistory (#10932 ) Description This PR adds the `ElasticsearchChatMessageHistory` implementation that stores chat message history in the configured [Elasticsearch](https://www.elastic.co/elasticsearch/) deployment. ```python from langchain.memory.chat_message_histories import ElasticsearchChatMessageHistory history = ElasticsearchChatMessageHistory( es_url="https://my-elasticsearch-deployment-url:9200", index="chat-history-index", session_id="123" ) history.add_ai_message("This is me, the AI") history.add_user_message("This is me, the human") ``` Dependencies - [elasticsearch client](https://elasticsearch-py.readthedocs.io/) required Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-12 16:51:38 -07:00
Jonathan Soma	48cf978391	Allow placeholders in OpenAPI endpoints #2938 (#2940 ) Use regex matches when checking endpoints instead of exact matches. `{varname}` becomes `.*` Fixes #2938 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-12 16:20:32 -07:00
Predrag Gruevski	9e32120cbb	Deprecate direct access to globals like `debug` and `verbose`. (#11311 ) Instead of accessing `langchain.debug`, `langchain.verbose`, or `langchain.llm_cache`, please use the new getter/setter functions in `langchain.globals`: - `langchain.globals.set_debug()` and `langchain.globals.get_debug()` - `langchain.globals.set_verbose()` and `langchain.globals.get_verbose()` - `langchain.globals.set_llm_cache()` and `langchain.globals.get_llm_cache()` Using the old globals directly will now raise a warning. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-10-12 15:48:04 -07:00
Richard Adams	35965df20d	Rspace doc loader (#11511 ) Description: Add a document loader for the RSpace Electronic Lab Notebook (www.researchspace.com), so that scientific documents and research notes can be easily pulled into Langchain pipelines. Issue This is an new contribution, rather than an issue fix. Dependencies: There are no new required dependencies. In order to use the loader, clients will need to install rspace_client SDK using `pip install rspace_client` --------- Co-authored-by: richarda23 <richard.c.adams@infinityworks.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-12 15:05:38 -07:00
Ryan Zotti	9d1867c77f	Update docs to specify Indexing-API-compatible vectorstores (#11581 ) Description: Update Indexing API docs to specify vectorstores that are compatible with the Indexing API. I add a unit test to remind developers to update the documentation whenever they add or change a vectorstore in a way that affects compatibility. For the unit test I repurposed existing code from [here](https://github.com/langchain-ai/langchain/blob/v0.0.311/libs/langchain/langchain/indexes/_api.py#L245-L257). This is my first PR to an open source project. This is a trivially simple PR whose main purpose is to make me more comfortable submitting Langchain PRs. If this PR goes through I plan to submit PRs with more substantive changes in the near future. Issue: Resolves [10482](https://github.com/langchain-ai/langchain/discussions/10482). Dependencies: No new dependencies. Twitter handle: None.	2023-10-12 15:17:44 -04:00
Richard Wang	6402c33299	Let Notion document loader support utf-8 and make it default. (#10613 ) Use utf-8 encoding by default	2023-10-12 15:13:41 -04:00
Bagatur	bd74eba152	add azure openai sched tests (#11723 )	2023-10-12 10:48:45 -07:00
Bagatur	9c0584be74	bump 313 (#11718 )	2023-10-12 09:48:54 -07:00
sudranga	361f8e1bc6	Add MMR functionality to elasticsearch retriever (#11633 ) Allows MMR functionality only for the case where we have access to the embedding function. Also allows for users to request for fields from elasticsearch store. These are added to the document metadata. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-12 08:42:32 -07:00
Dmitry Tyumentsev	ead9d5b55c	Add yandex stt parser (#11435 ) Description: Introducing an ability to load a transcription document of audio file using [Yandex SpeechKit](https://cloud.yandex.com/en-ru/services/speechkit) Issue: None Dependencies: yandex-speechkit Tag maintainer: @rlancemartin, @eyurtsev	2023-10-12 08:42:03 -07:00
Janos Tolgyesi	15687a28d5	Use correct tokenizer for Bedrock/Anthropic LLMs (#11561 ) Description This PR implements the usage of the correct tokenizer in Bedrock LLMs, if using anthropic models. Issue: #11560 Dependencies: optional dependency on `anthropic` python library. Twitter handle: jtolgyesi --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-12 08:41:52 -07:00
kYLe	467b082c34	Modify Anyscale integration to work with Anyscale Endpoint (#11569 ) Description: Modify Anyscale integration to work with [Anyscale Endpoint](https://docs.endpoints.anyscale.com/) and it supports invoke, async invoke, stream and async invoke features --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-12 08:41:25 -07:00
plpycoin	51193309ea	Update readthedocs.py (#11110 ) Only parse .html files .svg .png favicon.ico will crash processing phase --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-12 11:32:06 -04:00
Nuno Campos	ca9de26f2b	Add callback function to RunnablePassthrough (#11564 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-12 15:10:16 +01:00
Nuno Campos	7f4734c0dd	Add deploy command to repos generated by cli template (#11711 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-12 15:09:21 +01:00
Nuno Campos	1c0857b53e	Fix default impl of aparse_result (#11702 ) Should delegate to parse_result, not to aparse, as parse_result is a method that some output parsers override <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-12 14:13:59 +01:00
nuric	44da27c07b	Add SemaDB VST wrapper (#11484 ) - Description: Adding vectorstore wrapper for [SemaDB](https://rapidapi.com/semafind-semadb/api/semadb). - Issue: None - Dependencies: None - Twitter handle: semafind Checks performed: - [x] `make format` - [x] `make lint` - [x] `make test` - [x] `make spell_check` - [x] `make docs_build` Documentation added: - SemaDB vectorstore wrapper tutorial	2023-10-11 19:09:38 -07:00
hsuyuming	0b743f005b	Feature/enhance huggingfacepipeline to handle different return type (#11394 ) Description: Avoid huggingfacepipeline to truncate the response if user setup return_full_text as False within huggingface pipeline. Dependencies: : None Tag maintainer: Maybe @sam-h-bean ? --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-11 19:09:03 -07:00
Leonid Kuligin	2aba9ab47e	Retriever based on GCP DocAI Warehouse (#11400 ) - Description: implements a retriever on top of DocAI Warehouse (to interact with existing enterprise documents) https://cloud.google.com/document-ai-warehouse?hl=en - Issue: new functionality @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-11 19:08:53 -07:00
Erick Friis	a477ddda45	Langsmith in readme update (#11497 )	2023-10-11 18:43:52 -07:00
Leonid Kuligin	9e81ab47be	Added a better error description if processor name is wrong. (#11488 ) Replace this entire comment with: - Description: added a better error description for this error - Issue: #11407 @baskaryan	2023-10-11 18:43:40 -07:00
Robert Yi	e75766b759	fix: incorrect arguments in clickhouse docstring (#11693 ) fix docstring for clickhouse	2023-10-11 21:41:21 -04:00
Eugene Yurtsev	17b5090c18	Add `type` to Agent actions (#11682 ) Add `type` to agent actions.	2023-10-11 21:33:24 -04:00
April	c14a8df2ee	wrap confluence attachment processing with a try-except block (#11503 ) Prevents document loading from erroring out when an attachment is not found at the url. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-11 18:13:42 -07:00
eajechiloae	4ba2c8ba75	Fix ClearML callback (#11472 ) Handle different field names in dicts/dataframes, fixing the ClearML callback. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-11 17:09:02 -07:00
Lawrence Wu	93bb19f69a	Fix chains/loading.py error messages (#11688 ) - Description: make the error messages consistent in chains/loading.py - Dependencies: None	2023-10-11 17:05:42 -07:00
Harrison Chase	18ebce2032	fix tool async (#11689 )	2023-10-11 16:40:23 -07:00
sudranga	9beb03e771	11474 (#11519 ) No relevant documents may be found for a given question. In some use cases, we could directly respond with a fixed message instead of doing an LLM call with an empty context. This PR exposes this as an option: response_if_no_docs_found. --------- Co-authored-by: Sudharsan Rangarajan <sudranga@nile-global.com>	2023-10-11 16:30:15 -07:00
Joaquin Menendez	ef99b06362	feature: add metadata information into the embedding file before uplo… (#11553 ) Replace this entire comment with: - Description: In this modified version of the function, if the metadatas parameter is not None, the function includes the corresponding metadata in the JSON object for each text. This allows the metadata to be stored alongside the text's embedding in the vector store. - - Issue: #10924 - Dependencies: None - Tag maintainer: @hwchase17 @agola11 - Twitter handle: @MelliJoaco --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-11 16:05:13 -07:00
Marcin Wątroba	51a3a86022	#11655 Add SQLAlchemyMd5Cache implementation (#11660 ) - Description: Add SQLAlchemyMd5Cache implementation, - Issue: the issue # #11655, - Dependencies: no deps, - Tag maintainer: @markowanga --------- Co-authored-by: Marcin Wątroba <marcin.watroba@pwr.edu.pl> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-11 15:28:09 -07:00
Suresh Kumar Ponnusamy	70f7558db2	langchain-experimental: Add allow_list support in experimental/data_anonymizer (#11597 ) - Description: Add allow_list support in langchain experimental data-anonymizer package - Issue: no - Dependencies: no - Tag maintainer: @hwchase17 - Twitter handle:	2023-10-11 14:50:41 -07:00
wemysschen	2363c02cf3	Bos loader (#11525 ) Description: Add BaiduCloud BOS document loader. --------- Co-authored-by: chenweixu01 <chenweixu01@baidu.com> Co-authored-by: root <root@icoding-cwx.bcc-szzj.baidu.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-11 14:43:48 -07:00
Kwanghoon Choi	fbb82608cd	Fixed a bug in reporting Python code validation (#11522 ) - Description: fixed a bug in pal-chain when it reports Python code validation errors. When node.func does not have any ids, the original code tried to print node.func.id in raising ValueError. - Issue: n/a, - Dependencies: no dependencies, - Tag maintainer: @hazzel-cn, @eyurtsev - Twitter handle: @lazyswamp --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-11 14:34:28 -07:00
Harrison Chase	9f39c23a13	add input type for convo retrieval chain (#11679 )	2023-10-11 17:13:48 -04:00
zhaozhiming	d5e762d328	fix: Change the docs of JSONAgentOutputParser (#11594 ) I am merely making some minor adjustments to the function documentation. I hope to provide a small assistance to LangChain. - Description: Change the docs of JSONAgentOutputParser. It will be `JSON` better, - Issue: no, - Dependencies: no, - Tag maintainer: @hwchase17, - Twitter handle: Not worth mentioning.	2023-10-11 14:05:53 -07:00
Vinay Kakade	dd0cd98861	Add support for ChatOpenAI models in Infino callback handler (#11608 ) Description: This PR adds support for ChatOpenAI models in the Infino callback handler. In particular, this PR implements `on_chat_model_start` callback, so that ChatOpenAI models are supported. With this change, Infino callback handler can be used to track latency, errors, and prompt tokens for ChatOpenAI models too (in addition to the support for OpenAI and other non-chat models it has today). The existing example notebook is updated to show how to use this integration as well. cc/ @naman-modi @savannahar68 Issue: https://github.com/langchain-ai/langchain/issues/11607 Dependencies: None Tag maintainer: @hwchase17 Twitter handle: [@vkakade](https://twitter.com/vkakade)	2023-10-11 14:00:54 -07:00
Israel Ekpo	d0603c86b6	Add Support for Azure Cosmos DB MongoDB vCore Vector Store #11627 (#11632 ) This PR adds support for the Azure Cosmos DB MongoDB vCore Vector Store https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/ https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/vector-search Summary: - Description: added vector store integration for Azure Cosmos DB MongoDB vCore Vector Store, - Issue: the issue # it fixes #11627, - Dependencies: pymongo dependency, - Tag maintainer: @hwchase17, - Twitter handle: @izzyacademy --------- Co-authored-by: Israel Ekpo <israel.ekpo@gmail.com> Co-authored-by: Israel Ekpo <44282278+izzyacademy@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-11 13:56:46 -07:00
Erick Friis	28ee6a7c12	Track ChatFireworks time to first_token (#11672 )	2023-10-11 13:37:03 -07:00
Eugene Yurtsev	539941281d	Fix output types for BaseChatModel (#11670 ) * Should use non chunked messages for Invoke/Batch * After this PR, stream output type is not represented, do we want to use the union? --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-11 16:02:03 -04:00
Eugene Yurtsev	99adcdb1c9	Add dedicated `type` attribute to be used solely for serialization purposes (#11585 ) Adds standard `type` field for all messages that will be serialized/validated by pydantic. * The presence of `type` makes it easier for developers consuming schemas to write client code to serialize/deserialize. * In LangServe `type` will be used for both validation and will appear in the generated openapi specs	2023-10-11 15:06:42 -04:00
eryk-dsai	06d5971be9	Fix issue #10985 - Skip model.to(device) if it is instantiated with bitsandbytes config (#11009 ) Preventing error caused by attempting to move the model that was already loaded on the GPU using the Accelerate module to the same or another device. It is not possible to load model with Accelerate/PEFT to CPU for now Addresses: [#10985](https://github.com/langchain-ai/langchain/issues/10985)	2023-10-11 09:28:27 -07:00
Nuno Campos	64969bc8ae	Add patch_config(configurable=) arg, make with_config(configurable=) merge it with existing (#11662 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-11 14:45:31 +01:00
Harrison Chase	ce0019b646	make utils conditional (#11646 )	2023-10-11 06:11:32 +01:00
Harrison Chase	8f06085b24	make tools conditional (#11647 )	2023-10-11 06:11:05 +01:00
Bassem Yacoube	5451b724fc	Adds support for llama2 and fixes MPT-7b url (#11465 ) - Description: This is an update to OctoAI LLM provider that adds support for llama2 endpoints hosted on OctoAI and updates MPT-7b url with the current one. @baskaryan Thanks! --------- Co-authored-by: ML Wiz <bassemgeorgi@gmail.com>	2023-10-10 20:34:35 -07:00
Todd Kerpelman	0bff399af1	Make metadata from the url_selenium loader match that of the web_base loader (#11617 ) Description: I noticed the metadata returned by the url_selenium loader was missing several values included by the web_base loader. (The former returned `{source: ...}`, the latter returned `{source: ..., title: ..., description: ..., language: ...}`.) This change fixes it so both loaders return all 4 key value pairs. Files have been properly formatted and all tests are passing. Note, however, that I am not much of a python expert, so that whole "Adding the imports inside the code so that tests pass" thing seems weird to me. Please LMK if I did anything wrong.	2023-10-10 20:32:45 -07:00
Tarun Thotakura	c9d4d53545	Fixed the assignment of custom_llm_provider argument (#11628 ) - Description: Assigning the custom_llm_provider to the default params function so that it will be passed to the litellm - Issue: Even though the custom_llm_provider argument is being defined it's not being assigned anywhere in the code and hence its not being passed to litellm, therefore any litellm call which uses the custom_llm_provider as required parameter is being failed. This parameter is mainly used by litellm when we are doing inference via Custom API server. https://docs.litellm.ai/docs/providers/custom_openai_proxy - Dependencies: No dependencies are required @krrishdholakia , @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-10 20:29:24 -07:00
Leonid Ganeline	db67ccb0bb	docstrings cleanup (#11640 ) Added missed docstrings. Some reformatting.	2023-10-10 19:56:47 -07:00
Yang, Bo	3a82bd7bdb	Use raise from statement so that users can find detailed error message (#11461 ) - Description: Use `raise from` statement so that users can find detailed error message - Tag maintainer: @baskaryan, @eyurtsev, @hwchase17	2023-10-10 17:25:23 -07:00
Nuno Campos	9a0ed75a95	Add configurable fields with options (#11601 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-10 22:17:22 +01:00
Bagatur	7232e082de	bump 312 (#11621 )	2023-10-10 12:34:49 -07:00
Eugene Yurtsev	58220cda72	Remove LLM Bash and related bash utilities (#11619 ) Deprecate LLMBash and related bash utilities	2023-10-10 14:54:09 -04:00
Shubham Kushwaha	49de862076	Arcee.ai LLM & Retriever integration (#11579 ) - Description: This PR introduces a new LLM and Retriever API to https://arcee.ai for the python client - Issue: implements the integrations as requested in #11578 , - Dependencies: no dependencies are required, - Tag maintainer: @hwchase17 - Twitter handle: shwooobham ✅ `make format`, `make lint` and `make test` runs locally. ```shell =========== 1245 passed, 277 skipped, 20 warnings in 16.26s =========== ./scripts/check_pydantic.sh . ./scripts/check_imports.sh poetry run ruff . [ "." = "" ] \|\| poetry run black . --check All done! ✨ 🍰 ✨ 1818 files would be left unchanged. [ "." = "" ] \|\| poetry run mypy . Success: no issues found in 1815 source files [ "." = "" ] \|\| poetry run black . All done! ✨ 🍰 ✨ 1818 files left unchanged. [ "." = "" ] \|\| poetry run ruff --select I --fix . poetry run codespell --toml pyproject.toml poetry run codespell --toml pyproject.toml -w ``` Contributions 1. Arcee (langchain/llms), ArceeRetriever (langchain/retrievers), ArceeWrapper (langchain/utilities) 2. docs for Arcee (llms/arcee.py) and ArceeRetriever(retrievers/arcee.py) 3. cc: @jacobsolawetz @ben-epstein --------- Co-authored-by: Shubham <shubham@sORo.local>	2023-10-10 10:20:45 -07:00
Eugene Yurtsev	b56ca0c2a4	Deprecate LLMSymbolicMath from langchain core (#11615 ) Deprecate LLMSymbolicMath from langchain core package.	2023-10-10 12:33:51 -04:00
Eugene Yurtsev	c9bce5bbfb	Add version to langchain_experimental (#11613 ) Add version to langchain experimental	2023-10-10 11:17:41 -04:00
Predrag Gruevski	22abeb9f6c	Disable loading jinja2 `PromptTemplate` from file. (#10252 ) jinja2 templates are not sandboxed and are at risk for arbitrary code execution. To mitigate this risk: - We no longer support loading jinja2-formatted prompt template files. - `PromptTemplate` with jinja2 may still be constructed manually, but the class carries a security warning reminding the user to not pass untrusted input into it. Resolves #4394.	2023-10-10 11:15:42 -04:00
Nuno Campos	c7c03d4709	Fix mutation bugs in callback manager configure (#11603 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-10 14:50:18 +01:00
cccs-eric	e2a9072b80	Fix CohereRerank configuration (#11583 ) Description: CohereRerank is missing `cohere_api_key` as a field and since extras are forbidden, it is not possible to pass-in the key. The only way is to use an env variable named `COHERE_API_KEY`. For example, if trying to create a compressor like this: ```python cohere_api_key = "......Cohere api key......" compressor = CohereRerank(cohere_api_key=cohere_api_key) ``` you will get the following error: ``` File "/langchain/.venv/lib/python3.10/site-packages/pydantic/v1/main.py", line 341, in __init__ raise validation_error pydantic.v1.error_wrappers.ValidationError: 1 validation error for CohereRerank cohere_api_key extra fields not permitted (type=value_error.extra) ```	2023-10-09 23:26:34 -07:00
Anar	55fef4b64b	implemented add files method in LLMRails (#11518 ) This PR provides add files method with LLMRails. Implemented here are: docs/extras/integrations/vectorstores/llm-rails.ipynb --------- Co-authored-by: Anar Aliyev <aaliyev@mgmt.cloudnet.services>	2023-10-09 16:29:43 -07:00
Stephen Hankinson	316dddc7cd	fix wording of query_sql_database_tool_description (#11530 ) - Description: Fixes minor typo for the query_sql_database_tool_description in the db toolkit - Issue: N/A - Dependencies: N/A - Tag maintainer: @nfcampos - Twitter handle: N/A	2023-10-09 15:32:45 -07:00
Ash Vardanian	1acfe86353	Accelerating Math Utils with SimSIMD (#11566 ) LangChain relies on NumPy to compute cosine distances, which becomes a bottleneck with the growing dimensionality and number of embeddings. To avoid this bottleneck, in our libraries at [Unum](https://github.com/unum-cloud), we have created a specialized package - [SimSIMD](https://github.com/ashvardanian/simsimd), that knows how to use newer hardware capabilities. Compared to SciPy and NumPy, it reaches 3x-200x performance for various data types. Since publication, several LangChain users have asked me if I can integrate it into LangChain to accelerate their workflows, so here I am 🤗 ## Benchmarking To conduct benchmarks locally, run this in your Jupyter: ```py import numpy as np import scipy as sp import simsimd as simd import timeit as tt def cosine_similarity_np(X: np.ndarray, Y: np.ndarray) -> np.ndarray: X_norm = np.linalg.norm(X, axis=1) Y_norm = np.linalg.norm(Y, axis=1) with np.errstate(divide="ignore", invalid="ignore"): similarity = np.dot(X, Y.T) / np.outer(X_norm, Y_norm) similarity[np.isnan(similarity) \| np.isinf(similarity)] = 0.0 return similarity def cosine_similarity_sp(X: np.ndarray, Y: np.ndarray) -> np.ndarray: return 1 - sp.spatial.distance.cdist(X, Y, metric='cosine') def cosine_similarity_simd(X: np.ndarray, Y: np.ndarray) -> np.ndarray: return 1 - simd.cdist(X, Y, metric='cosine') X = np.random.randn(1, 1536).astype(np.float32) Y = np.random.randn(1, 1536).astype(np.float32) repeat = 1000 print("NumPy: {:,.0f} ops/s, SciPy: {:,.0f} ops/s, SimSIMD: {:,.0f} ops/s".format( repeat / tt.timeit(lambda: cosine_similarity_np(X, Y), number=repeat), repeat / tt.timeit(lambda: cosine_similarity_sp(X, Y), number=repeat), repeat / tt.timeit(lambda: cosine_similarity_simd(X, Y), number=repeat), )) ``` ## Results I ran this on an M2 Pro Macbook for various data types and different number of rows in `X` and reformatted the results as a table for readability: \| Data Type \| NumPy \| SciPy \| SimSIMD \| \| :--- \| ---: \| ---: \| ---: \| \| `f32, 1` \| 59,114 ops/s \| 80,330 ops/s \| 475,351 ops/s \| \| `f16, 1` \| 32,880 ops/s \| 82,420 ops/s \| 650,177 ops/s \| \| `i8, 1` \| 47,916 ops/s \| 115,084 ops/s \| 866,958 ops/s \| \| `f32, 10` \| 40,135 ops/s \| 24,305 ops/s \| 185,373 ops/s \| \| `f16, 10` \| 7,041 ops/s \| 17,596 ops/s \| 192,058 ops/s \| \| `f16, 10` \| 21,989 ops/s \| 25,064 ops/s \| 619,131 ops/s \| \| `f32, 100` \| 3,536 ops/s \| 3,094 ops/s \| 24,206 ops/s \| \| `f16, 100` \| 900 ops/s \| 2,014 ops/s \| 23,364 ops/s \| \| `i8, 100` \| 5,510 ops/s \| 3,214 ops/s \| 143,922 ops/s \| It's important to note that SimSIMD will underperform if both matrices are huge. That, however, seems to be an uncommon usage pattern for LangChain users. You can find a much more detailed performance report for different hardware models here: - [Apple M2 Pro](https://ashvardanian.com/posts/simsimd-faster-scipy/#appendix-1-performance-on-apple-m2-pro). - [4th Gen Intel Xeon Platinum](https://ashvardanian.com/posts/simsimd-faster-scipy/#appendix-2-performance-on-4th-gen-intel-xeon-platinum-8480). - [AWS Graviton 3](https://ashvardanian.com/posts/simsimd-faster-scipy/#appendix-3-performance-on-aws-graviton-3). ## Additional Notes 1. Previous version used `X = np.array(X)`, to repackage lists of lists. It's an anti-pattern, as it will use double-precision floating-point numbers, which are slow on both CPUs and GPUs. I have replaced it with `X = np.array(X, dtype=np.float32)`, but a more selective approach should be discussed. 2. In numerical computations, it's recommended to explicitly define tolerance levels, which were previously avoided in `np.allclose(expected, actual)` calls. For now, I've set absolute tolerance to distance computation errors as 0.01: `np.allclose(expected, actual, atol=1e-2)`. --- - Dependencies: adds `simsimd` dependency - Tag maintainer: @hwchase17 - Twitter handle: @ashvardanian --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-09 14:56:55 -07:00
benchello	5de64e6d60	Add option to specify metadata columns in CSV loader (#11576 ) #### Description This PR adds the option to specify additional metadata columns in the CSVLoader beyond just `Source`. The current CSV loader includes all columns in `page_content` and if we want to have columns specified for `page_content` and `metadata` we have to do something like the below.: ``` csv = pd.read_csv( "path_to_csv" ).to_dict("records") documents = [ Document( page_content=doc["content"], metadata={ "last_modified_by": doc["last_modified_by"], "point_of_contact": doc["point_of_contact"], } ) for doc in csv ] ``` #### Usage Example Usage: ``` csv_test = CSVLoader( file_path="path_to_csv", metadata_columns=["last_modified_by", "point_of_contact"] ) ``` Example CSV: ``` content, last_modified_by, point_of_contact "hello world", "Person A", "Person B" ``` Example Result: ``` Document { page_content: "hello world" metadata: { row: '0', source: 'path_to_csv', last_modified_by: 'Person A', point_of_contact: 'Person B', } ``` --------- Co-authored-by: Ben Chello <bchello@dropbox.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-09 14:56:45 -07:00
Stephen Hankinson	447a523662	fix comments in output format (#11536 ) - Description: Fixes the comments in the ConvoOutputParser. Because the \\\\ is escaping a single \\, they render something like: `"action_input": string \ The input to the action` in the prompt. Changing this to \\\\\\\\ lets it escape two slashes so that it renders a proper comment: `"action_input": string \\ The input to the action` - Issue: N/A - Dependencies: - Tag maintainer: @hwchase17 - Twitter handle:	2023-10-09 14:55:44 -07:00
Michael Landis	8e45f720a8	feat: add momento vector index as a vector store provider (#11567 ) Description: - Added Momento Vector Index (MVI) as a vector store provider. This includes an implementation with docstrings, integration tests, a notebook, and documentation on the docs pages. - Updated the Momento dependency in pyproject.toml and the lock file to enable access to MVI. - Refactored the Momento cache and chat history session store to prefer using "MOMENTO_API_KEY" over "MOMENTO_AUTH_TOKEN" for consistency with MVI. This change is backwards compatible with the previous "auth_token" variable usage. Updated the code and tests accordingly. Dependencies: - Updated Momento dependency in pyproject.toml. Testing: - Run the integration tests with a Momento API key. Get one at the [Momento Console](https://console.gomomento.com) for free. MVI is available in AWS us-west-2 with a superuser key. - `MOMENTO_API_KEY=<your key> poetry run pytest tests/integration_tests/vectorstores/test_momento_vector_index.py` Tag maintainer: @eyurtsev Twitter handle: Please mention @momentohq for this addition to langchain. With the integration of Momento Vector Index, Momento caching, and session store, Momento provides serverless support for the core langchain data needs. Also mention @mlonml for the integration.	2023-10-09 14:02:59 -07:00
Eugene Yurtsev	ca2eed36b7	LangChain cli fix a few bugs (#11573 ) Code was assuming that `git` and `poetry` exist. In addition, it was not ignoring pycache files that get generated during run time	2023-10-09 13:30:16 -07:00
Hugues Chocart	258ae1ba5f	[LLMonitor Callback Handler]: Add error handling (#11563 ) Wraps every callback handler method in error handlers to avoid breaking users' programs when an error occurs inside the handler. Thanks @valdo99 for the suggestion 🙂	2023-10-09 13:26:35 -07:00
Eugene Yurtsev	2aabfafe1e	Module documentation for langchain runnables (#11550 ) Add in code documentation for langchain runnables module.	2023-10-09 16:02:29 -04:00
Eugene Yurtsev	d8fa94e6fa	RunnablePassthrough: In code documentation (#11552 ) Add in code documentation for a runnable passthrough	2023-10-09 16:02:16 -04:00
Eugene Yurtsev	b42f218cfc	RunnableLambda: Add in code docs (#11521 ) Add in code docs for Runnable Lambda	2023-10-09 14:37:46 -04:00
maks-operlejn-ds	f64522fbaf	Reset deanonymizer mapping (#11559 ) @hwchase17 @baskaryan	2023-10-09 11:11:05 -07:00
maks-operlejn-ds	b14b65d62a	Support all presidio entities (#11558 ) https://microsoft.github.io/presidio/supported_entities/ @baskaryan @hwchase17	2023-10-09 11:10:46 -07:00
maks-operlejn-ds	4d62def9ff	Better deanonymizer matching strategy (#11557 ) @baskaryan, @hwchase17	2023-10-09 11:10:29 -07:00
Ash Vardanian	a992b9670d	Fix: Missing DuckDuckGo package version (#11535 ) [The `duckduckgo-search` v3.9.2 was removed from PyPi](https://pypi.org/project/duckduckgo-search/#history). That breaks the build. - Description: refreshes the Poetry dependency to v3.9.3 - Tag maintainer: @baskaryan - Twitter handle: @ashvardanian	2023-10-09 10:55:46 -07:00
Bagatur	8932ed3f07	bump 311 (#11555 )	2023-10-09 08:17:07 -07:00
Bagatur	e7a0def1bc	QoL improvements to query constructor (#11504 ) updating query constructor and self query retriever to - make it easier to pass in examples - validate attributes used in query - remove invalid parts of query - make it easier to get + edit prompt - make query constructor a runnable - make self query retriever use as runnable	2023-10-09 08:10:52 -07:00
Taikono-Himazin	eec53fa294	Added autodetect_encoding option to csvLoader (#11327 )	2023-10-09 08:06:43 -07:00
Holt Skinner	09c66fe04f	feat: Update Google Document AI Parser (#11413 ) - Description: Code Refactoring, Documentation Improvements for Google Document AI PDF Parser - Adds Online (synchronous) processing option. - Adds default field mask to limit payload size. - Skips Human review by default. - Issue: Fixes #10589 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-09 08:04:25 -07:00
Nuno Campos	628cc4cce8	Rename RunnableMap to RunnableParallel (#11487 ) - keep alias for RunnableMap - update docs to use RunnableParallel and RunnablePassthrough.assign <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-09 11:22:03 +01:00
Eugene Yurtsev	6a10e8ef31	Add documentation to Runnable (#11516 )	2023-10-08 08:09:04 +01:00
William FH	eb572f41a6	Add LangSmith Run Chat Loader (#11458 )	2023-10-06 17:02:18 -07:00
David Duong	484947c492	Fetch up-to-date attributes for env-pulled kwargs during serialisation of OpenAI classes (#11499 )	2023-10-06 22:43:29 +01:00
Bagatur	5470e730d2	raise openapi import error (#11495 )	2023-10-06 12:57:24 -07:00
Erick Friis	29f5f70415	Rename some last hwchase17/langchain links (#11494 )	2023-10-06 12:34:30 -07:00
Fabrice Pont	872836c541	feat: add markdown list parser (#11411 ) Description: add `MarkdownListOutputParser` as a new `ListOutputParser` Issue: #11410	2023-10-06 12:25:45 -07:00
Erick Friis	8f50b616c5	Remove optional from vectara source (#11493 ) fyi @ofermend --------- Co-authored-by: Ofer Mendelevitch <ofer@vectara.com> Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>	2023-10-06 12:12:44 -07:00
Bagatur	53887242a1	bump 310 (#11486 )	2023-10-06 09:49:10 -07:00
Jesús Vélez Santiago	a1c7532298	Add async sql record manager and async indexing API (#10726 ) - Description: Add support for a SQLRecordManager in async environments. It includes the creation of `RecorManagerAsync` abstract class. - Issue: None - Dependencies: Optional `aiosqlite`. - Tag maintainer: @nfcampos - Twitter handle: @jvelezmagic --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-06 09:38:44 -04:00
Qihui Xie	57ade13b2b	fix llm_inputs duplication problem in intermediate_steps in SQLDatabaseChain (#10279 ) Use `.copy()` to fix the bug that the first `llm_inputs` element is overwritten by the second `llm_inputs` element in `intermediate_steps`. *Problem description:* In [line 127]( `c732d8fffd/libs/experimental/langchain_experimental/sql/base.py (L127C17-L127C17)`), the `llm_inputs` of the sql generation step is appended as the first element of `intermediate_steps`: ``` intermediate_steps.append(llm_inputs) # input: sql generation ``` However, `llm_inputs` is a mutable dict, it is updated in [line 179](https://github.com/langchain-ai/langchain/blob/master/libs/experimental/langchain_experimental/sql/base.py#L179) for the final answer step: ``` llm_inputs["input"] = input_text ``` Then, the updated `llm_inputs` is appended as another element of `intermediate_steps` in [line 180](`c732d8fffd/libs/experimental/langchain_experimental/sql/base.py (L180)`): ``` intermediate_steps.append(llm_inputs) # input: final answer ``` As a result, the final `intermediate_steps` returned in [line 189](`c732d8fffd/libs/experimental/langchain_experimental/sql/base.py (L189C43-L189C43)`) actually contains two same `llm_inputs` elements, i.e., the `llm_inputs` for the sql generation step overwritten by the one for final answer step by mistake. Users are not able to get the actual `llm_inputs` for the sql generation step from `intermediate_steps` Simply calling `.copy()` when appending `llm_inputs` to `intermediate_steps` can solve this problem.	2023-10-05 21:32:08 -07:00
Florian	d78f418c0d	Extract abstracts from Pubmed articles, even if they have no extra label (#10245 ) ### Description This pull request involves modifications to the extraction method for abstracts/summaries within the PubMed utility. A condition has been added to verify the presence of unlabeled abstracts. Now an abstract will be extracted even if it does not have a subtitle. In addition, the extraction of the abstract was extended to books. ### Issue The PubMed utility occasionally returns an empty result when extracting abstracts from articles, despite the presence of an abstract for the paper on PubMed. This issue arises due to the varying structure of articles; some articles follow a "subtitle/label: text" format, while others do not include subtitles in their abstracts. An example of the latter case can be found at: [https://pubmed.ncbi.nlm.nih.gov/37666905/](url) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 18:56:46 -07:00
Viktor Zhemchuzhnikov	fd9da60aea	Add async support to SelfQueryRetriever (#10175 ) ### Description SelfQueryRetriever is missing async support, so I am adding it. I also removed deprecated predict_and_parse method usage here, and added some tests. ### Issue N/A ### Tag maintainer Not yet ### Twitter handle N/A	2023-10-05 18:54:21 -07:00
Theron Tau	35297ca0d3	Add feature for extracting images from pdf and recognizing text from images. (#10653 ) Description It is for #10423 that it will be a useful feature if we can extract images from pdf and recognize text on them. I have implemented it with `PyPDFLoader`, `PyPDFium2Loader`, `PyPDFDirectoryLoader`, `PyMuPDFLoader`, `PDFMinerLoader`, and `PDFPlumberLoader`. [RapidOCR](https://github.com/RapidAI/RapidOCR.git) is used to recognize text on extracted images. It is time-consuming for ocr so a boolen parameter `extract_images` is set to control whether to extract and recognize. I have tested the time usage for each parser on my own laptop thinkbook 14+ with AMD R7-6800H by unit test and the result is: \| extract_images \| PyPDFParser \| PDFMinerParser \| PyMuPDFParser \| PyPDFium2Parser \| PDFPlumberParser \| \| ------------- \| ------------- \| ------------- \| ------------- \| ------------- \| ------------- \| \| False \| 0.27s \| 0.39s \| 0.06s \| 0.08s \| 1.01s \| \| True \| 17.01s \| 20.67s \| 20.32s \| 19,75s \| 20.55s \| Issue #10423 Dependencies rapidocr_onnxruntime in [RapidOCR](https://github.com/RapidAI/RapidOCR/tree/main) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 18:51:59 -07:00
Bagatur	8e3fbc97ca	Add vowpal_wabbit RL chain (#11462 )	2023-10-05 18:39:45 -07:00
Haris Wang	f1269830a0	Fix bug in MarkdownHeaderTextSplitter for codeblock (#10262 ) - Description: The previous version of the MarkdownHeaderTextSplitter did not take into account the possibility of '#' appearing within code blocks, which caused segmentation anomalies in these situations. This PR has fixed this issue. - Issue: - Dependencies: No - Tag maintainer: - Twitter handle: cc @baskaryan @eyurtsev @rlancemartin --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 18:34:42 -07:00
Eddie Cohen	656d2303f7	add in, nin for pinecone (#10303 ) Description: Adds the in and nin comparators for pinecone seen [here](https://docs.pinecone.io/docs/metadata-filtering#metadata-query-language) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 18:31:09 -07:00
Bagatur	a3a2ce623e	Revise vowpal_wabbit notebook	2023-10-05 18:18:19 -07:00
Bagatur	8fafa1af91	merge	2023-10-05 18:09:35 -07:00
olgavrou	3b07c0cf3d	RL Chain with VowpalWabbit (#10242 ) - Description: This PR adds a new chain `rl_chain.PickBest` for learned prompt variable injection, detailed description and usage can be found in the example notebook added. It essentially adds a [VowpalWabbit](https://github.com/VowpalWabbit/vowpal_wabbit) layer before the llm call in order to learn or personalize prompt variable selections. Most of the code is to make the API simple and provide lots of defaults and data wrangling that is needed to use Vowpal Wabbit, so that the user of the chain doesn't have to worry about it. - Dependencies: [vowpal-wabbit-next](https://pypi.org/project/vowpal-wabbit-next/), - sentence-transformers (already a dep) - numpy (already a dep) - tagging @ataymano who contributed to this chain - Tag maintainer: @baskaryan - Twitter handle: @olgavrou Added example notebook and unit tests	2023-10-05 18:07:22 -07:00
Manikanta5112	56048b909f	added ContentFormatter escape special characters for message content (#10319 ) --------- Co-authored-by: Manikanta5112 <42089393+mani5112@users.noreply.github.com>	2023-10-05 18:02:29 -07:00
Leonid Ganeline	d17416ec79	docstrings `callbacks` (#11456 ) Added missed docstrings to the `callbacks/` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-05 17:13:14 -07:00
Ofer Mendelevitch	3c7653bf0f	"source" argument in constructor of Vectara (#11454 ) Replace this entire comment with: - Description: minor update to constructor to allow for specification of "source" - Tag maintainer: @baskaryan - Twitter handle: @ofermend	2023-10-05 17:04:14 -07:00
Eugene Yurtsev	d9018ae5f1	Improve CLI ux (#11452 ) Improve UX for cli	2023-10-05 19:40:00 -04:00
Jaikanth J	9f85f7c543	fix(cache): use dumps for RedisCache (#10408 ) # Description Attempts to fix RedisCache for ChatGenerations using `loads` and `dumps` used in SQLAlchemy cache by @hwchase17 . this is better than pickle dump, because this won't execute any arbitrary code during de-serialisation. # Issues #7722 & #8666 # Dependencies None, but removes the warning introduced in #8041 by @baskaryan Handle: @jaikanthjay46	2023-10-05 16:34:07 -07:00
rodrigo-clickup	5944c1851b	Add ClickUp Toolkit (#10662 ) - Description: Adds a toolkit to interact with the [ClickUp](https://clickup.com/) [Public API](https://clickup.com/api/) - Dependencies: None - Tag maintainer: @rodrigo-georgian, @rodrigo-clickup, @aiswaryasankarwork - Twitter handle: - Aiswarya (https://twitter.com/Aiswarya_Sankar, https://www.linkedin.com/in/sankaraiswarya/) - Rodrigo (https://www.linkedin.com/in/rodrigo-ceballos-lentini/) --------- Co-authored-by: Aiswarya Sankar <aiswaryasankar@Aiswaryas-MacBook-Pro.local> Co-authored-by: aiswaryasankarwork <143119412+aiswaryasankarwork@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 16:33:05 -07:00
John Reynolds	68901e1e40	Update output_parser.py (#10430 ) - Description: Updated output parser for mrkl to remove any hallucination actions after the final answer; this was encountered when using Anthropic claude v2 for planning; reopening PR with updated unit tests - Issue: #10278 - Dependencies: N/A - Twitter handle: @johnreynolds	2023-10-05 15:47:24 -07:00
Joshua Sundance Bailey	790010703b	ArcGISLoader: Limit number of results in query (#10615 ) Description: this PR changes the `ArcGISLoader` to set `return_all_records` to `False` when `result_record_count` is provided as a keyword argument. Previously, `return_all_records` was `True` by default and this made the API ignore `result_record_count`. Issue: `ArcGISLoader` would ignore `result_record_count` unless user also passed `return_all_records=False`.	2023-10-05 15:46:02 -07:00
mrbean	9903a70379	Add youdotcom retriever (#11304 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 13:48:11 -07:00
ashish-dahal	1655ff2ded	Fix PyMuPDFLoader kwargs (#11434 ) - Description: Fix the `PyMuPDFLoader` to accept `loader_kwargs` from the document loader's `loader_kwargs` option. This provides more flexibility in formatting the output from documents. - Issue: The `loader_kwargs` is not passed into the `load` method from the document loader, which limits configuration options. - Dependencies: None --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 13:25:19 -07:00
Leonid Kuligin	e4a46747dc	integration test for DocAI parser (#11424 ) - Description: added an integration test - Issue: #11407 @baskaryan	2023-10-05 12:38:29 -07:00
Aashish Saini	2abbdc6ecb	Update bageldb.py (#11421 ) I have restructured the code to ensure uniform handling of ImportError. In place of previously used ValueError, I've adopted the standard practice of raising ImportError with explanatory messages. This modification enhances code readability and clarifies that any problems stem from module importation.	2023-10-05 12:37:56 -07:00
maks-operlejn-ds	2aae1102b0	Instance anonymization (#10501 ) ### Description Add instance anonymization - if `John Doe` will appear twice in the text, it will be treated as the same entity. The difference between `PresidioAnonymizer` and `PresidioReversibleAnonymizer` is that only the second one has a built-in memory, so it will remember anonymization mapping for multiple texts: ``` >>> anonymizer = PresidioAnonymizer() >>> anonymizer.anonymize("My name is John Doe. Hi John Doe!") 'My name is Noah Rhodes. Hi Noah Rhodes!' >>> anonymizer.anonymize("My name is John Doe. Hi John Doe!") 'My name is Brett Russell. Hi Brett Russell!' ``` ``` >>> anonymizer = PresidioReversibleAnonymizer() >>> anonymizer.anonymize("My name is John Doe. Hi John Doe!") 'My name is Noah Rhodes. Hi Noah Rhodes!' >>> anonymizer.anonymize("My name is John Doe. Hi John Doe!") 'My name is Noah Rhodes. Hi Noah Rhodes!' ``` ### Twitter handle @deepsense_ai / @MaksOpp ### Tag maintainer @baskaryan @hwchase17 @hinthornw --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 11:23:02 -07:00
Kyle Pancamo	203258b4d6	Update pdf.py comment for PyPDFLoader (#10495 ) PyPDF does not chunk at the character level to my understanding. Description: PyPDF does not chunk at the character level, but instead breaks up content by page. Fixup comment --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 11:22:40 -07:00
Juan Daza	4236ae3851	Added Streaming Capability to SageMaker LLMs (#10535 ) This PR adds the ability to declare a Streaming response in the SageMaker LLM by leveraging the `invoke_endpoint_with_response_stream` capability in `boto3`. It is heavily based on the AWS Blog Post announcement linked [here](https://aws.amazon.com/blogs/machine-learning/elevating-the-generative-ai-experience-introducing-streaming-support-in-amazon-sagemaker-hosting/). It does not add any additional dependencies since it uses the existing `boto3` version. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 11:08:43 -07:00
Laurentiu Piciu	d9670a5945	openai_functions_multi_agent: solved the case when the "arguments" is valid JSON but it does not contain `actions` key (#10543 ) Description: There are cases when the output from the LLM comes fine (i.e. function_call["arguments"] is a valid JSON object), but it does not contain the key "actions". So I split the validation in 2 steps: loading arguments as JSON and then checking for "actions" in it. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 11:08:09 -07:00
Eugene Yurtsev	fcccde406d	Add SymbolicMathChain to experiment in preparation for deprecation (#11129 ) Move symbolic math chain to experimental	2023-10-05 13:54:43 -04:00
Holt Skinner	9f73fec057	fix: Update Google Cloud Enterprise Search to Vertex AI Search (#10513 ) - Description: Google Cloud Enterprise Search was renamed to Vertex AI Search - https://cloud.google.com/blog/products/ai-machine-learning/vertex-ai-search-and-conversation-is-now-generally-available - This PR updates the documentation and Retriever class to use the new terminology. - Changed retriever class from `GoogleCloudEnterpriseSearchRetriever` to `GoogleVertexAISearchRetriever` - Updated documentation to specify that `extractive_segments` requires the new [Enterprise edition](https://cloud.google.com/generative-ai-app-builder/docs/about-advanced-features#enterprise-features) to be enabled. - Fixed spelling errors in documentation. - Change parameter for Retriever from `search_engine_id` to `data_store_id` - When this retriever was originally implemented, there was no distinction between a data store and search engine, but now these have been split. - Fixed an issue blocking some users where the api_endpoint can't be set	2023-10-05 10:47:47 -07:00
Patrick Randell	1d678f805f	Additional Weaviate Filter Comparators (#10522 ) ### Description When using Weaviate Self-Retrievers, certain common filter comparators generated by user queries were unimplemented, resulting in errors. This PR implements some of them. All linting and format commands have been run and tests passed. ### Issue #10474 ### Dependencies timestamp module --------- Co-authored-by: Patrick Randell <prandell@deloitte.com.au>	2023-10-05 10:40:04 -07:00
Nuno Campos	79011f835f	Remove str() from RunnableConfigurableAlternatives (#11446 )	2023-10-05 18:40:00 +01:00
Harrison Chase	31d5bd84d7	make vectorstores optional (#11393 )	2023-10-05 10:14:05 -07:00
Eugene Yurtsev	8aa545901a	Update agent type docs (#11137 ) In code docs for agent types	2023-10-05 12:51:14 -04:00
Eugene Yurtsev	3e31d6e35f	Start deprecation of LLMBashChain (#11300 ) In preparation for migration LLMBashChain and related tools add a derprecation warning to the code.	2023-10-05 12:48:22 -04:00
Bagatur	8b6b8bf68c	bump 309 (#11443 )	2023-10-05 09:29:14 -07:00
billytrend-cohere	2ff91a46c0	Add cohere /chat integration (#11389 ) Add cohere /chat integration and an iPython notebook to demonstrate the addition. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-05 09:20:47 -07:00
adrienohana	ca346011b7	added interactive login for azure cognitive search vector store (#11360 ) Description: Previously if the access to Azure Cognitive Search was not done via an API key, the default credential was called which doesn't allow to use an interactive login. I simply added the option to use "INTERACTIVE" as a key name, and this will launch a login window upon initialization of the AzureSearch object.	2023-10-05 09:20:18 -07:00
Eugene Yurtsev	5a1f614175	Add docker compose to CLI (#11406 ) Add docker compose to cli	2023-10-05 15:58:56 +01:00
Predrag Gruevski	e2d6c41177	Upgrade langchain dependencies. (#11420 ) I was hoping this would pick up numpy 1.26, which is required to support the new Python 3.12 release, but it didn't. It seems that some transitive dependency requirement on numpy is preventing that, and the highest we can currently go is 1.24.x. But to find this out required a 15min `poetry lock`, so I figured we might as well upgrade the dependencies we can and hopefully make the next dependency upgrade a bit smaller.	2023-10-05 15:57:20 +01:00
Jacob Lee	71fd6428c5	Remove overridden async not implemented method on embeddings filters and add default async implementation for document compressors (#11415 ) @nfcampos @eyurtsev @baskaryan --------- Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-10-05 15:56:03 +01:00
Nuno Campos	2f490be09b	Fix .dict() for agent/chain (#11436 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-05 15:51:21 +01:00
Nuno Campos	1e59c44d36	Nc/5oct/runnable release (#11428 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-05 14:27:50 +01:00
Bagatur	58b7a3ba16	Rm bedrock anthropic error (#11403 )	2023-10-04 23:31:51 -04:00
Predrag Gruevski	c9986bc3a9	Tweak type hints to match dependency's behavior. (#11355 ) Needs #11353 to merge first, and a new `langchain` to be published with those changes.	2023-10-04 22:36:58 -04:00
William FH	940b9ae30a	Normalize Option in Scoring Chain (#11412 )	2023-10-04 15:59:28 -07:00
Eugene Yurtsev	70be04a816	CLI: Readme update (#11404 ) Consolidating to a single README for now, will be easier to maintain we can differentiate between poetry and pip later. Does not seem critical. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-04 16:25:37 -04:00
Nuno Campos	fde19c8667	Add CLI command to create a new project (#7837 ) First version of CLI command to create a new langchain project template Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-04 15:43:41 -04:00
mhwang-stripe	9cea796671	Make langchain compatible with SQLAlchemy<1.4.0 (#11390 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> ## Description Currently SQLAlchemy >=1.4.0 is a hard requirement. We are unable to run `from langchain.vectorstores import FAISS` with SQLAlchemy <1.4.0 due to top-level imports, even if we aren't even using parts of the library that use SQLAlchemy. See Testing section for repro. Let's make it so that langchain is still compatible with SQLAlchemy <1.4.0, especially if we aren't using parts of langchain that require it. The main conflict is that SQLAlchemy removed `declarative_base` from `sqlalchemy.ext.declarative` in 1.4.0 and moved it to `sqlalchemy.orm`. We can fix this by try-catching the import. This is the same fix as applied in https://github.com/langchain-ai/langchain/pull/883. (I see that there seems to be some refactoring going on about isolating dependencies, e.g. `c87e9fb2ce`, so if this issue will be eventually fixed by isolating imports in langchain.vectorstores that also works). ## Issue I can't find a matching issue. ## Dependencies No additional dependencies ## Maintainer @hwchase17 since you reviewed https://github.com/langchain-ai/langchain/pull/883 ## Testing I didn't add a test, but I manually tested this. 1. Current failure: ``` langchain==0.0.305 sqlalchemy==1.3.24 ``` ``` python python -i >>> from langchain.vectorstores import FAISS Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/pay/src/zoolander/vendor3/lib/python3.8/site-packages/langchain/vectorstores/__init__.py", line 58, in <module> from langchain.vectorstores.pgembedding import PGEmbedding File "/pay/src/zoolander/vendor3/lib/python3.8/site-packages/langchain/vectorstores/pgembedding.py", line 10, in <module> from sqlalchemy.orm import Session, declarative_base, relationship ImportError: cannot import name 'declarative_base' from 'sqlalchemy.orm' (/pay/src/zoolander/vendor3/lib/python3.8/site-packages/sqlalchemy/orm/__init__.py) ``` 2. This fix: ``` langchain==<this PR> sqlalchemy==1.3.24 ``` ``` python python -i >>> from langchain.vectorstores import FAISS <succeeds> ```	2023-10-04 15:41:20 -04:00
Nuno Campos	4d66756d93	Improve output of Runnable.astream_log() (#11391 ) - Make logs a dictionary keyed by run name (and counter for repeats) - Ensure no output shows up in lc_serializable format - Fix up repr for RunLog and RunLogPatch <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-04 20:16:37 +01:00
Lester Solbakken	a30f98f534	Add Vespa vector store (#11329 ) Addition of Vespa vector store integration including notebook showing its use. Maintainer: @lesters Twitter handle: LesterSolbakken	2023-10-04 14:59:11 -04:00
Nuno Campos	58a88f3911	Add optional input_types to prompt template (#11385 ) - default MessagesPlaceholder one to list of messages <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-04 18:54:53 +01:00
Tomaz Bratanic	71290315cf	Add optional Cypher validation tool (#11078 ) LLMs have trouble with consistently getting the relationship direction accurately. That's why I organized a competition how to best and most simple to fix it based on the existing schema as a post-processing step. https://github.com/tomasonjo/cypher-direction-competition I am adding the winner's code in this PR: https://github.com/sakusaku-rich/cypher-direction-competition	2023-10-04 12:54:37 -04:00
Bagatur	dd514c2781	bump 308 (#11383 )	2023-10-04 12:10:09 -04:00
Leonid Kuligin	4f4e0f38fc	a better error description when GCP project is not set (#11377 ) - Description: a little bit better error description - Issue: #10879	2023-10-04 11:57:47 -04:00
Nuno Campos	0d80226c64	Add _type to json functions output parser (#11381 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-04 16:56:45 +01:00
Bagatur	106608bc89	add default async (#11141 )	2023-10-04 11:40:35 -04:00
Nuno Campos	b0893c7c6a	Use an enum for configurable_alternatives to make the generated json schema nicer (#11350 )	2023-10-04 11:32:41 -04:00
Bagatur	b499de2926	Anthropic system message fix (#11301 ) Removes human prompt prefix before system message for anthropic models Bedrock anthropic api enforces that Human and Assistant messages must be interleaved (cannot have same type twice in a row). We currently treat System Messages as human messages when converting messages -> string prompt. Our validation when using Bedrock/BedrockChat raises an error when this happens. For ChatAnthropic we don't validate this so no error is raised, but perhaps the behavior is still suboptimal	2023-10-04 11:32:24 -04:00
Massimiliano Angelino	2f83350eac	Feat bedrock cohere support (#11230 ) Description: Added support for Cohere command model via Bedrock. With this change it is now possible to use the `cohere.command-text-v14` model via Bedrock API. About Streaming: Cohere model outputs 2 additional chunks at the end of the text being generated via streaming: a chunk containing the text `<EOS_TOKEN>`, and a chunk indicating the end of the stream. In this implementation I chose to ignore both chunks. An alternative solution could be to replace `<EOS_TOKEN>` with `\n` Tests: manually tested that the new model work with both `llm.generate()` and `llm.stream()`. Tested with `temperature`, `p` and `stop` parameters. Issue: #11181 Dependencies: No new dependencies Tag maintainer: @baskaryan Twitter handle: mangelino --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-10-04 11:12:19 -04:00
Daniel Butler	939bceccb0	GitHubIssuesLoader Custom API URL Support (#11378 ) - Description: Adds support for custom API URL in the GitHubIssuesLoader. This allows it to be used with Github enterprise instances.	2023-10-04 10:17:46 -04:00
Bagatur	16a80779b9	bump 307 (#11380 )	2023-10-04 10:03:17 -04:00
mziru	9e3c1d4463	add HTMLHeaderTextSplitter (#11039 ) Description: Similar in concept to the `MarkdownHeaderTextSplitter`, the `HTMLHeaderTextSplitter` is a "structure-aware" chunker that splits text at the element level and adds metadata for each header "relevant" to any given chunk. It can return chunks element by element or combine elements with the same metadata, with the objectives of (a) keeping related text grouped (more or less) semantically and (b) preserving context-rich information encoded in document structures. It can be used with other text splitters as part of a chunking pipeline. Dependency: lxml python package Maintainer: @hwchase17 Twitter handle: @MartinZirulnik --------- Co-authored-by: PresidioVantage <github@presidiovantage.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-04 09:24:25 -04:00
Predrag Gruevski	289de601c8	Use parameterized queries to select SQL schemas. (#11356 )	2023-10-04 05:43:30 +01:00
Nuno Campos	b0097f8908	In ProgressBarCallback update the progress counter also when runs fin… (#11332 )	2023-10-04 05:04:59 +01:00
William FH	06f39be1c2	Wfh/eval max concurrency (#11368 )	2023-10-03 20:18:14 -07:00
Aashish Saini	4adb2b399d	Fixed exception type in py files (#11322 ) I've refactored the code to ensure that ImportError is consistently handled. Instead of using ValueError as before, I've now followed the standard practice of raising ImportError along with clear and informative error messages. This change enhances the code's clarity and explicitly signifies that any problems are associated with module imports.	2023-10-03 21:46:26 -04:00
니콜라스	c6d7124675	Add 'device' to GPT4All (#11216 ) Add device to GPT4All - Description: GPT4All now supports GPU. This commit adds the option to enable it. - Issue: It closes https://github.com/langchain-ai/langchain/issues/10486 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-10-03 17:37:30 -07:00
Harrison Chase	6e848b879a	add default for async (#11367 )	2023-10-03 17:28:14 -07:00
Fynn Flügge	0a4baca291	chore: add kotlin code splitter (#11364 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> - Description: Adds Kotlin language to `TextSplitter` --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-03 18:35:36 -04:00
Ofer Mendelevitch	b93a08079e	Updates to Vectara Implementation (#11366 ) Replace this entire comment with: - Description: updates to documentation and API headers - Tag maintainer: @baskarya - Twitter handle: @ofermend	2023-10-03 18:34:39 -04:00
Erick Friis	745e3e29da	add getattr case for llms.type_to_cls_dict (#11362 ) For external libraries that depend on `type_to_cls_dict`, adds a workaround to continue using the old format. Recommend people use `get_type_to_cls_dict()` instead and only resolve the imports when they're used.	2023-10-03 14:34:30 -07:00
Vicente Reyes	f3e13e7e5a	Use term keyword according to the official python doc glossary (#11338 ) - Description: use term keyword according to the official python doc glossary, see https://docs.python.org/3/glossary.html - Issue: not applicable - Dependencies: not applicable - Tag maintainer: @hwchase17 - Twitter handle: vreyespue	2023-10-03 12:56:08 -07:00
Predrag Gruevski	5d6b83d9cf	Make a copy of external data instead of mutating another object's attributes. (#11349 ) Fix for a bug surfaced as part of #11339. `mypy` caught this since the types didn't match up.	2023-10-03 15:27:51 -04:00
Predrag Gruevski	42d979efdd	Improve type hints and interface for SQL execution functionality. (#11353 ) The previous API of the `_execute()` function had a few rough edges that this PR addresses: - The `fetch` argument was type-hinted as being able to take any string, but any string other than `"all"` or `"one"` would `raise ValueError`. The new type hints explicitly declare that only those values are supported. - The return type was type-hinted as `Sequence` but using `fetch = "one"` would actually return a single result item. This was incorrectly suppressed using `# type: ignore`. We now always return a list. - Using `fetch = "one"` would return a single item if data was found, or an empty list if no data was found. This was confusing, and we now always return a list to simplify. - The return type was `Sequence[Any]` which was a bit difficult to use since it wasn't clear what one could do with the returned rows. I'm making the new type `Dict[str, Any]` that corresponds to the column names and their values in the query. I've updated the use of this method elsewhere in the file to match the new behavior.	2023-10-03 15:19:08 -04:00
Mohammad Mohtashim	3bddd708f7	Add memory to sql chain (#8597 ) continuation of PR #8550 @hwchase17 please see and merge. And also close the PR #8550. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-03 12:04:39 -07:00
Harrison Chase	feabf2e0d5	make llm imports optional (#11237 )	2023-10-03 09:14:15 -07:00
Harrison Chase	88bad37ec2	fix get_tool_return (#11346 )	2023-10-03 09:01:05 -07:00
Harrison Chase	bdf865d8e8	better error message on parsing errors (#11342 )	2023-10-03 09:00:17 -07:00

... 13 14 15 16 17 ...

2663 Commits