langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-06 03:20:49 +00:00

Author	SHA1	Message	Date
Yusuf Khan	935f78c944	FEATURE: Add retriever for Outline (#13889 ) - Description: Added a retriever for the Outline API to ask questions on knowledge base - Issue: resolves #11814 - Dependencies: None - Tag maintainer: @baskaryan	2023-11-26 18:56:12 -08:00
Bagatur	0efa59cbb8	RELEASE: 0.0.339rc3 (#13852 )	2023-11-25 10:37:30 -08:00
Bagatur	7222c42077	RELEASE: core 0.0.6 (#13853 )	2023-11-25 10:21:14 -08:00
raelix	c172605ea6	IMPROVEMENT: Added title metadata to GoogleDriveLoader for optional File Loaders (#13832 ) - Description: Simple change, I just added title metadata to GoogleDriveLoader for optional File Loaders - Dependencies: no dependencies - Tag maintainer: @hwchase17 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-24 18:53:55 -08:00
Stefano Lottini	19c68c7652	FEATURE: Astra DB, LLM cache classes (exact-match and semantic cache) (#13834 ) This PR provides idiomatic implementations for the exact-match and the semantic LLM caches using Astra DB as backend through the database's HTTP JSON API. These caches require the `astrapy` library as dependency. Comes with integration tests and example usage in the `llm_cache.ipynb` in the docs. @baskaryan this is the Astra DB counterpart for the Cassandra classes you merged some time ago, tagging you for your familiarity with the topic. Thank you! --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-24 18:53:37 -08:00
Stefano Lottini	272df9dcae	Astra DB, chat message history (#13836 ) This PR adds a chat message history component that uses Astra DB for persistence through the JSON API. The `astrapy` package is required for this class to work. I have added tests and a small notebook, and updated the relevant references in the other docs pages. (@rlancemartin this is the counterpart of the Cassandra equivalent class you so helpfully reviewed back at the end of June) Thank you!	2023-11-24 18:12:29 -08:00
Bagatur	58f7e109ac	BUGFIX: Add import types and typevars from core (#13829 )	2023-11-24 17:04:10 -08:00
Bagatur	751226e067	bump 0.0.339rc2 (#13787 )	2023-11-23 12:50:09 -08:00
Bagatur	300ff01824	RELEASE: core 0.0.5 (#13786 )	2023-11-23 12:23:50 -08:00
Bagatur	72c108b003	IMPROVEMENT: filter global warnings properly (#13754 )	2023-11-22 16:26:37 -08:00
William FH	163bf165ed	Add Batch Size kwarg to the llm start callback (#13483 ) So you can more easily use the token counts directly from the API endpoint for batch size of 1	2023-11-22 14:47:57 -08:00
Bagatur	0be515f720	RELEASE: 0.0.339rc1 (#13746 )	2023-11-22 14:29:49 -08:00
Bagatur	2bc5bd67f7	RELEASE: core 0.0.4 (#13745 )	2023-11-22 13:57:28 -08:00
Bagatur	32d087fcb8	REFACTOR: combine core documents files (#13733 )	2023-11-22 10:10:26 -08:00
William FH	5b90fe5b1c	Fix locking (#13725 )	2023-11-22 07:37:25 -08:00
Bagatur	16af282429	BUGFIX: add prompt imports for backwards compat (#13702 )	2023-11-21 23:04:20 -08:00
Bagatur	e327bb4ba4	IMPROVEMENT: Conditionally import core type hints (#13700 )	2023-11-21 21:38:49 -08:00
dandanwei	d47ee1ae79	BUGFIX: redis vector store overwrites falsey metadata (#13652 ) - Description: This commit fixed the problem that Redis vector store will change the value of a metadata from 0 to empty when saving the document, which should be an un-intended behavior. - Issue: N/A - Dependencies: N/A	2023-11-21 20:16:23 -08:00
Bagatur	a21e84faf7	BUGFIX: llm backwards compat imports (#13698 )	2023-11-21 20:12:35 -08:00
Yujie Qian	ace9e64d62	IMPROVEMENT: VoyageEmbeddings embed_general_texts (#13620 ) - Description: add method embed_general_texts in VoyageEmebddings to support input_type - Issue: - Dependencies: - Tag maintainer: - Twitter handle: @Voyage_AI_	2023-11-21 18:33:07 -08:00
tanujtiwari-at	5064890fcf	BUGFIX: handle tool message type when converting to string (#13626 ) Description: Currently, if we pass in a ToolMessage back to the chain, it crashes with error `Got unsupported message type: ` This fixes it. Tested locally --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-21 18:20:58 -08:00
Josep Pon Farreny	143049c90f	Added partial_variables to BaseStringMessagePromptTemplate.from_template(...) (#13645 ) Description: BaseStringMessagePromptTemplate.from_template was passing the value of partial_variables into cls(...) via *kwargs, rather than passing it to PromptTemplate.from_template. Which resulted in those partial_variables being* lost and becoming required input_variables. Co-authored-by: Josep Pon Farreny <josep.pon-farreny@siemens.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-21 17:48:38 -08:00
Erick Friis	c5ae9f832d	INFRA: Lint for imports (#13632 ) - Adds pydantic/import linting to core - Adds a check for `langchain_experimental` imports to langchain	2023-11-21 17:42:56 -08:00
Erick Friis	131db4ba68	BUGFIX: anthropic models on bedrock (#13629 ) Introduced in #13403	2023-11-21 17:40:29 -08:00
David Ruan	04bddbaba4	BUGFIX: Update bedrock.py to fix provider bug (#13646 ) Provider check was incorrectly failing for anything other than "meta"	2023-11-21 17:28:38 -08:00
Bagatur	dc53523837	IMPROVEMENT: bump core dep 0.0.3 (#13690 )	2023-11-21 15:50:19 -08:00
Bagatur	a208abe6b7	add callback import test (#13689 )	2023-11-21 15:28:49 -08:00
Bagatur	083afba697	BUG: Add core utils imports (#13688 )	2023-11-21 15:25:47 -08:00
Bagatur	c61e30632e	BUG: more core fixes (#13665 ) Fix some circular deps: - move PromptValue into top level module bc both PromptTemplates and OutputParsers import - move tracer context vars to `tracers.context` and import them in functions in `callbacks.manager` - add core import tests	2023-11-21 15:15:48 -08:00
William FH	59df16ab92	Update name (#13676 )	2023-11-21 13:39:30 -08:00
Erick Friis	bfb980b968	CLI 0.0.19 (#13677 )	2023-11-21 12:34:38 -08:00
jakerachleff	249c796785	update langserve to v0.0.30 (#13673 ) Upgrade langserve template version to 0.0.30 to include new improvements	2023-11-21 11:17:47 -08:00
jakerachleff	c6937a2eb4	fix templates dockerfile (#13672 ) - Description: We need to update the Dockerfile for templates to also copy your README.md. This is because poetry requires that a readme exists if it is specified in the pyproject.toml	2023-11-21 11:09:55 -08:00
Bagatur	11614700a4	bump 0.0.339rc0 (#13664 )	2023-11-21 08:41:59 -08:00
Bagatur	d32e511826	REFACTOR: Refactor langchain_core (#13627 ) Changes: - remove langchain_core/schema since no clear distinction b/n schema and non-schema modules - make every module that doesn't end in -y plural - where easy have 1-2 classes per file - no more than one level of nesting in directories - only import from top level core modules in langchain	2023-11-21 08:35:29 -08:00
William FH	17c6551c18	Add error rate (#13568 ) To the in-memory outputs. Separate it out from the outputs so it's present in the dataframe.describe() results	2023-11-21 07:51:30 -08:00
Nuno Campos	8329f81072	Use pytest asyncio auto mode (#13643 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-21 15:00:13 +00:00
Bagatur	99b4f46cbe	REFACTOR: Add core as dep (#13623 )	2023-11-20 14:38:10 -08:00
Harrison Chase	d82cbf5e76	Separate out langchain_core package (#13577 ) Co-authored-by: Nuno Campos <nuno@boringbits.io> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-20 13:09:30 -08:00
Bagatur	e620347a83	RELEASE: bump 339 (#13613 )	2023-11-20 09:56:43 -08:00
Ofer Mendelevitch	52e23e50b1	BUG: Fix search_kwargs in Vectara retriever (#13299 ) - Description: fix a bug that prevented as_retriever() in Vectara to use the desired input arguments - Issue: as_retriever did not pass the arguments properly - Tag maintainer: @baskaryan - Twitter handle: @ofermend	2023-11-20 09:44:43 -08:00
Holt Skinner	1c08dbfb33	IMPROVEMENT: Reduce post-processing time for `DocAIParser` (#13210 ) - Remove `WrappedDocument` introduced in https://github.com/langchain-ai/langchain/pull/11413 - https://github.com/googleapis/python-documentai-toolbox/issues/198 in Document AI Toolbox to improve initialization time for `WrappedDocument` object. @lkuligin @baskaryan @hwchase17	2023-11-20 09:41:44 -08:00
Leonid Kuligin	f3fcdea574	fixed an UnboundLocalError when no documents are found (#12995 ) Replace this entire comment with: - Description: fixed a bug - Issue: the issue # #12780	2023-11-20 09:41:14 -08:00
Stijn Tratsaert	b6f70d776b	VertexAI LLM count_tokens method requires list of prompts (#13451 ) I encountered this during summarization with VertexAI. I was receiving an INVALID_ARGUMENT error, as it was trying to send a list of about 17000 single characters. The [count_tokens method](https://github.com/googleapis/python-aiplatform/blob/main/vertexai/language_models/_language_models.py#L658) made available by Google takes in a list of prompts. It does not fail for small texts, but it does for longer documents because the argument list will be exceeding Googles allowed limit. Enforcing the list type makes it work successfully. This change will cast the input text to count to a list of that single text so that the input format is always correct. [Twitter](https://www.x.com/stijn_tratsaert)	2023-11-20 09:40:48 -08:00
Wang Wei	fe7b40cb2a	feat: add ERNIE-Bot-4 Function Calling (#13320 ) - Description: ERNIE-Bot-Chat-4 Large Language Model adds the ability of `Function Calling` by passing parameters through the `functions` parameter in the request. To simplify function calling for ERNIE-Bot-Chat-4, the `create_ernie_fn_chain()` function has been added. The definition and usage of the `create_ernie_fn_chain()` function is similar to that of the `create_openai_fn_chain()` function. Examples as the follows: ``` import json from langchain.chains.ernie_functions import ( create_ernie_fn_chain, ) from langchain.chat_models import ErnieBotChat from langchain.prompts import ChatPromptTemplate def get_current_news(location: str) -> str: """Get the current news based on the location.' Args: location (str): The location to query. Returs: str: Current news based on the location. """ news_info = { "location": location, "news": [ "I have a Book.", "It's a nice day, today." ] } return json.dumps(news_info) def get_current_weather(location: str, unit: str="celsius") -> str: """Get the current weather in a given location Args: location (str): location of the weather. unit (str): unit of the tempuature. Returns: str: weather in the given location. """ weather_info = { "location": location, "temperature": "27", "unit": unit, "forecast": ["sunny", "windy"], } return json.dumps(weather_info) llm = ErnieBotChat(model_name="ERNIE-Bot-4") prompt = ChatPromptTemplate.from_messages( [ ("human", "{query}"), ] ) chain = create_ernie_fn_chain([get_current_weather, get_current_news], llm, prompt, verbose=True) res = chain.run("北京今天的新闻是什么？") print(res) ``` The running results of the above program are shown below： ``` > Entering new LLMChain chain... Prompt after formatting: Human: 北京今天的新闻是什么？ > Finished chain. {'name': 'get_current_news', 'thoughts': '用户想要知道北京今天的新闻。我可以使用get_current_news工具来获取这些信息。', 'arguments': {'location': '北京'}} ```	2023-11-19 22:36:12 -08:00
Adilkhan Sarsen	10418ab0c1	DeepLake Backwards compatibility fix (#13388 ) - Description: during search with DeepLake some people are facing backwards compatibility issues, this PR fixes it by making search accessible for the older datasets --------- Co-authored-by: adolkhan <adilkhan.sarsen@alumni.nu.edu.kz>	2023-11-19 21:46:01 -08:00
Tyler Hutcherson	190952fe76	IMPROVEMENT: Minor redis improvements (#13381 ) - Description: - Fixes a `key_prefix` bug where passing it in on `Redis.from_existing(...)` did not work properly. Updates doc strings accordingly. - Updates Redis filter classes logic with best practices on typing, string formatting, and handling "empty" filters. - Fixes a bug that would prevent multiple tag filters from being applied together in some scenarios. - Added a whole new filter unit testing module. Also updated code formatting for a number of modules that were failing the `make` commands. - Issue: N/A - Dependencies: N/A - Tag maintainer: @baskaryan - Twitter handle: @tchutch94	2023-11-19 19:15:45 -08:00
Sergey Kozlov	df03267edf	Fix tool arguments formatting in StructuredChatAgent (#10480 ) In the `FORMAT_INSTRUCTIONS` template, 4 curly braces (escaping) are used to get single curly brace after formatting: ``` "{{{ ... }}}}" -> format_instructions.format() -> "{{ ... }}" -> template.format() -> "{ ... }". ``` Tool's `args_schema` string contains single braces `{ ... }`, and is also transformed to `{{{{ ... }}}}` form. But this is not really correct since there is only one `format()` call: ``` "{{{{ ... }}}}" -> template.format() -> "{{ ... }}". ``` As a result we get double curly braces in the prompt: ```` Respond to the human as helpfully and accurately as possible. You have access to the following tools: foo: Test tool FOO, args: {{'tool_input': {{'type': 'string'}}}} # <--- !!! ... Provide only ONE action per $JSON_BLOB, as shown: ``` { "action": $TOOL_NAME, "action_input": $INPUT } ``` ```` This PR fixes curly braces escaping in the `args_schema` to have single braces in the final prompt: ```` Respond to the human as helpfully and accurately as possible. You have access to the following tools: foo: Test tool FOO, args: {'tool_input': {'type': 'string'}} # <--- !!! ... Provide only ONE action per $JSON_BLOB, as shown: ``` { "action": $TOOL_NAME, "action_input": $INPUT } ``` ```` --------- Co-authored-by: Sergey Kozlov <sergey.kozlov@ludditelabs.io>	2023-11-19 18:45:43 -08:00
Wouter Durnez	ef7802b325	Add llama2-13b-chat-v1 support to `chat_models.BedrockChat` (#13403 ) Hi 👋 We are working with Llama2 on Bedrock, and would like to add it to Langchain. We saw a [pull request](https://github.com/langchain-ai/langchain/pull/13322) to add it to the `llm.Bedrock` class, but since it concerns a chat model, we would like to add it to `BedrockChat` as well. - Description: Add support for Llama2 to `BedrockChat` in `chat_models` - Issue: the issue # it fixes (if applicable) [#13316](https://github.com/langchain-ai/langchain/issues/13316) - Dependencies: any dependencies required for this change `None` - Tag maintainer: / - Twitter handle: `@SimonBockaert @WouterDurnez` --------- Co-authored-by: wouter.durnez <wouter.durnez@showpad.com> Co-authored-by: Simon Bockaert <simon.bockaert@showpad.com>	2023-11-19 18:44:58 -08:00
jwbeck97	a93616e972	FEAT: Add azure cognitive health tool (#13448 ) - Description: This change adds an agent to the Azure Cognitive Services toolkit for identifying healthcare entities - Dependencies: azure-ai-textanalytics (Optional) --------- Co-authored-by: James Beck <James.Beck@sa.gov.au> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-19 18:44:01 -08:00
Massimiliano Pronesti	6bf9b2cb51	BUG: Limit Azure OpenAI embeddings chunk size (#13425 ) Hi! This short PR aims at: * Fixing `OpenAIEmbeddings`' check on `chunk_size` when used with Azure OpenAI (thus with openai < 1.0). Azure OpenAI embeddings support at most 16 chunks per batch, I believe we are supposed to take the min between the passed value/default value and 16, not the max - which, I suppose, was introduced by accident while refactoring the previous version of this check from this other PR of mine: #10707 * Porting this fix to the newest class (`AzureOpenAIEmbeddings`) for openai >= 1.0 This fixes #13539 (closed but the issue persists). @baskaryan @hwchase17	2023-11-19 18:34:51 -08:00
Zeyang Lin	e53f59f01a	DOCS: doc-string - langchain.vectorstores.dashvector.DashVector (#13502 ) - Description: There are several mistakes in the sample code in the doc-string of `DashVector` class, and this pull request aims to correct them. The correction code has been tested against latest version (at the time of creation of this pull request) of: `langchain==0.0.336` `dashvector==1.0.6` . - Issue: No issue is created for this. - Dependencies: No dependency is required for this change, <!-- - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), --> - Twitter handle: `zeyanglin` <!-- Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-19 18:24:05 -08:00
John Mai	16f7912e1b	BUG: fix hunyuan appid type (#13496 ) - Description: fix hunyuan appid type - Issue: https://github.com/langchain-ai/langchain/pull/12022#issuecomment-1815627855	2023-11-19 18:23:45 -08:00
Nicolò Boschi	8362bd729b	AstraDB: use includeSimilarity option instead of $similarity (#13512 ) - Description: AstraDB is going to deprecate the `$similarity` projection property in favor of the ´includeSimilarity´ option flag. I moved all the queries to the new format. - Tag maintainer: @hemidactylus - Twitter handle: nicoloboschi	2023-11-19 17:54:35 -08:00
shumpei	7100d586ef	Introduce search_kwargs for Custom Parameters in BingSearchAPIWrapper (#13525 ) Added a `search_kwargs` field to BingSearchAPIWrapper in `bing_search.py,` enabling users to include extra keyword arguments in Bing search queries. This update, like specifying language preferences, adds more customization to searches. The `search_kwargs` seamlessly merge with standard parameters in `_bing_search_results` method. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-19 17:51:02 -08:00
Nicolò Boschi	ad0c3b9479	Fix Astra integration tests (#13520 ) - Description: Fix Astra integration tests that are failing. The `delete` always return True as the deletion is successful if no errors are thrown. I aligned the test to verify this behaviour - Tag maintainer: @hemidactylus - Twitter handle: nicoloboschi --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-19 17:50:49 -08:00
umair mehmood	69d39e2173	fix: VLLMOpenAI -- create() got an unexpected keyword argument 'api_key' (#13517 ) The issue was accuring because of `openai` update in Completions. its not accepting `api_key` and 'api_base' args. The fix is we check for the openai version and if ats v1 then remove these keys from args before passing them to `Compilation.create(...)` when sending from `VLLMOpenAI` Fixed: #13507 @eyu @efriis @hwchase17 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-19 17:49:55 -08:00
Manuel Alemán Cueto	6bc08266e0	Fix for oracle schema parsing stated on the issue #7928 (#13545 ) - Description: In this pull request, we address an issue related to assigning a schema to the SQLDatabase class when utilizing an Oracle database. The current implementation encounters a bug where, upon attempting to execute a query, the alter session parse is not appropriately defined for Oracle, leading to an error, - Issue: #7928, - Dependencies: No dependencies, - Tag maintainer: @baskaryan, --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-19 17:35:27 -08:00
Andrew Teeter	325bdac673	feat: load all namespaces (#13549 ) - Description: This change allows for the `MWDumpLoader` to load all namespaces including custom by default instead of only loading the [default namespaces](https://www.mediawiki.org/wiki/Help:Namespaces#Localisation). - Tag maintainer: @hwchase17	2023-11-19 17:35:17 -08:00
Taranjeet Singh	47451764a7	Add embedchain retriever (#13553 ) Description: This commit adds embedchain retriever along with tests and docs. Embedchain is a RAG framework to create data pipelines. Twitter handle: - [Taranjeet's twitter](https://twitter.com/taranjeetio) and [Embedchain's twitter](https://twitter.com/embedchain) Reviewer @hwchase17 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-19 17:35:03 -08:00
rafly lesmana	420a17542d	fix: Make YoutubeLoader support on demand language translation (#13583 ) Description: Enhance the functionality of YoutubeLoader to enable the translation of available transcripts by refining the existing logic. Issue: Encountering a problem with YoutubeLoader (#13523) where the translation feature is not functioning as expected. Tag maintainers/contributors who might be interested: @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-19 17:34:48 -08:00
Bagatur	78a1f4b264	bump 338, exp 42 (#13564 )	2023-11-18 15:12:07 -08:00
Harrison Chase	f4c0e3cc15	move streaming stdout (#13559 )	2023-11-18 12:24:49 -05:00
Leonid Ganeline	43dad6cb91	BUG fixed `openai_assistant` namespace (#13543 ) BUG: langchain.agents.openai_assistant has a reference as `from langchain_experimental.openai_assistant.base import OpenAIAssistantRunnable` should be `from langchain.agents.openai_assistant.base import OpenAIAssistantRunnable` This prevents building of the API Reference docs	2023-11-17 17:15:33 -08:00
Bassem Yacoube	ff382b7b1b	IMPROVEMENT Adds support for new OctoAI endpoints (#13521 ) small fix to add support for new OctoAI LLM endpoints	2023-11-17 17:15:21 -08:00
William FH	cac849ae86	Use random seed (#13544 ) For default eval llm	2023-11-17 16:33:31 -08:00
Martin Krasser	79ed66f870	EXPERIMENTAL Generic LLM wrapper to support chat model interface with configurable chat prompt format (#8295 ) ## Update 2023-09-08 This PR now supports further models in addition to Lllama-2 chat models. See [this comment](#issuecomment-1668988543) for further details. The title of this PR has been updated accordingly. ## Original PR description This PR adds a generic `Llama2Chat` model, a wrapper for LLMs able to serve Llama-2 chat models (like `LlamaCPP`, `HuggingFaceTextGenInference`, ...). It implements `BaseChatModel`, converts a list of chat messages into the [required Llama-2 chat prompt format](https://huggingface.co/blog/llama2#how-to-prompt-llama-2) and forwards the formatted prompt as `str` to the wrapped `LLM`. Usage example: ```python # uses a locally hosted Llama2 chat model llm = HuggingFaceTextGenInference( inference_server_url="http://127.0.0.1:8080/", max_new_tokens=512, top_k=50, temperature=0.1, repetition_penalty=1.03, ) # Wrap llm to support Llama2 chat prompt format. # Resulting model is a chat model model = Llama2Chat(llm=llm) messages = [ SystemMessage(content="You are a helpful assistant."), MessagesPlaceholder(variable_name="chat_history"), HumanMessagePromptTemplate.from_template("{text}"), ] prompt = ChatPromptTemplate.from_messages(messages) memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True) chain = LLMChain(llm=model, prompt=prompt, memory=memory) # use chat model in a conversation # ... ``` Also part of this PR are tests and a demo notebook. - Tag maintainer: @hwchase17 - Twitter handle: `@mrt1nz` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-17 16:32:13 -08:00
William FH	c56faa6ef1	Add execution time (#13542 ) And warn instead of raising an error, since the chain API is too inconsistent.	2023-11-17 16:04:16 -08:00
pedro-inf-custodio	0fb5f857f9	IMPROVEMENT WebResearchRetriever error handling in urls with connection error (#13401 ) - Description: Added a method `fetch_valid_documents` to `WebResearchRetriever` class that will test the connection for every url in `new_urls` and remove those that raise a `ConnectionError`. - Issue: [Previous PR](https://github.com/langchain-ai/langchain/pull/13353), - Dependencies: None, - Tag maintainer: @efriis Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17.	2023-11-17 14:02:26 -08:00
Piyush Jain	d2335d0114	IMPROVEMENT Neptune graph updates (#13491 ) ## Description This PR adds an option to allow unsigned requests to the Neptune database when using the `NeptuneGraph` class. ```python graph = NeptuneGraph( host='<my-cluster>', port=8182, sign=False ) ``` Also, added is an option in the `NeptuneOpenCypherQAChain` to provide additional domain instructions to the graph query generation prompt. This will be injected in the prompt as-is, so you should include any provider specific tags, for example `<instructions>` or `<INSTR>`. ```python chain = NeptuneOpenCypherQAChain.from_llm( llm=llm, graph=graph, extra_instructions=""" Follow these instructions to build the query: 1. Countries contain airports, not the other way around 2. Use the airport code for identifying airports """ ) ```	2023-11-17 13:49:31 -08:00
William FH	5a28dc3210	Override Keys Option (#13537 ) Should be able to override the global key if you want to evaluate different outputs in a single run	2023-11-17 13:32:43 -08:00
Bagatur	e584b28c54	bump 337 (#13534 )	2023-11-17 12:50:52 -08:00
Bagatur	2e2114d2d0	FEATURE: Runnable with message history (#13418 ) Add RunnableWithMessageHistory class that can wrap certain runnables and manages chat history for them.	2023-11-17 12:00:01 -08:00
Bagatur	0fc3af8932	IMPROVEMENT: update assistants output and doc (#13480 )	2023-11-17 11:58:54 -08:00
Hugues Chocart	35e04f204b	[LLMonitorCallbackHandler] Various improvements (#13151 ) Small improvements for the llmonitor callback handler, like better support for non-openai models. --------- Co-authored-by: vincelwt <vince@lyser.io>	2023-11-16 23:39:36 -08:00
Noah Stapp	c1b041c188	Add Wrapping Library Metadata to MongoDB vector store (#13084 ) Description MongoDB drivers are used in various flavors and languages. Making sure we exercise our due diligence in identifying the "origin" of the library calls makes it best to understand how our Atlas servers get accessed.	2023-11-16 22:20:04 -08:00
Guy Korland	7f8fd70ac4	Add optional arguments to FalkorDBGraph constructor (#13459 ) Description: Add optional arguments to FalkorDBGraph constructor Tag maintainer: baskaryan Twitter handle: @g_korland	2023-11-16 18:15:40 -08:00
chris stucchio	d7f014cd89	Bug: OpenAIFunctionsAgentOutputParser doesn't handle functions with no args (#13467 ) Description/Issue: When OpenAI calls a function with no args, the args are `""` rather than `"{}"`. Then `json.loads("")` blows up. This PR handles it correctly. Dependencies: None	2023-11-16 16:47:05 -08:00
Yujie Qian	41a433fa33	IMPROVEMENT: add input_type to VoyageEmbeddings (#13488 ) - Description: add input_type to VoyageEmbeddings	2023-11-16 16:35:36 -08:00
David Duong	ea6e017b85	Add serialisation arguments to Bedrock and ChatBedrock (#13465 )	2023-11-17 01:33:24 +01:00
Erick Friis	427331d621	IMPROVEMENT Lock pydantic v1 in app template, cli 0.0.18 (#13485 )	2023-11-16 15:22:11 -08:00
Erick Friis	75363f048f	BUG Fix app_name in cli app new (#13482 )	2023-11-16 14:19:35 -08:00
ifduyue	324ab382ad	Use List instead of list (#13443 ) Unify List usages in libs/langchain/langchain/text_splitter.py, only one place it's `list`, all other ocurrences are `List`	2023-11-16 13:15:58 -08:00
Stefano Lottini	b029d9f4e6	Astra DB: minor improvements to docstrings and demo notebook (#13449 ) This PR brings a few minor improvements to the docs, namely class/method docstrings and the demo notebook. - A note on how to control concurrency levels to tune performance in bulk inserts, both in the class docstring and the demo notebook; - Slightly increased concurrency defaults after careful experimentation (still on the conservative side even for clients running on less-than-typical network/hardware specs) - renamed the DB token variable to the standardized `ASTRA_DB_APPLICATION_TOKEN` name (used elsewhere, e.g. in the Astra DB docs) - added a note and a reference (add_text docstring, demo notebook) on allowed metadata field names. Thank you!	2023-11-16 12:48:32 -08:00
Eugene Yurtsev	1e43fd6afe	Add ahandle_event to _all_ (#13469 ) Add ahandle_event for backwards compatibility as it is used by langserve	2023-11-16 12:46:20 -08:00
Harrison Chase	f90249305a	callback refactor (#13372 ) Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-11-16 08:25:09 -08:00
Bagatur	a9b2c943e6	bump 336, exp 44 (#13420 )	2023-11-15 14:08:34 -08:00
Bagatur	1372296dc8	FIX: Infer runnable agent single or multi action (#13412 )	2023-11-15 13:58:14 -08:00
Eugene Yurtsev	accadccf8e	Use secretstr for api keys for javelin-ai-gateway (#13417 ) - Make javelin_ai_gateway_api_key a SecretStr --------- Co-authored-by: Hiroshi Tashiro <hiroshitash@gmail.com>	2023-11-15 16:12:05 -05:00
William FH	ba501b27a0	Fix Runnable Lambda Afunc Repr (#13413 ) Otherwise, you get an error when using async functions. h/t to Chris Ruppelt	2023-11-15 16:11:42 -05:00
Sumukh Sridhara	1726d5dcdd	Merge pull request #13232 * PGVector needs to close its connection if its garbage collected	2023-11-15 15:34:37 -05:00
Nuno Campos	85a77d2c27	IMPROVEMENT Passthrough kwargs in runnable lambda (#13405 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-15 11:45:16 -08:00
Bagatur	76c317ed78	DOCS: update rag use case (#13319 )	2023-11-15 10:54:15 -08:00
Clay Elmore	8823e3831f	FEAT Bedrock cohere embedding support (#13366 ) - Description: adding cohere embedding support to bedrock embedding class - Issue: N/A - Dependencies: None - Tag maintainer: @3coins - Twitter handle: celmore25 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-15 10:19:12 -08:00
Nuno Campos	d5aeff706a	Make it easier to subclass RunnableEach (#13346 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-15 13:12:57 +00:00
竹内謙太	3b5e8bacfa	FEAT Add some properties to NotionDBLoader (#13358 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> fix #13356 Add supports following properties for metadata to NotionDBLoader. - `checkbox` - `email` - `number` - `select` There are no relevant tests for this code to be updated.	2023-11-14 20:31:12 -08:00
Fielding Johnston	37eb44c591	BUG Add limit_to_domains to APIChain based tools (#13367 ) - Description: Adds `limit_to_domains` param to the APIChain based tools (open_meteo, TMDB, podcast_docs, and news_api) - Issue: I didn't open an issue, but after upgrading to 0.0.328 using these tools would throw an error. - Dependencies: N/A - Tag maintainer: @baskaryan Note: I included the trailing / simply because the docs here did `fc886cc303/docs/docs/use_cases/apis.ipynb (L246)` , but I checked the code and it is using `urlparse`. SoI followed the docs since it comes down to stylee.	2023-11-14 19:07:16 -08:00
Bagatur	38180ad25f	bump openai support (#13262 )	2023-11-14 16:50:23 -08:00
Erick Friis	7c3066f9ec	more cli interactivity, bugfix (#13360 )	2023-11-14 14:49:43 -08:00
Predrag Gruevski	d63d4994c0	Bump all libraries to the latest `ruff` version. (#13350 ) This version of `ruff` is the one we'll be using to lint the docs and cookbooks (#12677), so I'm making it used everywhere else too.	2023-11-14 16:00:21 -05:00
Massimiliano Pronesti	344cab0739	IMPROVEMENT: support Openai API v1 for Azure OpenAI completions (#13231 ) Hi, this PR adds support for OpenAI API v1 for Azure OpenAI completion API. @baskaryan @hwchase17 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-14 12:10:18 -08:00
dependabot[bot]	fc886cc303	Bump pyarrow from 13.0.0 to 14.0.1 in /libs/langchain (#13363 ) Bumps [pyarrow](https://github.com/apache/arrow) from 13.0.0 to 14.0.1. <details> <summary>Commits</summary> <ul> <li><a href="`ba53748361`"><code>ba53748</code></a> MINOR: [Release] Update versions for 14.0.1</li> <li><a href="`529f3768fa`"><code>529f376</code></a> MINOR: [Release] Update .deb/.rpm changelogs for 14.0.1</li> <li><a href="`b84bbcac64`"><code>b84bbca</code></a> MINOR: [Release] Update CHANGELOG.md for 14.0.1</li> <li><a href="`f141709763`"><code>f141709</code></a> <a href="https://redirect.github.com/apache/arrow/issues/38607">GH-38607</a>: [Python] Disable PyExtensionType autoload (<a href="https://redirect.github.com/apache/arrow/issues/38608">#38608</a>)</li> <li><a href="`5a37e74198`"><code>5a37e74</code></a> <a href="https://redirect.github.com/apache/arrow/issues/38431">GH-38431</a>: [Python][CI] Update fs.type_name checks for s3fs tests (<a href="https://redirect.github.com/apache/arrow/issues/38455">#38455</a>)</li> <li><a href="`2dcee3f82c`"><code>2dcee3f</code></a> MINOR: [Release] Update versions for 14.0.0</li> <li><a href="`297428cbf2`"><code>297428c</code></a> MINOR: [Release] Update .deb/.rpm changelogs for 14.0.0</li> <li><a href="`3e9734f883`"><code>3e9734f</code></a> MINOR: [Release] Update CHANGELOG.md for 14.0.0</li> <li><a href="`9f90995c8c`"><code>9f90995</code></a> <a href="https://redirect.github.com/apache/arrow/issues/38332">GH-38332</a>: [CI][Release] Resolve symlinks in RAT lint (<a href="https://redirect.github.com/apache/arrow/issues/38337">#38337</a>)</li> <li><a href="`bd61239a32`"><code>bd61239</code></a> <a href="https://redirect.github.com/apache/arrow/issues/35531">GH-35531</a>: [Python] C Data Interface PyCapsule Protocol (<a href="https://redirect.github.com/apache/arrow/issues/37797">#37797</a>)</li> <li>Additional commits viewable in <a href="https://github.com/apache/arrow/compare/go/v13.0.0...go/v14.0.1">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pyarrow&package-manager=pip&previous-version=13.0.0&new-version=14.0.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/langchain-ai/langchain/network/alerts). </details> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>	2023-11-14 14:23:52 -05:00
Erick Friis	c0e6045c0b	cli 0.0.17 (#13359 )	2023-11-14 09:56:18 -08:00
Erick Friis	927824b7cb	CLI interactivity (#13148 ) Will implement more later	2023-11-14 09:53:29 -08:00
billytrend-cohere	2f6fe6ddf3	Fix latest message index (#13355 ) There is a bug which caused the earliest message rather than the latest message being sent	2023-11-14 09:23:25 -08:00
Harrison Chase	be854225c7	add more reasonable arxiv retriever (#13327 )	2023-11-13 20:54:14 -08:00
Krish Dholakia	5a920e14c0	fix litellm openai imports (#13307 )	2023-11-13 17:55:10 -08:00
Bagatur	1c67db4c18	Move OAI assistants to langchain and add callbacks (#13236 )	2023-11-13 17:42:07 -08:00
Erick Friis	280ecfd8eb	IMPROVEMENT redirect root to docs in langserve app template (#13303 )	2023-11-13 15:51:41 -08:00
mertkayhan	9b4974871d	IMPROVEMENT Increase flexibility of ElasticVectorSearch (#6863 ) Hey @rlancemartin, @eyurtsev , I did some minimal changes to the `ElasticVectorSearch` client so that it plays better with existing ES indices. Main changes are as follows: 1. You can pass the dense vector field name into `_default_script_query` 2. You can pass a custom script query implementation and the respective parameters to `similarity_search_with_score` 3. You can pass functions for building page content and metadata for the resulting `Document` <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 4. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-11-13 14:36:03 -08:00
Erick Friis	50a5c919f0	IMPROVEMENT self-query template (#13305 ) - [ ] https://github.com/langchain-ai/langchain/pull/12694#discussion_r1391334719 -> keep date - [x] https://github.com/langchain-ai/langchain/pull/12694#discussion_r1391336586	2023-11-13 14:03:15 -08:00
Yasin	b46f88d364	IMPROVEMENT add license file to subproject (#8403 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> hi! This is pretty straight-forward: The sdist package does not contain the license file (which is needed by e.g. conda) because the package is built from the subdir and can't see the license. I _copied_ the license but since I'm unfamiliar with the projects direction, I'm not sure that's correct. thanks! --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-13 11:48:21 -08:00
Rui Ramos	ff19a62afc	Fix Pinecone cosine relevance score (#8920 ) Fixes: #8207 Description: Pinecone returns scores (not distances) with cosine similarity. The values according to the docs are [-1, 1], although I could never reproduce negative values. This PR ensures that the score returned from Pinecone is preserved, rather than inverted, so the most relevant documents can be filtered (eg when using similarity thresholds) I'll leave this as a draft PR as I couldn't run the tests (my pinecone account might not be enough - some errors were being thrown around namespaces) so hopefully someone who _can_ will pick this up. Maintainers: @rlancemartin, @eyurtsev --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-13 11:47:38 -08:00
Bagatur	2e42ed5de6	Self-query template (#12694 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-13 11:44:19 -08:00
Konstantin Spieß	1e43025bf5	Fix serialization issue in Matching Engine Vector Store (#13266 ) - Description: Fixed a serialization issue in the add_texts method of the Matching Engine Vector Store caused by a typo, leading to an attempt to serialize the json module itself. - Issue: #12154 - Dependencies: ./. - Tag maintainer:	2023-11-13 11:04:11 -08:00
William FH	9169d77cf6	Update error message in evaluation runner (#13296 )	2023-11-13 11:03:20 -08:00
takatost	f22f273f93	FIX: 'from_texts' method in Weaviate with non-existent kwargs param (#11604 ) Due to the possibility of external inputs including UUIDs, there may be additional values in kwargs, while Weaviate's `__init__` method does not support passing extra kwarg parameters. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-13 10:32:20 -08:00
Frank995	971d2b2e34	Add missing filter to max_marginal_relevance_search inner call to max_marginal_relevance_search_by_vector (#13260 ) When calling max_marginal_relevance_search from PGVector the filter param is not carried over to max_marginal_relevance_search_by_vector --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-13 10:31:34 -08:00
chevalmuscle	3ad78e48e2	Use endpoint_url if provided with boto3 session for dynamodb (#11622 ) - Description: Uses `endpoint_url` if provided with a boto3 session. When running dynamodb locally, credentials are required even if invalid. With this change, it will be possible to pass a boto3 session with credentials and specify an endpoint_url --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-13 10:31:16 -08:00
Erick Friis	18acc22f29	Ollama pass kwargs as options instead of top (#13280 ) Noticed params are really in `options` instead while reviewing #12895	2023-11-13 10:28:47 -08:00
刘方瑞	46af56dc4f	Add MyScaleWithoutJSON which allows user to wrap columns into Document's Metadata (#13164 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> Replace this entire comment with: - Description: Add MyScaleWithoutJSON which allows user to wrap columns into Document's Metadata - Tag maintainer: @baskaryan	2023-11-13 10:10:36 -08:00
Michael Landis	2aa13f1e10	chore: bump momento dependency version and refactor search hit usage (#13111 ) Description Bumps the Momento dependency to the latest version and refactors the usage of `SearchHit` in the Momento Vector Index (MVI) vector store integration. This change is a one liner where we use the preferred attribute `score` to read the query-document similarity instead of `distance`. The latest versions of Momento clients will use this attribute going forward. Dependencies Updated the Momento dependency to latest version. Tests 💚 I re-ran the existing MVI integration tests (`tests/integration_tests/vectorstores/test_momento_vector_index.py`) and they pass. Review cc @baskaryan @eyurtsev	2023-11-13 09:12:21 -08:00
kYLe	cc55d2fcee	Add OpenAI API v1 support for ChatAnyscale and fixed a bug with openai_api_key (#13237 ) 1. Add OpenAI API v1 support 2. Fixed a bug to call `get_secret_value` on a str value (values["openai_api_key"])	2023-11-13 09:01:54 -08:00
Govind.S.B	9024593468	added system prompt and template fields to ollama (#13022 ) Description the ollama api now supports passing system prompt and template directly instead of modifying the model file , but the ollama integration in langchain did not have this change updated . The update just adds these two parameters to it ( there are 2 more parameters that are pending to be updated, I was not sure about their utility wrt to langchain ) Refer : `8713ac23a8` Issue : None Applicable Dependencies : None Changed Twitter handle : https://twitter.com/violetto96 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-11-13 08:45:11 -08:00
langchain-infra	f55f67055f	Add dockerfile template (#13240 )	2023-11-13 10:33:01 -05:00
Guillem Orellana Trullols	0f31cd8b49	Remove `_get_kwarg_value` function (#13184 ) `_get_kwarg_value` function is useless, one can rely on python builtin functionalities to do the exact same thing. - Description: Removed `_get_kwarg_value`. Helps with code readability. - Issue: the issue # it fixes (if applicable), - Twitter handle: @Guillem_96	2023-11-13 00:09:54 -08:00
SuperDa Fu	e1c020dfe1	dalle add model parameter (#13201 ) - Description: dalle_image_generator adding a new model parameter, - Issue: N/A, - Dependencies: - Tag maintainer: @hwchase17 - Twitter handle:** --------- Co-authored-by: dafu <xiangbingze@wenru.wang> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Erick Friis <erickfriis@gmail.com>	2023-11-13 00:09:20 -08:00
Dennis de Greef	64e11592bb	Improve CSV reader which can't call .strip() on NoneType (#13079 ) Improve CSV reader which can't call .strip() on NoneType if there are less cells in the row compared to the header <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: I have a CSV file as followed ``` headerA,headerB,headerC v1A,v1B,v1C, v2A,v2B v3A,v3B,v3C ``` In this case, row 2 is missing a value, which results in reading a None type. The strip() method can not be called on None, hence raising. In this PR I am making the change to only call strip if the value if not None. - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-12 23:51:39 -08:00
glad4enkonm	339973db47	Update ollama.py (#12895 ) duplicate option removed Description: An issue fix, http stop option duplicate removed. Issue: the issue #12892 fix Dependencies: no Tag maintainer: @eyurtsev --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-12 23:43:59 -08:00
Isak Nyberg	8f81703d76	Add new models to openai callback (#13244 ) Description: Adding the new models to the openai callback function, info taken from [model announcement](https://platform.openai.com/docs/models) and [pricing](https://openai.com/pricing) A short description for a short PR :)	2023-11-12 12:01:19 -08:00
Bagatur	ea6dd3a550	bump 335 (#13261 )	2023-11-12 11:30:25 -08:00
William FH	a837b03e55	Update langsmith version 0.63 (#13208 )	2023-11-12 11:29:25 -08:00
Harrison Chase	7f1d26160d	update tools (#13243 )	2023-11-12 10:22:54 -08:00
Nuno Campos	8d6faf5665	Make it easier to subclass runnable binding with custom init args (#13189 )	2023-11-11 09:01:17 +00:00
Peter Vandenabeele	7f1964b264	Fix BeautifulSoupTransformer: no more duplicates and correct order of tags + tests (#12596 )	2023-11-11 08:56:37 +00:00
Erick Friis	9c7afa8adb	Upgrade cohere embedding model to v3 (#13219 ) Just updates API docs, doesn't change default param from 2.0 (could be breaking change)	2023-11-10 16:25:58 -08:00
Erick Friis	8fdf15c023	Fix Document Loader Unit Test - Docusaurus (#13228 )	2023-11-10 14:52:01 -08:00
Lee	72ad448daa	feat: Docusaurus Loader (#9138 ) Added a Docusaurus Loader Issue: #6353 I had to implement this for working with the Ionic documentation, and wanted to open this up as a draft to get some guidance on building this out further. I wasn't sure if having it be a light extension of the SitemapLoader was in the spirit of a proper feature for the library -- but I'm grateful for the opportunities Langchain has given me and I'd love to build this out properly for the sake of the community. Any feedback welcome!	2023-11-10 14:21:55 -08:00
Tomaz Bratanic	0dc4ab0be1	Neo4j chat message history (#13008 )	2023-11-10 11:53:34 -08:00
fyasla	d266b3ea4a	issue #12165 mask API key in chat_models/azureml_endpoint module (#12836 ) - Description: `AzureMLChatOnlineEndpoint` object from langchain/chat_models/azureml_endpoint.py safe to print without having any secrets included in raw format in the string representation. - Issue: #12165, - Tag maintainer: @eyurtsev --------- Co-authored-by: Faysal Bougamale <faysal.bougamale@horiba.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-10 14:05:19 -05:00
Anush	52f34de9b7	feat: FastEmbed embedding provider (#13109 ) ## Description: This PR intends to add [Qdrant/FastEmbed](https://qdrant.github.io/fastembed/) as a local embeddings provider, associated tests and documentation. Documentation preview: https://langchain-git-fork-anush008-master-langchain.vercel.app/docs/integrations/text_embedding/fastembed --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-11-10 13:51:52 -05:00
Eugene Yurtsev	b0e8cbe0b3	Add RunnableSequence documentation (#13094 ) Add RunnableSequence documentation	2023-11-10 13:44:43 -05:00
Eugene Yurtsev	869df62736	Document RunnableWithFallbacks (#13088 ) Add documentation to RunnableWithFallbacks	2023-11-10 13:16:21 -05:00
Eugene Yurtsev	8313c218da	Add more runnable documentation (#13083 ) - Adding documentation to the runnable. - Documentation is not organized in the best way for the runnable; i.e., in terms of LCEL vs. other standard methods, will follow up with more edits.	2023-11-10 13:14:57 -05:00
Bagatur	24386e0860	bump 334, exp 40 (#13211 )	2023-11-10 09:43:29 -08:00
Lance Martin	d2e50b3108	Add Chroma multimodal cookbook (#12952 ) Pending: * https://github.com/chroma-core/chroma/pull/1294 * https://github.com/chroma-core/chroma/pull/1293 --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-10 09:43:10 -08:00
The1Bill	55912868da	Update toolkit.py to remove single quotes around table names (#12445 ) Description: Removing the single quote wrapper around the table names in the SQL agent toolkit.py file as it misleads the LLM into querying against tables with single quotes around their names. Issue: #7457 Dependencies: None Tag maintainer: @hwchase17 Twitter handle: None	2023-11-10 06:39:15 -08:00
Nuno Campos	362a446999	Changes to root listener (#12174 ) - Implement config_specs to include session_id - Remove Runnable method and update notebook - Add more details to notebook, eg. show input schema and config schema before and after adding message history --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-11-10 09:53:48 +00:00
Nuno Campos	b2b94424db	Update return type for Runnable.__or__ (#12880 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-10 09:52:38 +00:00
Harrison Chase	0a2b1c7471	improve duck duck go tool (#13165 )	2023-11-09 20:49:39 -08:00
Shinya Maeda	28cc60b347	Fix langchain.llms OpenAI completion doesn't work due to v1 client update (#13099 ) This commit fixes the issue that langchain.llms OpenAI completion stopped working since the V1 openai client update. Replace this entire comment with: - Description: This PR fixes the issue [AttributeError: module 'openai' has no attribute 'Completion'](https://github.com/langchain-ai/langchain/issues/12967) similar to `8e0cb2eb84` and https://github.com/langchain-ai/langchain/pull/12969, - Issue: https://github.com/langchain-ai/langchain/issues/12967, - Dependencies: `openai` v1.x.x client, - Tag maintainer: @baskaryan, - Twitter handle: @dosuken123 Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-09 15:12:19 -08:00
Bagatur	ff43cd6701	OpenAI remove httpx typing (#13154 ) Addresses #13124	2023-11-09 14:32:09 -08:00
Bagatur	8b2a82b5ce	Bagatur/docs smith context (#13139 )	2023-11-09 10:22:49 -08:00
Bagatur	f04cc4b7e1	bump 333 (#13131 )	2023-11-09 07:33:15 -08:00
billytrend-cohere	b346d4a455	Add message to documents (#12552 ) This adds the response message as a document to the rag retriever so users can choose to use this. Also drops document limit. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-09 07:30:48 -08:00
Harrison Chase	5f38770161	Support oai tool call (#13110 ) Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-11-09 07:29:29 -08:00
Holt Skinner	0fc8fd12bd	feat: Vertex AI Search - Add Snippet Retrieval for Non-Advanced Website Data Stores (#13020 ) https://cloud.google.com/generative-ai-app-builder/docs/snippets#snippets --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-11-08 21:52:50 -05:00
Jacob Lee	76283e9625	Adds embeddings filter option to return scores in state (#12489 ) CC @baskaryan @assafelovic	2023-11-08 17:50:06 -08:00
jakerachleff	18601bd4c8	Get project from langchain sdk (#13100 ) ## Description We need to centralize the API we use to get the project name for our tracers. This PR makes it so we always get this from a shared function in the langsmith sdk. ## Dependencies Upgraded langsmith from 0.52 to 0.62 to include the new API `get_tracer_project`	2023-11-08 17:10:12 -08:00
Bagatur	72e12f6bcf	update more azure docs (#13093 )	2023-11-08 14:11:16 -08:00
Bagatur	1703f132c6	update azure embedding docs (#13091 )	2023-11-08 13:39:31 -08:00
Bagatur	9fdfac22c2	bump 332 (#13089 )	2023-11-08 13:23:16 -08:00
Bagatur	1f85ec34d5	bump 331rc3 exp 39 (#13086 )	2023-11-08 13:00:13 -08:00
Anton Troynikov	9f077270c8	Don't pass EF to chroma (#13085 ) - Description: Recently Chroma rolled out a breaking change on the way we handle embedding functions, in order to support multi-modal collections. This broke the way LangChain's `Chroma` objects get created, because we were passing the EF down into the Chroma collection: https://docs.trychroma.com/migration#migration-to-0416---november-7-2023 However, internally, we are never actually using embeddings on the chroma collection - LangChain's `Chroma` object calls it instead. Thus we just don't pass an `embedding_function` to Chroma itself, which fixes the issue.	2023-11-08 12:55:35 -08:00
Erick Friis	f15f8e01cf	Azure OpenAI Embeddings (#13039 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-08 12:37:17 -08:00
David Peterson	37561d8986	Add Proper Import Error (#13042 ) - Description: The issue was not listing the proper import error for amazon textract loader. - Issue: Time wasted trying to figure out what to install... (langchain docs don't list the dependency either) - Dependencies: N/A - Tag maintainer: @sbusso - Twitter handle: @h9ste --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-11-08 10:29:08 -08:00
Eugene Yurtsev	06c503f672	Add RunnableRetry Documentation (#13074 )	2023-11-08 18:20:18 +00:00
Bagatur	55aeff6777	oai assistant multiple actions (#13068 )	2023-11-08 08:25:37 -08:00
Erick Friis	a9b70baef9	cli updates, 0.0.16 (#13034 ) - confirm flags, serve detection - 0.0.16 - always gen code - pip bool	2023-11-08 07:47:30 -08:00
Erick Friis	506f81563f	Update Deps in Experimental (#13029 )	2023-11-07 15:15:09 -08:00
Stefano Lottini	4f4b020582	Add "Astra DB" vector store integration (#12966 ) # Astra DB Vector store integration - Description: This PR adds a `VectorStore` implementation for DataStax Astra DB using its HTTP API - Issue: (no related issue) - Dependencies: A new required dependency is `astrapy` (`>=0.5.3`) which was added to pyptoject.toml, optional, as per guidelines - Tag maintainer: I recently mentioned to @baskaryan this integration was coming - Twitter handle: `@rsprrs` if you want to mention me This PR introduces the `AstraDB` vector store class, extensive integration test coverage, a reworking of the documentation which conflates Cassandra and Astra DB on a single "provider" page and a new, completely reworked vector-store example notebook (common to the Cassandra store, since parts of the flow is shared by the two APIs). I also took care in ensuring docs (and redirects therein) are behaving correctly. All style, linting, typechecks and tests pass as far as the `AstraDB` integration is concerned. I could build the documentation and check it all right (but ran into trouble with the `api_docs_build` makefile target which I could not verify: `Error: Unable to import module 'plan_and_execute.agent_executor' with error: No module named 'langchain_experimental'` was the first of many similar errors) Thank you for a review! Stefano --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-07 14:45:33 -08:00
Yang, Bo	600caff03c	Add `Memorize` tool (#11722 ) - Description: Add `Memorize` tool - Tag maintainer: @hwchase17 This PR added a new tool `Memorize` so that an agent can use it to fine-tune itself. This tool requires `TrainableLLM` introduced in #11721 DEMO: `6a9003d5db` ![image](https://github.com/langchain-ai/langchain/assets/601530/d6f0cb45-54df-4dcf-b143-f8aefb1e76e3)	2023-11-07 12:42:10 -08:00
Bagatur	cf481c9418	bump exp 38 (#13016 )	2023-11-07 11:49:23 -08:00
Bagatur	57e19989f6	Bagatur/oai assistant (#13010 )	2023-11-07 11:44:53 -08:00
Erick Friis	74134dd7e1	cli pyproject updating (#12945 ) `langchain app add` and `langchain app remove` will now keep the dependencies list updated. --------- Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-11-07 11:06:08 -08:00
Bagatur	6175dc30aa	bump 331rc2 (#13006 )	2023-11-07 08:52:17 -08:00
Erick Friis	0c81cd923e	oai v1 embeddings (#12969 ) Initial PR to get OpenAIEmbeddings working with the new sdk fyi @rlancemartin Fixes #12943 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-06 18:52:33 -08:00
Bagatur	fdbb45d79e	bump 331rc1 (#12965 )	2023-11-06 15:36:43 -08:00
Bagatur	3bb8030a6e	fix max_tokens (#12964 )	2023-11-06 15:36:05 -08:00
Bagatur	a9002a82b8	bump 331rc0 (#12963 )	2023-11-06 15:19:33 -08:00
Harrison Chase	c27400efeb	Support multimodal messages (#11320 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-06 15:14:18 -08:00
Bagatur	4f7dff9d66	Record system fingerprint chat openai (#12960 )	2023-11-06 14:25:53 -08:00
Bagatur	8e0cb2eb84	ChatOpenAI and AzureChatOpenAI openai>=1 compatible (#12948 )	2023-11-06 13:24:18 -08:00
Kacper Łukawski	52d0055a91	Add support of Cohere Embed v3 (#12940 ) Cohere released the new embedding API (Embed v3: https://txt.cohere.com/introducing-embed-v3/) that treats document and query embeddings differently. This PR updated the `CohereEmbeddings` to use them appropriately. It also works with the old models.	2023-11-06 15:06:58 -05:00
Praveen Venkateswaran	8e0dcb37d2	Add SecretStr for Symbl.ai Nebula API (#12896 ) Description: This PR masks API key secrets for the Nebula model from Symbl.ai Issue: #12165 Maintainer: @eyurtsev --------- Co-authored-by: Praveen Venkateswaran <praveen.venkateswaran@ibm.com>	2023-11-06 14:13:59 -05:00
Vinzenz Klass	59d0bd2150	feat: acquire advisory lock before creating extension in pgvector (#12935 ) - Description: Acquire advisory lock before attempting to create extension on postgres server, preventing errors in concurrent executions. - Issue: #12933 - Dependencies: None --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-11-06 14:00:39 -05:00
Eugene Yurtsev	b376854b26	Fix for anyscale chat model api key (#12938 ) * ChatAnyscale was missing coercion to SecretStr for anyscale api key * The model inherits from ChatOpenAI so it should not force the openai api key to be secret str until openai model has the same changes https://github.com/langchain-ai/langchain/issues/12841	2023-11-06 13:28:02 -05:00
hmasdev	622bf12c2e	fix regex pattern of structured output parser (#12929 ) - Description: fix the regex pattern of [StructuredChatOutputParser](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/agents/structured_chat/output_parser.py#L18) and add unit tests for the code change. - Issue: #12158 #12922 - Dependencies: None - Tag maintainer: - Twitter handle: @hmdev3 - NOTE: This PR conflicts #7495 . After #7495 is merged, I am going to update PR.	2023-11-06 07:53:14 -08:00
wemysschen	8d7144e6a6	fix baiducloud directory loader import file loader (#12924 ) Issue: fix baiducloud BOS directory loader imports its file loader --------- Co-authored-by: wemysschen <root@icoding-cwx.bcc-szzj.baidu.com>	2023-11-06 07:52:31 -08:00
Kacper Łukawski	621419f71e	Fix normalizing the cosine distance in Qdrant (#12934 ) Qdrant was incorrectly calculating the cosine similarity and returning `0.0` for the best match, instead of `1.0`. Internally Qdrant returns a cosine score from `-1.0` (worst match) to `1.0` (best match), and the current formula reflects it.	2023-11-06 07:36:59 -08:00
Hech	8fe6bcc662	Fix return metadata when searching for DingoDB (#12937 )	2023-11-06 07:35:36 -08:00
Jakub Novák	ada3d2cbd1	Add possibility to pass on_artifacts for a specific conversation (#12687 ) Possibility to pass on_artifacts to a conversation. It can be then achieved by adding this way: ```python result = agent.run( input=message.text, metadata={ "on_artifact": CALLBACK_FUNCTION }, ) ```	2023-11-06 07:29:47 -08:00
Bagatur	53f453f01a	bump 331 (#12932 )	2023-11-06 05:58:12 -08:00
Erick Friis	5000c7308e	cli template gitignores (#12914 ) - ap gitignore - package	2023-11-05 22:34:45 -08:00
Harrison Chase	aba407f774	use keys not items (#12918 )	2023-11-05 22:08:29 -08:00
wemysschen	e14aa37d59	fix bes vector store search (#12828 ) Issue: fix search body in baidu cloud vectorsearch --------- Co-authored-by: wemysschen <root@icoding-cwx.bcc-szzj.baidu.com>	2023-11-03 15:39:19 -07:00
Lance Martin	ea1ab391d4	Open Clip multimodal embeddings (#12754 )	2023-11-03 13:33:36 -07:00
Bagatur	ebee616822	bump 330 (#12853 )	2023-11-03 13:26:41 -07:00
Erick Friis	6c237716c4	Update readmes with new cli install (#12847 ) Old command still works. Just simplifying. Merge after releasing CLI 0.0.15	2023-11-03 12:10:32 -07:00
Erick Friis	7db49d3842	Confirm sys.path includes current dir for app serve (#12851 ) - Make sure sys.path is set properly for langchain app serve - bump	2023-11-03 11:37:20 -07:00
Erick Friis	1bc35f61cb	CLI 0.0.14, Uvicorn update and no more [serve] (#12845 ) Calls uvicorn directly from cli: Reload works if you define app by import string instead of object. (was doing subprocess in order to get reloading) Version bump to 0.0.14 Remove the need for [serve] for simplicity. Readmes are updated in #12847 to avoid cluttering this PR	2023-11-03 11:05:52 -07:00
William FH	18005c6384	Disable trace_on_chain_group auto-tracing (#12807 ) Previously we treated trace_on_chain_group as a command to always start tracing. This is unintuitive (makes the function do 2 things), and makes it harder to toggle tracing	2023-11-03 10:05:09 -07:00
Erick Friis	0da75b9ebd	Autopopulate module name in cli init (#12814 )	2023-11-02 23:45:38 -07:00
William FH	98aff29fbd	Add Dataset Page to printout (#12816 )	2023-11-02 20:36:56 -07:00
Manuel Rech	2e2b9c76d9	Keep also original query - multi_query.py (#12696 ) When you use a MultiQuery it might be useful to use the original query as well as the newly generated ones to maximise the changes to retriever the correct document. I haven't created an issue, it seems a very small and easy thing. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-02 18:15:02 -07:00
Bagatur	658a3a8607	FEAT: Merge TileDB vecstore (#12811 )	2023-11-02 17:40:32 -07:00
Akio Nishimura	c04647bb4e	Correct number of elements in config list in `batch()` and `abatch()` of `BaseLLM` (#12713 ) - Description: Correct number of elements in config list in `batch()` and `abatch()` of `BaseLLM` in case `max_concurrency` is not None. - Issue: #12643 - Twitter handle: @akionux --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-02 17:28:48 -07:00
James Braza	88b506b321	Adds missing `urllib.parse` for IDE warning of `PubMedAPIWrapper` (#12808 ) Resolves an IDE (PyCharm 2023.2.3 PE) warning around `urllib.parse.quote`, also enabling CTRL-click	2023-11-02 17:27:25 -07:00
Bagatur	a2bb0dd445	TileDB update import unit tests	2023-11-02 17:24:22 -07:00
Nikos Papailiou	2fdaa1e5fd	Add TileDB vectorstore implementation (#12624 ) - Description: Add [TileDB](https://tiledb.com) vectorstore implementation. TileDB offers ANN search capabilities using the [TileDB-Vector-Search](https://github.com/TileDB-Inc/TileDB-Vector-Search) module. It provides serverless execution of ANN queries and storage of vector indexes both on local disk and cloud object stores (i.e. AWS S3). More details in: - [Why TileDB as a Vector Database](https://tiledb.com/blog/why-tiledb-as-a-vector-database) - [TileDB 101: Vector Search](https://tiledb.com/blog/tiledb-101-vector-search) - Twitter handle: @tiledb	2023-11-02 17:21:03 -07:00
盐粒 Yanli	1b233798a0	feat: Supprt pgvecto.rs as a VectorStore (#12718 ) Supprt [pgvecto.rs](https://github.com/tensorchord/pgvecto.rs) as a new VectorStore type. This introduces a new dependency [pgvecto_rs](https://pypi.org/project/pgvecto_rs/) and upgrade SQLAlchemy to ^2. Relate to https://github.com/tensorchord/pgvecto.rs/issues/11 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-02 17:16:04 -07:00
Daniel Chalef	0cbdba6a9b	zep: VectorStore: Use Native MMR (#12690 ) - refactor to use Zep's native MMR; update example - @baskaryan @eyurtsev	2023-11-02 16:45:42 -07:00
Daniel Chalef	cc3d3920e3	Zep: Summary Search and Example (#12686 ) Zep now has the ability to search over chat history summaries. This PR adds support for doing so. More here: https://blog.getzep.com/zep-v0-17/ @baskaryan @eyurtsev	2023-11-02 16:31:11 -07:00
Bagatur	526313002c	add import tests to all modules (#12806 )	2023-11-02 15:32:55 -07:00
Harrison Chase	6609a6033f	fix vectorstore imports (#12804 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-11-02 15:32:31 -07:00
Nuno Campos	f66a9d2adf	Automatically add configurable key to config_schema if config_specs i… (#12798 ) …s present <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-02 21:46:15 +00:00
Praveen Venkateswaran	21eeba075c	enable the device_map parameter in huggingface pipeline (#12731 ) ### Enabling `device_map` in HuggingFacePipeline For multi-gpu settings with large models, the [accelerate](https://huggingface.co/docs/accelerate/usage_guides/big_modeling#using--accelerate) library provides the `device_map` parameter to automatically distribute the model across GPUs / disk. The [Transformers pipeline](`3520e37e86/src/transformers/pipelines/__init__.py (L543)`) enables users to specify `device` (or) `device_map`, and handles cases (with warnings) when both are specified. However, Langchain's HuggingFacePipeline only supports specifying `device` when calling transformers which limits large models and multi-gpu use-cases. Additionally, the [default value](`8bd3ce59cd/libs/langchain/langchain/llms/huggingface_pipeline.py (L72)`) of `device` is initialized to `-1` , which is incompatible with the transformers pipeline when `device_map` is specified. This PR addresses the addition of `device_map` as a parameter , and solves the incompatibility of `device = -1` when `device_map` is also specified. An additional test has been added for this feature. Additionally, some existing tests no longer work since 1. `max_new_tokens` has to be specified under `pipeline_kwargs` and not `model_kwargs` 2. The GPT2 tokenizer raises a `ValueError: Pipeline with tokenizer without pad_token cannot do batching`, since the `tokenizer.pad_token` is `None` ([related issue](https://github.com/huggingface/transformers/issues/19853) on the transformers repo). This PR handles fixing these tests as well. Co-authored-by: Praveen Venkateswaran <praveen.venkateswaran@ibm.com>	2023-11-02 14:29:06 -07:00
Mark Bell	3276aa3e17	__getattr__ should rase AttributeError not ImportError on missing attributes (#12801 ) [The python spec](https://docs.python.org/3/reference/datamodel.html#object.__getattr__) requires that `__getattr__` throw `AttributeError` for missing attributes but there are several places throwing `ImportError` in the current code base. This causes a specific problem with `hasattr` since it calls `__getattr__` then looks only for `AttributeError` exceptions. At present, calling `hasattr` on any of these modules will raise an unexpected exception that most code will not handle as `hasattr` throwing exceptions is not expected. In our case this is triggered by an exception tracker (Airbrake) that attempts to collect the version of all installed modules with code that looks like: `if hasattr(mod, "__version__"):`. With `HEAD` this is causing our exception tracker to fail on all exceptions. I only changed instances of unknown attributes raising `ImportError` and left instances of known attributes raising `ImportError`. It feels a little weird but doesn't seem to break anything.	2023-11-02 17:08:54 -04:00
Illia	71d1a48b66	Use data from all Google search results in SerpApi.com wrapper (#12770 ) - Description: Use all Google search results data in SerpApi.com wrapper instead of the first one only - Tag maintainer: @hwchase17 _P.S. `libs/langchain/tests/integration_tests/utilities/test_serpapi.py` are not executed during the `make test`._	2023-11-02 13:31:27 -07:00
Nuno Campos	c4fdf78d03	Fix AddableDict raising exception when used with non-addable values (#12785 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-02 18:56:29 +00:00
Erick Friis	49e283a0cd	CLI 0.0.13, Configurable Template Demo (#12796 )	2023-11-02 11:42:57 -07:00
Nuno Campos	d1c6ad7769	Fix on_llm_new_token(chunk=) for some chat models (#12784 ) It was passing in message instead of generation <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-02 16:33:44 +00:00
Erick Friis	070823f294	CLI 0.0.12 (#12787 )	2023-11-02 08:29:27 -07:00
Bagatur	979501c0ca	bump 329 (#12778 )	2023-11-02 06:02:43 -07:00
Erick Friis	da821320d3	Fixes 'Nonetype' not iterable for ObsidianLoader (#12751 ) Implements #12726 from @Di3mex	2023-11-01 16:07:09 -07:00
Eugene Yurtsev	b1caae62fd	APIChain add restrictions to domains (CVE-2023-32786) (#12747 ) * Restrict the chain to specific domains by default * This is a breaking change, but it will fail loudly upon object instantiation -- so there should be no silent errors for users * Resolves CVE-2023-32786	2023-11-01 18:50:34 -04:00
Erick Friis	4421ba46d7	Demo Server, Fix Timescale (#12746 ) - improve demo server - missing deps	2023-11-01 15:29:34 -07:00
Eugene Yurtsev	0e1aedb9f4	Use jinja2 sandboxing by default (#12733 ) * This is an opt-in feature, so users should be aware of risks if using jinja2. * Regardless we'll add sandboxing by default to jinja2 templates -- this sandboxing is a best effort basis. * Best strategy is still to make sure that jinja2 templates are only loaded from trusted sources.	2023-11-01 14:54:01 -07:00
Erick Friis	14340ee7cd	use http.client instead of urllib3 (#12660 ) dep problems with requests cloudflare debugging not worth it with urllib	2023-11-01 11:15:05 -07:00
Bagatur	eee5181b7a	bump 328, exp 37 (#12722 )	2023-11-01 10:27:39 -07:00
Erick Friis	3405dbbc64	dash not underscore (#12716 ) template names are auto-populating with the wrong convention (with underscores)	2023-11-01 09:48:37 -07:00
123-fake-st	8bd3ce59cd	PyPDFLoader use url in metadata source if file is a web path (#12092 ) Description: Update `langchain.document_loaders.pdf.PyPDFLoader` to store url in metadata (instead of a temporary file path) if user provides a web path to a pdf - Issue: Related to #7034; the reporter on that issue submitted a PR updating `PyMuPDFParser` for this behavior, but it has unresolved merge issues as of 20 Oct 2023 #7077 - In addition to `PyPDFLoader` and `PyMuPDFParser`, these other classes in `langchain.document_loaders.pdf` exhibit similar behavior and could benefit from an update: `PyPDFium2Loader`, `PDFMinerLoader`, `PDFMinerPDFasHTMLLoader`, `PDFPlumberLoader` (I'm happy to contribute to some/all of that, including assisting with `PyMuPDFParser`, if my work is agreeable) - The root cause is that the underlying pdf parser classes, e.g. `langchain.document_loaders.parsers.pdf.PyPDFParser`, never receive information about the url; the parsers receive a `langchain.document_loaders.blob_loaders.blob`, which contains the pdf contents and local file path, but not the url - This update passes the web path directly to the parser since it's minimally invasive and doesn't require further changes to maintain existing behavior for local files... bigger picture, I'd consider extending `blob` so that extra information like this can be communicated, but that has much bigger implications on the codebase which I think warrants maintainer input - Dependencies: None ```python # old behavior >>> from langchain.document_loaders import PyPDFLoader >>> loader = PyPDFLoader('https://arxiv.org/pdf/1706.03762.pdf') >>> docs = loader.load() >>> docs[0].metadata {'source': '/var/folders/w2/zx77z1cs01s1thx5dhshkd58h3jtrv/T/tmpfgrorsi5/tmp.pdf', 'page': 0} # new behavior >>> from langchain.document_loaders import PyPDFLoader >>> loader = PyPDFLoader('https://arxiv.org/pdf/1706.03762.pdf') >>> docs = loader.load() >>> docs[0].metadata {'source': 'https://arxiv.org/pdf/1706.03762.pdf', 'page': 0} ```	2023-11-01 11:27:00 -04:00
Dave Kwon	b1954aab13	feat: Add page metadata on PDFMinerLoader (#12277 ) - Description: #12273 's suggestion PR Like other PDFLoader, loading pdf per each page and giving page metadata. - Issue: #12273 - Twitter handle: @blue0_0hope --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-11-01 11:25:37 -04:00
Duda Nogueira	7148f3e1fe	Weaviate - Fix schema existence check (#12711 ) This will allow you create the schema beforehand. The check was failing and preventing importing into existing classes. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-11-01 08:22:15 -07:00
Aidos Kanapyanov	ae63c186af	Mask API key for Anyscale LLM (#12406 ) Description: Add masking of API Key for Anyscale LLM when printed. Issue: #12165 Dependencies: None Tag maintainer: @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-11-01 10:22:26 -04:00
Predrag Gruevski	5ae51a8a85	Fix typo highlighted by `ruff` autoformatter. (#12691 ) H/t @MichaReiser for spotting it: https://github.com/langchain-ai/langchain/pull/12585/files#r1378253045	2023-10-31 22:16:06 -04:00
Erick Friis	44c8b159b9	properly increment version in cli (#12685 ) Went from 0.0.9 -> 0.0.11 without releasing. Back to 10, then release.	2023-10-31 17:27:43 -07:00
Leonid Ganeline	ddcec005bc	fix for `YahooFinanceNewsTool` (#12665 ) Added YahooFinanceNewsTool to the __init__.py It was missed here.	2023-10-31 14:58:09 -07:00
Predrag Gruevski	01a3c9b94e	Use an in-project virtualenv in the CLI package. (#12678 ) Keeping it in sync with how our other packages are configured.	2023-10-31 14:51:24 -07:00
Jacob Lee	bd668fcea1	Adds version CLI command (#12619 ) Will be automatically bumped with `poetry version patch`. @efriis @hwchase17 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-31 14:50:04 -07:00
Frank	bf5805bb32	Add quip loader (#12259 ) - Description: implement [quip](https://quip.com) loader - Issue: https://github.com/langchain-ai/langchain/issues/10352 - Dependencies: No - pass make format, make lint, make test --------- Co-authored-by: Hao Fan <h_fan@apple.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-31 14:11:24 -07:00
Roman Vasilyev	c9a6940d58	PGVector fix (#12592 ) latest release broken, this fixes it --------- Co-authored-by: Roman Vasilyev <rvasilyev@mozilla.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-31 17:01:15 -04:00
Predrag Gruevski	e8b99364b3	Use `ruff` for both linting and formatting in `langchain-cli`. (#12672 ) Prior to this PR, `ruff` was used only for linting and not for formatting, despite the names of the commands. This PR makes it be used for both linting code and autoformatting it.	2023-10-31 13:52:25 -07:00
Margaret Qian	acfc485808	Update MosaicML Embedding Input Key (#12657 ) This input key was missed in the last update PR: https://github.com/langchain-ai/langchain/pull/7391 The input/output formats are intended to be like this: ``` {"inputs": [<prompt>]} {"outputs": [<output_text>]} ```	2023-10-31 14:43:30 -04:00
Predrag Gruevski	c871cc5055	Remove `print()` statements which seemed leftover from debugging. (#12648 ) Added in #12159 presumably during debugging. Right now they cause a bit of visual noise.	2023-10-31 13:45:48 -04:00
Noam Gat	14e8c74736	LM Format Enforcer Integration + Sample Notebook (#12625 ) ## Description This PR adds support for [lm-format-enforcer](https://github.com/noamgat/lm-format-enforcer) to LangChain. ![image](https://raw.githubusercontent.com/noamgat/lm-format-enforcer/main/docs/Intro.webp) The library is similar to jsonformer / RELLM which are supported in Langchain, but has several advantages such as - Batching and Beam search support - More complete JSON Schema support - LLM has control over whitespace, improving quality - Better runtime performance due to only calling the LLM's generate() function once per generate() call. The integration is loosely based on the jsonformer integration in terms of project structure. ## Dependencies No compile-time dependency was added, but if `lm-format-enforcer` is not installed, a runtime error will occur if it is trying to be used. ## Tests Due to the integration modifying the internal parameters of the underlying huggingface transformer LLM, it is not possible to test without building a real LM, which requires internet access. So, similar to the jsonformer and RELLM integrations, the testing is via the notebook. ## Twitter Handle [@noamgat](https://twitter.com/noamgat) Looking forward to hearing feedback! --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-31 09:49:01 -07:00
Erick Friis	7f6e751a3d	template updates (#12646 )	2023-10-31 09:13:58 -07:00
Predrag Gruevski	f94e24dfd7	Install and use `ruff format` instead of black for code formatting. (#12585 ) Best to review one commit at a time, since two of the commits are 100% autogenerated changes from running `ruff format`: - Install and use `ruff format` instead of black for code formatting. - Output of `ruff format .` in the `langchain` package. - Use `ruff format` in experimental package. - Format changes in experimental package by `ruff format`. - Manual formatting fixes to make `ruff .` pass.	2023-10-31 10:53:12 -04:00
William FH	bfd719f9d8	bind_functions convenience method (#12518 ) I always take 20-30 seconds to re-discover where the `convert_to_openai_function` wrapper lives in our codebase. Chat langchain [has no clue](https://smith.langchain.com/public/3989d687-18c7-4108-958e-96e88803da86/r) what to do either. There's the older `create_openai_fn_chain` , but we haven't been recommending it in LCEL. The example we show in the [cookbook](https://python.langchain.com/docs/expression_language/how_to/binding#attaching-openai-functions) is really verbose. General function calling should be as simple as possible to do, so this seems a bit more ergonomic to me (feel free to disagree). Another option would be to directly coerce directly in the class's init (or when calling invoke), if provided. I'm not 100% set against that. That approach may be too easy but not simple. This PR feels like a decent compromise between simple and easy. ``` from enum import Enum from typing import Optional from pydantic import BaseModel, Field class Category(str, Enum): """The category of the issue.""" bug = "bug" nit = "nit" improvement = "improvement" other = "other" class IssueClassification(BaseModel): """Classify an issue.""" category: Category other_description: Optional[str] = Field( description="If classified as 'other', the suggested other category" ) from langchain.chat_models import ChatOpenAI llm = ChatOpenAI().bind_functions([IssueClassification]) llm.invoke("This PR adds a convenience wrapper to the bind argument") # AIMessage(content='', additional_kwargs={'function_call': {'name': 'IssueClassification', 'arguments': '{\n "category": "improvement"\n}'}}) ```	2023-10-31 07:15:37 -07:00
Nuno Campos	3143324984	Improve Runnable type inference for input_schemas (#12630 ) - Prefer lambda type annotations over inferred dict schema - For sequences that start with RunnableAssign infer seq input type as "input type of 2nd item in sequence - output type of runnable assign"	2023-10-31 13:22:54 +00:00
Nuno Campos	2f563cee20	Add Runnable.with_listeners() (#12549 ) - This binds start/end/error listeners to a runnable, which will be called with the Run object	2023-10-31 11:04:51 +00:00
Bagatur	bcc62d63be	bump 327 (#12623 )	2023-10-31 02:18:08 -07:00
Erick Friis	a1fae1fddd	Readme rewrite (#12615 ) Co-authored-by: Lance Martin <lance@langchain.dev> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-10-31 00:06:02 -07:00
Yujie Qian	1dbb77d7db	VoyageEmbeddings (#12608 ) - Description: Integrate VoyageEmbeddings into LangChain, with tests and docs - Issue: N/A - Dependencies: N/A - Tag maintainer: N/A - Twitter handle: @Voyage_AI_ --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 18:37:43 -07:00
chocolate4	92bf40a921	Add a new vector store hippo for langchain #11763 (#12412 ) #11763 --------- Co-authored-by: TranswarpHippo <hippo.0.assistant@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 18:35:23 -07:00
Karthik Raja A	342d6c7ab6	Multi on client toolkit (#12392 ) Replace this entire comment with: -Add MultiOn close function and update key value and add async functionality - solved the key value TabId not found.. (updated to use latest key value) @hwchase17	2023-10-30 18:34:56 -07:00
Prabin Nepal	b109cb031b	SecretStr for fireworks api (#12475 ) - Description: This pull request removes secrets present in raw format, - Issue: Fireworks api key was exposed when printing out the langchain object [#12165](https://github.com/langchain-ai/langchain/issues/12165) - Maintainer: @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 18:17:53 -07:00
Harrison Chase	a32c236c64	bump cli to 009 (#12611 )	2023-10-30 18:12:08 -07:00
Martin Schade	0c7f1d8b21	Textract linearizer (#12446 ) Description: Textract PDF Loader generating linearized output, meaning it will replicate the structure of the source document as close as possible based on the features passed into the call (e. g. LAYOUT, FORMS, TABLES). With LAYOUT reading order for multi-column documents or identification of lists and figures is supported and with TABLES it will generate the table structure as well. FORMS will indicate "key: value" with columms. - Issue: the issue fixes #12068 - Dependencies: amazon-textract-textractor is added, which provides the linearization - Tag maintainer: @3coins --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 18:02:10 -07:00
Erick Friis	f39246bd7e	cli should pull instead of delete+clone (#12607 ) Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-10-30 16:44:09 -07:00
Harrison Chase	8b5e879171	add a template for the package readme (#12499 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2023-10-30 16:39:39 -07:00
Bagatur	9bedda50f2	Bagatur/lakefs loader2 (#12524 ) Co-authored-by: Jonathan Rosenberg <96974219+Jonathan-Rosenberg@users.noreply.github.com>	2023-10-30 16:30:27 -07:00
Ackermann Yuriy	99b69fe607	Fixed missing optional tags. Added default key value for Ollama (#12599 ) Added missing Optional typings. Added default values for Ollama optional keys. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 16:30:10 -07:00
Bagatur	016813d189	factor out to_secret (#12593 )	2023-10-30 15:10:25 -07:00
hsuyuming	630ae24b28	implement get_num_tokens to use google's count_tokens function (#10565 ) can get the correct token count instead of using gpt-2 model Description: Implement get_num_tokens within VertexLLM to use google's count_tokens function. (https://cloud.google.com/vertex-ai/docs/generative-ai/get-token-count). So we don't need to download gpt-2 model from huggingface, also when we do the mapreduce chain we can get correct token count. Tag maintainer: @lkuligin Twitter handle: My twitter: @abehsu1992626 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 15:10:05 -07:00
Pham Vu Thai Minh	33e77a1007	Async support for FAISS (#11333 ) Following this tutoral about using OpenAI Embeddings with FAISS https://python.langchain.com/docs/integrations/vectorstores/faiss ```python from langchain.embeddings.openai import OpenAIEmbeddings from langchain.text_splitter import CharacterTextSplitter from langchain.vectorstores import FAISS from langchain.document_loaders import TextLoader from langchain.document_loaders import TextLoader loader = TextLoader("../../../extras/modules/state_of_the_union.txt") documents = loader.load() text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0) docs = text_splitter.split_documents(documents) embeddings = OpenAIEmbeddings() ``` This works fine ```python db = FAISS.from_documents(docs, embeddings) query = "What did the president say about Ketanji Brown Jackson" docs = db.similarity_search(query) ``` But the async version is not ```python db = await FAISS.afrom_documents(docs, embeddings) # NotImplementedError query = "What did the president say about Ketanji Brown Jackson" docs = await db.asimilarity_search(query) # this will use await asyncio.get_event_loop().run_in_executor under the hood and will not call OpenAIEmbeddings.aembed_query but call OpenAIEmbeddings.embed_query ``` So this PR add async/await supports for FAISS --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-30 15:08:53 -07:00
Jeff Zhuo	13b89815a3	Issue: fix the issue #11648 init minimax llm (#12554 ) e https://github.com/langchain-ai/langchain/issues/11648 Minimax llm failed to initialize The idea of this fix is https://github.com/langchain-ai/langchain/issues/10917#issuecomment-1765606725 do not use underscore in python model class --------- Co-authored-by: zhuojianming@cmcm.com <zhuojianming@cmcm.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 14:30:17 -07:00
Florian Valeye	bfb27324cb	[Matching Engine] Update the Matching Engine to include the distance and filters (#12555 ) Hello 👋, This Pull Request adds more capability to the [MatchingEngine](https://api.python.langchain.com/en/latest/vectorstores/langchain.vectorstores.matching_engine.MatchingEngine.html) vectorstore of GCP. It includes the `similarity_search_by_vector_with_relevance_scores` function and also [filters](https://cloud.google.com/vertex-ai/docs/vector-search/filtering) to `filter` the namespaces when retrieving the results. - Description: Add [filter](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform.MatchingEngineIndexEndpoint#google_cloud_aiplatform_MatchingEngineIndexEndpoint_find_neighbors) in `similarity_search` and add `similarity_search_by_vector_with_relevance_scores` method - Dependencies: None - Tag maintainer: Unknown Thank you! --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 14:12:59 -07:00
Harrison Chase	1d51363e49	change project template (#12493 )	2023-10-30 14:06:30 -07:00
Holt Skinner	e53b9ccd70	feat: Add Google Cloud Text-to-Speech Tool (#12572 ) - Add Tool for [Google Cloud Text-to-Speech](https://cloud.google.com/text-to-speech) - Follows similar structure to [Eleven Labs Text2Speech](https://python.langchain.com/docs/integrations/tools/eleven_labs_tts) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 14:05:39 -07:00
Adilkhan Sarsen	6e702b9c36	Deep memory support in LangChain (#12268 ) - Description: adding support to Activeloop's DeepMemory feature that boosts recall up to 25%. Added Jupyter notebook showcasing the feature and also made index params explicit. - Twitter handle: will really appreciate if we could announce this on twitter. --------- Co-authored-by: adolkhan <adilkhan.sarsen@alumni.nu.edu.kz>	2023-10-30 12:16:14 -07:00
billytrend-cohere	b1e3843931	Add client_name="langchain" to Cohere usage (#11328 ) Hey, we're looking to invest more in adding cohere integrations to langchain so would love to get more of an idea for how it's used. Hopefully this pr is acceptable. This week I'm also going to be looking into adding our new [retrieval augmented generation product](https://txt.cohere.com/chat-with-rag/) to langchain. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-30 11:20:55 -07:00
Bagatur	37aec1e050	bump 326 (#12569 )	2023-10-30 10:11:17 -07:00
Eugene Yurtsev	1b1a2d5740	Image Caption accepts bytes for images (#12561 ) Accept bytes for images in image caption --------- Co-authored-by: webcoderz <19884161+webcoderz@users.noreply.github.com>	2023-10-30 12:29:54 -04:00
Nuno Campos	7897483819	Allow astream_log to be used inside atrace_as_chain_group (#12558 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-30 15:55:16 +00:00
Holt Skinner	e05bb938de	Merge pull request #12433 * feat: Add Google Cloud Translation document transformer * Merge branch 'langchain-ai:master' into google-translate * Add documentation for Google Translate Document Transformer * Fix line length error * Merge branch 'master' into google-translate * Merge branch 'google-translate' of https://github.com/holtskinner/lan… * Addressed code review comments * Merge branch 'master' into google-translate * Merge branch 'google-translate' of https://github.com/holtskinner/lan… * Removed extra variable * Merge branch 'google-translate' of https://github.com/holtskinner/lan… * Merge branch 'master' into google-translate * Merge branch 'google-translate' of https://github.com/holtskinner/lan… * Removed extra import	2023-10-29 21:22:36 -04:00
Samad Koita	d1fdcd4fcb	Masking of API Key for GooseAI LLM (#12496 ) Description: Add masking of API Key for GooseAI LLM when printed. Issue: https://github.com/langchain-ai/langchain/issues/12165 Dependencies: None Tag maintainer: @eyurtsev --------- Co-authored-by: Samad Koita <>	2023-10-29 21:21:33 -04:00
Andrew Zhou	64c4a698a8	More comprehensive readthedocs document loader (#12382 ) ## Description: When building our own readthedocs.io scraper, we noticed a couple interesting things: 1. Text lines with a lot of nested <span> tags would give unclean text with a bunch of newlines. For example, for [Langchain's documentation](https://api.python.langchain.com/en/latest/document_loaders/langchain.document_loaders.readthedocs.ReadTheDocsLoader.html#langchain.document_loaders.readthedocs.ReadTheDocsLoader), a single line is represented in a complicated nested HTML structure, and the naive `soup.get_text()` call currently being made will create a newline for each nested HTML element. Therefore, the document loader would give a messy, newline-separated blob of text. This would be true in a lot of cases. <img width="945" alt="Screenshot 2023-10-26 at 6 15 39 PM" src="https://github.com/langchain-ai/langchain/assets/44193474/eca85d1f-d2bf-4487-a18a-e1e732fadf19"> <img width="1031" alt="Screenshot 2023-10-26 at 6 16 00 PM" src="https://github.com/langchain-ai/langchain/assets/44193474/035938a0-9892-4f6a-83cd-0d7b409b00a3"> Additionally, content from iframes, code from scripts, css from styles, etc. will be gotten if it's a subclass of the selector (which happens more often than you'd think). For example, [this page](https://pydeck.gl/gallery/contour_layer.html#) will scrape 1.5 million characters of content that looks like this: <img width="1372" alt="Screenshot 2023-10-26 at 6 32 55 PM" src="https://github.com/langchain-ai/langchain/assets/44193474/dbd89e39-9478-4a18-9e84-f0eb91954eac"> Therefore, I wrote a recursive _get_clean_text(soup) class function that 1. skips all irrelevant elements, and 2. only adds newlines when necessary. 2. Index pages (like [this one](https://api.python.langchain.com/en/latest/api_reference.html)) would be loaded, chunked, and eventually embedded. This is really bad not just because the user will be embedding irrelevant information - but because index pages are very likely to show up in retrieved content, making retrieval less effective (in our tests). Therefore, I added a bool parameter `exclude_index_pages` defaulted to False (which is the current behavior — although I'd petition to default this to True) that will skip all pages where links take up 50%+ of the page. Through manual testing, this seems to be the best threshold. ## Other Information: - Issue: n/a - Dependencies: n/a - Tag maintainer: n/a - Twitter handle: @andrewthezhou --------- Co-authored-by: Andrew Zhou <andrew@heykona.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-29 16:26:53 -07:00
Peter Vandenabeele	3468c038ba	Add unit tests for document_transformers/beautiful_soup_transformer.py (#12520 ) - Description: * Add unit tests for document_transformers/beautiful_soup_transformer.py * Basic functionality is tested (extract tags, remove tags, drop lines) * add a FIXME comment about the order of tags that is not preserved (and a passing test, but with the expected tags now out-of-order) - Issue: None - Dependencies: None - Tag maintainer: @rlancemartin - Twitter handle: `peter_v` Please make sure your PR is passing linting and testing before submitting. => OK: I ran `make format`, `make test` (passing after install of beautifulsoup4) and `make lint`.	2023-10-29 16:24:47 -07:00
Anirudh Gautam	b257e6a4e8	Mask API key for AI21 LLM (#12418 ) - Description: Added masking of the API Key for AI21 LLM when printed and improved the docstring for AI21 LLM. - Updated the AI21 LLM to utilize SecretStr from pydantic to securely manage API key. - Made improvements in the docstring of AI21 LLM. It now mentions that the API key can also be passed as a named parameter to the constructor. - Added unit tests. - Issue: #12165 - Tag maintainer: @eyurtsev --------- Co-authored-by: Anirudh Gautam <anirudh@Anirudhs-Mac-mini.local>	2023-10-29 14:53:41 -07:00
silvhua	9dead1034c	`_dalle_image_url` returns list of urls if n>1 (#11800 ) - Description: Updated the `_dalle_image_url` method to return a list of URLs if self.n>1, - Issue: #10691, - Dependencies: unsure, - Tag maintainer: @eyurtsev, - Twitter handle: @silvhua --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-29 14:23:23 -07:00
Bagatur	1815ea2fdb	OpenAI runnable constructor (#12455 )	2023-10-29 13:40:30 -07:00
William FH	a830b809f3	Patch forward ref bug (#12508 ) Currently this gives a bug: ``` from langchain.schema.runnable import RunnableLambda bound = RunnableLambda(lambda x: x).with_config({"callbacks": []}) # ConfigError: field "callbacks" not yet prepared so type is still a ForwardRef, you might need to call RunnableConfig.update_forward_refs(). ``` Rather than deal with cyclic imports and extra load time, etc., I think it makes sense to just have a separate Callbacks definition here that is a relaxed typehint.	2023-10-29 00:53:01 -07:00
William FH	36204c2baf	Evaluation Callback Multi Response (#12505 ) 1. Allow run evaluators to return {"results": [list of evaluation results]} in the evaluator callback. 2. Allows run evaluators to pick the target run ID to provide feedback to (1) means you could do something like a function call that populates a full rubric in one go (not sure how reliable that is in general though) rather than splitting off into separate LLM calls - cheaper and less code to write (2) means you can provide feedback to runs on subsequent calls. Immediate use case is if you wanted to add an evaluator to a chat bot and assign to assign to previous conversation turns have a corresponding one in the SDK	2023-10-28 23:18:29 -07:00
Harrison Chase	9e0ae56287	various templates improvements (#12500 )	2023-10-28 22:13:22 -07:00
0xC9	79cf01366e	Update tool.py (#12472 ) In the GoogleSerperResults class, the name field is defined as 'google_serrper_results_json'. This looks like a typo, and perhaps should be 'google_serper_results_json'. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-28 21:49:01 -07:00
Harrison Chase	eb903e211c	bump to 36 (#12487 )	2023-10-28 08:51:23 -07:00
Tyler Hutcherson	4209457bdc	Redis langserve template (#12443 ) Add Redis langserve template! Eventually will add semantic caching to this too. But I was struggling to get that to work for some reason with the LCEL implementation here. - Description: Introduces the Redis LangServe template. A simple RAG based app built on top of Redis that allows you to chat with company's public financial data (Edgar 10k filings) - Issue: None - Dependencies: The template contains the poetry project requirements to run this template - Tag maintainer: @baskaryan @Spartee - Twitter handle: @tchutch94 Note: this requires the commit here that deletes the `_aget_relevant_documents()` method from the Redis retriever class that wasn't implemented. That was breaking the langserve app. --------- Co-authored-by: Sam Partee <sam.partee@redis.com>	2023-10-28 08:31:12 -07:00
Erick Friis	9adaa78c65	cli improvements (#12465 ) Features - add multiple repos by their branch/repo - generate `pip install` commands and `add_route()` code ![Screenshot 2023-10-27 at 4 49 52 PM](https://github.com/langchain-ai/langchain/assets/9557659/3aec4cbb-3f67-4f04-8370-5b54ea983b2a) Optimizations: - group installs by repo/branch to avoid duplicate cloning	2023-10-28 08:25:31 -07:00
Adam Law	df4960a6d8	add reranking to azuresearch (#12454 ) -Description Adds returning the reranking score when using semantic search -*Issue: #12317 --------- Co-authored-by: Adam Law <adamlaw@microsoft.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-27 14:14:09 -07:00
Eugene Yurtsev	60d009f75a	Add security note to API chain (#12452 ) Add security note	2023-10-27 17:09:42 -04:00
Matvey Arye	11505f95d3	Improve handling of empty queries for timescale vector (#12393 ) Description: Improve handling of empty queries in timescale-vector. For timescale-vector it is more efficient to get a None embedding when the embedding has no semantic meaning. It allows timescale-vector to perform more optimizations. Thus, when the query is empty, use a None embedding. Also pass down constructor arguments to the timescale vector client. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-10-27 13:55:16 -07:00
Erick Friis	38cee5fae0	cli updates 2 (#12447 ) - extras group - readme - another readme --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-27 13:37:03 -07:00
William FH	5d40e36c75	Trace if run tree set (#12444 ) This code path is hit in the following case: - Start in langchain code and manually provide a tracer - Handoff to the traceable - Hand back to langchain code. Which happens for evaluating `@traceable` functions unfortunately	2023-10-27 12:29:18 -07:00
Bagatur	c2a0a6b6df	make doc utils public (#12394 )	2023-10-27 12:08:08 -07:00
Henter	d6888a90d0	Fix the missing temperature parameter for Baichuan-AI chat_model (#12420 ) Description: the missing `temperature` parameter for Baichuan-AI chat_model Baichuan-AI api doc: https://platform.baichuan-ai.com/docs/api	2023-10-27 12:07:21 -07:00
Erick Friis	6908634428	cli updates oct27 (#12436 )	2023-10-27 12:06:46 -07:00
HwangJohn	d38c8369b3	added rrf argument in ApproxRetrievalStrategy class __init__() (#11987 ) - Description: To handle the hybrid search with RRF(Reciprocal Rank Fusion) in the Elasticsearch, rrf argument was added for adjusting 'rank_constant' and 'window_size' to combine multiple result sets with different relevance indicators into a single result set. (ref: https://www.elastic.co/kr/blog/whats-new-elastic-enterprise-search-8-9-0), - Issue: the issue # it fixes (if applicable), - Dependencies: No dependencies changed, - Tag maintainer: @baskaryan, Nice to meet you, I'm a newbie for contributions and it's my first PR. I only changed the langchain/vectorstores/elasticsearch.py file. I did make format&lint I got this message, ```shell make lint_diff ./scripts/check_pydantic.sh . ./scripts/check_imports.sh poetry run ruff . [ "langchain/vectorstores/elasticsearch.py" = "" ] \|\| poetry run black langchain/vectorstores/elasticsearch.py --check All done! ✨ 🍰 ✨ 1 file would be left unchanged. [ "langchain/vectorstores/elasticsearch.py" = "" ] \|\| poetry run mypy langchain/vectorstores/elasticsearch.py langchain/__init__.py: error: Source file found twice under different module names: "mvp.nlp.langchain.libs.langchain.langchain" and "langchain" Found 1 error in 1 file (errors prevented further checking) make: * [lint_diff] Error 2 ``` Thank you --------- Co-authored-by: 황중원 <jwhwang@amorepacific.com>	2023-10-27 11:53:19 -07:00
Roman Vasilyev	2c58dca5f0	optional reusable connection (#12051 ) My postgres out of connections after continuous PGVector usage, and the reason because it constantly creates new connections, so adding a reusable pre established connection seems like solves an issue --------- Co-authored-by: Roman Vasilyev <rvasilyev@mozilla.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-10-27 11:52:42 -07:00
Ennio Pastore	48fde2004f	Update long_context_reorder.py (#12422 ) The function comment was confusing and inaccurate	2023-10-27 11:52:28 -07:00

... 4 5 6 7 8 ...

2133 Commits