langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-10 01:10:59 +00:00

Author	SHA1	Message	Date
Ivaylo Bratoev	7c5063ef60	infra: fix how Poetry is installed in the dev container (#20521 ) Currently, when a new dev container is created, poetry does not work in it with the error "No module named 'rapidfuzz'". Install Poetry outside the project venv so that poetry and project dependencies do not get mixed. Use pipx to install poetry securely in its own isolated environment. Issue: #12237 Twitter handle: https://twitter.com/ibratoev Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-04-24 17:33:25 -07:00
GustavoSept	c2d09a5186	experimental[patch]: Makes regex customizable in text_splitter.py (SemanticChunker class) (#20485 ) - Description: Currently, the regex is static (`r"(?<=[.?!])\s+"`), which is only useful for certain use cases. The current change only moves this to be a parameter of split_text(). Which adds flexibility without making it more complex (as the default regex is still the same). - Issue: Not applicable (I searched, no one seems to have created this issue yet). - Dependencies: None. _If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17._ --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-04-25 00:32:40 +00:00
William FH	a936f696a6	[Core] Feat: update config CVar in tool.invoke (#20808 )	2024-04-24 17:17:21 -07:00
Lei Zhang	2cd907ad7e	text-splitters[patch]: fix MarkdownHeaderTextSplitter fails to parse headers with non-printable characters (#20645 ) Description: MarkdownHeaderTextSplitter Fails to Parse Headers with non-printable characters. more #20643 The following is the official test case. Just replacing `# Foo\n\n` with `\ufeff# Foo\n\n` will cause the test case to fail. chunk metadata is empty ```python def test_md_header_text_splitter_1() -> None: """Test markdown splitter by header: Case 1.""" markdown_document = ( "\ufeff# Foo\n\n" " ## Bar\n\n" "Hi this is Jim\n\n" "Hi this is Joe\n\n" " ## Baz\n\n" " Hi this is Molly" ) headers_to_split_on = [ ("#", "Header 1"), ("##", "Header 2"), ] markdown_splitter = MarkdownHeaderTextSplitter( headers_to_split_on=headers_to_split_on, ) output = markdown_splitter.split_text(markdown_document) expected_output = [ Document( page_content="Hi this is Jim \nHi this is Joe", metadata={"Header 1": "Foo", "Header 2": "Bar"}, ), Document( page_content="Hi this is Molly", metadata={"Header 1": "Foo", "Header 2": "Baz"}, ), ] assert output == expected_output ``` twitter: @coolbeevip Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-04-25 00:07:42 +00:00
jtanios	2968f20970	docs: git dependency name correction (#20662 ) This PR corrects the name of the `git` python package to `GitPython`. Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-04-24 23:43:44 +00:00
ccurme	481d3855dc	patch: remove usage of llm, chat model __call__ (#20788 ) - `llm(prompt)` -> `llm.invoke(prompt)` - `llm(prompt=prompt` -> `llm.invoke(prompt)` (same with `messages=`) - `llm(prompt, callbacks=callbacks)` -> `llm.invoke(prompt, config={"callbacks": callbacks})` - `llm(prompt, kwargs)` -> `llm.invoke(prompt, kwargs)`	2024-04-24 19:39:23 -04:00
Raghav Dixit	9b7fb381a4	community[patch]: LanceDB integration patch update (#20686 ) Description : - added functionalities - delete, index creation, using existing connection object etc. - updated usage - Added LaceDB cloud OSS support make lint_diff , make test checks done	2024-04-24 16:27:43 -07:00
Nikita Pokidyshev	9e983c9500	langchain[patch]: fix agent_token_buffer_memory not working with openai tools (#20708 ) - Description: fix a bug in the agent_token_buffer_memory - Issue: agent_token_buffer_memory was not working with openai tools - Dependencies: None - Twitter handle: @pokidyshef	2024-04-24 15:51:58 -07:00
Salika Dave	6353991498	docs: [Retrieval > .. > PDF] update package installation instructions for Unstructured and PDFMiner (#20723 ) Description: Adds the command to install packages required before using _Unstructured_ and _PDFMiner_ from `langchain.community` Documentation Page Being Updated: [LangChain > Retrieval > Document loaders > PDF > Using Unstructured](https://python.langchain.com/docs/modules/data_connection/document_loaders/pdf/#using-unstructured) Issue: #20719 Dependencies: no dependencies Twitter handle: SalikaDave <!-- Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --> --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-04-24 22:24:11 +00:00
dpdjvhxm	a9e2e98708	docs: Update apache_age.ipynb (#20722 ) typo	2024-04-24 22:18:59 +00:00
Erick Friis	1aef8116de	upstage: release 0.1.1 (#20864 )	2024-04-24 15:18:30 -07:00
junkeon	c8fd51e8c8	upstage: Add Upstage partner package LA and GC (#20651 ) --------- Co-authored-by: Sean <chosh0615@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Sean Cho <sean@upstage.ai>	2024-04-24 15:17:20 -07:00
hsmtkk	5ecebf168c	docs: imported List is not used (#20720 ) # Description Minor sample code fix # Issue Imported `List` is not used. # Dependencies N/A # Twitter handle N/A	2024-04-24 15:17:07 -07:00
Alex Lee	243ba71b28	langchain[patch]: add `aprep_output` method to `langchain/chains/base.py` (#20748 ) ## Description Add `aprep_output` method to `langchain/chains/base.py`. Some downstream `ChatMessageHistory` objects that use async connections require an async way to append to the context. It turned out that `ainvoke()` was calling `prep_output` which is synchronous. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-04-24 22:16:25 +00:00
Harrison Chase	43c041cda5	support messages in messages out (#20862 )	2024-04-24 14:58:58 -07:00
back2nix	a1614b88ac	groq[patch]: groq proxy support (#20758 ) # Proxy Fix for Groq Class 🐛 🚀 ## Description This PR fixes a bug related to proxy settings in the `Groq` class, allowing users to connect to LangChain services via a proxy. ## Changes Made - ✅ FIX support for specifying proxy settings in the `Groq` class. - ✅ Resolved the bug causing issues with proxy settings. - ❌ Did not include unit tests and documentation updates. - ❌ Did not run make format, make lint, and make test to ensure code quality and functionality because I couldn't get it to run, so I don't program in Python and couldn't run `ruff`. - ❔ Ensured that the changes are backwards compatible. - ✅ No additional dependencies were added to `pyproject.toml`. ### Error Before Fix ```python Traceback (most recent call last): File "/home/bg/Documents/code/github.com/back2nix/test/groq/main.py", line 9, in <module> chat = ChatGroq( ^^^^^^^^^ File "/home/bg/Documents/code/github.com/back2nix/test/groq/venv310/lib/python3.11/site-packages/langchain_core/load/serializable.py", line 120, in __init__ super().__init__(**kwargs) File "/home/bg/Documents/code/github.com/back2nix/test/groq/venv310/lib/python3.11/site-packages/pydantic/v1/main.py", line 341, in __init__ raise validation_error pydantic.v1.error_wrappers.ValidationError: 1 validation error for ChatGroq __root__ Invalid `http_client` argument; Expected an instance of `httpx.AsyncClient` but got <class 'httpx.Client'> (type=type_error) ``` ### Example usage after fix ```python3 import os import httpx from langchain_core.prompts import ChatPromptTemplate from langchain_groq import ChatGroq chat = ChatGroq( temperature=0, groq_api_key=os.environ.get("GROQ_API_KEY"), model_name="mixtral-8x7b-32768", http_client=httpx.Client( proxies="socks5://127.0.0.1:1080", transport=httpx.HTTPTransport(local_address="0.0.0.0"), ), http_async_client=httpx.AsyncClient( proxies="socks5://127.0.0.1:1080", transport=httpx.HTTPTransport(local_address="0.0.0.0"), ), ) system = "You are a helpful assistant." human = "{text}" prompt = ChatPromptTemplate.from_messages([("system", system), ("human", human)]) chain = prompt \| chat out = chain.invoke({"text": "Explain the importance of low latency LLMs"}) print(out) ``` --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-04-24 21:58:03 +00:00
volodymyr-memsql	493afe4d8d	community[patch]: add hybrid search to singlestoredb vectorstore (#20793 ) Implemented the ability to enable full-text search within the SingleStore vector store, offering users a versatile range of search strategies. This enhancement allows users to seamlessly combine full-text search with vector search, enabling the following search strategies: * Search solely by vector similarity. * Conduct searches exclusively based on text similarity, utilizing Lucene internally. * Filter search results by text similarity score, with the option to specify a threshold, followed by a search based on vector similarity. * Filter results by vector similarity score before conducting a search based on text similarity. * Perform searches using a weighted sum of vector and text similarity scores. Additionally, integration tests have been added to comprehensively cover all scenarios. Updated notebook with examples. CC: @baskaryan, @hwchase17 --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-04-24 21:34:50 +00:00
Tomaz Bratanic	9efab3ed66	community[patch]: Add driver config param for neo4j graph (#20772 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-04-24 21:14:41 +00:00
Leonid Ganeline	13751c3297	community: `tigergraph` fixes (#20034 ) - added guard on the `pyTigerGraph` import - added a missed example page in the `docs/integrations/graphs/` - formatted the `docs/integrations/providers/` page to the consistent format. Added links.	2024-04-24 16:49:21 -04:00
Martin Kolb	0186e4e633	community[patch]: Advanced filtering for HANA Cloud Vector Engine (#20821 ) - Description: This PR adds support for advanced filtering to the integration of HANA Vector Engine. The newly supported filtering operators are: $eq, $ne, $gt, $gte, $lt, $lte, $between, $in, $nin, $like, $and, $or - Issue: N/A - Dependencies: no new dependencies added Added integration tests to: `libs/community/tests/integration_tests/vectorstores/test_hanavector.py` Description of the new capabilities in notebook: `docs/docs/integrations/vectorstores/hanavector.ipynb`	2024-04-24 13:47:27 -07:00
Alex Sherstinsky	12e5ec6de3	community: Support both Predibase SDK-v1 and SDK-v2 in Predibase-LangChain integration (#20859 )	2024-04-24 13:31:01 -07:00
Erick Friis	8c95ac3145	docs, multiple: de-beta with_structured_output (#20850 )	2024-04-24 19:34:57 +00:00
Nuno Campos	477eb1745c	Better support for subgraphs in graph viz (#20840 )	2024-04-24 12:32:52 -07:00
aditya thomas	a9c7d47c03	docs: update openai llm documentation (#20827 ) Description: Bring OpenAI LLM page to the LCEL era Issue: See discussion #20810 Dependencies: None	2024-04-24 12:26:57 -07:00
JeffKatzy	5ab3f9a995	community[patch]: standardize chat init args (#20844 ) Thank you for contributing to LangChain! community:perplexity[patch]: standardize init args updated pplx_api_key and request_timeout so that aliased to api_key, and timeout respectively. Added test that both continue to set the same underlying attributes. Related to [20085](https://github.com/langchain-ai/langchain/issues/20085) --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-04-24 12:26:05 -07:00
Pavlo Paliychuk	70ae59bcfe	docs: Update Zep Messaging, add links to Zep Cloud Docs (#20848 ) Thank you for contributing to LangChain! - [x] PR title: docs: Update Zep Messaging, add links to Zep Cloud Docs - [x] PR message: - Description: This PR updates Zep messaging in the docs + links to Langchain Zep Cloud examples in our documentation - Twitter handle: @paulpaliychuk51 - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-04-24 19:14:54 +00:00
Massimiliano Pronesti	8d1167b32f	community[patch]: add support for similarity_score_threshold search in… (#20852 ) See https://github.com/langchain-ai/langchain/issues/20600#issuecomment-2075569338 for details. @chrislrobert	2024-04-24 19:14:33 +00:00
Bagatur	87d31a3ec0	docs: contributing note (#20843 )	2024-04-24 10:41:19 -07:00
Eugene Yurtsev	d8aa72f51d	core[minor],langchain[patch]: Move base indexing interface and logic to core (#20667 ) This PR moves the interface and the logic to core. The following changes to namespaces: `indexes` -> `indexing` `indexes._api` -> `indexing.api` Testing code is intentionally duplicated for now since it's testing different implementations of the record manager (in-memory vs. SQL). Common logic will need to be pulled out into the test client. A follow up PR will move the SQL based implementation outside of LangChain.	2024-04-24 13:18:42 -04:00
ccurme	3bcfbcc871	groq: handle null queue_time (#20839 )	2024-04-24 09:50:09 -07:00
Eugene Yurtsev	30e48c9878	core[patch],community[patch]: Move file chat history back to community (#20834 ) Marking as patch since we haven't had releases in between. This just reverting part of a PR from yesterday.	2024-04-24 12:47:25 -04:00
ccurme	6debadaa70	groq: bump core (#20838 )	2024-04-24 11:51:46 -04:00
Erick Friis	7984206c95	groq: release 0.1.3 (#20836 ) Fixes #20811	2024-04-24 08:06:06 -07:00
Nestor Qin	9111d3a636	community[patch]: Fix message formatting for Anthropic models on Amazon Bedrock (#20801 ) Description: This PR fixes an issue in message formatting function for Anthropic models on Amazon Bedrock. Currently, LangChain BedrockChat model will crash if it uses Anthropic models and the model return a message in the following type: - `AIMessageChunk` Moreover, when use BedrockChat with for building Agent, the following message types will trigger the same issue too: - `HumanMessageChunk` - `FunctionMessage` Issue: https://github.com/langchain-ai/langchain/issues/18831 Dependencies: No. Testing: Manually tested. The following code was failing before the patch and works after. ``` @tool def square_root(x: str): "Useful when you need to calculate the square root of a number" return math.sqrt(int(x)) llm = ChatBedrock( model_id="anthropic.claude-3-sonnet-20240229-v1:0", model_kwargs={ "temperature": 0.0 }, ) prompt = ChatPromptTemplate.from_messages( [ ("system", FUNCTION_CALL_PROMPT), ("human", "Question: {user_input}"), MessagesPlaceholder(variable_name="agent_scratchpad"), ] ) tools = [square_root] tools_string = format_tool_to_anthropic_function(square_root) agent = ( RunnablePassthrough.assign( user_input=lambda x: x['user_input'], agent_scratchpad=lambda x: format_to_openai_function_messages( x["intermediate_steps"] ) ) \| prompt \| llm \| AnthropicFunctionsAgentOutputParser() ) agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, return_intermediate_steps=True) output = agent_executor.invoke({ "user_input": "What is the square root of 2?", "tools_string": tools_string, }) ``` List of messages returned from Bedrock: ``` <SystemMessage> content='You are a helpful assistant.' <HumanMessage> content='Question: What is the square root of 2?' <AIMessageChunk> content="Okay, let's calculate the square root of 2.<scratchpad>\nTo calculate the square root of a number, I can use the square_root tool:\n\n<function_calls>\n <invoke>\n <tool_name>square_root</tool_name>\n <parameters>\n <__arg1>2</__arg1>\n </parameters>\n </invoke>\n</function_calls>\n</scratchpad>\n\n<function_results>\n<search_result>\nThe square root of 2 is approximately 1.414213562373095\n</search_result>\n</function_results>\n\n<answer>\nThe square root of 2 is approximately 1.414213562373095\n</answer>" id='run-92363df7-eff6-4849-bbba-fa16a1b2988c'" <FunctionMessage> content='1.4142135623730951' name='square_root' ```	2024-04-23 22:40:39 +00:00
ccurme	06b04b80b8	groq: fix warning filter for integration test (#20806 )	2024-04-23 18:11:41 -04:00
ccurme	5a3c65a756	standard tests: add xfails (#20659 )	2024-04-23 17:14:16 -04:00
Erick Friis	ddc2274aea	standard-tests: split tool calling test (#20803 ) just making it a bit easier to grok	2024-04-23 20:59:45 +00:00
ccurme	6622829c67	mistral: catch GatedRepoError, release 0.1.3 (#20802 ) https://github.com/langchain-ai/langchain/issues/20618 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-04-23 20:56:42 +00:00
Eugene Yurtsev	a7c347ab35	langchain[patch]: Update evaluation logic that instantiates a default LLM (#20760 ) Favor langchain_openai over langchain_community for evaluation logic. --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-04-23 16:09:32 -04:00
Eugene Yurtsev	72f720fa38	langchain[major]: Remove default instantations of LLMs from VectorstoreToolkit (#20794 ) Remove default instantiation from vectorstore toolkit.	2024-04-23 16:09:14 -04:00
ccurme	42de5168b1	langchain: deprecate LLMChain, RetrievalQA, and ConversationalRetrievalChain (#20751 )	2024-04-23 15:55:34 -04:00
Erick Friis	30c7951505	core: use qualname in beta message (#20361 )	2024-04-23 11:20:13 -07:00
Aliaksandr Kuzmik	5560cc448c	community[patch]: fix CometTracer bug (#20796 ) Hi! My name is Alex, I'm an SDK engineer from [Comet](https://www.comet.com/site/) This PR updates the `CometTracer` class. Fixed an issue when `CometTracer` failed while logging the data to Comet because this data is not JSON-encodable. The problem was in some of the `Run` attributes that could contain non-default types inside, now these attributes are taken not from the run instance, but from the `run.dict()` return value.	2024-04-23 13:24:41 -04:00
Eugene Yurtsev	1c89e45c14	langchain[major]: breaks some chains to remove hidden defaults (#20759 ) Breaks some chains in langchain to remove hidden chat model / llm instantiation.	2024-04-23 11:11:40 -04:00
Eugene Yurtsev	ad6b5f84e5	community[patch],core[minor]: Move in memory cache implementation to core (#20753 ) This PR moves the InMemoryCache implementation from community to core.	2024-04-23 11:10:11 -04:00
Stefano Ottolenghi	4f67ce485a	docs: Fix typo to render list (#20774 ) This _should_ fix the currently broken list in the [Neo4jVector page](https://python.langchain.com/docs/integrations/vectorstores/neo4jvector/). ![Screenshot from 2024-04-23 08-40-37](https://github.com/langchain-ai/langchain/assets/114478074/ab5ad622-879e-4764-93db-5f502eae479b)	2024-04-23 14:46:58 +00:00
Eugene Yurtsev	a2cc9b55ba	core[patch]: Remove autoupgrade to addable dict in Runnable/RunnableLambda/RunnablePassthrough transform (#20677 ) Causes an issue for this code ```python from langchain.chat_models.openai import ChatOpenAI from langchain.output_parsers.openai_tools import JsonOutputToolsParser from langchain.schema import SystemMessage prompt = SystemMessage(content="You are a nice assistant.") + "{question}" llm = ChatOpenAI( model_kwargs={ "tools": [ { "type": "function", "function": { "name": "web_search", "description": "Searches the web for the answer to the question.", "parameters": { "type": "object", "properties": { "query": { "type": "string", "description": "The question to search for.", }, }, }, }, } ], }, streaming=True, ) parser = JsonOutputToolsParser(first_tool_only=True) llm_chain = prompt \| llm \| parser \| (lambda x: x) for chunk in llm_chain.stream({"question": "tell me more about turtles"}): print(chunk) # message = llm_chain.invoke({"question": "tell me more about turtles"}) # print(message) ``` Instead by definition, we'll assume that RunnableLambdas consume the entire stream and that if the stream isn't addable then it's the last message of the stream that's in the usable format. --- If users want to use addable dicts, they can wrap the dict in an AddableDict class. --- Likely, need to follow up with the same change for other places in the code that do the upgrade	2024-04-23 10:35:06 -04:00
Oleksandr Yaremchuk	9428923bab	experimental[minor]: upgrade the prompt injection model (#20783 ) - Description: In January, Laiyer.ai became part of ProtectAI, which means the model became owned by ProtectAI. In addition to that, yesterday, we released a new version of the model addressing issues the Langchain's community and others mentioned to us about false-positives. The new model has a better accuracy compared to the previous version, and we thought the Langchain community would benefit from using the [latest version of the model](https://huggingface.co/protectai/deberta-v3-base-prompt-injection-v2). - Issue: N/A - Dependencies: N/A - Twitter handle: @alex_yaremchuk	2024-04-23 10:23:39 -04:00
Eugene Yurtsev	645b1e142e	core[minor],langchain[patch],community[patch]: Move InMemory and File implementations of Chat History to core (#20752 ) This PR moves the implementations for chat history to core. So it's easier to determine which dependencies need to be broken / add deprecation warnings	2024-04-23 10:22:11 -04:00
ccurme	7a922f3e48	core, openai: support custom token encoders (#20762 )	2024-04-23 13:57:05 +00:00

1 2 3 4 5 ...

8923 Commits