langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-10 01:10:59 +00:00

Author	SHA1	Message	Date
Leonid Kuligin	1862900078	google-genai[patch]: added parsing of function call / response (#17245 )	2024-02-08 13:34:46 -08:00
Cailin Wang	a210a8bc53	langchain[patch]: Fix create_retriever_tool missing on_retriever_end Document content (#16933 ) - Description: In create_retriever_tool create_tool, fix create_retriever_tool's missing Document content for on_retriever_end, caused by create_retriever_tool's missing callbacks parameter, - Twitter handle: @CailinWang_ --------- Co-authored-by: root <root@Bluedot-AI> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-08 13:18:43 -08:00
Kartheek Yakkala	3a22157d92	docs: Added LCEL for alibabacloud and anyscale (#17252 ) --------- Co-authored-by: KARTHEEK YAKKALA <kartheekyakkala@KARTHEEKs-Air.lan> Co-authored-by: KARTHEEK YAKKALA <kartheekyakkala.se@gmail.com>	2024-02-08 13:18:09 -08:00
Sparsh Jain	a2167614b7	google-genai[patch]: Invoke callback prior to yielding token (#17092 ) - Description: Invoke callback prior to yielding token in stream and astream methods for Google-genai, - Issue: the issue # 16913, - Twitter handle: Sparsh10649446 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-08 13:13:46 -08:00
Liang Zhang	7306600e2f	community[patch]: Support SerDe transform functions in Databricks LLM (#16752 ) Description: Databricks LLM does not support SerDe the transform_input_fn and transform_output_fn. After saving and loading, the LLM will be broken. This PR serialize these functions into a hex string using pickle, and saving the hex string in the yaml file. Using pickle to serialize a function can be flaky, but this is a simple workaround that unblocks many use cases. If more sophisticated SerDe is needed, we can improve it later. Test: Added a simple unit test. I did manual test on Databricks and it works well. The saved yaml looks like: ``` llm: _type: databricks cluster_driver_port: null cluster_id: null databricks_uri: databricks endpoint_name: databricks-mixtral-8x7b-instruct extra_params: {} host: e2-dogfood.staging.cloud.databricks.com max_tokens: null model_kwargs: null n: 1 stop: null task: null temperature: 0.0 transform_input_fn: 80049520000000000000008c085f5f6d61696e5f5f948c0f7472616e73666f726d5f696e7075749493942e transform_output_fn: null ``` @baskaryan ```python from langchain_community.embeddings import DatabricksEmbeddings from langchain_community.llms import Databricks from langchain.chains import RetrievalQA from langchain.document_loaders import TextLoader from langchain.text_splitter import CharacterTextSplitter from langchain.vectorstores import FAISS import mlflow embeddings = DatabricksEmbeddings(endpoint="databricks-bge-large-en") def transform_input(**request): request["messages"] = [ { "role": "user", "content": request["prompt"] } ] del request["prompt"] return request llm = Databricks(endpoint_name="databricks-mixtral-8x7b-instruct", transform_input_fn=transform_input) persist_dir = "faiss_databricks_embedding" # Create the vector db, persist the db to a local fs folder loader = TextLoader("state_of_the_union.txt") documents = loader.load() text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0) docs = text_splitter.split_documents(documents) db = FAISS.from_documents(docs, embeddings) db.save_local(persist_dir) def load_retriever(persist_directory): embeddings = DatabricksEmbeddings(endpoint="databricks-bge-large-en") vectorstore = FAISS.load_local(persist_directory, embeddings) return vectorstore.as_retriever() retriever = load_retriever(persist_dir) retrievalQA = RetrievalQA.from_llm(llm=llm, retriever=retriever) with mlflow.start_run() as run: logged_model = mlflow.langchain.log_model( retrievalQA, artifact_path="retrieval_qa", loader_fn=load_retriever, persist_dir=persist_dir, ) # Load the retrievalQA chain loaded_model = mlflow.pyfunc.load_model(logged_model.model_uri) print(loaded_model.predict([{"query": "What did the president say about Ketanji Brown Jackson"}])) ```	2024-02-08 13:09:50 -08:00
cjpark-data	ce22e10c4b	community[patch]: Fix KeyError 'embedding' (MongoDBAtlasVectorSearch) (#17178 ) - Description: Embedding field name was hard-coded named "embedding". So I suggest that change `res["embedding"]` into `res[self._embedding_key]`. - Issue: #17177, - Twitter handle: [@bagcheoljun17](https://twitter.com/bagcheoljun17)	2024-02-08 12:06:42 -08:00
Neli Hateva	9bb5157a3d	langchain[patch], community[patch]: Fixes in the Ontotext GraphDB Graph and QA Chain (#17239 ) - Description: Fixes in the Ontotext GraphDB Graph and QA Chain related to the error handling in case of invalid SPARQL queries, for which `prepareQuery` doesn't throw an exception, but the server returns 400 and the query is indeed invalid - Issue: N/A - Dependencies: N/A - Twitter handle: @OntotextGraphDB	2024-02-08 12:05:43 -08:00
ByeongUk Choi	b88329e9a5	community[patch]: Implement Unique ID Enforcement in FAISS (#17244 ) Description: Implemented unique ID validation in the FAISS component to ensure all document IDs are distinct. This update resolves issues related to non-unique IDs, such as inconsistent behavior during deletion processes.	2024-02-08 12:03:33 -08:00
Jorge Campo	88609565a3	docs: Fix typo in github.ipynb (#17259 ) 'agiven' -> 'a given'	2024-02-08 12:03:00 -08:00
Bagatur	852973d616	langchain[minor], core[minor]: update json, pydantic parser. add openai-json structured output runnable (#16914 )	2024-02-08 11:59:06 -08:00
hsuyuming	e22c4d4eb0	google-vertexai[patch]: fix _parse_response_candidate issue (#16647 ) Description: enable _parse_response_candidate to support complex structure format. Issue: currently, if Gemini response complex args format, people will get "TypeError: Object of type RepeatedComposite is not JSON serializable" error from _parse_response_candidate. response candidate example ``` content { role: "model" parts { function_call { name: "Information" args { fields { key: "people" value { list_value { values { string_value: "Joe is 30, his mom is Martha" } } } } } } } } finish_reason: STOP safety_ratings { category: HARM_CATEGORY_HARASSMENT probability: NEGLIGIBLE } safety_ratings { category: HARM_CATEGORY_HATE_SPEECH probability: NEGLIGIBLE } safety_ratings { category: HARM_CATEGORY_SEXUALLY_EXPLICIT probability: NEGLIGIBLE } safety_ratings { category: HARM_CATEGORY_DANGEROUS_CONTENT probability: NEGLIGIBLE } ``` error msg: ``` Traceback (most recent call last): File "/home/jupyter/user/abehsu/gemini_langchain_tools/example2.py", line 36, in <module> print(tagging_chain.invoke({"input": "Joe is 30, his mom is Martha"})) File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 2053, in invoke input = step.invoke( File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 3887, in invoke return self.bound.invoke( File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 165, in invoke self.generate_prompt( File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 543, in generate_prompt return self.generate(prompt_messages, stop=stop, callbacks=callbacks, kwargs) File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 407, in generate raise e File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 397, in generate self._generate_with_cache( File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 576, in _generate_with_cache return self._generate( File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/site-packages/langchain_google_vertexai/chat_models.py", line 406, in _generate generations = [ File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/site-packages/langchain_google_vertexai/chat_models.py", line 408, in <listcomp> message=_parse_response_candidate(c), File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/site-packages/langchain_google_vertexai/chat_models.py", line 280, in _parse_response_candidate function_call["arguments"] = json.dumps( File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/json/__init__.py", line 231, in dumps return _default_encoder.encode(obj) File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/json/encoder.py", line 199, in encode chunks = self.iterencode(o, _one_shot=True) File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/json/encoder.py", line 257, in iterencode return _iterencode(o, 0) File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type RepeatedComposite is not JSON serializable ``` Twitter handle:** @abehsu1992626	2024-02-08 11:48:25 -08:00
Erick Friis	d77bb7b4e9	google-vertexai[patch]: integration test fix, release 0.0.5 (#17258 )	2024-02-08 11:45:33 -08:00
Aditya	98176ac982	langchain_google_vertexai : added logic to override get_num_tokens_from_messages() for ChatVertexAI (#16784 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: added logic to override get_num_tokens_from_messages() for ChatVertexAI. Currently ChatVertexAI was inheriting get_num_tokens_from_messages() from BaseChatModel which in-turn was calling GPT-2 tokenizer - Issue: NA - Dependencies: NA - Twitter handle:@aditya_rane @lkuligin for review --------- Co-authored-by: adityarane@google.com <adityarane@google.com> Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru>	2024-02-08 11:30:42 -08:00
Bagatur	00a09e1b71	docs: use PromptTemplate.from_template (#17218 ) Ran ```python import glob import re def update_prompt(x): return re.sub( r"(?P<start>\b)PromptTemplate\(template=(?P<template>.), input_variables=(?:.)\)", "\g<start>PromptTemplate.from_template(\g<template>)", x ) for fn in glob.glob("docs/*/", recursive=True): try: content = open(fn).readlines() except: continue content = [update_prompt(l) for l in content] with open(fn, "w") as f: f.write("".join(content)) ```	2024-02-07 19:52:42 -08:00
sana-google	7f55c95790	docs: add missing link to Quickstart (#17085 ) Replace this entire comment with: - Description: Added missing link for Quickstart in Model IO documentation, - Issue: N/A, - Dependencies: N/A, - Twitter handle: N/A <!-- If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-07 22:26:10 -05:00
Bassem Yacoube	4e3ed7f043	community[patch]: octoai embeddings bug fix (#17216 ) fixes a bug in octoa_embeddings provider	2024-02-07 22:25:52 -05:00
Eugene Yurtsev	780e84ae79	community[minor]: SQLDatabase Add fetch mode `cursor`, query parameters, query by selectable, expose execution options, and documentation (#17191 ) - Description: Improve `SQLDatabase` adapter component to promote code re-use, see [suggestion](https://github.com/langchain-ai/langchain/pull/16246#pullrequestreview-1846590962). - Needed by: GH-16246 - Addressed to: @baskaryan, @cbornet ## Details - Add `cursor` fetch mode - Accept SQL query parameters - Accept both `str` and SQLAlchemy selectables as query expression - Expose `execution_options` - Documentation page (notebook) about `SQLDatabase` [^1] See [About SQLDatabase](https://github.com/langchain-ai/langchain/blob/c1c7b763/docs/docs/integrations/tools/sql_database.ipynb). [^1]: Apparently there hasn't been any yet? --------- Co-authored-by: Andreas Motl <andreas.motl@crate.io>	2024-02-07 22:23:43 -05:00
Tomaz Bratanic	7e4b676d53	community[patch]: Better error propagation for neo4jgraph (#17190 ) There are other errors that could happen when refreshing the schema, so we want to propagate specific errors for more clarity	2024-02-07 22:16:14 -05:00
Leonid Ganeline	d903fa313e	docs: titles fix (#17206 ) Several notebooks have Title != file name. That results in corrupted sorting in Navbar (ToC). - Fixed titles and file names. - Changed text formats to the consistent form - Redirected renamed files in the `Vercel.json`	2024-02-07 22:09:34 -05:00
Luiz Ferreira	34d2daffb3	community[patch]: Fix chat openai unit test (#17124 ) - Description: Actually the test named `test_openai_apredict` isn't testing the apredict method from ChatOpenAI. - Twitter handle: https://twitter.com/OAlmofadas	2024-02-07 22:08:26 -05:00
Dmitry Kankalovich	f92738a6f6	langchain[minor], community[minor], core[minor]: Async Cache support and AsyncRedisCache (#15817 ) * This PR adds async methods to the LLM cache. * Adds an implementation using Redis called AsyncRedisCache. * Adds a docker compose file at the /docker to help spin up docker * Updates redis tests to use a context manager so flushing always happens by default	2024-02-07 22:06:09 -05:00
Harrison Chase	19546081c6	templates: add gemini functions agent (#17141 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-07 17:27:01 -08:00
Bagatur	aeb6b38901	docs: cleanup fleet integration (#17214 ) Causing search issues	2024-02-07 17:18:48 -08:00
Erick Friis	4153837502	google-genai[patch]: release 0.0.7 (#17193 )	2024-02-07 17:15:09 -08:00
Erick Friis	927ab77d6e	google-genai[patch]: no error for FunctionMessage (#17215 ) Both should eventually match this: https://github.com/langchain-ai/langchain/blob/master/libs/partners/google-vertexai/langchain_google_vertexai/chat_models.py#L179 But seems undocumented / can't find types in genai package	2024-02-07 17:14:50 -08:00
Erick Friis	2ecf318218	google-genai[patch]: match function call interface (#17213 ) should match vertex	2024-02-07 17:07:31 -08:00
Erick Friis	e17173c403	google-vertexai[patch]: function calling integration test (#17209 )	2024-02-07 15:49:56 -08:00
Erick Friis	52be84a603	google-vertexai[patch]: serializable citation metadata, release 0.0.4 (#17145 ) was breaking in langserve before	2024-02-07 15:47:32 -08:00
Nuno Campos	19ff81e74f	Fix stream events/log with some kinds of non addable output (#17205 ) <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2024-02-07 15:46:13 -08:00
Bagatur	6f1403b9b6	community[patch]: Release 0.0.19 (#17207 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-07 15:37:01 -08:00
Erick Friis	a13dc47a08	cli[patch]: copyright 2024 default (#17204 )	2024-02-07 14:52:37 -08:00
Bagatur	00757567ba	core[patch]: Release 0.1.21 (#17202 )	2024-02-07 14:20:20 -08:00
Bagatur	af74301ab9	core[patch], community[patch]: link extraction continue on failure (#17200 )	2024-02-07 14:15:30 -08:00
Henry	2281f00198	langchain: Standardize `output_parser.py` across all agent types for custom `FORMAT_INSTRUCTIONS` (#17168 ) - Description: This PR standardizes the `output_parser.py` file across all agent types to ensure a uniform parsing mechanism is implemented. It introduces a cohesive structure and common interface for output parsing, facilitating easier modifications and extensions by users. The standardized approach enhances maintainability and scalability of the codebase by providing a consistent pattern for output parsing, which can be easily understood and utilized across different agent types. This PR builds upon the foundation set by a previously merged PR, which focused exclusively on standardizing the `output_parser.py` for the `conversational_agent` ([PR #16945](https://github.com/langchain-ai/langchain/pull/16945)). With this new update, I extend the standardization efforts to encompass `output_parser.py` files across all agent types. This enhancement not only unifies the parsing mechanism across the board but also introduces the flexibility for users to incorporate custom `FORMAT_INSTRUCTIONS`. - Issue: https://github.com/langchain-ai/langchain/issues/10721 https://github.com/langchain-ai/langchain/issues/4044 - Dependencies: No new dependencies required for this change - Twitter handle: With my github user is enough. Thanks I hope you accept my PR.	2024-02-07 13:46:17 -08:00
Erick Friis	1cf5a5858f	remove pg_essay.txt (#17198 ) Added in #16159	2024-02-07 12:58:01 -08:00
Tomaz Bratanic	ecf8042a10	templates: Add neo4j semantic layer with ollama template (#17192 ) A template with JSON-based agent using Mixtral via Ollama. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-07 12:50:54 -08:00
Erick Friis	f87acf0340	infra: better conditional (#17197 )	2024-02-07 12:49:02 -08:00
Erick Friis	4ae91733aa	infra: fix core release (#17195 ) core doesn't have any min deps to test	2024-02-07 12:35:27 -08:00
Bagatur	78409634fe	core[patch]: Release 0.1.20 (#17194 )	2024-02-07 12:28:05 -08:00
Nuno Campos	65798289a4	core[minor]: Use batched tracing in sdk (#16305 ) Remove threadpool executor usage in langchain tracer, this is now handled by sdk	2024-02-07 12:10:58 -08:00
chyroc	f87b38a559	google-genai[minor]: support functions call (#15146 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-07 12:09:30 -08:00
Tomaz Bratanic	302989a2b1	allow optional newline in the action responses of JSON Agent parser (#17186 ) Based on my experiments, the newline isn't always there, so we can make the regex slightly more robust by allowing an optional newline after the bacticks	2024-02-07 10:26:14 -08:00
William FH	9fa07076da	Add trace_as_chain_group metadata (#17187 )	2024-02-07 09:42:44 -08:00
Leonid Ganeline	5ceaf784f3	docs `Integraions/Components` menu reordered (#17151 ) This PR is opinionated. - Moved `Embedding models` item to place after `LLMs` and `Chat model`, so all items with models are together. - Renamed `Text embedding models` to `Embedding models`. Now, it is shorter and easier to read. `Text` is obvious from context. The same as the `Text LLMs` vs. `LLMs` (we also have multi-modal LLMs).	2024-02-06 20:33:41 -08:00
Leonid Ganeline	0af0fc5d25	docs `integraions/providers` nav fix (#17148 ) Issue: `Provides` page is presented as the index page (on the `Providers` item) and as the `Providers/Providers` item. The latter should not be in the menu. See the picture. ![image](https://github.com/langchain-ai/langchain/assets/2256422/6894023f-f13a-4f0d-8fe2-ed5b0ae2bdd2) This PR fixes this.	2024-02-06 20:33:14 -08:00
Leonid Ganeline	bf55279d39	docs: tutorials update (#17132 ) Added the course and the one-pager links	2024-02-06 20:30:30 -08:00
Erick Friis	f499a222de	infra: release min version debugging 2 (#17152 )	2024-02-06 18:20:19 -08:00
Erick Friis	deb02de051	infra: release min version debugging (#17150 )	2024-02-06 18:10:37 -08:00
Erick Friis	9710346095	infra: poetry run min versions 2 (#17149 )	2024-02-06 17:57:43 -08:00
Erick Friis	181a033226	infra: poetry run min versions (#17146 ) <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2024-02-06 17:37:36 -08:00

... 2 3 4 5 6 ...

7532 Commits