langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-13 19:10:52 +00:00

Author	SHA1	Message	Date
Prithvi Kannan	0433b114bb	docs: Add databricks-langchain package consolidation notice (#27703 ) Thank you for contributing to LangChain! Add notice of upcoming package consolidation of `langchain-databricks` into `databricks-langchain`. <img width="1047" alt="image" src="https://github.com/user-attachments/assets/18eaa394-4e82-444b-85d5-7812be322674"> Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Signed-off-by: Prithvi Kannan <prithvi.kannan@databricks.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-29 22:00:27 +00:00
Zapiron	447c0dd2f0	docs: Fixed Grammar & Improve reading (#27672 ) Updated the documentation to fix some grammar errors - Description: Some language errors exist in the documentation - Issue: the issue # Changed the structure of some sentences	2024-10-29 20:19:00 +00:00
Soham Das	913ff1b152	docs: fix typo in query analysis documentation (#27721 ) PR Title: `docs: fix typo in query analysis documentation` Description: This PR corrects a typo on line 68 in the query analysis documentation, changing "pharsings" to "phrasings" for clarity and accuracy. Only one instance of the typo was fixed in the last merge, and this PR fixes the second instance. Issue: N/A Dependencies: None Additional Notes: No functional changes were made; this is a documentation fix only.	2024-10-29 16:15:37 -04:00
Erick Friis	8396ca2990	docs: redis in api docs (#27722 )	2024-10-29 20:13:53 +00:00
Mateusz Szewczyk	0606aabfa3	docs: Added WatsonxRerank documentation (#27424 ) Thank you for contributing to LangChain! Changes: - docs: Added `WatsonxRerank` documentation - docs Updated `WatsonxEmbeddings` with docs template - docs: Updated `ChatWatsonx` with docs template - docs: Updated `WatsonxLLM` with docs template - docs: Added `ChatWatsonx` to list with Chat models providers. Added [test_chat_models_standard](https://github.com/langchain-ai/langchain-ibm/blob/main/libs/ibm/tests/integration_tests/test_chat_models_standard.py) to `langchain_ibm` tests suite. - docs: Added `IBM` to list with Embedding models providers. Added [test_embeddings_standard](https://github.com/langchain-ai/langchain-ibm/blob/main/libs/ibm/tests/integration_tests/test_embeddings_standard.py) to `langchain_ibm` tests suite. - docs: Updated `langcahin_ibm` recommended versions compatible with `LangChain v0.3` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-29 16:57:47 +00:00
Zapiron	9ccd4a6ffb	DOC: Tutorial Section Updates (#27675 ) Edited various notebooks in the tutorial section to fix: * Grammatical Errors * Improve Readability by changing the sentence structure or reducing repeated words which bears the same meaning * Edited a code block to follow the PEP 8 Standard * Added more information in some sentences to make the concept more clear and reduce ambiguity --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-10-29 14:51:34 +00:00
Harsimran-19	c1d8c33df6	core: JsonOutputParser UTF characters bug (#27306 ) Description: This PR fixes an issue where non-ASCII characters in Pydantic field descriptions were being escaped to their Unicode representations when using `JsonOutputParser`. The change allows non-ASCII characters to be preserved in the output, which is especially important for multilingual support and when working with non-English languages. Issue: Fixes #27256 Example Code: ```python from pydantic import BaseModel, Field from langchain_core.output_parsers import JsonOutputParser class Article(BaseModel): title: str = Field(description="科学文章的标题") output_data_structure = Article parser = JsonOutputParser(pydantic_object=output_data_structure) print(parser.get_format_instructions()) ``` Previous Output: ```... "title": {"description": "\\u79d1\\u5b66\\u6587\\u7ae0\\u7684\\u6807\\u9898", "title": "Title", "type": "string"}} ...``` Current Output: ```... "title": {"description": "科学文章的标题", "title": "Title", "type": "string"}} ...``` Changes made: - Modified `json.dumps()` call in `langchain_core/output_parsers/json.py` to use `ensure_ascii=False` - Added a unit test to verify Unicode handling Co-authored-by: Harsimran-19 <harsimran1869@gmail.com>	2024-10-29 14:48:53 +00:00
Andrew Effendi	49517cc1e7	partners/huggingface[patch]: fix HuggingFacePipeline model_id parameter (#27514 ) Description: Fixes issue with model parameter not getting initialized correctly when passing transformers pipeline Issue: https://github.com/langchain-ai/langchain/issues/25915	2024-10-29 14:34:46 +00:00
Jeong-Minju	0a465b8032	docs: Fix typo in _action_agent docs section (#27698 ) PR Title: docs: Fix typo in _action_agent function docs section Description: In line 1185, _action_agent function's docs, changing ".agent" to "self.agent". Issue: N/A Dependencies: None --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-10-29 14:16:42 +00:00
Soham Das	c3021e9322	docs: fix typo in query analysis documentation (#27697 ) PR Title: `docs: fix typo in query analysis documentation` Description: This PR corrects a typo on line 68 in the query analysis documentation, changing "pharsings" to "phrasings" for clarity and accuracy. Issue: N/A Dependencies: None Additional Notes: No functional changes were made; this is a documentation fix only.	2024-10-29 14:07:22 +00:00
Neil Vachharajani	eec35672a4	core[patch]: Improve type checking for the tool decorator (#27460 ) Description: When annotating a function with the @tool decorator, the symbol should have type BaseTool. The previous type annotations did not convey that to type checkers. This patch creates 4 overloads for the tool function for the 4 different use cases. 1. @tool decorator with no arguments 2. @tool decorator with only keyword arguments 3. @tool decorator with a name argument (and possibly keyword arguments) 4. Invoking tool as function with a name and runnable positional arguments The main function is updated to match the overloads. The changes are 100% backwards compatible (all existing calls should continue to work, just with better type annotations). Twitter handle: @nvachhar --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-10-29 13:59:56 +00:00
Erick Friis	94e5765416	docs: packages in homepage (#27693 )	2024-10-28 20:44:30 +00:00
Erick Friis	583808a7b8	partners/huggingface: release 0.1.1 (#27691 )	2024-10-28 13:39:38 -07:00
Erick Friis	6d524e9566	partners/box: release 0.2.2 (#27690 )	2024-10-28 12:54:20 -07:00
yahya-mouman	6803cb4f34	openai[patch]: add check for none values when summing token usage (#27585 ) Description: Fixes None addition issues when an empty value is passed on If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-10-28 12:49:43 -07:00
Bagatur	ede953d617	openai[patch]: fix schema formatting util (#27685 )	2024-10-28 15:46:47 +00:00
Baptiste Pasquier	440c162b8b	community: Fix closed session in Infinity (#26933 ) Description: The `aiohttp.ClientSession` is closed at the end of the with statement, which causes an error during a second call. The implemented fix is to define the session directly within the with block, exactly like in the textembed code: `c6350d636e/libs/community/langchain_community/embeddings/textembed.py (L335-L346)` Issue: Fix #26932 Co-authored-by: ccurme <chester.curme@gmail.com>	2024-10-27 11:37:21 -04:00
Jorge Piedrahita Ortiz	8895d468cb	community: sambastudio llm refactor (#27215 ) Description: - Sambastudio LLM refactor - Sambastudio openai compatible API support added - docs updated	2024-10-27 11:08:15 -04:00
ccurme	fe87e411f2	groq: fix unit test (#27660 )	2024-10-26 14:57:23 -04:00
Erick Friis	cdb4b1980a	docs: reorganize contributing docs (#27649 )	2024-10-25 22:41:54 +00:00
Erick Friis	fbfc6bdade	core: test runner improvements (#27654 ) when running core tests locally this - prevents langsmith tracing from being enabled by env vars - prevents network calls	2024-10-25 15:06:59 -07:00
Gabriel Faundez	ef27ce7a45	docs: add missing import for tools docs (#27650 ) ## Description Added missing import from `pydantic` in the tools docs	2024-10-25 21:14:40 +00:00
Vincent Min	7bc4e320f1	core[patch]: improve performance of InMemoryVectorStore (#27538 ) Description: We improve the performance of the InMemoryVectorStore. Isue: Originally, similarity was computed document by document: ``` for doc in self.store.values(): vector = doc["vector"] similarity = float(cosine_similarity([embedding], [vector]).item(0)) ``` This is inefficient and does not make use of numpy vectorization. This PR computes the similarity in one vectorized go: ``` docs = list(self.store.values()) similarity = cosine_similarity([embedding], [doc["vector"] for doc in docs]) ``` Dependencies: None Twitter handle: @b12_consulting, @Vincent_Min --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-10-25 17:07:04 -04:00
Bagatur	d5306899d3	openai[patch]: Release 0.2.4 (#27652 )	2024-10-25 20:26:21 +00:00
Erick Friis	247d6bb09d	infra: test doc imports 3.12 (#27653 )	2024-10-25 13:23:06 -07:00
Erick Friis	600b7bdd61	all: test 3.13 ci (#27197 ) Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-10-25 12:56:58 -07:00
Bagatur	06df15c9c0	core[patch]: Release 0.3.13 (#27651 )	2024-10-25 19:22:44 +00:00
Erick Friis	2683f814f4	docs: contributing index page (#27647 )	2024-10-25 17:06:55 +00:00
Rashmi Pawar	83eebf549f	docs: Add NVIDIA as provider in v3 integrations (#27254 ) ### Add NVIDIA as provider in langchain v3 integrations cc: @sumitkbh @mattf @dglogo --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-25 16:21:22 +00:00
Steve Moss	24605bcdb6	community[patch]: Fix missing protected_namespaces(). (#27610 ) - [x] PR message: - Description: Fixes warning messages raised due to missing `protected_namespaces` parameter in `ConfigDict`. - Issue: https://github.com/langchain-ai/langchain/issues/27609 - Dependencies: No dependencies - Twitter handle: @gawbul	2024-10-25 02:16:26 +00:00
Eugene Yurtsev	7667ee126f	core: remove mustache in extended deps (#27629 ) Remove mustache from extended deps -- we vendor the mustache implementation	2024-10-24 22:12:49 -04:00
Erick Friis	265e0a164a	core: add flake8-bandit (S) ruff rules to core (#27368 ) Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-10-24 22:33:41 +00:00
hippopond	bcff458ae3	DOC: Added notes in ipynb file to advise user to upgrade package langchain_openai. For issue: https://github.com/langchain-ai/langchain/issues/26616 (#27621 ) Thank you for contributing to LangChain! - [X] PR title: DOC: Added notes in ipynb file to advice user to upgrade package langchain_openai. - [X] Added notes from the issue report: to advise the user to upgrade langchain_openai Issue: https://github.com/langchain-ai/langchain/issues/26616 - [ ] Add tests and docs: - [ ] Lint and test: - [ ] --------- Co-authored-by: Libby Lin <libbylin@Libbys-MacBook-Pro.local> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-24 21:54:12 +00:00
Nithish Raghunandanan	0623c74560	couchbase: Add document id to vector search results (#27622 ) Description: Returns the document id along with the Vector Search results Issue: Fixes https://github.com/langchain-ai/langchain/issues/26860 for CouchbaseVectorStore - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-24 21:47:36 +00:00
ZhangShenao	455ab7d714	Improvement[Community] Improve Document Loaders and Splitters (#27568 ) - Fix word spelling error - Add static method decorator - Fix language splitter Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-24 21:42:16 +00:00
Ed Branch	7345470669	docs: add aws support to how-to-guides (#27450 ) This PR adds support to the how-to documentation for using AWS Bedrock and Sagemaker Endpoints. Because AWS services above dont presently use API keys to access LLMs I've amended more of the source code than would normally be expected. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-24 14:23:32 -07:00
CLOVA Studio 개발	846a75284f	community: Add Naver chat model & embeddings (#25162 ) Reopened as a personal repo outside the organization. ## Description - Naver HyperCLOVA X community package - Add chat model & embeddings - Add unit test & integration test - Add chat model & embeddings docs - I changed partner package(https://github.com/langchain-ai/langchain/pull/24252) to community package on this PR - Could this embeddings(https://github.com/langchain-ai/langchain/pull/21890) be deprecated? We are trying to replace it with embedding model(ClovaXEmbeddings) in this PR. Twitter handle: None. (if needed, contact with joonha.jeon@navercorp.com) --- you can check our previous discussion below: > one question on namespaces - would it make sense to have these in .clova namespaces instead of .naver? I would like to keep it as is, unless it is essential to unify the package name. (ClovaX is a branding for the model, and I plan to add other models and components. They need to be managed as separate classes.) > also, could you clarify the difference between ClovaEmbeddings and ClovaXEmbeddings? There are 3 models that are being serviced by embedding, and all are supported in the current PR. In addition, all the functionality of CLOVA Studio that serves actual models, such as distinguishing between test apps and service apps, is supported. The existing PR does not support this content because it is hard-coded. --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Vadym Barda <vadym@langchain.dev>	2024-10-24 20:54:13 +00:00
Hyejun An	6227396e20	partners/HuggingFacePipeline[stream]: Change to use `pipeline` instead of `pipeline.model.generate` in stream() (#26531 ) ## Description I encountered an error while using the` gemma-2-2b-it model` with the `HuggingFacePipeline` class and have implemented a fix to resolve this issue. ### What is Problem ```python model_id="google/gemma-2-2b-it" gemma_2_model = AutoModelForCausalLM.from_pretrained(model_id) gemma_2_tokenizer = AutoTokenizer.from_pretrained(model_id) gen = pipeline( task='text-generation', model=gemma_2_model, tokenizer=gemma_2_tokenizer, max_new_tokens=1024, device=0 if torch.cuda.is_available() else -1, temperature=.5, top_p=0.7, repetition_penalty=1.1, do_sample=True, ) llm = HuggingFacePipeline(pipeline=gen) for chunk in llm.stream("Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World."): print(chunk, end="", flush=True) ``` This code outputs the following error message: ``` /usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py:1258: UserWarning: Using the model-agnostic default `max_length` (=20) to control the generation length. We recommend setting `max_new_tokens` to control the maximum length of the generation. warnings.warn( Exception in thread Thread-19 (generate): Traceback (most recent call last): File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/usr/lib/python3.10/threading.py", line 953, in run self._target(self._args, self._kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(args, *kwargs) File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1874, in generate self._validate_generated_length(generation_config, input_ids_length, has_default_max_length) File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1266, in _validate_generated_length raise ValueError( ValueError: Input length of input_ids is 31, but `max_length` is set to 20. This can lead to unexpected behavior. You should consider increasing `max_length` or, better yet, setting `max_new_tokens`. ``` In addition, the following error occurs when the number of tokens is reduced. ```python for chunk in llm.stream("Hello World"): print(chunk, end="", flush=True) ``` ``` /usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py:1258: UserWarning: Using the model-agnostic default `max_length` (=20) to control the generation length. We recommend setting `max_new_tokens` to control the maximum length of the generation. warnings.warn( /usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py:1885: UserWarning: You are calling .generate() with the `input_ids` being on a device type different than your model's device. `input_ids` is on cpu, whereas the model is on cuda. You may experience unexpected behaviors or slower generation. Please make sure that you have put `input_ids` to the correct device by calling for example input_ids = input_ids.to('cuda') before running `.generate()`. warnings.warn( Exception in thread Thread-20 (generate): Traceback (most recent call last): File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/usr/lib/python3.10/threading.py", line 953, in run self._target(self._args, *self._kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(args, kwargs) File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 2024, in generate result = self._sample( File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 2982, in _sample outputs = self(model_inputs, return_dict=True) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1562, in _call_impl return forward_call(args, *kwargs) File "/usr/local/lib/python3.10/dist-packages/transformers/models/gemma2/modeling_gemma2.py", line 994, in forward outputs = self.model( File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl return self._call_impl(args, *kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1562, in _call_impl return forward_call(args, *kwargs) File "/usr/local/lib/python3.10/dist-packages/transformers/models/gemma2/modeling_gemma2.py", line 803, in forward inputs_embeds = self.embed_tokens(input_ids) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl return self._call_impl(args, *kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1562, in _call_impl return forward_call(args, kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/sparse.py", line 164, in forward return F.embedding( File "/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py", line 2267, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select) ``` On the other hand, in the case of invoke, the output is normal: ``` llm.invoke("Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World.") ``` ``` 'Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World.\n\nThis is a simple program that prints the phrase "Hello World" to the console. \n\nHere\'s how it works:*\n\n `print("Hello World")`: This line of code uses the `print()` function, which is a built-in function in most programming languages (like Python). The `print()` function takes whatever you put inside its parentheses and displays it on the screen.\n* `"Hello World"`: The text within the double quotes (`"`) is called a string. It represents the message we want to print.\n\n\nLet me know if you\'d like to explore other programming concepts or see more examples! \n' ``` ### Problem Analysis - Apparently, I put kwargs in while generating pipelines and it applied to `invoke()`, but it's not applied in the `stream()`. - When using the stream, `inputs = self.pipeline.tokenizer (prompt, return_tensors = "pt")` enters cpu. - This can crash when the model is in gpu. ### Solution Just use `self.pipeline` instead of `self.pipeline.model.generate`. - Original Code ```python stopping_criteria = StoppingCriteriaList([StopOnTokens()]) inputs = self.pipeline.tokenizer(prompt, return_tensors="pt") streamer = TextIteratorStreamer( self.pipeline.tokenizer, timeout=60.0, skip_prompt=skip_prompt, skip_special_tokens=True, ) generation_kwargs = dict( inputs, streamer=streamer, stopping_criteria=stopping_criteria, pipeline_kwargs, ) t1 = Thread(target=self.pipeline.model.generate, kwargs=generation_kwargs) t1.start() ``` - Updated Code ```python stopping_criteria = StoppingCriteriaList([StopOnTokens()]) streamer = TextIteratorStreamer( self.pipeline.tokenizer, timeout=60.0, skip_prompt=skip_prompt, skip_special_tokens=True, ) generation_kwargs = dict( text_inputs= prompt, streamer=streamer, stopping_criteria=stopping_criteria, pipeline_kwargs, ) t1 = Thread(target=self.pipeline, kwargs=generation_kwargs) t1.start() ``` By using the `pipeline` directly, the `kwargs` of the pipeline are applied, and there is no need to consider the `device` of the `tensor` made with the `tokenizer`. > According to the change to use `pipeline`, it was modified to put `text_inputs=prompts` directly into `generation_kwargs`. ## Issue None ## Dependencies None ## Twitter handle None --------- Co-authored-by: Vadym Barda <vadym@langchain.dev>	2024-10-24 16:49:43 -04:00
Bagatur	655ced84d7	openai[patch]: accept json schema response format directly (#27623 ) fix #25460 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-24 18:19:15 +00:00
Tibor Reiss	20b56a0233	core[patch]: fix repr and str for Serializable (#26786 ) Fixes #26499 --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-10-24 08:36:35 -07:00
Adarsh Sahu	2d58a8a08d	docs: Update structured_outputs.mdx (#27613 ) `strightforward` => `straightforward` `adavanced` => `advanced` `There a few challenges` => `There are a few challenges` Documentation Correction: * [`docs/docs/concepts/structured_output.mdx`]: Corrected several typos in the sentence directing users to the API reference.	2024-10-24 15:13:28 +00:00
Daniel Vu Dao	da6b526770	docs: Update `Runnable` documentation (#27606 ) Description Adds better code formatting for one of the docs.	2024-10-24 15:05:43 +00:00
QiQi	133c1b4f76	docs: Update passthrough.ipynb -- Grammar correction (#27601 ) Grammar correction needed in passthrough.ipynb The sentence is: "Now you've learned how to pass data through your chains to help to help format the data flowing through your chains." There's a redundant "to help", and it could be more succinctly written as: "Now you've learned how to pass data through your chains to help format the data flowing through your chains."	2024-10-24 15:05:06 +00:00
hippopond	61897aef90	docs: Fix for spelling mistake (#27599 ) Fixes #26009 Thank you for contributing to LangChain! - [x] PR title: "docs: Correcting spelling mistake" - [x] PR message: - Description: Corrected spelling from "trianed" to "trained" - Issue: the issue #26009 - Dependencies: NA - Twitter handle: NA - [ ] Add tests and docs: NA - [ ] Lint and test: Co-authored-by: Libby Lin <libbylin@Libbys-MacBook-Pro.local>	2024-10-24 15:04:18 +00:00
Eugene Yurtsev	d081a5400a	docs: fix more links (#27598 ) Fix more links	2024-10-23 21:26:38 -04:00
Lei Zhang	f203229b51	community: Fix the failure of ChatSparkLLM after upgrading to Pydantic V2 (#27418 ) Description: The test_sparkllm.py can reproduce this issue. https://github.com/langchain-ai/langchain/blob/master/libs/community/tests/integration_tests/chat_models/test_sparkllm.py#L66 ``` Testing started at 18:27 ... Launching pytest with arguments test_sparkllm.py::test_chat_spark_llm --no-header --no-summary -q in /Users/zhanglei/Work/github/langchain/libs/community/tests/integration_tests/chat_models ============================= test session starts ============================== collecting ... collected 1 item test_sparkllm.py::test_chat_spark_llm ============================== 1 failed in 0.45s =============================== FAILED [100%] tests/integration_tests/chat_models/test_sparkllm.py:65 (test_chat_spark_llm) def test_chat_spark_llm() -> None: > chat = ChatSparkLLM( spark_app_id="your spark_app_id", spark_api_key="your spark_api_key", spark_api_secret="your spark_api_secret", ) # type: ignore[call-arg] test_sparkllm.py:67: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ../../../../core/langchain_core/load/serializable.py:111: in __init__ super().__init__(args, kwargs) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ cls = <class 'langchain_community.chat_models.sparkllm.ChatSparkLLM'> values = {'spark_api_key': 'your spark_api_key', 'spark_api_secret': 'your spark_api_secret', 'spark_api_url': 'wss://spark-api.xf-yun.com/v3.5/chat', 'spark_app_id': 'your spark_app_id', ...} @model_validator(mode="before") @classmethod def validate_environment(cls, values: Dict) -> Any: values["spark_app_id"] = get_from_dict_or_env( values, ["spark_app_id", "app_id"], "IFLYTEK_SPARK_APP_ID", ) values["spark_api_key"] = get_from_dict_or_env( values, ["spark_api_key", "api_key"], "IFLYTEK_SPARK_API_KEY", ) values["spark_api_secret"] = get_from_dict_or_env( values, ["spark_api_secret", "api_secret"], "IFLYTEK_SPARK_API_SECRET", ) values["spark_api_url"] = get_from_dict_or_env( values, "spark_api_url", "IFLYTEK_SPARK_API_URL", SPARK_API_URL, ) values["spark_llm_domain"] = get_from_dict_or_env( values, "spark_llm_domain", "IFLYTEK_SPARK_LLM_DOMAIN", SPARK_LLM_DOMAIN, ) # put extra params into model_kwargs default_values = { name: field.default for name, field in get_fields(cls).items() if field.default is not None } > values["model_kwargs"]["temperature"] = default_values.get("temperature") E KeyError: 'model_kwargs' ../../../langchain_community/chat_models/sparkllm.py:368: KeyError ``` I found that when upgrading to Pydantic v2, @root_validator was changed to @model_validator. When a class declares multiple @model_validator(model=before), the execution order in V1 and V2 is opposite. This is the reason for ChatSparkLLM's failure. The correct execution order is to execute build_extra first. https://github.com/langchain-ai/langchain/blob/langchain%3D%3D0.2.16/libs/community/langchain_community/chat_models/sparkllm.py#L302 And then execute validate_environment. https://github.com/langchain-ai/langchain/blob/langchain%3D%3D0.2.16/libs/community/langchain_community/chat_models/sparkllm.py#L329 The Pydantic community also discusses it, but there hasn't been a conclusion yet. https://github.com/pydantic/pydantic/discussions/7434 Issus:* #27416 Twitter handle: coolbeevip --------- Co-authored-by: vbarda <vadym@langchain.dev>	2024-10-23 21:17:10 -04:00
Andrew Effendi	8f151223ad	Community: Fix DuckDuckGo search tool Output Format (#27479 ) Issue: : https://github.com/langchain-ai/langchain/issues/22961 Description: Previously, the documentation for `DuckDuckGoSearchResults` said that it returns a JSON string, however the code returns a regular string that can't be parsed as is. for example running ```python from langchain_community.tools import DuckDuckGoSearchResults # Create a DuckDuckGo search instance search = DuckDuckGoSearchResults() # Invoke the search result = search.invoke("Obama") # Print the result print(result) # Print the type of the result print("Result Type:", type(result)) ``` will return ``` snippet: Harris will hold a campaign event with former President Barack Obama in Georgia next Thursday, the first time the pair has campaigned side by side, a senior campaign official said. A week from ..., title: Obamas to hit the campaign trail in first joint appearances with Harris, link: https://www.nbcnews.com/politics/2024-election/obamas-hit-campaign-trail-first-joint-appearances-harris-rcna176034, snippet: Item 1 of 3 Former U.S. first lady Michelle Obama and her husband, former U.S. President Barack Obama, stand on stage during Day 2 of the Democratic National Convention (DNC) in Chicago, Illinois ..., title: Obamas set to hit campaign trail with Kamala Harris for first time, link: https://www.reuters.com/world/us/obamas-set-hit-campaign-trail-with-kamala-harris-first-time-2024-10-18/, snippet: Barack and Michelle Obama will make their first campaign appearances alongside Kamala Harris at rallies in Georgia and Michigan. By Reid J. Epstein Reporting from Ashwaubenon, Wis. Here come the ..., title: Harris Will Join Michelle Obama and Barack Obama on Campaign Trail, link: https://www.nytimes.com/2024/10/18/us/politics/kamala-harris-michelle-obama-barack-obama.html, snippet: Obama's leaving office was "a turning point," Mirsky said. "That was the last time anybody felt normal." A few feet over, a 64-year-old physics professor named Eric Swanson who had grown ..., title: Obama's reemergence on the campaign trail for Harris comes as he ..., link: https://www.cnn.com/2024/10/13/politics/obama-campaign-trail-harris-biden/index.html Result Type: <class 'str'> ``` After the change in this PR, `DuckDuckGoSearchResults` takes an additional `output_format = "list" \| "json" \| "string"` ("string" = current behavior, default). For example, invoking `DuckDuckGoSearchResults(output_format="list")` return a list of dictionaries in the format ``` [{'snippet': '...', 'title': '...', 'link': '...'}, ...] ``` e.g. ``` [{'snippet': "Obama has in a sense been wrestling with Trump's impact since the real estate magnate broke onto the political stage in 2015. Trump's victory the next year, defeating Obama's secretary of ...", 'title': "Obama's fears about Trump drive his stepped-up campaigning", 'link': 'https://www.washingtonpost.com/politics/2024/10/18/obama-trump-anxiety-harris-campaign/'}, {'snippet': 'Harris will hold a campaign event with former President Barack Obama in Georgia next Thursday, the first time the pair has campaigned side by side, a senior campaign official said. A week from ...', 'title': 'Obamas to hit the campaign trail in first joint appearances with Harris', 'link': 'https://www.nbcnews.com/politics/2024-election/obamas-hit-campaign-trail-first-joint-appearances-harris-rcna176034'}, {'snippet': 'Item 1 of 3 Former U.S. first lady Michelle Obama and her husband, former U.S. President Barack Obama, stand on stage during Day 2 of the Democratic National Convention (DNC) in Chicago, Illinois ...', 'title': 'Obamas set to hit campaign trail with Kamala Harris for first time', 'link': 'https://www.reuters.com/world/us/obamas-set-hit-campaign-trail-with-kamala-harris-first-time-2024-10-18/'}, {'snippet': 'Barack and Michelle Obama will make their first campaign appearances alongside Kamala Harris at rallies in Georgia and Michigan. By Reid J. Epstein Reporting from Ashwaubenon, Wis. Here come the ...', 'title': 'Harris Will Join Michelle Obama and Barack Obama on Campaign Trail', 'link': 'https://www.nytimes.com/2024/10/18/us/politics/kamala-harris-michelle-obama-barack-obama.html'}] Result Type: <class 'list'> ``` --------- Co-authored-by: vbarda <vadym@langchain.dev>	2024-10-23 20:18:11 -04:00
Erick Friis	5e5647b5dd	docs: render api ref urls in search (#27594 )	2024-10-23 16:18:21 -07:00
Bagatur	948e2e6322	docs: concept nits (#27586 )	2024-10-23 14:52:44 -07:00
Eugene Yurtsev	562cf416c2	docs: Update messages.mdx (#27592 ) Add missing `.`	2024-10-23 20:18:27 +00:00

1 2 3 4 5 ...

11710 Commits