langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-18 09:25:54 +00:00

Author	SHA1	Message	Date
Ather Fawaz	a5ccf5d33c	community[minor]: Add support for Perplexity chat model(#17024 ) - Description: This PR adds support for [Perplexity AI APIs](https://blog.perplexity.ai/blog/introducing-pplx-api). - Issues: None - Dependencies: None - Twitter handle: [@atherfawaz](https://twitter.com/AtherFawaz) --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-03-01 12:19:23 -08:00
Rodrigo Nogueira	3438d2cbcc	community[minor]: add maritalk chat (#17675 ) Description: Adds the MariTalk chat that is based on a LLM specially trained for Portuguese. Twitter handle: @MaritacaAI	2024-03-01 12:18:23 -08:00
sarahberenji	08fa38d56d	community[patch]: the syntax error for Redis generated query (#17717 ) To fix the reported error: https://github.com/langchain-ai/langchain/discussions/17397	2024-03-01 12:18:10 -08:00
certified-dodo	43e3244573	community[patch]: Fix MongoDBAtlasVectorSearch max_marginal_relevance_search (#17971 ) Description: * `self._embedding_key` is accessed after deletion, breaking `max_marginal_relevance_search` search * Introduced in: `e135e5257c` * Updated but still persists in: `ce22e10c4b` Issue: https://github.com/langchain-ai/langchain/issues/17963 Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-01 12:17:42 -08:00
Nikita Titov	9f2ab37162	community[patch]: don't try to parse json in case of errored response (#18317 ) Related issue: #13896. In case Ollama is behind a proxy, proxy error responses cannot be viewed. You aren't even able to check response code. For example, if your Ollama has basic access authentication and it's not passed, `JSONDecodeError` will overwrite the truth response error. <details> <summary><b>Log now:</b></summary> ``` { "name": "JSONDecodeError", "message": "Expecting value: line 1 column 1 (char 0)", "stack": "--------------------------------------------------------------------------- JSONDecodeError Traceback (most recent call last) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/requests/models.py:971, in Response.json(self, kwargs) 970 try: --> 971 return complexjson.loads(self.text, kwargs) 972 except JSONDecodeError as e: 973 # Catch JSON-related errors and raise as requests.JSONDecodeError 974 # This aliases json.JSONDecodeError and simplejson.JSONDecodeError File /opt/miniforge3/envs/.gpt/lib/python3.10/json/__init__.py:346, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, kw) 343 if (cls is None and object_hook is None and 344 parse_int is None and parse_float is None and 345 parse_constant is None and object_pairs_hook is None and not kw): --> 346 return _default_decoder.decode(s) 347 if cls is None: File /opt/miniforge3/envs/.gpt/lib/python3.10/json/decoder.py:337, in JSONDecoder.decode(self, s, _w) 333 \"\"\"Return the Python representation of ``s`` (a ``str`` instance 334 containing a JSON document). 335 336 \"\"\" --> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end()) 338 end = _w(s, end).end() File /opt/miniforge3/envs/.gpt/lib/python3.10/json/decoder.py:355, in JSONDecoder.raw_decode(self, s, idx) 354 except StopIteration as err: --> 355 raise JSONDecodeError(\"Expecting value\", s, err.value) from None 356 return obj, end JSONDecodeError: Expecting value: line 1 column 1 (char 0) During handling of the above exception, another exception occurred: JSONDecodeError Traceback (most recent call last) Cell In[3], line 1 ----> 1 print(translate_func().invoke('text')) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/runnables/base.py:2053, in RunnableSequence.invoke(self, input, config) 2051 try: 2052 for i, step in enumerate(self.steps): -> 2053 input = step.invoke( 2054 input, 2055 # mark each step as a child run 2056 patch_config( 2057 config, callbacks=run_manager.get_child(f\"seq:step:{i+1}\") 2058 ), 2059 ) 2060 # finish the root run 2061 except BaseException as e: File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:165, in BaseChatModel.invoke(self, input, config, stop, kwargs) 154 def invoke( 155 self, 156 input: LanguageModelInput, (...) 160 kwargs: Any, 161 ) -> BaseMessage: 162 config = ensure_config(config) 163 return cast( 164 ChatGeneration, --> 165 self.generate_prompt( 166 [self._convert_input(input)], 167 stop=stop, 168 callbacks=config.get(\"callbacks\"), 169 tags=config.get(\"tags\"), 170 metadata=config.get(\"metadata\"), 171 run_name=config.get(\"run_name\"), 172 kwargs, 173 ).generations[0][0], 174 ).message File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:543, in BaseChatModel.generate_prompt(self, prompts, stop, callbacks, kwargs) 535 def generate_prompt( 536 self, 537 prompts: List[PromptValue], (...) 540 kwargs: Any, 541 ) -> LLMResult: 542 prompt_messages = [p.to_messages() for p in prompts] --> 543 return self.generate(prompt_messages, stop=stop, callbacks=callbacks, kwargs) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:407, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, kwargs) 405 if run_managers: 406 run_managers[i].on_llm_error(e, response=LLMResult(generations=[])) --> 407 raise e 408 flattened_outputs = [ 409 LLMResult(generations=[res.generations], llm_output=res.llm_output) 410 for res in results 411 ] 412 llm_output = self._combine_llm_outputs([res.llm_output for res in results]) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:397, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, kwargs) 394 for i, m in enumerate(messages): 395 try: 396 results.append( --> 397 self._generate_with_cache( 398 m, 399 stop=stop, 400 run_manager=run_managers[i] if run_managers else None, 401 kwargs, 402 ) 403 ) 404 except BaseException as e: 405 if run_managers: File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:576, in BaseChatModel._generate_with_cache(self, messages, stop, run_manager, kwargs) 572 raise ValueError( 573 \"Asked to cache, but no cache found at `langchain.cache`.\" 574 ) 575 if new_arg_supported: --> 576 return self._generate( 577 messages, stop=stop, run_manager=run_manager, kwargs 578 ) 579 else: 580 return self._generate(messages, stop=stop, kwargs) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py:250, in ChatOllama._generate(self, messages, stop, run_manager, kwargs) 226 def _generate( 227 self, 228 messages: List[BaseMessage], (...) 231 kwargs: Any, 232 ) -> ChatResult: 233 \"\"\"Call out to Ollama's generate endpoint. 234 235 Args: (...) 247 ]) 248 \"\"\" --> 250 final_chunk = self._chat_stream_with_aggregation( 251 messages, 252 stop=stop, 253 run_manager=run_manager, 254 verbose=self.verbose, 255 kwargs, 256 ) 257 chat_generation = ChatGeneration( 258 message=AIMessage(content=final_chunk.text), 259 generation_info=final_chunk.generation_info, 260 ) 261 return ChatResult(generations=[chat_generation]) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py:183, in ChatOllama._chat_stream_with_aggregation(self, messages, stop, run_manager, verbose, kwargs) 174 def _chat_stream_with_aggregation( 175 self, 176 messages: List[BaseMessage], (...) 180 kwargs: Any, 181 ) -> ChatGenerationChunk: 182 final_chunk: Optional[ChatGenerationChunk] = None --> 183 for stream_resp in self._create_chat_stream(messages, stop, kwargs): 184 if stream_resp: 185 chunk = _chat_stream_response_to_chat_generation_chunk(stream_resp) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py:156, in ChatOllama._create_chat_stream(self, messages, stop, kwargs) 147 def _create_chat_stream( 148 self, 149 messages: List[BaseMessage], 150 stop: Optional[List[str]] = None, 151 kwargs: Any, 152 ) -> Iterator[str]: 153 payload = { 154 \"messages\": self._convert_messages_to_ollama_messages(messages), 155 } --> 156 yield from self._create_stream( 157 payload=payload, stop=stop, api_url=f\"{self.base_url}/api/chat/\", kwargs 158 ) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_community/llms/ollama.py:234, in _OllamaCommon._create_stream(self, api_url, payload, stop, kwargs) 228 raise OllamaEndpointNotFoundError( 229 \"Ollama call failed with status code 404. \" 230 \"Maybe your model is not found \" 231 f\"and you should pull the model with `ollama pull {self.model}`.\" 232 ) 233 else: --> 234 optional_detail = response.json().get(\"error\") 235 raise ValueError( 236 f\"Ollama call failed with status code {response.status_code}.\" 237 f\" Details: {optional_detail}\" 238 ) 239 return response.iter_lines(decode_unicode=True) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/requests/models.py:975, in Response.json(self, kwargs) 971 return complexjson.loads(self.text, kwargs) 972 except JSONDecodeError as e: 973 # Catch JSON-related errors and raise as requests.JSONDecodeError 974 # This aliases json.JSONDecodeError and simplejson.JSONDecodeError --> 975 raise RequestsJSONDecodeError(e.msg, e.doc, e.pos) JSONDecodeError: Expecting value: line 1 column 1 (char 0)" } ``` </details> <details> <summary><b>Log after a fix:</b></summary> ``` { "name": "ValueError", "message": "Ollama call failed with status code 401. Details: <html>\r <head><title>401 Authorization Required</title></head>\r <body>\r <center><h1>401 Authorization Required</h1></center>\r <hr><center>nginx/1.18.0 (Ubuntu)</center>\r </body>\r </html>\r ", "stack": "--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[2], line 1 ----> 1 print(translate_func().invoke('text')) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/runnables/base.py:2053, in RunnableSequence.invoke(self, input, config) 2051 try: 2052 for i, step in enumerate(self.steps): -> 2053 input = step.invoke( 2054 input, 2055 # mark each step as a child run 2056 patch_config( 2057 config, callbacks=run_manager.get_child(f\"seq:step:{i+1}\") 2058 ), 2059 ) 2060 # finish the root run 2061 except BaseException as e: File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:165, in BaseChatModel.invoke(self, input, config, stop, kwargs) 154 def invoke( 155 self, 156 input: LanguageModelInput, (...) 160 kwargs: Any, 161 ) -> BaseMessage: 162 config = ensure_config(config) 163 return cast( 164 ChatGeneration, --> 165 self.generate_prompt( 166 [self._convert_input(input)], 167 stop=stop, 168 callbacks=config.get(\"callbacks\"), 169 tags=config.get(\"tags\"), 170 metadata=config.get(\"metadata\"), 171 run_name=config.get(\"run_name\"), 172 kwargs, 173 ).generations[0][0], 174 ).message File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:543, in BaseChatModel.generate_prompt(self, prompts, stop, callbacks, kwargs) 535 def generate_prompt( 536 self, 537 prompts: List[PromptValue], (...) 540 kwargs: Any, 541 ) -> LLMResult: 542 prompt_messages = [p.to_messages() for p in prompts] --> 543 return self.generate(prompt_messages, stop=stop, callbacks=callbacks, kwargs) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:407, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, kwargs) 405 if run_managers: 406 run_managers[i].on_llm_error(e, response=LLMResult(generations=[])) --> 407 raise e 408 flattened_outputs = [ 409 LLMResult(generations=[res.generations], llm_output=res.llm_output) 410 for res in results 411 ] 412 llm_output = self._combine_llm_outputs([res.llm_output for res in results]) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:397, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, kwargs) 394 for i, m in enumerate(messages): 395 try: 396 results.append( --> 397 self._generate_with_cache( 398 m, 399 stop=stop, 400 run_manager=run_managers[i] if run_managers else None, 401 kwargs, 402 ) 403 ) 404 except BaseException as e: 405 if run_managers: File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:576, in BaseChatModel._generate_with_cache(self, messages, stop, run_manager, kwargs) 572 raise ValueError( 573 \"Asked to cache, but no cache found at `langchain.cache`.\" 574 ) 575 if new_arg_supported: --> 576 return self._generate( 577 messages, stop=stop, run_manager=run_manager, kwargs 578 ) 579 else: 580 return self._generate(messages, stop=stop, kwargs) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py:250, in ChatOllama._generate(self, messages, stop, run_manager, kwargs) 226 def _generate( 227 self, 228 messages: List[BaseMessage], (...) 231 kwargs: Any, 232 ) -> ChatResult: 233 \"\"\"Call out to Ollama's generate endpoint. 234 235 Args: (...) 247 ]) 248 \"\"\" --> 250 final_chunk = self._chat_stream_with_aggregation( 251 messages, 252 stop=stop, 253 run_manager=run_manager, 254 verbose=self.verbose, 255 kwargs, 256 ) 257 chat_generation = ChatGeneration( 258 message=AIMessage(content=final_chunk.text), 259 generation_info=final_chunk.generation_info, 260 ) 261 return ChatResult(generations=[chat_generation]) File /storage/gpt-project/Repos/repo_nikita/gpt_lib/langchain/ollama.py:328, in ChatOllamaCustom._chat_stream_with_aggregation(self, messages, stop, run_manager, verbose, kwargs) 319 def _chat_stream_with_aggregation( 320 self, 321 messages: List[BaseMessage], (...) 325 kwargs: Any, 326 ) -> ChatGenerationChunk: 327 final_chunk: Optional[ChatGenerationChunk] = None --> 328 for stream_resp in self._create_chat_stream(messages, stop, kwargs): 329 if stream_resp: 330 chunk = _chat_stream_response_to_chat_generation_chunk(stream_resp) File /storage/gpt-project/Repos/repo_nikita/gpt_lib/langchain/ollama.py:301, in ChatOllamaCustom._create_chat_stream(self, messages, stop, kwargs) 292 def _create_chat_stream( 293 self, 294 messages: List[BaseMessage], 295 stop: Optional[List[str]] = None, 296 kwargs: Any, 297 ) -> Iterator[str]: 298 payload = { 299 \"messages\": self._convert_messages_to_ollama_messages(messages), 300 } --> 301 yield from self._create_stream( 302 payload=payload, stop=stop, api_url=f\"{self.base_url}/api/chat\", kwargs 303 ) File /storage/gpt-project/Repos/repo_nikita/gpt_lib/langchain/ollama.py:134, in _OllamaCommonCustom._create_stream(self, api_url, payload, stop, **kwargs) 132 else: 133 optional_detail = response.text --> 134 raise ValueError( 135 f\"Ollama call failed with status code {response.status_code}.\" 136 f\" Details: {optional_detail}\" 137 ) 138 return response.iter_lines(decode_unicode=True) ValueError: Ollama call failed with status code 401. Details: <html>\r <head><title>401 Authorization Required</title></head>\r <body>\r <center><h1>401 Authorization Required</h1></center>\r <hr><center>nginx/1.18.0 (Ubuntu)</center>\r </body>\r </html>\r " } ``` </details> The same is true for timeout errors or when you simply mistyped in `base_url` arg and get response from some other service, for instance. Real Ollama errors are still clearly readable: ``` ValueError: Ollama call failed with status code 400. Details: {"error":"invalid options: unknown_option"} ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-01 12:17:29 -08:00
Yudhajit Sinha	e2b901c35b	community[patch]: chat message histrory mypy fix (#18250 ) Description: Fixed type: ignore's for mypy for chat_message_histories(streamlit) Adresses #17048 Planning to add more based on reviews	2024-03-01 12:17:18 -08:00
Gabriel Altay	b9416dc96a	docs: update pinecone README to use PineconeVectorStore (#18170 )	2024-03-01 12:12:52 -08:00
Hemslo Wang	58a2abf089	community[patch]: fix RecursiveUrlLoader metadata_extractor return type (#18193 ) Description: Fix `metadata_extractor` type for `RecursiveUrlLoader`, the default `_metadata_extractor` returns `dict` instead of `str`. Issue: N/A Dependencies: N/A Twitter handle: N/A Signed-off-by: Hemslo Wang <hemslo.wang@gmail.com>	2024-03-01 12:08:20 -08:00
Maxime Perrin	98380cff9b	community[patch]: removing "response_mode" parameter in llama_index retriever (#18180 ) - Description: Removing this line ```python response = index.query(query, response_mode="no_text", self.query_kwargs) ``` to ```python response = index.query(query, self.query_kwargs) ``` Since llama index query does not support response_mode anymore : ``` \| TypeError: BaseQueryEngine.query() got an unexpected keyword argument 'response_mode'```` - Twitter handle: @maximeperrin_ --------- Co-authored-by: Maxime Perrin <mperrin@doing.fr>	2024-03-01 12:05:09 -08:00
Christophe Bornet	177f51c7bd	community: Use default load() implementation in doc loaders (#18385 ) Following https://github.com/langchain-ai/langchain/pull/18289	2024-03-01 14:46:52 -05:00
William De Vena	42341bc787	infra: fake model invoke callback prior to yielding token (#18286 ) ## PR title core[patch]: Invoke callback prior to yielding ## PR message Description: Invoke on_llm_new_token callback prior to yielding token in _stream and _astream methods. Issue: https://github.com/langchain-ai/langchain/issues/16913 Dependencies: None Twitter handle: None	2024-03-01 11:46:18 -08:00
mwmajewsk	e192f6b6eb	community[patch]: fix, better error message in deeplake vectoriser (#18397 ) If the document loader recieves Pathlib path instead of str, it reads the file correctly, but the problem begins when the document is added to Deeplake. This problem arises from casting the path to str in the metadata. ```python deeplake = True fname = Path('./lorem_ipsum.txt') loader = TextLoader(fname, encoding="utf-8") docs = loader.load_and_split() text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=100) chunks= text_splitter.split_documents(docs) if deeplake: db = DeepLake(dataset_path=ds_path, embedding=embeddings, token=activeloop_token) db.add_documents(chunks) else: db = Chroma.from_documents(docs, embeddings) ``` So using this snippet of code the error message for deeplake looks like this: ``` [part of error message omitted] Traceback (most recent call last): File "/home/mwm/repositories/sources/fixing_langchain/main.py", line 53, in <module> db.add_documents(chunks) File "/home/mwm/repositories/sources/langchain/libs/core/langchain_core/vectorstores.py", line 139, in add_documents return self.add_texts(texts, metadatas, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/mwm/repositories/sources/langchain/libs/community/langchain_community/vectorstores/deeplake.py", line 258, in add_texts return self.vectorstore.add( ^^^^^^^^^^^^^^^^^^^^^ File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/vectorstore/deeplake_vectorstore.py", line 226, in add return self.dataset_handler.add( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/vectorstore/dataset_handlers/client_side_dataset_handler.py", line 139, in add dataset_utils.extend_or_ingest_dataset( File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/vectorstore/vector_search/dataset/dataset.py", line 544, in extend_or_ingest_dataset extend( File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/vectorstore/vector_search/dataset/dataset.py", line 505, in extend dataset.extend(batched_processed_tensors, progressbar=False) File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/dataset/dataset.py", line 3247, in extend raise SampleExtendError(str(e)) from e.__cause__ deeplake.util.exceptions.SampleExtendError: Failed to append a sample to the tensor 'metadata'. See more details in the traceback. If you wish to skip the samples that cause errors, please specify `ignore_errors=True`. ``` Which is does not explain the error well enough. The same error for chroma looks like this ``` During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/mwm/repositories/sources/fixing_langchain/main.py", line 56, in <module> db = Chroma.from_documents(docs, embeddings) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/mwm/repositories/sources/langchain/libs/community/langchain_community/vectorstores/chroma.py", line 778, in from_documents return cls.from_texts( ^^^^^^^^^^^^^^^ File "/home/mwm/repositories/sources/langchain/libs/community/langchain_community/vectorstores/chroma.py", line 736, in from_texts chroma_collection.add_texts( File "/home/mwm/repositories/sources/langchain/libs/community/langchain_community/vectorstores/chroma.py", line 309, in add_texts raise ValueError(e.args[0] + "\n\n" + msg) ValueError: Expected metadata value to be a str, int, float or bool, got lorem_ipsum.txt which is a <class 'pathlib.PosixPath'> Try filtering complex metadata from the document using langchain_community.vectorstores.utils.filter_complex_metadata. ``` Which is way more user friendly, so I just added information about possible mismatch of the type in the error message, the same way it is covered in chroma https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/vectorstores/chroma.py#L224	2024-03-01 11:21:21 -08:00
Daniel Chico	7d962278f6	community[patch]: type ignore fixes (#18395 ) Related to #17048	2024-03-01 11:21:02 -08:00
Christophe Bornet	69be82c86d	community[patch]: Implement lazy_load() for CSVLoader (#18391 ) Covered by `test_csv_loader.py`	2024-03-01 11:17:08 -08:00
Bagatur	c54d6eb5da	fireworks[patch]: support "any" tool_choice (#18343 ) per https://readme.fireworks.ai/docs/function-calling	2024-03-01 11:12:28 -08:00
Erick Friis	6afb135baa	astradb: move to langchain-datastax repo (#18354 )	2024-03-01 19:04:43 +00:00
Guangdong Liu	760a16ff32	community[patch]: Fix ChatModel for sparkllm Bug. (#18375 ) PR message: *Delete this entire checklist* and replace with - Description: fix sparkllm paramer error - Issue: close #18370 - Dependencies: change `IFLYTEK_SPARK_APP_URL` to `IFLYTEK_SPARK_API_URL` - Twitter handle: No	2024-03-01 10:49:30 -08:00
Yujie Qian	cbb65741a7	community[patch]: Voyage AI updates default model and batch size (#17655 ) - Description: update the default model and batch size in VoyageEmbeddings - Issue: N/A - Dependencies: N/A - Twitter handle: N/A --------- Co-authored-by: fodizoltan <zoltan@conway.expert>	2024-03-01 10:22:24 -08:00
Shengsheng Huang	ae471a7dcb	community[minor]: add BigDL-LLM integrations (#17953 ) - Description: [`bigdl-llm`](https://github.com/intel-analytics/BigDL) is a library for running LLM on Intel XPU (from Laptop to GPU to Cloud) using INT4/FP4/INT8/FP8 with very low latency (for any PyTorch model). This PR adds bigdl-llm integrations to langchain. - Issue: NA - Dependencies: `bigdl-llm` library - Contribution maintainer: @shane-huang Examples added: - docs/docs/integrations/llms/bigdl.ipynb	2024-03-01 10:04:53 -08:00
Ethan Yang	f61cb8d407	community[minor]: Add openvino backend support (#11591 ) - Description: add openvino backend support by HuggingFace Optimum Intel, - Dependencies: “optimum[openvino]”, --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-01 10:04:24 -08:00
Leonid Ganeline	a89f007947	docs: `runnable` module description (#17966 ) Added a module description. Added `batch` description.	2024-03-01 10:01:32 -08:00
RadhikaBansal97	8bafd2df5e	community[patch]: Change github endpoint in GithubLoader (#17622 ) Description- - Changed the GitHub endpoint as existing was not working and giving 404 not found error - Also the existing function was failing if file_filter is not passed as the tree api return all paths including directory as well, and when get_file_content was iterating over these path, the function was failing for directory as the api was returning list of files inside the directory, so added a condition to ignore the paths if it a directory - Fixes this issue - https://github.com/langchain-ai/langchain/issues/17453 Co-authored-by: Radhika Bansal <Radhika.Bansal@veritas.com>	2024-03-01 09:36:31 -08:00
Yufei (Benny) Chen	2b93206f02	fireworks[patch]: Fix fireworks async stream (#18372 ) - Description: Fix the async stream issue with Fireworks - Dependencies: fireworks >= 0.13.0 ``` tests/integration_tests/test_chat_models.py .......... [ 45%] tests/integration_tests/test_compile.py . [ 50%] tests/integration_tests/test_embeddings.py .. [ 59%] tests/integration_tests/test_llms.py ......... [100%] ``` ``` tests/unit_tests/test_embeddings.py . [ 16%] tests/unit_tests/test_imports.py . [ 33%] tests/unit_tests/test_llms.py .... [100%] ``` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-03-01 09:20:26 -08:00
William FH	1deb8cadd5	Add dataset version info (#18299 )	2024-02-29 22:00:44 -08:00
Anush	9d663f31fa	community[patch]: FastEmbed to latest (#18040 ) ## Description Updates the `langchain_community.embeddings.fastembed` provider as per the recent updates to [`FastEmbed`](https://github.com/qdrant/fastembed) library.	2024-02-29 21:15:51 -08:00
Erick Friis	3c8a115e21	fireworks[patch]: remove custom async and stream implementations (#18363 )	2024-03-01 03:20:02 +00:00
Bagatur	f220af3dce	docs: text splitters readme (#18359 )	2024-03-01 03:00:42 +00:00
Bagatur	0d7fb5f60a	langchain[patch]: langchain-text-splitters dep (#18357 )	2024-02-29 18:48:55 -08:00
Eugene Yurtsev	51b661cfe8	community[patch]: BaseLoader load method should just delegate to lazy_load (#18289 ) load() should just reference lazy_load()	2024-02-29 21:45:28 -05:00
Bagatur	5efb5c099f	text-splitters[minor], langchain[minor], community[patch], templates, docs: langchain-text-splitters 0.0.1 (#18346 )	2024-02-29 18:33:21 -08:00
Nuno Campos	7891934173	Fix missing labels (#18356 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-02-29 18:11:18 -08:00
William FH	fdab931fd3	[Core] Patch: rm dumpd of outputs from runnables/base (#18295 ) It obstructs evaluations when your return a pydantic object.	2024-02-29 18:04:53 -08:00
William FH	f481cbb32d	fireworks[patch]: Fix fireworks bind tools (#18352 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-03-01 01:18:15 +00:00
Erick Friis	eefb49680f	multiple[patch]: fix deprecation versions (#18349 )	2024-02-29 16:58:33 -08:00
Erick Friis	11cb42c2c1	core[patch]: deprecation docstring with lib (#18350 )	2024-03-01 00:44:13 +00:00
Erick Friis	7bbff98dc7	mongodb[patch]: core 0.1.5 dep (#18348 )	2024-02-29 15:39:04 -08:00
Jib	72bfc1d3db	mongodb[minor]: MongoDB Partner Package -- Porting MongoDBAtlasVectorSearch (#17652 ) This PR migrates the existing MongoDBAtlasVectorSearch abstraction from the `langchain_community` section to the partners package section of the codebase. - [x] Run the partner package script as advised in the partner-packages documentation. - [x] Add Unit Tests - [x] Migrate Integration Tests - [x] Refactor `MongoDBAtlasVectorStore` (autogenerated) to `MongoDBAtlasVectorSearch` - [x] ~Remove~ deprecate the old `langchain_community` VectorStore references. ## Additional Callouts - Implemented the `delete` method - Included any missing async function implementations - `amax_marginal_relevance_search_by_vector` - `adelete` - Added new Unit Tests that test for functionality of `MongoDBVectorSearch` methods - Removed [`del res[self._embedding_key]`](`e0c81e1cb0/libs/community/langchain_community/vectorstores/mongodb_atlas.py (L218)`) in `_similarity_search_with_score` function as it would make the `maximal_marginal_relevance` function fail otherwise. The `Document` needs to store the embedding key in metadata to work. Checklist: - [x] PR title: Please title your PR "package: description", where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message - [x] Pass lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified to check that you're passing lint and testing. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ - [x] Add tests and docs: If you're adding a new integration, please include 1. Existing tests supplied in docs/docs do not change. Updated docstrings for new functions like `delete` 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. (This already exists) If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Steven Silvester <steven.silvester@ieee.org> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-29 23:09:48 +00:00
William De Vena	412148773c	Updated partners/fireworks README (#18267 ) ## PR title partners: changed the README file for the Fireworks integration in the libs/partners/fireworks folder ## PR message Description: Changed the README file of partners/fireworks following the docs on https://python.langchain.com/docs/integrations/llms/Fireworks The README includes: - Brief description - Installation - Setting-up instructions (API key, model id, ...) - Basic usage Issue: https://github.com/langchain-ai/langchain/issues/17545 Dependencies: None Twitter handle: None	2024-02-29 14:55:03 -08:00
Kai Kugler	df234fb171	community[patch]: Fixing embedchain document mapping (#18255 ) - Description: The current embedchain implementation seems to handle document metadata differently than done in the current implementation of langchain and a KeyError is thrown. I would love for someone else to test this... --------- Co-authored-by: KKUGLER <kai.kugler@mercedes-benz.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Deshraj Yadav <deshraj@gatech.edu>	2024-02-29 14:54:37 -08:00
Erick Friis	040271f33a	community[patch]: remove llmlingua extended tests (#18344 )	2024-02-29 13:51:29 -08:00
William De Vena	87dca8e477	Updated partners/ibm README (#18268 ) ## PR title partners: changed the README file for the IBM Watson AI integration in the libs/partners/ibm folder. ## PR message Description: Changed the README file of partners/ibm following the docs on https://python.langchain.com/docs/integrations/llms/ibm_watsonx The README includes: - Brief description - Installation - Setting-up instructions (API key, project id, ...) - Basic usage: - Loading the model - Direct inference - Chain invoking - Streaming the model output Issue: https://github.com/langchain-ai/langchain/issues/17545 Dependencies: None Twitter handle: None --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>	2024-02-29 13:29:28 -08:00
Bagatur	9e46535ebc	core[patch]: Release 0.1.28 (#18341 )	2024-02-29 13:03:13 -08:00
Tomaz Bratanic	5999c4a240	Add support for parameters in neo4j retrieval query (#18310 ) Sometimes, you want to use various parameters in the retrieval query of Neo4j Vector to personalize/customize results. Before, when there were only predefined chains, it didn't really make sense. Now that it's all about custom chains and LCEL, it is worth adding since users can inject any params they wish at query time. Isn't prone to SQL injection-type attacks since we use parameters and not concatenating strings.	2024-02-29 13:00:54 -08:00
Hasan	15d1b73a00	Add optional output_parser param in create_react_agent (#18320 ) Description: Add facility to pass the optional output parser to customize the parsing logic --------- Co-authored-by: hasan <hasan@m2sys.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-29 12:35:43 -08:00
Bagatur	a6f0506aaf	docs: query analysis use case (#17766 ) Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-02-29 12:33:49 -08:00
kkdamowang	6782dac420	docs: remove duplicate quote in AzureOpenAIEmbeddings doc (#18315 ) - Description: Remove duplicate quote in AzureOpenAIEmbeddings doc, remove trailing spaces. - Issue: No - Dependencies: No	2024-02-29 11:25:50 -08:00
Virat Singh	cd926ac3dd	community: Add PolygonFinancials Tool (#18324 ) Description: In this PR, I am adding a `PolygonFinancials` tool, which can be used to get financials data for a given ticker. The financials data is the fundamental data that is found in income statements, balance sheets, and cash flow statements of public US companies. Twitter: [@virattt](https://twitter.com/virattt)	2024-02-29 10:56:05 -08:00
Bagatur	68ad3414a2	experimental[patch]: Release 0.0.53 (#18330 )	2024-02-29 09:13:21 -08:00
William FH	8af4425abd	[Evaluation] Config Fix (#18231 )	2024-02-29 00:06:46 -08:00
William De Vena	0486404a74	langchain_openai[patch]: Invoke callback prior to yielding token (#18269 ) ## PR title langchain_openai[patch]: Invoke callback prior to yielding token ## PR message Description: Invoke callback prior to yielding token in _stream and _astream methods for langchain_openai. Issue: https://github.com/langchain-ai/langchain/issues/16913 Dependencies: None Twitter handle: None	2024-02-29 00:00:08 +00:00
William De Vena	5ee76fccd5	langchain_groq[patch]: Invoke callback prior to yielding token (#18272 ) ## PR title langchain_groq[patch]: Invoke callback prior to yielding ## PR message Description:Invoke callback prior to yielding token in _stream and _astream methods for groq. Issue: https://github.com/langchain-ai/langchain/issues/16913 Dependencies: None Twitter handle: None	2024-02-28 23:43:16 +00:00
Christophe Bornet	8a81fcd5d3	community: Fix deprecation version of AstraDB VectorStore (#17991 )	2024-02-28 17:15:09 -05:00
Stefano Lottini	6d863bed51	partner[minor]: Astra DB clients identify themselves as coming through LangChain package (#18131 ) Description This PR sets the "caller identity" of the Astra DB clients used by the integration plugins (`AstraDBChatMessageHistory`, `AstraDBStore`, `AstraDBByteStore` and, pending #17767 , `AstraDBVectorStore`). In this way, the requests to the Astra DB Data API coming from within LangChain are identified as such (the purpose is anonymous usage stats to best improve the Astra DB service).	2024-02-28 17:13:22 -05:00
mackong	2c42f3a955	ollama[patch]: delete suffix slash to avoid redirect (#18260 ) - Description: see [ollama](https://github.com/ollama/ollama/blob/main/server/routes.go#L949)'s route definitions - Issue: N/A - Dependencies: N/A	2024-02-28 16:44:48 -05:00
William De Vena	6b58943917	community[patch]: Invoke callback prior to yielding token (#18288 ) ## PR title community[patch]: Invoke callback prior to yielding PR message Description: Invoke on_llm_new_token callback prior to yielding token in _stream and _astream methods. Issue: https://github.com/langchain-ai/langchain/issues/16913 Dependencies: None Twitter handle: None	2024-02-28 21:40:53 +00:00
William De Vena	23722e3653	langchain[patch]: Invoke callback prior to yielding token (#18282 ) ## PR title langchain[patch]: Invoke callback prior to yielding ## PR message Description: Invoke on_llm_new_token callback prior to yielding token in _stream and _astream methods in langchain/tests/fake_chat_model. Issue: https://github.com/langchain-ai/langchain/issues/16913 Dependencies: None Twitter handle: None	2024-02-28 16:15:02 -05:00
Eugene Yurtsev	cd52433ba0	community[minor]: Add `SQLDatabaseLoader` document loader (#18281 ) - Description: A generic document loader adapter for SQLAlchemy on top of LangChain's `SQLDatabaseLoader`. - Needed by: https://github.com/crate-workbench/langchain/pull/1 - Depends on: GH-16655 - Addressed to: @baskaryan, @cbornet, @eyurtsev Hi from CrateDB again, in the same spirit like GH-16243 and GH-16244, this patch breaks out another commit from https://github.com/crate-workbench/langchain/pull/1, in order to reduce the size of this patch before submitting it, and to separate concerns. To accompany the SQLAlchemy adapter implementation, the patch includes integration tests for both SQLite and PostgreSQL. Let me know if corresponding utility resources should be added at different spots. With kind regards, Andreas. ### Software Tests ```console docker compose --file libs/community/tests/integration_tests/document_loaders/docker-compose/postgresql.yml up ``` ```console cd libs/community pip install psycopg2-binary pytest -vvv tests/integration_tests -k sqldatabase ``` ``` 14 passed ``` ![image](https://github.com/langchain-ai/langchain/assets/453543/42be233c-eb37-4c76-a830-474276e01436) --------- Co-authored-by: Andreas Motl <andreas.motl@crate.io>	2024-02-28 21:02:28 +00:00
William De Vena	a37dc83a9e	langchain_anthropic[patch]: Invoke callback prior to yielding token (#18274 ) ## PR title langchain_anthropic[patch]: Invoke callback prior to yielding ## PR message - Description: Invoke callback prior to yielding token in _stream and _astream methods for anthropic. - Issue: https://github.com/langchain-ai/langchain/issues/16913 - Dependencies: None - Twitter handle: None	2024-02-28 20:19:22 +00:00
David Ruan	af35e2525a	community[minor]: add hugging_face_model document loader (#17323 ) - Description: add hugging_face_model document loader, - Issue: NA, - Dependencies: NA, --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-28 20:05:35 +00:00
Sanjaypranav V M	b9a495e56e	community[patch]: added latin-1 decoder to gmail search tool (#18116 ) some mails from flipkart , amazon are encoded with other plain text format so to handle UnicodeDecode error , added exception and latin decoder Thank you for contributing to LangChain! @hwchase17	2024-02-28 19:28:29 +00:00
Nuno Campos	6da08d0f22	Add PNG drawer for Runnable.get_graph() (#18239 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-02-28 11:25:19 -08:00
Nuno Campos	d9fd1194f5	Remove check preventing passing non-declared config keys (#18276 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-02-28 18:28:53 +00:00
William De Vena	7ac74f291e	langchain_nvidia_ai_endpoints[patch]: Invoke callback prior to yielding token (#18271 ) ## PR title langchain_nvidia_ai_endpoints[patch]: Invoke callback prior to yielding ## PR message Description: Invoke callback prior to yielding token in _stream and _astream methods for nvidia_ai_endpoints. Issue: https://github.com/langchain-ai/langchain/issues/16913 Dependencies: None	2024-02-28 18:10:57 +00:00
Ashley Xu	e3211c2b3d	community[patch]: BigQueryVectorSearch JSON type unsupported for metadatas (#18234 )	2024-02-28 08:19:53 -08:00
Mateusz Szewczyk	db643f6283	ibm[patch]: release 0.1.0 Add possibility to pass ModelInference or Model object to WatsonxLLM class (#18189 ) - Description: Add possibility to pass ModelInference or Model object to WatsonxLLM class - Dependencies: [ibm-watsonx-ai](https://pypi.org/project/ibm-watsonx-ai/), - Tag maintainer: : Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. ✅	2024-02-28 07:03:15 -08:00
Erick Friis	d7a77054ed	airbyte[patch]: core version 0.1.5 (#18244 )	2024-02-27 19:54:43 -08:00
Erick Friis	be8d2ff5f7	airbyte[patch]: init pkg (#18236 )	2024-02-27 19:37:53 -08:00
Ayo Ayibiowu	ac1d7d9de8	community[feat]: Adds LLMLingua as a document compressor (#17711 ) Description: This PR adds support for using the [LLMLingua project ](https://github.com/microsoft/LLMLingua) especially the LongLLMLingua (Enhancing Large Language Model Inference via Prompt Compression) as a document compressor / transformer. The LLMLingua project is an interesting project that can greatly improve RAG system by compressing prompts and contexts while keeping their semantic relevance. Issue: https://github.com/microsoft/LLMLingua/issues/31 Dependencies: [llmlingua](https://pypi.org/project/llmlingua/) @baskaryan --------- Co-authored-by: Ayodeji Ayibiowu <ayodeji.ayibiowu@getinge.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-02-27 19:23:56 -08:00
Nuno Campos	a99eb3abf4	openai[patch]: Assign message id in ChatOpenAI (#17837 )	2024-02-27 17:32:54 -08:00
Isaac Francisco	733367b795	docs: deprecation of OpenAI functions agent, astream_events docstring (#18164 ) Co-authored-by: Hershenson, Isaac (Extern) <isaac.hershenson.extern@bayer04.de> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-27 09:14:53 -08:00
Bagatur	242af4b5a4	openai[patch], mistral[patch], fireworks[patch]: releases 0.0.8, 0.0.5, 0.0.2 (#18186 )	2024-02-27 04:22:24 -08:00
Bagatur	7e66d964c6	core[patch]: Release 0.1.27 (#18159 )	2024-02-26 17:27:38 -08:00
Harrison Chase	d7c607ca00	core[minor]: move document compressor base (#17910 )	2024-02-26 17:20:50 -08:00
Bagatur	b3f4de38ae	mistral[minor]: Function calling and with_structured_output (#18150 ) ![Screenshot 2024-02-26 at 2 07 06 PM](https://github.com/langchain-ai/langchain/assets/22008038/20cacb47-3b24-45b5-871b-dd169f1acd37)	2024-02-26 16:22:30 -08:00
Bagatur	c53aa5cd37	core[patch]: support JS message serial namespaces (#18151 )	2024-02-26 16:19:46 -08:00
Max Jakob	5ab69f907f	partners: add Elasticsearch package (#17467 ) ### Description This PR moves the Elasticsearch classes to a partners package. Note that we will not move (and later remove) `ElasticKnnSearch`. It were previously deprecated. `ElasticVectorSearch` is going to stay in the community package since it is used quite a lot still. Also note that I left the `ElasticsearchTranslator` for self query untouched because it resides in main `langchain` package. ### Dependencies There will be another PR that updates the notebooks (potentially pulling them into the partners package) and templates and removes the classes from the community package, see https://github.com/langchain-ai/langchain/pull/17468 #### Open question How to make the transition smooth for users? Do we move the import aliases and require people to install `langchain-elasticsearch`? Or do we remove the import aliases from the `langchain` package all together? What has worked well for other partner packages? --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-26 23:19:47 +00:00
matt haigh	a4896da2a0	Experimental: Add other threshold types to SemanticChunker (#16807 ) Description Adding different threshold types to the semantic chunker. I’ve had much better and predictable performance when using standard deviations instead of percentiles. ![image](https://github.com/langchain-ai/langchain/assets/44395485/066e84a8-460e-4da5-9fa1-4ff79a1941c5) For all the documents I’ve tried, the distribution of distances look similar to the above: positively skewed normal distribution. All skews I’ve seen are less than 1 so that explains why standard deviations perform well, but I’ve included IQR if anyone wants something more robust. Also, using the percentile method backwards, you can declare the number of clusters and use semantic chunking to get an ‘optimal’ splitting. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-02-26 13:50:48 -08:00
Jaskirat Singh	ce682f5a09	community: vectorstores.kdbai - Added support for when no docs are present (#18103 ) - Description: By default it expects a list but that's not the case in corner scenarios when there is no document ingested(use case: Bootstrap application). \ Hence added as check, if the instance is panda Dataframe instead of list then it will procced with return immediately. - Issue: NA - Dependencies: NA - Twitter handle: jaskiratsingh1 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-02-26 12:47:06 -08:00
am-kinetica	9b8f6455b1	Langchain vectorstore integration with Kinetica (#18102 ) - Description: New vectorstore integration with the Kinetica database - Issue: - Dependencies: the Kinetica Python API `pip install gpudb==7.2.0.1`, - Tag maintainer: @baskaryan, @hwchase17 - Twitter handle: --------- Co-authored-by: Chad Juliano <cjuliano@kinetica.com>	2024-02-26 12:46:48 -08:00
Bagatur	1e8ab83d7b	langchain[patch], core[patch], openai[patch], fireworks[minor]: ChatFireworks.with_structured_output (#18078 ) <img width="1192" alt="Screenshot 2024-02-24 at 3 39 39 PM" src="https://github.com/langchain-ai/langchain/assets/22008038/1cf74774-a23f-4b06-9b9b-85dfa2f75b63">	2024-02-26 12:46:39 -08:00
GoodBai	3589a135ef	community: make `SET allow_experimental_[engine]_index` configurabe in vectorstores.clickhouse (#18107 ) ## Description & Issue While following the official doc to use clickhouse as a vectorstore, I found only the default `annoy` index is properly supported. But I want to try another engine `usearch` for `annoy` is not properly supported on ARM platforms. Here is the settings I prefer: ``` python settings = ClickhouseSettings( table="wiki_Ethereum", index_type="usearch", # annoy by default index_param=[], ) ``` The above settings do not work for the command `set allow_experimental_annoy_index=1` is hard-coded. This PR will make sure the experimental feature follow the `index_type` which is also consistent with Clickhouse's naming conventions.	2024-02-26 12:39:17 -08:00
Dan Stambler	69344a0661	community: Add Laser Embedding Integration (#18111 ) - Description: Added Integration with Meta AI's LASER Language-Agnostic SEntence Representations embedding library, which supports multilingual embedding for any of the languages listed here: https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200, including several low resource languages - Dependencies: laser_encoders	2024-02-26 12:16:37 -08:00
Luan Fernandes	e867557936	[docs] Update doc-string for buffer_as_messages method in ConversationBufferWindowMemory (#18136 ) minor fix stated in #18080	2024-02-26 11:46:43 -08:00
Bagatur	767523f364	core[patch], langchain[patch], templates: move openai functions parsers to core (#18060 ) ![Screenshot 2024-02-23 at 7 48 03 PM](https://github.com/langchain-ai/langchain/assets/22008038/e5540c4d-0020-4ece-869f-ae19db2a1f3f)	2024-02-26 11:12:53 -08:00
Nuno Campos	cd3ab3703b	Improve runnable generator error messages (#18142 ) h/t @hinthornw Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-02-26 18:54:25 +00:00
Nuno Campos	62a30efb12	Fix bug with using configurable_fields after configurable_alternatives (#18139 ) Closes #17915	2024-02-26 10:27:07 -08:00
Nuno Campos	b1d9ce541d	Add BaseMessage.id (#17835 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-02-26 09:27:47 -08:00
Harrison Chase	935aefa8db	add run name for query constructor (#18101 ) Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-26 08:17:05 -08:00
Mohammad Mohtashim	719a1cde75	langchain[patch]: Update doc-string for a method in ConversationBufferWindowMemory (#18090 ) A minor doc fix stated in #18080	2024-02-26 10:15:02 -05:00
Simon Schmidt	2716d58603	langchain: Import from langchain_core in langchain.smith to avoid deprecation warning (#18129 ) Avoids deprecation warning that triggered at import time, e.g. with `python -c 'import langchain.smith'` /opt/venv/lib/python3.12/site-packages/langchain/callbacks/__init__.py:37: LangChainDeprecationWarning: Importing this callback from langchain is deprecated. Importing it from langchain will no longer be supported as of langchain==0.2.0. Please import from langchain-community instead: `from langchain_community.callbacks import base`. To install langchain-community run `pip install -U langchain-community`.	2024-02-26 10:14:10 -05:00
Erick Friis	248c5b84ee	google-genai, google-vertexai: move to langchain-google (#17899 ) These packages have moved to https://github.com/langchain-ai/langchain-google Left tombstone readmes incase anyone ends up at the "Source Code" link from old pypi releases. Can keep these around for a few months.	2024-02-25 21:58:05 -08:00
Erick Friis	3b5bdbfee8	anthropic[minor]: package move (#17974 )	2024-02-25 21:57:26 -08:00
Christophe Bornet	a2d5fa7649	community[patch]: Fix GenericRequestsWrapper _aget_resp_content must be async (#18065 ) There are existing tests in `libs/community/tests/unit_tests/tools/requests/test_tool.py`	2024-02-25 19:07:07 -08:00
Neli Hateva	a01e8473f8	community[patch]: Fix GraphSparqlQAChain so that it works with Ontotext GraphDB (#15009 ) - Description: Introduce a new parameter `graph_kwargs` to `RdfGraph` - parameters used to initialize the `rdflib.Graph` if `query_endpoint` is set. Also, do not set `rdflib.graph.DATASET_DEFAULT_GRAPH_ID` as default value for the `rdflib.Graph` `identifier` if `query_endpoint` is set. - Issue: N/A - Dependencies: N/A - Twitter handle: N/A	2024-02-25 19:05:21 -08:00
Christophe Bornet	4d6cd5b46a	astradb[patch]: Use astrapy's upsert_one method in AstraDBStore (#18063 ) As `upsert` is deprecated	2024-02-25 19:04:18 -08:00
Danny McAteer	e42110f720	docs: Additional examples for partners/exa README (#18081 ) Description: Add additional examples for other modules to partners/exa README Issue: #17545 Dependencies: None Twitter handle: @DannyMcAteer8 --------- Co-authored-by: Daniel McAteer <danielmcateer@Daniels-MBP.attlocal.net> Co-authored-by: Daniel McAteer <danielmcateer@Daniels-MacBook-Pro.local>	2024-02-25 18:53:47 -08:00
dokato	5afb242161	langchain[patch]: Make BooleanOutputParser more robust to non-binary responses (#17810 ) - Description: I encountered this error when I tried to use LLMChainFilter. Even if the message slightly differs, like `Not relevant (NO)` this results in an error. It has been reported already here: https://github.com/langchain-ai/langchain/issues/. This change hopefully makes it more robust. - Issue: #11408 - Dependencies: No - Twitter handle: dokatox	2024-02-25 18:48:33 -08:00
kYLe	17ecf6e119	community[patch]: Remove model limitation on Anyscale LLM (#17662 ) Description: Llama Guard is deprecated from Anyscale public endpoint. Issue: Change the default model. and remove the limitation of only use Llama Guard with Anyscale LLMs Anyscale LLM can also works with all other Chat model hosted on Anyscale. Also added `async_client` for Anyscale LLM	2024-02-25 18:21:19 -08:00
Barun Amalkumar Halder	cc69976860	community[minor] : adds callback handler for Fiddler AI (#17708 ) Description: Callback handler to integrate fiddler with langchain. This PR adds the following - 1. `FiddlerCallbackHandler` implementation into langchain/community 2. Example notebook `fiddler.ipynb` for usage documentation [Internal Tracker : FDL-14305] Issue: NA Dependencies: - Installation of langchain-community is unaffected. - Usage of FiddlerCallbackHandler requires installation of latest fiddler-client (2.5+) Twitter handle: @fiddlerlabs @behalder Co-authored-by: Barun Halder <barun@fiddler.ai>	2024-02-25 18:17:03 -08:00
Christophe Bornet	b8b5ce0c8c	astradb: Add AstraDBChatMessageHistory to langchain-astradb package (#17732 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-25 18:14:49 -08:00
Maxime Perrin	c06a8732aa	community[patch]: fix llama index imports and fields access (#17870 ) - Description: Fixing outdated imports after v0.10 llama index update and updating metadata and source text access - Issue: #17860 - Twitter handle: @maximeperrin_ --------- Co-authored-by: Maxime Perrin <mperrin@doing.fr>	2024-02-25 18:14:23 -08:00
2jimoo	7fc903464a	community: Add document manager and mongo document manager (#17320 ) - Description: - Add DocumentManager class, which is a nosql record manager. - In order to use index and aindex in libs/langchain/langchain/indexes/_api.py, DocumentManager inherits RecordManager. - Also I added the MongoDB implementation of Document Manager too. - Dependencies: pymongo, motor <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: Add DocumentManager class, which is a no sql record manager. To use index method and aindex method in indexes._api.py, Document Manager inherits RecordManager.Add the MongoDB implementation of Document Manager. - Dependencies: pymongo, motor Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-23 21:32:52 -05:00
Leonid Ganeline	3f6bf852ea	experimental: docstrings update (#18048 ) Added missed docstrings. Formatted docsctrings to the consistent format.	2024-02-23 21:24:16 -05:00
kYLe	56b955fc31	community[minor]: Add async_client for Anyscale Chat model (#18050 ) Add `async_client` for Anyscale Chat_model	2024-02-23 21:22:54 -05:00
Eugene Yurtsev	68527b809d	core[patch]: Runnable with message history to use add_messages (#17958 ) This PR updates RunnableWithMessageHistory to use add_messages which will save on round-trips for any chat history abstractions that implement the optimization. If the optimization isn't implemented, add_messages automatically invokes add_message serially.	2024-02-23 21:19:38 -05:00
Bagatur	1c1bb1152e	openai[patch]: refactor with_structured_output (#18052 ) - make schema Optional with default val None, since in json_mode you don't need it if not parsing to pydantic - change return_type -> include_raw - expand docstring examples	2024-02-23 17:02:11 -08:00
Erick Friis	a05fb19f42	openai[patch]: remove numpy dep (#18034 )	2024-02-23 21:12:05 +00:00
Danny McAteer	e8be34f8c7	exa[patch]: update readme (#18047 )	2024-02-23 21:05:42 +00:00
Yufei (Benny) Chen	ee6a773456	fireworks[patch]: Add Fireworks partner packages (#17694 ) --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-23 20:45:47 +00:00
Bagatur	22b964f802	community[patch]: Release 0.0.24 (#18038 )	2024-02-23 10:49:29 -08:00
Erick Friis	29e0445490	community[patch]: BaseLLM typing in init (#18029 )	2024-02-23 17:51:27 +00:00
Nicolò Boschi	4c132b4cc6	community: fix openai streaming throws 'AIMessageChunk' object has no attribute 'text' (#18006 ) After upgrading langchain-community to 0.0.22, it's not possible to use openai from the community package with streaming=True ``` File "/home/runner/work/ragstack-ai/ragstack-ai/ragstack-e2e-tests/.tox/langchain/lib/python3.11/site-packages/langchain_community/chat_models/openai.py", line 434, in _generate return generate_from_stream(stream_iter) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/runner/work/ragstack-ai/ragstack-ai/ragstack-e2e-tests/.tox/langchain/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 65, in generate_from_stream for chunk in stream: File "/home/runner/work/ragstack-ai/ragstack-ai/ragstack-e2e-tests/.tox/langchain/lib/python3.11/site-packages/langchain_community/chat_models/openai.py", line 418, in _stream run_manager.on_llm_new_token(chunk.text, chunk=cg_chunk) ^^^^^^^^^^ AttributeError: 'AIMessageChunk' object has no attribute 'text' ``` Fix regression of https://github.com/langchain-ai/langchain/pull/17907 Twitter handle: @nicoloboschi	2024-02-23 12:12:47 -05:00
Bagatur	9b982b2aba	community[patch]: Release 0.0.23 (#18027 )	2024-02-23 08:54:31 -08:00
Guangdong Liu	4197efd67a	community: Fix SparkLLM error (#18015 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - Description: fix SparkLLM error - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out!	2024-02-23 06:40:29 -08:00
Bagatur	d9e6ca2279	lanchain[patch]: Release 0.1.9 (#17999 )	2024-02-22 21:45:30 -08:00
Bagatur	b46d6b04e1	community[patch]: Release 0.0.22 (#17994 )	2024-02-22 21:35:04 -08:00
Bagatur	cc0290fdf3	openai[patch]: Release 0.0.7 (#17993 )	2024-02-22 21:33:59 -08:00
Bagatur	e045655657	core[patch]: Release 0.1.26 (#17990 )	2024-02-22 17:12:51 -08:00
Reid Falconer	0534ba5a7d	langchain[patch]: return formatted SPARQL query on demand (#11263 ) - Description: Added the `return_sparql_query` feature to the `GraphSparqlQAChain` class, allowing users to get the formatted SPARQL query along with the chain's result. - Issue: NA - Dependencies: None Note: I've ensured that the PR passes linting and testing by running make format, make lint, and make test locally. I have added a test for the integration (which relies on network access) and I have added an example to the notebook showing its use.	2024-02-22 17:03:26 -08:00
Leo Diegues	b15fccbb99	community[patch]: Skip `OpenAIWhisperParser` extremely small audio chunks to avoid api error (#11450 ) Description This PR addresses a rare issue in `OpenAIWhisperParser` that causes it to crash when processing an audio file with a duration very close to the class's chunk size threshold of 20 minutes. Issue #11449 Dependencies None Tag maintainer @agola11 @eyurtsev Twitter handle leonardodiegues --------- Co-authored-by: Leonardo Diegues <leonardo.diegues@grupofolha.com.br> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-22 17:02:43 -08:00
Jorge Villegas	f6a98032e4	docs: langchain-anthropic README updates (#17684 ) # PR Message - Description: This PR adds a README file for the Anthropic API in the `libs/partners` folder of this repository. The README includes: - A brief description of the Anthropic package - Installation & API instructions - Usage examples - Issue: [17545](https://github.com/langchain-ai/langchain/issues/17545) - Dependencies: None Additional notes: This change only affects the docs package and does not introduce any new dependencies. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-22 16:22:30 -08:00
mackong	9678797625	community[patch]: callback before yield for _stream/_astream (#17907 ) - Description: callback on_llm_new_token before yield chunk for _stream/_astream for some chat models, make all chat models in a consistent behaviour. - Issue: N/A - Dependencies: N/A	2024-02-22 16:15:21 -08:00
Chad Juliano	50ba3c68bb	community[minor]: add Kinetica LLM wrapper (#17879 ) Description: Initial pull request for Kinetica LLM wrapper Issue: N/A Dependencies: No new dependencies for unit tests. Integration tests require gpudb, typeguard, and faker Twitter handle: @chad_juliano Note: There is another pull request for Kinetica vectorstore. Ultimately we would like to make a partner package but we are starting with a community contribution.	2024-02-22 16:02:00 -08:00
Erick Friis	ed789be8f4	docs, templates: update schema imports to core (#17885 ) - chat models, messages - documents - agentaction/finish - baseretriever,document - stroutputparser - more messages - basemessage - format_document - baseoutputparser --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-22 15:58:44 -08:00
Leonid Ganeline	971d29e718	docs: robocorpai dosctrings (#17968 ) Added missing docstrings --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-02-22 15:55:01 -08:00
Bagatur	b0cfb86c48	langchain[minor]: openai tools structured_output_chain (#17296 ) Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-22 15:42:47 -08:00
Bagatur	b5f8cf9509	core[minor], openai[minor], langchain[patch]: BaseLanguageModel.with_structured_output #17302 ) ```python class Foo(BaseModel): bar: str structured_llm = ChatOpenAI().with_structured_output(Foo) ``` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-22 15:33:34 -08:00
Bagatur	9b0b0032c2	community[patch]: fix lint (#17984 )	2024-02-22 15:15:27 -08:00
Christophe Bornet	4f88a5130e	langchain[patch]: Support langchain-astradb AstraDBVectorStore in self-query retriever (#17728 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-22 13:19:27 -08:00
David Loving	d068e8ea54	community[patch]: compatibility with SQLAlchemy 1.4.x (#17954 ) Description: Change type hint on `QuerySQLDataBaseTool` to be compatible with SQLAlchemy v1.4.x. Issue: Users locked to `SQLAlchemy < 2.x` are unable to import `QuerySQLDataBaseTool`. closes https://github.com/langchain-ai/langchain/issues/17819 Dependencies: None	2024-02-22 13:17:07 -05:00
Erick Friis	e237dcec91	pinecone[patch]: integration test debug (#17960 )	2024-02-22 09:11:21 -08:00
kartikTAI	9cf6661dc5	community: use NeuralDB object to initialize NeuralDBVectorStore (#17272 ) Description: This PR adds an `__init__` method to the NeuralDBVectorStore class, which takes in a NeuralDB object to instantiate the state of NeuralDBVectorStore. Issue: N/A Dependencies: N/A Twitter handle: N/A	2024-02-22 12:05:01 -05:00
hongbo.mo	a51a257575	langchain_openai[patch]: fix typos in langchain_openai (#17923 ) Just a small typo	2024-02-22 12:03:16 -05:00
Brad Erickson	ecd72d26cf	community: Bugfix - correct Ollama API path to avoid HTTP 307 (#17895 ) Sets the correct /api/generate path, without ending /, to reduce HTTP requests. Reference: https://github.com/ollama/ollama/blob/efe040f8/docs/api.md#generate-request-streaming Before: DEBUG: Starting new HTTP connection (1): localhost:11434 DEBUG: http://localhost:11434 "POST /api/generate/ HTTP/1.1" 307 0 DEBUG: http://localhost:11434 "POST /api/generate HTTP/1.1" 200 None After: DEBUG: Starting new HTTP connection (1): localhost:11434 DEBUG: http://localhost:11434 "POST /api/generate HTTP/1.1" 200 None	2024-02-22 11:59:55 -05:00
Erick Friis	a53370a060	pinecone[patch], docs: PineconeVectorStore, release 0.0.3 (#17896 )	2024-02-22 08:24:08 -08:00
Graden Rea	e5e38e89ce	partner: Add groq partner integration and chat model (#17856 ) Description: Add a Groq chat model issue: TODO Dependencies: groq Twitter handle: N/A	2024-02-22 07:36:16 -08:00
Hasan	7248e98b9e	community[patch]: Return PK in similarity search Document (#17561 ) Issue: #17390 Co-authored-by: hasan <hasan@m2sys.com>	2024-02-21 17:03:50 -08:00
Raunak	1ec8199c8e	community[patch]: Added more functions in NetworkxEntityGraph class (#17624 ) - Description: 1. Added add_node(), remove_node(), has_node(), remove_edge(), has_edge() and get_neighbors() functions in NetworkxEntityGraph class. 2. Added the above functions in graph_networkx_qa.ipynb documentation.	2024-02-21 17:02:56 -08:00
Christophe Bornet	0e26b16930	docs: Fix AstraDBVectorStore docstring (#17706 )	2024-02-21 16:53:08 -08:00
Christophe Bornet	3d91be94b1	community[patch]: Add missing async_astra_db_client param to AstraDBChatMessageHistory (#17742 )	2024-02-21 16:46:42 -08:00
Xudong Sun	c524bf31f5	docs: add helpful comments to sparkllm.py (#17774 ) Adding helpful comments to sparkllm.py, help users to use ChatSparkLLM more effectively	2024-02-21 16:42:54 -08:00
Ian	3019a594b7	community[minor]: Add tidb loader support (#17788 ) This pull request support loading data from TiDB database with Langchain. A simple usage: ``` from langchain_community.document_loaders import TiDBLoader CONNECTION_STRING = "mysql+pymysql://root@127.0.0.1:4000/test" QUERY = "select id, name, description from items;" loader = TiDBLoader( connection_string=CONNECTION_STRING, query=QUERY, page_content_columns=["name", "description"], metadata_columns=["id"], ) documents = loader.load() print(documents) ```	2024-02-21 16:42:33 -08:00
Christophe Bornet	815ec74298	docs: Add docstring to AstraDBStore (#17793 )	2024-02-21 16:41:47 -08:00
ehude	9e54c227f1	community[patch]: Bug Neo4j VectorStore when having multiple indexes the sort is not working and the store that returned is random (#17396 ) Bug fix: when having multiple indexes the sort is not working and the store that returned is random. The following small fix resolves the issue.	2024-02-21 16:33:33 -08:00
Michael Feil	242981b8f0	community[minor]: infinity embedding local option (#17671 ) drop-in-replacement for sentence-transformers inference. https://github.com/langchain-ai/langchain/discussions/17670 tldr from the discussion above -> around a 4x-22x speedup over using SentenceTransformers / huggingface embeddings. For more info: https://github.com/michaelfeil/infinity (pure-python dependency) --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-21 16:33:13 -08:00
Leonid Ganeline	ed0b7c3b72	docs: added `community` modules descriptions (#17827 ) API Reference: Several `community` modules (like [adapter](https://api.python.langchain.com/en/latest/community_api_reference.html#module-langchain_community.adapters) module) are missing descriptions. It happens when langchain was split to the core, langchain and community packages. - Copied module descriptions from other packages - Fixed several descriptions to the consistent format.	2024-02-21 16:18:36 -08:00
Christophe Bornet	5019951a5d	docs: AstraDB VectorStore docstring (#17834 )	2024-02-21 16:16:31 -08:00
Leonid Ganeline	2f2b77602e	docs: modules descriptions (#17844 ) Several `core` modules do not have descriptions, like the [agent](https://api.python.langchain.com/en/latest/core_api_reference.html#module-langchain_core.agents) module. - Added missed module descriptions. The descriptions are mostly copied from the `langchain` or `community` package modules.	2024-02-21 15:58:21 -08:00
Christophe Bornet	f8a3b8e83f	docs: Update langchain-astradb README with AstraDBStore (#17864 )	2024-02-21 15:51:40 -08:00
Rohit Gupta	3acd0c74fc	community[patch]: added SCANN index in default search params (#17889 ) This will enable users to add data in same collection for index type SCANN for milvus	2024-02-21 15:47:47 -08:00
Karim Assi	afc1ba0329	community[patch]: add possibility to search by vector in OpenSearchVectorSearch (#17878 ) - Description: implements the missing `similarity_search_by_vector` function for `OpenSearchVectorSearch` - Issue: N/A - Dependencies: N/A	2024-02-21 15:44:55 -08:00
Nathan Voxland (Activeloop)	9ece134d45	docs: Improved deeplake.py init documentation (#17549 ) Description: Updated documentation for DeepLake init method. Especially the exec_option docs needed improvement, but did a general cleanup while I was looking at it. Issue: n/a Dependencies: None --------- Co-authored-by: Nathan Voxland <nathan@voxland.net>	2024-02-21 15:33:00 -08:00
Zachary Toliver	29ee0496b6	community[patch]: Allow override of 'fetch_schema_from_transport' in the GraphQL tool (#17649 ) - Description: In order to override the bool value of "fetch_schema_from_transport" in the GraphQLAPIWrapper, a "fetch_schema_from_transport" value needed to be added to the "_EXTRA_OPTIONAL_TOOLS" dictionary in load_tools in the "graphql" key. The parameter "fetch_schema_from_transport" must also be passed in to the GraphQLAPIWrapper to allow reading of the value when creating the client. Passing as an optional parameter is probably best to avoid breaking changes. This change is necessary to support GraphQL instances that do not support fetching schema, such as TigerGraph. More info here: [TigerGraph GraphQL Schema Docs](https://docs.tigergraph.com/graphql/current/schema) - Threads handle: @zacharytoliver --------- Co-authored-by: Zachary Toliver <zt10191991@hotmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-21 15:32:43 -08:00
mackong	31891092d8	community[patch]: add missing chunk parameter for _stream/_astream (#17807 ) - Description: Add missing chunk parameter for _stream/_astream for some chat models, make all chat models in a consistent behaviour. - Issue: N/A - Dependencies: N/A	2024-02-21 15:32:28 -08:00
ccurme	1b0802babe	core: fix .bind when used with RunnableLambda async methods (#17739 ) Description: Here is a minimal example to illustrate behavior: ```python from langchain_core.runnables import RunnableLambda def my_function(args, kwargs): return 3 + kwargs.get("n", 0) runnable = RunnableLambda(my_function).bind(n=1) assert 4 == runnable.invoke({}) assert [4] == list(runnable.stream({})) assert 4 == await runnable.ainvoke({}) assert [4] == [item async for item in runnable.astream({})] ``` Here, `runnable.invoke({})` and `runnable.stream({})` work fine, but `runnable.ainvoke({})` raises ``` TypeError: RunnableLambda._ainvoke.<locals>.func() got an unexpected keyword argument 'n' ``` and similarly for `runnable.astream({})`: ``` TypeError: RunnableLambda._atransform.<locals>.func() got an unexpected keyword argument 'n' ``` Here we assume that this behavior is undesired and attempt to fix it. Issue:* https://github.com/langchain-ai/langchain/issues/17241, https://github.com/langchain-ai/langchain/discussions/16446	2024-02-21 15:31:52 -08:00
volodymyr-memsql	0a9a519a39	community[patch]: Added add_images method to SingleStoreDB vector store (#17871 ) In this pull request, we introduce the add_images method to the SingleStoreDB vector store class, expanding its capabilities to handle multi-modal embeddings seamlessly. This method facilitates the incorporation of image data into the vector store by associating each image's URI with corresponding document content, metadata, and either pre-generated embeddings or embeddings computed using the embed_image method of the provided embedding object. the change includes integration tests, validating the behavior of the add_images. Additionally, we provide a notebook showcasing the usage of this new method. --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>	2024-02-21 15:16:32 -08:00
Shashank	8381f859b4	community[patch]: Graceful handling of redis errors in RedisCache and AsyncRedisCache (#17171 ) - Description: The existing `RedisCache` implementation lacks proper handling for redis client failures, such as `ConnectionRefusedError`, leading to subsequent failures in pipeline components like LLM calls. This pull request aims to improve error handling for redis client issues, ensuring a more robust and graceful handling of such errors. - Issue: Fixes #16866 - Dependencies: No new dependency - Twitter handle: N/A Co-authored-by: snsten <> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-21 12:15:19 -05:00
Christophe Bornet	e6311d953d	community[patch]: Add AstraDBLoader docstring (#17873 )	2024-02-21 11:41:34 -05:00
nbyrneKX	c1bb5fd498	community[patch]: typo in doc-string for kdbai vectorstore (#17811 ) community[patch]: typo in doc-string for kdbai vectorstore (#17811)	2024-02-21 10:35:11 -05:00
Christophe Bornet	f59ddcab74	partners/astradb: Use single file instead of module for AstraDBVectorStore (#17644 )	2024-02-20 16:58:56 -08:00
Savvas Mantzouranidis	691ff67096	partners/openai: fix depracation errors of pydantic's .dict() function (reopen #16629 ) (#17404 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-20 16:57:34 -08:00
Christophe Bornet	bebe401b1a	astradb[patch]: Add AstraDBStore to langchain-astradb package (#17789 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-20 16:54:35 -08:00
Bagatur	4e28888d45	core[patch]: Release 0.1.25 (#17833 )	2024-02-20 16:43:28 -08:00
Erick Friis	f154cd64fe	astradb[patch]: relaxed httpx version constraint (#17826 ) relock to newest sdk	2024-02-20 15:45:25 -08:00
Nuno Campos	223e5eff14	Add JSON representation of runnable graph to serialized representation (#17745 ) Sent to LangSmith Thank you for contributing to LangChain! Checklist: - [ ] PR title: Please title your PR "package: description", where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: Delete this entire template message and replace it with the following bulleted list - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Pass lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified to check that you're passing lint and testing. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-02-20 14:51:09 -08:00
Guangdong Liu	47b1b7092d	community[minor]: Add SparkLLM to community (#17702 )	2024-02-20 11:23:47 -08:00
Guangdong Liu	3ba1cb8650	community[minor]: Add SparkLLM Text Embedding Model and SparkLLM introduction (#17573 )	2024-02-20 11:22:27 -08:00
Virat Singh	92e52e89ca	community: Add PolygonTickerNews Tool (#17808 ) Description: In this PR, I am adding a PolygonTickerNews Tool, which can be used to get the latest news for a given ticker / stock. Twitter handle: [@virattt](https://twitter.com/virattt)	2024-02-20 10:15:29 -08:00
Christophe Bornet	b13e52b6ac	community[patch]: Fix AstraDBCache docstrings (#17802 )	2024-02-20 11:39:30 -05:00
Eugene Yurtsev	865cabff05	Docs: Add custom chat model documenation (#17595 ) This PR adds documentation about how to implement a custom chat model.	2024-02-19 22:03:49 -05:00
Nuno Campos	07ee41d284	Cache calls to create_model for get_input_schema and get_output_schema (#17755 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-02-19 13:26:42 -08:00
Bagatur	5ed16adbde	experimental[patch]: Release 0.0.52 (#17763 )	2024-02-19 13:12:22 -08:00
Bagatur	da7bca2178	langchain[patch]: bump community to 0.0.21 (#17754 )	2024-02-19 12:58:32 -08:00
Bagatur	441448372d	langchain[patch]: Release 0.1.8 (#17751 )	2024-02-19 11:27:37 -08:00
Bagatur	ad285ca15c	community[patch]: Release 0.0.21 (#17750 )	2024-02-19 11:13:33 -08:00
Karim Lalani	ea61302f71	community[patch]: bug fix - add empty metadata when metadata not provided (#17669 ) Code fix to include empty medata dictionary to aadd_texts if metadata is not provided.	2024-02-19 10:54:52 -08:00
CogniJT	919ebcc596	community[minor]: CogniSwitch Agent Toolkit for LangChain (#17312 ) Description: CogniSwitch focusses on making GenAI usage more reliable. It abstracts out the complexity & decision making required for tuning processing, storage & retrieval. Using simple APIs documents / URLs can be processed into a Knowledge Graph that can then be used to answer questions. Dependencies: No dependencies. Just network calls & API key required Tag maintainer: @hwchase17 Twitter handle: https://github.com/CogniSwitch Documentation: Please check `docs/docs/integrations/toolkits/cogniswitch.ipynb` Tests: The usual tool & toolkits tests using `test_imports.py` PR has passed linting and testing before this submission. --------- Co-authored-by: Saicharan Sridhara <145636106+saiCogniswitch@users.noreply.github.com>	2024-02-19 10:54:13 -08:00
Christophe Bornet	6275d8b1bf	docs: Fix AstraDBChatMessageHistory docstrings (#17740 )	2024-02-19 10:47:38 -08:00
Pranav Agarwal	86ae48b781	experimental[minor]: Amazon Personalize support (#17436 ) ## Amazon Personalize support on Langchain This PR is a successor to this PR - https://github.com/langchain-ai/langchain/pull/13216 This PR introduces an integration with [Amazon Personalize](https://aws.amazon.com/personalize/) to help you to retrieve recommendations and use them in your natural language applications. This integration provides two new components: 1. An `AmazonPersonalize` client, that provides a wrapper around the Amazon Personalize API. 2. An `AmazonPersonalizeChain`, that provides a chain to pull in recommendations using the client, and then generating the response in natural language. We have added this to langchain_experimental since there was feedback from the previous PR about having this support in experimental rather than the core or community extensions. Here is some sample code to explain the usage. ```python from langchain_experimental.recommenders import AmazonPersonalize from langchain_experimental.recommenders import AmazonPersonalizeChain from langchain.llms.bedrock import Bedrock recommender_arn = "<insert_arn>" client=AmazonPersonalize( credentials_profile_name="default", region_name="us-west-2", recommender_arn=recommender_arn ) bedrock_llm = Bedrock( model_id="anthropic.claude-v2", region_name="us-west-2" ) chain = AmazonPersonalizeChain.from_llm( llm=bedrock_llm, client=client ) response = chain({'user_id': '1'}) ``` Reviewer: @3coins	2024-02-19 10:36:37 -08:00
Aymeric Roucher	0d294760e7	Community: Fuse HuggingFace Endpoint-related classes into one (#17254 ) ## Description Fuse HuggingFace Endpoint-related classes into one: - [HuggingFaceHub](`5ceaf784f3/libs/community/langchain_community/llms/huggingface_hub.py`) - [HuggingFaceTextGenInference](`5ceaf784f3/libs/community/langchain_community/llms/huggingface_text_gen_inference.py`) - and [HuggingFaceEndpoint](`5ceaf784f3/libs/community/langchain_community/llms/huggingface_endpoint.py`) Are fused into - HuggingFaceEndpoint ## Issue The deduplication of classes was creating a lack of clarity, and additional effort to develop classes leads to issues like [this hack](`5ceaf784f3/libs/community/langchain_community/llms/huggingface_endpoint.py (L159)`). ## Dependancies None, this removes dependancies. ## Twitter handle If you want to post about this: @AymericRoucher --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-19 10:33:15 -08:00
Bagatur	8009be862e	core[patch]: Release 0.1.24 (#17744 )	2024-02-19 10:27:26 -08:00
Raghav Dixit	6c18f73ca5	community[patch]: LanceDB integration improvements/fixes (#16173 ) Hi, I'm from the LanceDB team. Improves LanceDB integration by making it easier to use - now you aren't required to create tables manually and pass them in the constructor, although that is still backward compatible. Bug fix - pandas was being used even though it's not a dependency for LanceDB or langchain PS - this issue was raised a few months ago but lost traction. It is a feature improvement for our users kindly review this , Thanks !	2024-02-19 10:22:02 -08:00
Christophe Bornet	e92e96193f	community[minor]: Add async methods to the AstraDB BaseStore (#16872 ) --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-02-19 10:11:49 -08:00
Mohammad Mohtashim	43dc5d3416	community[patch]: OpenLLM Client Fixes + Added Timeout Parameter (#17478 ) - OpenLLM was using outdated method to get the final text output from openllm client invocation which was raising the error. Therefore corrected that. - OpenLLM `_identifying_params` was getting the openllm's client configuration using outdated attributes which was raising error. - Updated the docstring for OpenLLM. - Added timeout parameter to be passed to underlying openllm client.	2024-02-19 10:09:11 -08:00
Leonid Ganeline	1d2aa19aee	docs: Fix bug that caused the word "Beta" to appear twice in doc-strings (#17704 ) The current issue: Several beta descriptions in the API Reference are duplicated. For example: `[Beta] Get a context value.[Beta] Get a context value.` for the [ContextGet class](https://api.python.langchain.com/en/latest/core_api_reference.html#module-langchain_core.beta) description. NOTE: I've tested it only with a new ut! I cannot build API Reference locally :( This PR related to #17615	2024-02-18 21:38:37 -05:00
Guangdong Liu	73edf17b4e	community[minor]: Add Apache Doris as vector store (#17527 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-18 12:05:58 -07:00
Bagatur	a058c8812d	community[patch]: add VoyageEmbeddings truncation (#17638 )	2024-02-18 10:21:21 -07:00
Mohammad Mohtashim	8d4547ae97	[Langchain_community]: Corrected the imports to make them compatible with Sqlachemy <2.0 (#17653 ) - Small Change in Imports in sql_database module to make it work with Sqlachemy <2.0 - This was identified in the following issue: #17616	2024-02-16 16:59:08 -05:00
Christophe Bornet	75465a2a3c	partners/astradb: Add dotenv to langchain-astradb integration tests (#17629 )	2024-02-16 11:48:30 -05:00
Christophe Bornet	19ebc7418e	community: Use _AstraDBCollectionEnvironment in AstraDB VectorStore (community) (#17635 ) Another PR will be done for the langchain-astradb package. Note: for future PRs, devs will be done in the partner package only. This one is just to align with the rest of the components in the community package and it fixes a bunch of issues.	2024-02-16 11:28:16 -05:00
Mateusz Szewczyk	e25b722ea9	watsonx[patch]: Invoke callback prior to yielding token when streaming (#17625 ) Description: Invoke callback prior to yielding token in stream method for watsonx. Issue: https://github.com/langchain-ai/langchain/issues/16913	2024-02-16 09:45:12 -05:00
Nejc Habjan	b4fa847a90	community[minor]: add exclude parameter to DirectoryLoader (#17316 ) - Description: adds an `exclude` parameter to the DirectoryLoader class, based on similar behavior in GenericLoader - Issue: discussed in https://github.com/langchain-ai/langchain/discussions/9059 and I think in some other issues that I cannot find at the moment 🙇 - Dependencies: None - Twitter handle: don't have one sorry! Just https://github/nejch --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-16 09:42:42 -05:00
Bagatur	8f14234afb	infra: ignore flakey lua test (#17618 )	2024-02-16 05:02:58 -07:00
Krista Pratico	bf8e3c6dd1	community[patch]: add fixes for AzureSearch after update to stable azure-search-documents library (#17599 ) - Description: Addresses the bugs described in linked issue where an import was erroneously removed and the rename of a keyword argument was missed when migrating from beta --> stable of the azure-search-documents package - Issue: https://github.com/langchain-ai/langchain/issues/17598 - Dependencies: N/A - Twitter handle: N/A	2024-02-15 22:23:52 -08:00
William FH	64743dea14	core[patch], community[patch], langchain[patch], experimental[patch], robocorp[patch]: bump LangSmith 0.1.* (#17567 )	2024-02-15 23:17:59 -07:00
morgana	9d7ca7df6e	community[patch]: update copy of metadata in rockset vectorstore integration (#17612 ) - Description: This fixes an issue with working with RecordManager. RecordManager was generating new hashes on documents because `add_texts` was modifying the metadata directly. Additionally moved some tests to unit tests since that was a more appropriate home. - Issue: N/A - Dependencies: N/A - Twitter handle: `@_morgan_adams_`	2024-02-15 23:13:40 -07:00
Erick Friis	c8d96f30bd	exa[patch]: fix lint (#17610 )	2024-02-15 20:45:16 -08:00
Erick Friis	8f5c70769d	astradb[patch]: fix core dep 3 (#17617 )	2024-02-15 20:42:30 -08:00
Leonid Ganeline	0835ebad70	docs: Fix bug that caused the word "Deprecated" to appear twice in doc-strings (#17615 ) The current issue: Most of the deprecation descriptions are duplicated. For example: `[Deprecated] Chat Agent.[Deprecated] Chat Agent.` for the [ChatAgent class](https://api.python.langchain.com/en/latest/langchain_api_reference.html#classes) description. NOTE: I've tested it only with new ut! I cannot build API Reference locally :(	2024-02-15 22:52:26 -05:00
Erick Friis	aa31025dd7	astradb[patch]: fix core dep 2 (#17608 )	2024-02-15 16:33:02 -08:00
Erick Friis	cc562e7c58	astradb[patch]: fix core dep (#17606 )	2024-02-15 16:09:38 -08:00
Stefano Lottini	5240ecab99	astradb: bootstrapping Astra DB as Partner Package (#16875 ) Description: This PR introduces a new "Astra DB" Partner Package. So far only the vector store class is _duplicated_ there, all others following once this is validated and established. Along with the move to separate package, incidentally, the class name will change `AstraDB` => `AstraDBVectorStore`. The strategy has been to duplicate the module (with prospected removal from community at LangChain 0.2). Until then, the code will be kept in sync with minimal, known differences (there is a makefile target to automate drift control. Out of convenience with this check, the community package has a class `AstraDBVectorStore` aliased to `AstraDB` at the end of the module). With this PR several bugfixes and improvement come to the vector store, as well as a reshuffling of the doc pages/notebooks (Astra and Cassandra) to align with the move to a separate package. Dependencies: A brand new pyproject.toml in the new package, no changes otherwise. Twitter handle: `@rsprrs` --------- Co-authored-by: Christophe Bornet <cbornet@hotmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-15 15:50:59 -08:00
Erick Friis	6cc6faa00e	ai21: init package (#17592 ) Co-authored-by: Asaf Gardin <asafg@ai21.com> Co-authored-by: etang <etang@ai21.com> Co-authored-by: asafgardin <147075902+asafgardin@users.noreply.github.com>	2024-02-15 12:25:05 -08:00
Moshe Berchansky	20a56fe0a2	community[minor]: Add QuantizedEmbedders (#17391 ) Description: * adding Quantized embedders using optimum-intel and intel-extension-for-pytorch. * added mdx documentation and example notebooks * added embedding import testing. Dependencies: optimum = {extras = ["neural-compressor"], version = "^1.14.0", optional = true} intel_extension_for_pytorch = {version = "^2.2.0", optional = true} Dependencies have been added to pyproject.toml for the community lib. Twitter handle: @peter_izsak --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-15 11:01:24 -08:00
Amir Karbasi	bccc9241ea	community[patch]: Resolve KuzuQAChain API Changes (#16885 ) - Description: Updates to the Kuzu API had broken this functionality. These updates resolve those issues and add a new test to demonstrate the updates. - Issue: #11874 - Dependencies: No new dependencies - Twitter handle: @amirk08 Test results: ``` tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_no_params PASSED [ 33%] tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_params PASSED [ 66%] tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_refresh_schema PASSED [100%] =================================================== slowest 5 durations =================================================== 0.53s call tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_refresh_schema 0.34s call tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_no_params 0.28s call tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_params 0.03s teardown tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_refresh_schema 0.02s teardown tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_params ==================================================== 3 passed in 1.27s ==================================================== ```	2024-02-15 10:18:37 -08:00
Rafail Giavrimis	a84a3add25	Community[patch]: Adjusted import to be compatible with SQLAlchemy<2 (#17520 ) - Description: Adjusts an import to directly import `Result` from `sqlalchemy.engine`. - Issue: #17519 - Dependencies: N/A - Twitter handle: @grafail	2024-02-15 11:12:13 -05:00
Zachary Toliver	6746adf363	community[patch]: pass bool value for fetch_schema_from_transport in GraphQLAPIWrapper (#17552 ) - Description: Allow a bool value to be passed to fetch_schema_from_transport since not all GraphQL instances support this feature, such as TigerGraph. - Threads: @zacharytoliver --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-15 09:54:04 -05:00
Christophe Bornet	789cd5198d	community[patch]: Use astrapy built-in pagination prefetch in AstraDBLoader (#17569 )	2024-02-15 09:52:56 -05:00
Christophe Bornet	387cacb881	community[minor]: Add async methods to AstraDBChatMessageHistory (#17572 )	2024-02-15 09:48:42 -05:00
Christophe Bornet	ff1f985a2a	community: Fix some mypy types in cassandra doc loader (#17570 ) Thank you!	2024-02-15 09:45:22 -05:00
Mo Latif	f3e4a0e27f	langchain[patch]: Update Chain prep_inputs docstring (#17575 ) Description: @eyurtsev Following up on #16644 to fix the docstring, because `prep_inputs` is not longer doing any validation.	2024-02-15 09:44:35 -05:00
William FH	fc1617c44f	Update contact link (#17563 )	2024-02-14 22:37:32 -08:00
Christophe Bornet	ca2d4078f3	community: Add async methods to AstraDBCache (#17415 ) Adds async methods to AstraDBCache	2024-02-14 23:10:08 -05:00
Jan Cap	7ae3ce60d2	community[patch]: Fix pwd import that is not available on windows (#17532 ) - Description: Resolving problem in `langchain_community\document_loaders\pebblo.py` with `import pwd`. `pwd` is not available on windows. import moved to try catch block - Issue: #17514	2024-02-14 13:45:10 -08:00
nvpranak	91bcc9c5c9	community[minor]: Nemo embeddings(#16206 ) This PR is adding support for NVIDIA NeMo embeddings issue #16095. --------- Co-authored-by: Praveen Nakshatrala <pnakshatrala@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-14 13:25:42 -08:00
Mattt394	7c6009b76f	experimental[patch]: Fixed typos in SmartLLMChain ideation and critique prompts (#11507 ) Noticed and fixed a few typos in the SmartLLMChain default ideation and critique prompts --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-02-14 13:20:10 -08:00
Erick Friis	86d3e42853	core[minor]: add name to basemessage (#17539 ) Adds an optional name param to our base message to support passing names into LLMs. OpenAI supports having a name on anything except tool message now (system, ai, user/human).	2024-02-14 12:21:59 -08:00
Mateusz Szewczyk	916332ef5b	ibm: added partners package `langchain_ibm`, added llm (#16512 ) - Description: Added `langchain_ibm` as an langchain partners package of IBM [watsonx.ai](https://www.ibm.com/products/watsonx-ai) LLM provider (`WatsonxLLM`) - Dependencies: [ibm-watsonx-ai](https://pypi.org/project/ibm-watsonx-ai/), - Tag maintainer: : --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-14 12:12:19 -08:00
Shawn	f6d3a3546f	community[patch]: document_loaders: modified athena key logic to handle s3 uris without a prefix (#17526 ) https://github.com/langchain-ai/langchain/issues/17525 ### Example Code ```python from langchain_community.document_loaders.athena import AthenaLoader database_name = "database" s3_output_path = "s3://bucket-no-prefix" query="""SELECT CAST(extract(hour FROM current_timestamp) AS INTEGER) AS current_hour, CAST(extract(minute FROM current_timestamp) AS INTEGER) AS current_minute, CAST(extract(second FROM current_timestamp) AS INTEGER) AS current_second; """ profile_name = "AdministratorAccess" loader = AthenaLoader( query=query, database=database_name, s3_output_uri=s3_output_path, profile_name=profile_name, ) documents = loader.load() print(documents) ``` ### Error Message and Stack Trace (if applicable) NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist ### Description Athena Loader errors when result s3 bucket uri has no prefix. The Loader instance call results in a "NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist." error. If s3_output_path contains a prefix like: ```python s3_output_path = "s3://bucket-with-prefix/prefix" ``` Execution works without an error. ## Suggested solution Modify: ```python key = "/".join(tokens[1:]) + "/" + query_execution_id + ".csv" ``` to ```python key = "/".join(tokens[1:]) + ("/" if tokens[1:] else "") + query_execution_id + ".csv" ``` `9e8a3fc4ff/libs/community/langchain_community/document_loaders/athena.py (L128)` ### System Info System Information ------------------ > OS: Darwin > OS Version: Darwin Kernel Version 22.6.0: Fri Sep 15 13:41:30 PDT 2023; root:xnu-8796.141.3.700.8~1/RELEASE_ARM64_T8103 > Python Version: 3.9.9 (main, Jan 9 2023, 11:42:03) [Clang 14.0.0 (clang-1400.0.29.102)] Package Information ------------------- > langchain_core: 0.1.23 > langchain: 0.1.7 > langchain_community: 0.0.20 > langsmith: 0.0.87 > langchain_openai: 0.0.6 > langchainhub: 0.1.14 Packages not installed (Not Necessarily a Problem) -------------------------------------------------- The following packages were not found: > langgraph > langserve --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-14 11:48:31 -08:00
wulixuan	c776cfc599	community[minor]: integrate with model Yuan2.0 (#15411 ) 1. integrate with [`Yuan2.0`](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/README-EN.md) 2. update `langchain.llms` 3. add a new doc for [Yuan2.0 integration](docs/docs/integrations/llms/yuan2.ipynb) --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-14 11:46:20 -08:00
Philippe PRADOS	d07db457fc	community[patch]: Fix SQLAlchemyMd5Cache race condition (#16279 ) If the SQLAlchemyMd5Cache is shared among multiple processes, it is possible to encounter a race condition during the cache update. Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-14 11:45:28 -08:00
Alex Peplowski	70c296ae96	community[patch]: Expose Anthropic Retry Logic (#17069 ) Description: Expose Anthropic's retry logic, so that `max_retries` can be configured via langchain. Anthropic's retry logic is implemented in their Python SDK here: https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#retries --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-14 11:44:28 -08:00
DanisJiang	de9a6cdf16	experimental[patch]: Enhance protection against arbitrary code execution in PALChain (#17091 ) - Description: Block some ways to trigger arbitrary code execution bug in PALChain. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-14 11:44:07 -08:00
Lyndsey	8562a1e7d4	community[patch]: support query filters for NotionDBLoader (#17217 ) - Description: Support filtering databases in the use case where devs do not want to query ALL entries within a DB, - Issue: N/A, - Dependencies: N/A, - Twitter handle: I don't have Twitter but feel free to tag my Github! --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-14 11:43:41 -08:00
volodymyr-memsql	e36bc379f2	community[patch]: Add vector index support to SingleStoreDB VectorStore (#17308 ) This pull request introduces support for various Approximate Nearest Neighbor (ANN) vector index algorithms in the VectorStore class, starting from version 8.5 of SingleStore DB. Leveraging this enhancement enables users to harness the power of vector indexing, significantly boosting search speed, particularly when handling large sets of vectors. --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-14 11:43:12 -08:00
Kate Silverstein	0bc4a9b3fc	community[minor]: Adds Llamafile as an LLM (#17431 ) * Description: Adds a simple LLM implementation for interacting with [llamafile](https://github.com/Mozilla-Ocho/llamafile)-based models. * Dependencies: N/A * Issue: N/A Detail [llamafile](https://github.com/Mozilla-Ocho/llamafile) lets you run LLMs locally from a single file on most computers without installing any dependencies. To use the llamafile LLM implementation, the user needs to: 1. Download a llamafile e.g. https://huggingface.co/jartine/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile?download=true 2. Make the file executable. 3. Run the llamafile in 'server mode'. (All llamafiles come packaged with a lightweight server; by default, the server listens at `http://localhost:8080`.) ```bash wget https://url/of/model.llamafile chmod +x model.llamafile ./model.llamafile --server --nobrowser ``` Now, the user can invoke the LLM via the LangChain client: ```python from langchain_community.llms.llamafile import Llamafile llm = Llamafile() llm.invoke("Tell me a joke.") ```	2024-02-14 11:15:24 -08:00
Rakib Hosen	5ce1827d31	community[patch]: fix import in language parser (#17538 ) - Description: Resolving import error in language_parser.py during "from langchain.langchain.text_splitter import Language - Issue: the issue #17536 - Dependencies: NO - Twitter handle: @iRakibHosen --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-14 11:11:23 -08:00
Raunak	685d62b032	community[patch]: Added functions in NetworkxEntityGraph class (#17535 ) - Description: 1. Added _clear_edges()_ and _get_number_of_nodes()_ functions in NetworkxEntityGraph class. 2. Added the above two function in graph_networkx_qa.ipynb documentation.	2024-02-14 11:02:24 -08:00
Erick Friis	bfaa8c3048	anthropic[patch]: de-beta anthropic messages, release 0.0.2 (#17540 )	2024-02-14 10:31:45 -08:00
Erick Friis	a99c667c22	partners: version constraints (#17492 ) Core should be ^0.1 by default Careful about 0.x.y and 0.0.z packages	2024-02-14 08:57:46 -08:00
Erick Friis	d7418acbe1	nomic[patch]: release 0.0.2, dimensionality (#17534 ) - nomic[patch]: release 0.0.2 - x	2024-02-14 08:38:07 -08:00
shibuiwilliam	c502736841	infra: add test for ensemble retriever to ensure multiple retrievers (#8401 ) Add tests to ensemble retriever to ensure it works with combination of multiple retrievers --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-13 21:22:03 -08:00
Qihui Xie	5738143d4b	add mongodb_store (#13801 ) # Add MongoDB storage - Description: Add MongoDB Storage as an option for large doc store. Example usage: ```Python # Instantiate the MongodbStore with a MongoDB connection from langchain.storage import MongodbStore mongo_conn_str = "mongodb://localhost:27017/" mongodb_store = MongodbStore(mongo_conn_str, db_name="test-db", collection_name="test-collection") # Set values for keys doc1 = Document(page_content='test1') doc2 = Document(page_content='test2') mongodb_store.mset([("key1", doc1), ("key2", doc2)]) # Get values for keys values = mongodb_store.mget(["key1", "key2"]) # [doc1, doc2] # Iterate over keys for key in mongodb_store.yield_keys(): print(key) # Delete keys mongodb_store.mdelete(["key1", "key2"]) ``` - Dependencies: Use `mongomock` for integration test. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-13 22:33:22 -05:00
Mo Latif	50b48a8e6a	langchain[patch]: Invoke chain prep_inputs and prep_outputs inside try block to catch validation errors (#16644 ) - Description: Callback manager can't catch chain input or output validation errors because `prepare_input` and `prepare_output` are not part of the try/raise logic, this PR fixes that logic. - Issue: #15954	2024-02-13 22:23:11 -05:00
Christophe Bornet	a8f530bc4d	Add async methods to CacheBackedEmbeddings (#16873 ) Adds async methods to CacheBackedEmbeddings	2024-02-13 22:16:27 -05:00
Bagatur	50de7a31f0	langchain[patch]: structured output chain nits (#17291 )	2024-02-13 16:45:29 -08:00
Nat Noordanus	8a3b74fe1f	community[patch]: Fix pydantic ForwardRef error in BedrockBase (#17416 ) - Description: Fixes a type annotation issue in the definition of BedrockBase. This issue was that the annotation for the `config` attribute includes a ForwardRef to `botocore.client.Config` which is only imported when `TYPE_CHECKING`. This can cause pydantic to raise an error like `pydantic.errors.ConfigError: field "config" not yet prepared so type is still a ForwardRef, ...`. - Issue: N/A - Dependencies: N/A - Twitter handle: `@__nat_n__`	2024-02-13 16:15:55 -08:00
Ashley Xu	f746a73e26	Add the BQ job usage tracking from LangChain (#17123 ) - Description: Add the BQ job usage tracking from LangChain --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-13 14:47:57 -08:00
JongRok BAEK	8d6cc90fc5	langchain.core : Use shallow copy for schema manipulation in JsonOutputParser.get_format_instructions (#17162 ) - Description : Fix: Use shallow copy for schema manipulation in get_format_instructions Prevents side effects on the original schema object by using a dictionary comprehension for a safer and more controlled manipulation of schema key-value pairs, enhancing code reliability. - Issue: #17161 - Dependencies: None - Twitter handle: None	2024-02-13 13:30:53 -08:00
Bagatur	b5d3416563	experimental[patch]: Release 0.0.51 (#17484 )	2024-02-13 13:14:38 -08:00
Bagatur	de7c4b277c	langchain[patch]: Release 0.1.7 (#17482 )	2024-02-13 13:13:04 -08:00
Bagatur	39342d98d6	community[patch]: Release 0.0.20 (#17480 )	2024-02-13 13:01:51 -08:00
Bagatur	89b765ec27	core[patch]: Release 0.1.23 (#17479 )	2024-02-13 12:55:45 -08:00
Max Jakob	ab3d944667	community[patch]: ElasticsearchStore: preserve user headers (#16830 ) Users can provide an Elasticsearch connection with custom headers. This PR makes sure these headers are preserved when adding the langchain user agent header.	2024-02-13 12:37:35 -08:00
Erick Friis	9eb1b56e73	pinecone[patch]: release 0.0.2 (#17477 )	2024-02-13 12:01:45 -08:00
Erick Friis	37678471c4	openai[patch]: relax tiktoken constraint, release 0.0.6 (#17472 )	2024-02-13 11:25:55 -08:00
Wendy H. Chun	2df7387c91	langchain[patch]: Fix to avoid infinite loop during collapse chain in map reduce (#16253 ) - Description: Depending on `token_max` used in `load_summarize_chain`, it could cause an infinite loop when documents cannot collapse under `token_max`. This change would not affect the existing feature, but it also gives an option to users to avoid the situation. - Issue: https://github.com/langchain-ai/langchain/issues/16251 - Dependencies: None - Twitter handle: None --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-13 10:55:32 -08:00
wulixuan	5d06797905	community[minor]: integrate chat models with Yuan2.0 (#16575 ) 1. integrate chat models with [`Yuan2.0`](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/README-EN.md) 2. add a new doc for [Yuan2.0 integration](docs/docs/integrations/llms/yuan2.ipynb) Yuan2.0 is a new generation Fundamental Large Language Model developed by IEIT System. We have published all three models, Yuan 2.0-102B, Yuan 2.0-51B, and Yuan 2.0-2B. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-13 10:55:14 -08:00
Taha Khabouss	15baffc484	langchain[patch]: Ensure that the Elasticsearch Query Translator functions accurately w… (#17044 ) Description: Addresses a problem where the Date type within an Elasticsearch SelfQueryRetriever would encounter difficulties in generating a valid query. Issue: #17042 --------- Co-authored-by: Max Jakob <max.jakob@elastic.co> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-13 10:54:24 -08:00
Erick Friis	e5c76f9dbd	pinecone[patch]: poetry update (#17471 )	2024-02-13 10:32:29 -08:00
Erick Friis	10bdf2422c	pinecone[patch]: release 0.0.2rc0, remove simsimd dep (#17469 )	2024-02-13 10:02:16 -08:00
Erick Friis	065cde69b1	google-genai[patch]: release 0.0.9, safety settings docs (#17432 )	2024-02-13 10:01:25 -08:00
Sergey Kozlov	db6f266d97	core: improve None value processing in merge_dicts() (#17462 ) - Description: fix `None` and `0` merging in `merge_dicts()`, add tests. ```python from langchain_core.utils._merge import merge_dicts assert merge_dicts({"a": None}, {"a": 0}) == {"a": 0} ``` --------- Co-authored-by: Sergey Kozlov <sergey.kozlov@ludditelabs.io>	2024-02-13 08:48:02 -08:00
Ian Gregory	e5472b5eb8	Framework for supporting more languages in LanguageParser (#13318 ) ## Description I am submitting this for a school project as part of a team of 5. Other team members are @LeilaChr, @maazh10, @Megabear137, @jelalalamy. This PR also has contributions from community members @Harrolee and @Mario928. Initial context is in the issue we opened (#11229). This pull request adds: - Generic framework for expanding the languages that `LanguageParser` can handle, using the [tree-sitter](https://github.com/tree-sitter/py-tree-sitter#py-tree-sitter) parsing library and existing language-specific parsers written for it - Support for the following additional languages in `LanguageParser`: - C - C++ - C# - Go - Java (contributed by @Mario928 https://github.com/ThatsJustCheesy/langchain/pull/2) - Kotlin - Lua - Perl - Ruby - Rust - Scala - TypeScript (contributed by @Harrolee https://github.com/ThatsJustCheesy/langchain/pull/1) Here is the [design document](https://docs.google.com/document/d/17dB14cKCWAaiTeSeBtxHpoVPGKrsPye8W0o_WClz2kk) if curious, but no need to read it. ## Issues - Closes #11229 - Closes #10996 - Closes #8405 ## Dependencies `tree_sitter` and `tree_sitter_languages` on PyPI. We have tried to add these as optional dependencies. ## Documentation We have updated the list of supported languages, and also added a section to `source_code.ipynb` detailing how to add support for additional languages using our framework. ## Maintainer - @hwchase17 (previously reviewed https://github.com/langchain-ai/langchain/pull/6486) Thanks!! ## Git commits We will gladly squash any/all of our commits (esp merge commits) if necessary. Let us know if this is desirable, or if you will be squash-merging anyway. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Maaz Hashmi <mhashmi373@gmail.com> Co-authored-by: LeilaChr <87657694+LeilaChr@users.noreply.github.com> Co-authored-by: Jeremy La <jeremylai511@gmail.com> Co-authored-by: Megabear137 <zubair.alnoor27@gmail.com> Co-authored-by: Lee Harrold <lhharrold@sep.com> Co-authored-by: Mario928 <88029051+Mario928@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-02-13 08:45:49 -08:00
Bagatur	3925071dd6	langchain[patch], templates[patch]: fix multi query retriever, web re… (#17434 ) …search retriever Fixes #17352	2024-02-12 22:52:07 -08:00
Bagatur	c0ce93236a	experimental[patch]: fix zero-shot pandas agent (#17442 )	2024-02-12 21:58:35 -08:00
Abhishek Jain	37e1275f9e	community[patch]: Fixed the 'aembed' method of 'CohereEmbeddings'. (#16497 ) Description: - The existing code was trying to find a `.embeddings` property on the `Coroutine` returned by calling `cohere.async_client.embed`. - Instead, the `.embeddings` property is present on the value returned by the `Coroutine`. - Also, it seems that the original cohere client expects a value of `max_retries` to not be `None`. Hence, setting the default value of `max_retries` to `3`. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-12 21:57:27 -08:00
Sridhar Ramaswamy	9f1cbbc6ed	community[minor]: Add pebblo safe document loader (#16862 ) - Description: Pebblo opensource project enables developers to safely load data to their Gen AI apps. It identifies semantic topics and entities found in the loaded data and summarizes them in a developer-friendly report. - Dependencies: none - Twitter handle: srics @hwchase17	2024-02-12 21:56:12 -08:00
mhavey	1bbb64d956	community[minor], langchian[minor]: Add Neptune Rdf graph and chain (#16650 ) Description: This PR adds a chain for Amazon Neptune graph database RDF format. It complements the existing Neptune Cypher chain. The PR also includes a Neptune RDF graph class to connect to, introspect, and query a Neptune RDF graph database from the chain. A sample notebook is provided under docs that demonstrates the overall effect: invoking the chain to make natural language queries against Neptune using an LLM. Issue: This is a new feature Dependencies: The RDF graph class depends on the AWS boto3 library if using IAM authentication to connect to the Neptune database. --------- Co-authored-by: Piyush Jain <piyushjain@duck.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-12 21:30:20 -08:00
Michael Feil	e1cfd0f3e7	community[patch]: infinity embeddings update incorrect default url (#16759 ) The default url has always been incorrect (7797 instead 7997). Here is a update to the correct url.	2024-02-12 20:05:08 -08:00
Massimiliano Pronesti	df7cbd6fbb	community[minor]: add FlashRank ranker (#16785 ) Description: This PR adds support for [flashrank](https://github.com/PrithivirajDamodaran/FlashRank) for reranking as alternative to Cohere. I'm not sure `libs/langchain` is the right place for this change. At first, I wanted to put it under `libs/community`. All the compressors were under `libs/langchain/retrievers/document_compressors` though. Hope this makes sense!	2024-02-12 20:00:52 -08:00
Andreas Motl	1fdd9bd980	community/SQLDatabase: Generalize and trim software tests (#16659 ) - Description: Improve test cases for `SQLDatabase` adapter component, see [suggestion](https://github.com/langchain-ai/langchain/pull/16655#pullrequestreview-1846749474). - Depends on: GH-16655 - Addressed to: @baskaryan, @cbornet, @eyurtsev _Remark: This PR is stacked upon GH-16655, so that one will need to go in first._ Edit: Thank you for bringing in GH-17191, @eyurtsev. This is a little aftermath, improving/streamlining the corresponding test cases.	2024-02-12 22:58:34 -05:00
Theo / Taeyoon Kang	1987f905ed	core[patch]: Support .yml extension for YAML (#16783 ) - Description: [AS-IS] When dealing with a yaml file, the extension must be .yaml. [TO-BE] In the absence of extension length constraints in the OS, the extension of the YAML file is yaml, but control over the yml extension must still be made. It's as if it's an error because it's a .jpg extension in jpeg support. - Issue: - - Dependencies: no dependencies required for this change,	2024-02-12 19:57:20 -08:00
Kapil Sachdeva	cd00a87db7	community[patch] - in FAISS vector store, support passing custom DocStore implementation when using from_xxx methods (#16801 ) - Description: The from__xx methods of FAISS class have hardcoded InMemoryStore implementation and thereby not let users pass a custom DocStore implementation, - Issue: no referenced issue, - Dependencies: none, - Twitter handle: ksachdeva	2024-02-12 19:51:55 -08:00
Chris	f9f5626ca4	community[patch]: Fix github search issues and PRs PaginatedList has no len() error (#16806 ) Description: Bugfix: Langchain_community's GitHub Api wrapper throws a TypeError when searching for issues and/or PRs (the `search_issues_and_prs` method). This is because PyGithub's PageinatedList type does not support the len() method. See https://github.com/PyGithub/PyGithub/issues/1476 ![image](https://github.com/langchain-ai/langchain/assets/8849021/57390b11-ed41-4f48-ba50-f3028610789c) Dependencies: None Twitter handle: @ChrisKeoghNZ I haven't registered an issue as it would take me longer to fill the template out than to make the fix, but I'm happy to if that's deemed essential. I've added a simple integration test to cover this as there were no existing unit tests and it was going to be tricky to set them up. Co-authored-by: Chris Keogh <chris.keogh@xero.com>	2024-02-12 19:50:59 -08:00
morgana	722aae4fd1	community: add delete method to rocksetdb vectorstore to support recordmanager (#17030 ) - Description: This adds a delete method so that rocksetdb can be used with `RecordManager`. - Issue: N/A - Dependencies: N/A - Twitter handle: `@_morgan_adams_` --------- Co-authored-by: Rockset API Bot <admin@rockset.io>	2024-02-12 19:50:20 -08:00
yin1991	c454dc36fc	community[proxy]: Enhancement/add proxy support playwrighturlloader 16751 (#16822 ) - Description: Enhancement/add proxy support playwrighturlloader 16751 - Issue: [Enhancement: Add Proxy Support to PlaywrightURLLoader Class](https://github.com/langchain-ai/langchain/issues/16751) - Dependencies: - Twitter handle: @ootR77013489 --------- Co-authored-by: root <root@ip-172-31-46-160.ap-southeast-1.compute.internal> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-12 19:48:29 -08:00
Bhupesh Varshney	e3b775e035	infra: make `.gitignore` consistent with standard python gitignore (#16828 ) - The new .gitignore version is inherited from the one maintained by the github community over at https://github.com/github/gitignore/blob/main/Python.gitignore - This should cover all the cases of how a langchain app can be used.	2024-02-12 19:43:41 -08:00
James Braza	64938ae6f2	infra: unit testing `check_package_version` (#16825 ) Wrote a unit test for `check_package_version` in the core package. Note that this is a revival of https://github.com/langchain-ai/langchain/pull/16387 after GitHub incident (see https://github.com/langchain-ai/langchain/discussions/16796).	2024-02-12 19:39:58 -08:00
Lingzhen Chen	30af711c34	community[patch]: update AzureSearch class to work with azure-search-documents=11.4.0 (#15659 ) - Description: Updates `libs/community/langchain_community/vectorstores/azuresearch.py` to support the stable version `azure-search-documents=11.4.0` - Issue: https://github.com/langchain-ai/langchain/issues/14534, https://github.com/langchain-ai/langchain/issues/15039, https://github.com/langchain-ai/langchain/issues/15355 - Dependencies: azure-search-documents>=11.4.0 --------- Co-authored-by: Clément Tamines <Skar0@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-12 19:23:35 -08:00
Robby	e135dc70c3	community[patch]: Invoke callback prior to yielding token (#17348 ) Description: Invoke callback prior to yielding token in stream method for Ollama. Issue: [Callback for on_llm_new_token should be invoked before the token is yielded by the model #16913](https://github.com/langchain-ai/langchain/issues/16913) Co-authored-by: Robby <h0rv@users.noreply.github.com>	2024-02-12 19:22:55 -08:00
Christophe Bornet	ab025507bc	community[patch]: Add async methods to VectorStoreQATool (#16949 )	2024-02-12 19:19:50 -08:00
Christophe Bornet	fb7552bfcf	Add async methods to InMemoryCache (#17425 ) Add async methods to InMemoryCache	2024-02-12 22:02:38 -05:00
Eugene Yurtsev	93472ee9e6	core[patch]: Replace memory stream implementation used by LogStreamCallbackHandler (#17185 ) This PR replaces the memory stream implementation used by the LogStreamCallbackHandler. This implementation resolves an issue in which streamed logs and streamed events originating from sync code would arrive only after the entire sync code would finish execution (rather than arriving in real time as they're generated). One example is if trying to stream tokens from an llm within a tool. If the tool was an async tool, but the llm was invoked via stream (sync variant) rather than astream (async variant), then the tokens would fail to stream in real time and would all arrived bunched up after the tool invocation completed.	2024-02-12 21:57:38 -05:00
yin1991	37ef6ac113	community[patch]: Add Pagination to GitHubIssuesLoader for Efficient GitHub Issues Retrieval (#16934 ) - Description: Add Pagination to GitHubIssuesLoader for Efficient GitHub Issues Retrieval - Issue: [the issue # it fixes if applicable,](https://github.com/langchain-ai/langchain/issues/16864) --------- Co-authored-by: root <root@ip-172-31-46-160.ap-southeast-1.compute.internal> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-12 18:30:36 -08:00
Bagatur	22638e5927	community[patch]: give reranker default client val (#17289 )	2024-02-12 17:21:53 -08:00
Robby	ece4b43a81	community[patch]: doc loaders mypy fixes (#17368 ) Description: Fixed `type: ignore`'s for mypy for some document_loaders. Issue: [Remove "type: ignore" comments #17048 ](https://github.com/langchain-ai/langchain/issues/17048) --------- Co-authored-by: Robby <h0rv@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-12 16:51:06 -08:00
Robby	0653aa469a	community[patch]: Invoke callback prior to yielding token (#17346 ) Description: Invoke callback prior to yielding token in stream method for watsonx. Issue: [Callback for on_llm_new_token should be invoked before the token is yielded by the model #16913](https://github.com/langchain-ai/langchain/issues/16913) Co-authored-by: Robby <h0rv@users.noreply.github.com>	2024-02-12 16:36:33 -08:00
Bagatur	f7e453971d	community[patch]: remove print (#17435 )	2024-02-12 15:21:38 -08:00
Spencer Kelly	54fa78c887	community[patch]: fixed vector similarity filtering (#16967 ) Description: changed filtering so that failed filter doesn't add document to results. Currently filtering is entirely broken and all documents are returned whether or not they pass the filter. fixes issue introduced in https://github.com/langchain-ai/langchain/pull/16190	2024-02-12 14:52:57 -08:00
Aditya	a23c719c8b	google-genai[minor]: add safety settings (#16836 ) Replace this entire comment with: - Description:Expose safety_settings for Gemini integrations on google-generativeai - Issue:NA, - Dependencies:NA - Twitter handle:@aditya_rane @lkuligin for review --------- Co-authored-by: adityarane@google.com <adityarane@google.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-12 13:44:24 -08:00
Abhijeeth Padarthi	584b647b96	community[minor]: AWS Athena Document Loader (#15625 ) - Description: Adds the document loader for [AWS Athena](https://aws.amazon.com/athena/), a serverless and interactive analytics service. - Dependencies: Added boto3 as a dependency	2024-02-12 12:53:40 -08:00
david-tempelmann	93da18b667	community[minor]: Add mmr and similarity_score_threshold retrieval to DatabricksVectorSearch (#16829 ) - Description: This PR adds support for `search_types="mmr"` and `search_type="similarity_score_threshold"` to retrievers using `DatabricksVectorSearch`, - Issue: - Dependencies: - Twitter handle: --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-12 12:51:37 -08:00
Erick Friis	42648061ad	openai[patch]: code cleaning (#17355 ) h/t @tdene for finding cleanup op in #17047	2024-02-12 12:36:12 -08:00
Massimiliano Pronesti	3894b4d9a5	community: add gpt-4-turbo and gpt-4-0125 costs (#17349 ) Ref: https://openai.com/pricing <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2024-02-11 21:24:24 -08:00
Tomaz Bratanic	19a1c9183d	Improve graph cypher qa prompt (#17380 ) Unlike vector results, the LLM has to completely trust the context of a graph database result, even if it doesn't provide whole context. We tried with instructions, but it seems that adding a single example is the way to go to solve this issue.	2024-02-11 21:15:46 -08:00
Sandeep Banerjee	183daa6e6f	google-genai[patch]: on_llm_new_token fix (#16924 ) ### This pull request makes the following changes: * Fixed issue #16913 Fixed the google gen ai chat_models.py code to make sure that the callback is called before the token is yielded <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-09 18:00:24 -08:00
Bagatur	10c10f2dea	cli[patch]: integration template nits (#14691 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-09 17:59:34 -08:00
Erick Friis	99540d3d75	infra: no print in newer partner packages (#17353 ) <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2024-02-09 16:40:02 -08:00
William FH	7c03cc5ed4	Support serialization when inputs/outputs contain generators (#17338 ) Pydantic's `dict()` function raises an error here if you pass in a generator. We have a more robust serialization function in lagnsmith that we will use instead.	2024-02-09 16:24:54 -08:00
Erick Friis	3a2eb6e12b	infra: add print rule to ruff (#16221 ) Added noqa for existing prints. Can slowly remove / will prevent more being intro'd	2024-02-09 16:13:30 -08:00
Jael Gu	c07c0da01a	community[patch]: Fix Milvus add texts when ids=None (#17021 ) - Description: Fix Milvus add texts when ids=None (auto_id=True) Signed-off-by: Jael Gu <mengjia.gu@zilliz.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-09 18:48:37 -05:00
Quang Hoa	54c1fb3f25	community[patch]: Make some functions work with Milvus (#10695 ) Description Make some functions work with Milvus: 1. get_ids: Get primary keys by field in the metadata 2. delete: Delete one or more entities by ids 3. upsert: Update/Insert one or more entities Issue None Dependencies None Tag maintainer: @hwchase17 Twitter handle: None --------- Co-authored-by: HoaNQ9 <hoanq.1811@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-09 15:21:31 -08:00
kYLe	c9999557bf	community[patch]: Modify LLMs/Anyscale work with OpenAI API v1 (#14206 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> - Description: 1. Modify LLMs/Anyscale to work with OAI v1 2. Get rid of openai_ prefixed variables in Chat_model/ChatAnyscale 3. Modify `anyscale_api_base` to `anyscale_base_url` to follow OAI name convention (reverted) --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-09 15:11:18 -08:00
Charlie Marsh	24c0bab57b	infra, multiple: Upgrade configuration for Ruff v0.2.0 (#16905 ) ## Summary This PR upgrades LangChain's Ruff configuration in preparation for Ruff's v0.2.0 release. (The changes are compatible with Ruff v0.1.5, which LangChain uses today.) Specifically, we're now warning when linter-only options are specified under `[tool.ruff]` instead of `[tool.ruff.lint]`. --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-09 14:28:02 -08:00
Bagatur	01409add5a	google-vertexai[patch]: rm deps (#17077 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-09 14:12:10 -08:00
Erick Friis	1c2facf88d	nvidia-ai-endpoints[patch]: release 0.0.3 (#17345 )	2024-02-09 13:55:01 -08:00
Vadim Kudlay	5f9ac6986e	nvidia-ai-endpoints[patch]: model arguments (e.g. temperature) on construction bug (#17290 ) - Issue: Issue with model argument support (been there for a while actually): - Non-specially-handled arguments like temperature don't work when passed through constructor. - Such arguments DO work quite well with `bind`, but also do not abide by field requirements. - Since initial push, server-side error messages have gotten better and v0.0.2 raises better exceptions. So maybe it's better to let server-side handle such issues? - Description: - Removed ChatNVIDIA's argument fields in favor of `model_kwargs`/`model_kws` arguments which aggregates constructor kwargs (from constructor pathway) and merges them with call kwargs (bind pathway). - Shuffled a few functions from `_NVIDIAClient` to `ChatNVIDIA` to streamline construction for future integrations. - Minor/Optional: Old services didn't have stop support, so client-side stopping was implemented. Now do both. - Any Breaking Changes: Minor breaking changes if you strongly rely on chat_model.temperature, etc. This is captured by chat_model.model_kwargs. PR passes tests and example notebooks and example testing. Still gonna chat with some people, so leaving as draft for now. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-09 13:46:02 -08:00
Leonid Ganeline	932c52c333	community[patch]: docstrings (#16810 ) - added missed docstrings - formated docstrings to the consistent form	2024-02-09 12:48:57 -08:00
Leonid Ganeline	ae66bcbc10	core[patch]: docstring update (#16813 ) - added missed docstrings - formated docstrings to consistent form	2024-02-09 12:47:41 -08:00
Eugene Yurtsev	e10030e241	core[patch]: Add unit test to cover different streaming format for json parsing (#17063 ) Add unit test to cover this issue: https://github.com/langchain-ai/langchain/issues/16423 which was resolved by this PR: https://github.com/langchain-ai/langchain/pull/16670/files --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-09 11:28:55 -05:00
Kononov Pavel	15bc201967	langchain_community: Fix typo bug (#17324 ) Problem from #17095 This error wasn't in the v1.4.0	2024-02-09 11:27:33 -05:00
Erick Friis	e660a1685b	google-genai[patch]: release 0.0.8 (#17285 )	2024-02-08 19:39:44 -08:00
Erick Friis	febf9540b9	google-genai[patch]: fix tool format, use protos (#17284 )	2024-02-08 19:36:49 -08:00
German Martin	1032faba5f	langchain_google_genai : Add missing _identifying_params property. (#17224 ) Description: Missing _identifying_params create issues when dealing with callbacks to get current run model parameters. All other model partners implementation provide this property and also provide _default_params. I'm not sure about the default values to include or if we can re-use the same as for _VertexAICommon(), this change allows you to access the model parameters correctly. Issue: Not exactly this issue but could be related https://github.com/langchain-ai/langchain/issues/14711 Twitter handle:@musicaoriginal2	2024-02-08 17:40:21 -08:00
Erick Friis	e4da7918f3	google-genai[patch]: fix streaming, function calling (#17268 )	2024-02-08 17:29:53 -08:00
Ruben Hakopian	96b5711a0c	google-vertexai[patch]: Fixed SafetySettings handling in streaming API in VertexAI (#17278 ) The streaming API doesn't separate safety_settings from the generation_config payload. As the result the following error is observed when using `stream` API. The functionality is correct with `invoke` API. The fix separates the `safety_settings` from params and sets it as argument to the `send_message` method. ``` ERROR: Unknown field for GenerationConfig: safety_settings Traceback (most recent call last): File "/Users/user/Library/Caches/pypoetry/virtualenvs/chatbot-worker-main-Ju-qIM-X-py3.12/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 250, in stream raise e File "/Users/user/Library/Caches/pypoetry/virtualenvs/chatbot-worker-main-Ju-qIM-X-py3.12/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 234, in stream for chunk in self._stream( File "/Users/user/Library/Caches/pypoetry/virtualenvs/chatbot-worker-main-Ju-qIM-X-py3.12/lib/python3.12/site-packages/langchain_google_vertexai/chat_models.py", line 501, in _stream for response in responses: File "/Users/user/Library/Caches/pypoetry/virtualenvs/chatbot-worker-main-Ju-qIM-X-py3.12/lib/python3.12/site-packages/vertexai/generative_models/_generative_models.py", line 921, in _send_message_streaming for chunk in stream: File "/Users/user/Library/Caches/pypoetry/virtualenvs/chatbot-worker-main-Ju-qIM-X-py3.12/lib/python3.12/site-packages/vertexai/generative_models/_generative_models.py", line 514, in _generate_content_streaming request = self._prepare_request( ^^^^^^^^^^^^^^^^^^^^^^ File "/Users/user/Library/Caches/pypoetry/virtualenvs/chatbot-worker-main-Ju-qIM-X-py3.12/lib/python3.12/site-packages/vertexai/generative_models/_generative_models.py", line 256, in _prepare_request gapic_generation_config = gapic_content_types.GenerationConfig( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/user/Library/Caches/pypoetry/virtualenvs/chatbot-worker-main-Ju-qIM-X-py3.12/lib/python3.12/site-packages/proto/message.py", line 576, in __init__ raise ValueError( ValueError: Unknown field for GenerationConfig: safety_settings ``` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-08 17:25:28 -08:00
Bagatur	65e97c9b53	infra: mv SQLDatabase tests to community (#17276 )	2024-02-08 17:05:43 -08:00
Bagatur	72c7af0bc0	langchain[patch]: undo redis cache import (#17275 )	2024-02-08 16:39:55 -08:00
Bagatur	8bad4157ad	langchain[patch]: Release 0.1.6 (#17133 )	2024-02-08 16:25:06 -08:00
Bagatur	7fa4dc593f	core[patch]: Release 0.1.22 (#17274 )	2024-02-08 16:13:33 -08:00
Bagatur	02ef9164b5	langchain[patch]: expose cohere rerank score, add parent doc param (#16887 )	2024-02-08 16:07:18 -08:00
Bagatur	35c1bf339d	infra: rm boto3, gcaip from pyproject (#17270 )	2024-02-08 15:28:22 -08:00
Alex	de5e96b5f9	community[patch]: updated openai prices in mapping (#17009 ) - Description: there are january prices update for chatgpt [blog](https://openai.com/blog/new-embedding-models-and-api-updates), also there are updates on their website on page [pricing](https://openai.com/pricing) - Issue: N/A --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-08 14:43:44 -08:00
Mohammad Mohtashim	e35c7fa3b2	[Langchain_core]: Added Docstring for RunnableConfigurableAlternatives (#17263 ) I noticed that RunnableConfigurableAlternatives which is an important composition in LCEL has no Docstring. Therefore I added the detailed Docstring for it. @baskaryan, @eyurtsev, @hwchase17 please have a look and let me if the docstring is looking good. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-08 17:05:33 -05:00
Armin Stepanyan	641efcf41c	community: add runtime kwargs to HuggingFacePipeline (#17005 ) This PR enables changing the behaviour of huggingface pipeline between different calls. For example, before this PR there's no way of changing maximum generation length between different invocations of the chain. This is desirable in cases, such as when we want to scale the maximum output size depending on a dynamic prompt size. Usage example: ```python from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline model_id = "gpt2" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id) pipe = pipeline("text-generation", model=model, tokenizer=tokenizer) hf = HuggingFacePipeline(pipeline=pipe) hf("Say foo:", pipeline_kwargs={"max_new_tokens": 42}) ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-08 13:58:31 -08:00
Scott Nath	a32798abd7	community: Add you.com utility, update you retriever integration docs (#17014 ) <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> - Description: changes to you.com files - general cleanup - adds community/utilities/you.py, moving bulk of code from retriever -> utility - removes `snippet` as endpoint - adds `news` as endpoint - adds more tests <s>Description: update community MAKE file - adds `integration_tests` - adds `coverage`</s> - Issue: the issue # it fixes if applicable, - [For New Contributors: Update Integration Documentation](https://github.com/langchain-ai/langchain/issues/15664#issuecomment-1920099868) - Dependencies: n/a - Twitter handle: @scottnath - Mastodon handle: scottnath@mastodon.social --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-08 13:47:50 -08:00
joelsprunger	3984f6604f	langchain: adds recursive json splitter (#17144 ) - Description: This adds a recursive json splitter class to the existing text_splitters as well as unit tests - Issue: splitting text from structured data can cause issues if you have a large nested json object and you split it as regular text you may end up losing the structure of the json. To mitigate against this you can split the nested json into large chunks and overlap them, but this causes unnecessary text processing and there will still be times where the nested json is so big that the chunks get separated from the parent keys. As an example you wouldn't want the following to be split in half: ```shell {'val0': 'DFWeNdWhapbR', 'val1': {'val10': 'QdJo', 'val11': 'FWSDVFHClW', 'val12': 'bkVnXMMlTiQh', 'val13': 'tdDMKRrOY', 'val14': 'zybPALvL', 'val15': 'JMzGMNH', 'val16': {'val160': 'qLuLKusFw', 'val161': 'DGuotLh', 'val162': 'KztlcSBropT', -----------------------------------------------------------------------split----- 'val163': 'YlHHDrN', 'val164': 'CtzsxlGBZKf', 'val165': 'bXzhcrWLmBFp', 'val166': 'zZAqC', 'val167': 'ZtyWno', 'val168': 'nQQZRsLnaBhb', 'val169': 'gSpMbJwA'}, 'val17': 'JhgiyF', 'val18': 'aJaqjUSFFrI', 'val19': 'glqNSvoyxdg'}} ``` Any llm processing the second chunk of text may not have the context of val1, and val16 reducing accuracy. Embeddings will also lack this context and this makes retrieval less accurate. Instead you want it to be split into chunks that retain the json structure. ```shell {'val0': 'DFWeNdWhapbR', 'val1': {'val10': 'QdJo', 'val11': 'FWSDVFHClW', 'val12': 'bkVnXMMlTiQh', 'val13': 'tdDMKRrOY', 'val14': 'zybPALvL', 'val15': 'JMzGMNH', 'val16': {'val160': 'qLuLKusFw', 'val161': 'DGuotLh', 'val162': 'KztlcSBropT', 'val163': 'YlHHDrN', 'val164': 'CtzsxlGBZKf'}}} ``` and ```shell {'val1':{'val16':{ 'val165': 'bXzhcrWLmBFp', 'val166': 'zZAqC', 'val167': 'ZtyWno', 'val168': 'nQQZRsLnaBhb', 'val169': 'gSpMbJwA'}, 'val17': 'JhgiyF', 'val18': 'aJaqjUSFFrI', 'val19': 'glqNSvoyxdg'}} ``` This recursive json text splitter does this. Values that contain a list can be converted to dict first by using split(... convert_lists=True) otherwise long lists will not be split and you may end up with chunks larger than the max chunk. In my testing large json objects could be split into small chunks with ✅ Increased question answering accuracy ✅ The ability to split into smaller chunks meant retrieval queries can use fewer tokens - Dependencies: json import added to text_splitter.py, and random added to the unit test - Twitter handle: @joelsprunger --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-02-08 13:45:34 -08:00
Leonid Kuligin	1862900078	google-genai[patch]: added parsing of function call / response (#17245 )	2024-02-08 13:34:46 -08:00
Cailin Wang	a210a8bc53	langchain[patch]: Fix create_retriever_tool missing on_retriever_end Document content (#16933 ) - Description: In create_retriever_tool create_tool, fix create_retriever_tool's missing Document content for on_retriever_end, caused by create_retriever_tool's missing callbacks parameter, - Twitter handle: @CailinWang_ --------- Co-authored-by: root <root@Bluedot-AI> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-08 13:18:43 -08:00
Sparsh Jain	a2167614b7	google-genai[patch]: Invoke callback prior to yielding token (#17092 ) - Description: Invoke callback prior to yielding token in stream and astream methods for Google-genai, - Issue: the issue # 16913, - Twitter handle: Sparsh10649446 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-08 13:13:46 -08:00
Liang Zhang	7306600e2f	community[patch]: Support SerDe transform functions in Databricks LLM (#16752 ) Description: Databricks LLM does not support SerDe the transform_input_fn and transform_output_fn. After saving and loading, the LLM will be broken. This PR serialize these functions into a hex string using pickle, and saving the hex string in the yaml file. Using pickle to serialize a function can be flaky, but this is a simple workaround that unblocks many use cases. If more sophisticated SerDe is needed, we can improve it later. Test: Added a simple unit test. I did manual test on Databricks and it works well. The saved yaml looks like: ``` llm: _type: databricks cluster_driver_port: null cluster_id: null databricks_uri: databricks endpoint_name: databricks-mixtral-8x7b-instruct extra_params: {} host: e2-dogfood.staging.cloud.databricks.com max_tokens: null model_kwargs: null n: 1 stop: null task: null temperature: 0.0 transform_input_fn: 80049520000000000000008c085f5f6d61696e5f5f948c0f7472616e73666f726d5f696e7075749493942e transform_output_fn: null ``` @baskaryan ```python from langchain_community.embeddings import DatabricksEmbeddings from langchain_community.llms import Databricks from langchain.chains import RetrievalQA from langchain.document_loaders import TextLoader from langchain.text_splitter import CharacterTextSplitter from langchain.vectorstores import FAISS import mlflow embeddings = DatabricksEmbeddings(endpoint="databricks-bge-large-en") def transform_input(**request): request["messages"] = [ { "role": "user", "content": request["prompt"] } ] del request["prompt"] return request llm = Databricks(endpoint_name="databricks-mixtral-8x7b-instruct", transform_input_fn=transform_input) persist_dir = "faiss_databricks_embedding" # Create the vector db, persist the db to a local fs folder loader = TextLoader("state_of_the_union.txt") documents = loader.load() text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0) docs = text_splitter.split_documents(documents) db = FAISS.from_documents(docs, embeddings) db.save_local(persist_dir) def load_retriever(persist_directory): embeddings = DatabricksEmbeddings(endpoint="databricks-bge-large-en") vectorstore = FAISS.load_local(persist_directory, embeddings) return vectorstore.as_retriever() retriever = load_retriever(persist_dir) retrievalQA = RetrievalQA.from_llm(llm=llm, retriever=retriever) with mlflow.start_run() as run: logged_model = mlflow.langchain.log_model( retrievalQA, artifact_path="retrieval_qa", loader_fn=load_retriever, persist_dir=persist_dir, ) # Load the retrievalQA chain loaded_model = mlflow.pyfunc.load_model(logged_model.model_uri) print(loaded_model.predict([{"query": "What did the president say about Ketanji Brown Jackson"}])) ```	2024-02-08 13:09:50 -08:00
cjpark-data	ce22e10c4b	community[patch]: Fix KeyError 'embedding' (MongoDBAtlasVectorSearch) (#17178 ) - Description: Embedding field name was hard-coded named "embedding". So I suggest that change `res["embedding"]` into `res[self._embedding_key]`. - Issue: #17177, - Twitter handle: [@bagcheoljun17](https://twitter.com/bagcheoljun17)	2024-02-08 12:06:42 -08:00
Neli Hateva	9bb5157a3d	langchain[patch], community[patch]: Fixes in the Ontotext GraphDB Graph and QA Chain (#17239 ) - Description: Fixes in the Ontotext GraphDB Graph and QA Chain related to the error handling in case of invalid SPARQL queries, for which `prepareQuery` doesn't throw an exception, but the server returns 400 and the query is indeed invalid - Issue: N/A - Dependencies: N/A - Twitter handle: @OntotextGraphDB	2024-02-08 12:05:43 -08:00
ByeongUk Choi	b88329e9a5	community[patch]: Implement Unique ID Enforcement in FAISS (#17244 ) Description: Implemented unique ID validation in the FAISS component to ensure all document IDs are distinct. This update resolves issues related to non-unique IDs, such as inconsistent behavior during deletion processes.	2024-02-08 12:03:33 -08:00
Bagatur	852973d616	langchain[minor], core[minor]: update json, pydantic parser. add openai-json structured output runnable (#16914 )	2024-02-08 11:59:06 -08:00
hsuyuming	e22c4d4eb0	google-vertexai[patch]: fix _parse_response_candidate issue (#16647 ) Description: enable _parse_response_candidate to support complex structure format. Issue: currently, if Gemini response complex args format, people will get "TypeError: Object of type RepeatedComposite is not JSON serializable" error from _parse_response_candidate. response candidate example ``` content { role: "model" parts { function_call { name: "Information" args { fields { key: "people" value { list_value { values { string_value: "Joe is 30, his mom is Martha" } } } } } } } } finish_reason: STOP safety_ratings { category: HARM_CATEGORY_HARASSMENT probability: NEGLIGIBLE } safety_ratings { category: HARM_CATEGORY_HATE_SPEECH probability: NEGLIGIBLE } safety_ratings { category: HARM_CATEGORY_SEXUALLY_EXPLICIT probability: NEGLIGIBLE } safety_ratings { category: HARM_CATEGORY_DANGEROUS_CONTENT probability: NEGLIGIBLE } ``` error msg: ``` Traceback (most recent call last): File "/home/jupyter/user/abehsu/gemini_langchain_tools/example2.py", line 36, in <module> print(tagging_chain.invoke({"input": "Joe is 30, his mom is Martha"})) File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 2053, in invoke input = step.invoke( File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 3887, in invoke return self.bound.invoke( File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 165, in invoke self.generate_prompt( File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 543, in generate_prompt return self.generate(prompt_messages, stop=stop, callbacks=callbacks, kwargs) File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 407, in generate raise e File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 397, in generate self._generate_with_cache( File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 576, in _generate_with_cache return self._generate( File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/site-packages/langchain_google_vertexai/chat_models.py", line 406, in _generate generations = [ File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/site-packages/langchain_google_vertexai/chat_models.py", line 408, in <listcomp> message=_parse_response_candidate(c), File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/site-packages/langchain_google_vertexai/chat_models.py", line 280, in _parse_response_candidate function_call["arguments"] = json.dumps( File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/json/__init__.py", line 231, in dumps return _default_encoder.encode(obj) File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/json/encoder.py", line 199, in encode chunks = self.iterencode(o, _one_shot=True) File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/json/encoder.py", line 257, in iterencode return _iterencode(o, 0) File "/opt/conda/envs/gemini_langchain_tools/lib/python3.10/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type RepeatedComposite is not JSON serializable ``` Twitter handle:** @abehsu1992626	2024-02-08 11:48:25 -08:00
Erick Friis	d77bb7b4e9	google-vertexai[patch]: integration test fix, release 0.0.5 (#17258 )	2024-02-08 11:45:33 -08:00
Aditya	98176ac982	langchain_google_vertexai : added logic to override get_num_tokens_from_messages() for ChatVertexAI (#16784 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: added logic to override get_num_tokens_from_messages() for ChatVertexAI. Currently ChatVertexAI was inheriting get_num_tokens_from_messages() from BaseChatModel which in-turn was calling GPT-2 tokenizer - Issue: NA - Dependencies: NA - Twitter handle:@aditya_rane @lkuligin for review --------- Co-authored-by: adityarane@google.com <adityarane@google.com> Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru>	2024-02-08 11:30:42 -08:00
Bassem Yacoube	4e3ed7f043	community[patch]: octoai embeddings bug fix (#17216 ) fixes a bug in octoa_embeddings provider	2024-02-07 22:25:52 -05:00
Eugene Yurtsev	780e84ae79	community[minor]: SQLDatabase Add fetch mode `cursor`, query parameters, query by selectable, expose execution options, and documentation (#17191 ) - Description: Improve `SQLDatabase` adapter component to promote code re-use, see [suggestion](https://github.com/langchain-ai/langchain/pull/16246#pullrequestreview-1846590962). - Needed by: GH-16246 - Addressed to: @baskaryan, @cbornet ## Details - Add `cursor` fetch mode - Accept SQL query parameters - Accept both `str` and SQLAlchemy selectables as query expression - Expose `execution_options` - Documentation page (notebook) about `SQLDatabase` [^1] See [About SQLDatabase](https://github.com/langchain-ai/langchain/blob/c1c7b763/docs/docs/integrations/tools/sql_database.ipynb). [^1]: Apparently there hasn't been any yet? --------- Co-authored-by: Andreas Motl <andreas.motl@crate.io>	2024-02-07 22:23:43 -05:00
Tomaz Bratanic	7e4b676d53	community[patch]: Better error propagation for neo4jgraph (#17190 ) There are other errors that could happen when refreshing the schema, so we want to propagate specific errors for more clarity	2024-02-07 22:16:14 -05:00
Luiz Ferreira	34d2daffb3	community[patch]: Fix chat openai unit test (#17124 ) - Description: Actually the test named `test_openai_apredict` isn't testing the apredict method from ChatOpenAI. - Twitter handle: https://twitter.com/OAlmofadas	2024-02-07 22:08:26 -05:00
Dmitry Kankalovich	f92738a6f6	langchain[minor], community[minor], core[minor]: Async Cache support and AsyncRedisCache (#15817 ) * This PR adds async methods to the LLM cache. * Adds an implementation using Redis called AsyncRedisCache. * Adds a docker compose file at the /docker to help spin up docker * Updates redis tests to use a context manager so flushing always happens by default	2024-02-07 22:06:09 -05:00
Erick Friis	4153837502	google-genai[patch]: release 0.0.7 (#17193 )	2024-02-07 17:15:09 -08:00
Erick Friis	927ab77d6e	google-genai[patch]: no error for FunctionMessage (#17215 ) Both should eventually match this: https://github.com/langchain-ai/langchain/blob/master/libs/partners/google-vertexai/langchain_google_vertexai/chat_models.py#L179 But seems undocumented / can't find types in genai package	2024-02-07 17:14:50 -08:00
Erick Friis	2ecf318218	google-genai[patch]: match function call interface (#17213 ) should match vertex	2024-02-07 17:07:31 -08:00
Erick Friis	e17173c403	google-vertexai[patch]: function calling integration test (#17209 )	2024-02-07 15:49:56 -08:00
Erick Friis	52be84a603	google-vertexai[patch]: serializable citation metadata, release 0.0.4 (#17145 ) was breaking in langserve before	2024-02-07 15:47:32 -08:00
Nuno Campos	19ff81e74f	Fix stream events/log with some kinds of non addable output (#17205 ) <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2024-02-07 15:46:13 -08:00
Bagatur	6f1403b9b6	community[patch]: Release 0.0.19 (#17207 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-07 15:37:01 -08:00
Erick Friis	a13dc47a08	cli[patch]: copyright 2024 default (#17204 )	2024-02-07 14:52:37 -08:00
Bagatur	00757567ba	core[patch]: Release 0.1.21 (#17202 )	2024-02-07 14:20:20 -08:00
Bagatur	af74301ab9	core[patch], community[patch]: link extraction continue on failure (#17200 )	2024-02-07 14:15:30 -08:00
Henry	2281f00198	langchain: Standardize `output_parser.py` across all agent types for custom `FORMAT_INSTRUCTIONS` (#17168 ) - Description: This PR standardizes the `output_parser.py` file across all agent types to ensure a uniform parsing mechanism is implemented. It introduces a cohesive structure and common interface for output parsing, facilitating easier modifications and extensions by users. The standardized approach enhances maintainability and scalability of the codebase by providing a consistent pattern for output parsing, which can be easily understood and utilized across different agent types. This PR builds upon the foundation set by a previously merged PR, which focused exclusively on standardizing the `output_parser.py` for the `conversational_agent` ([PR #16945](https://github.com/langchain-ai/langchain/pull/16945)). With this new update, I extend the standardization efforts to encompass `output_parser.py` files across all agent types. This enhancement not only unifies the parsing mechanism across the board but also introduces the flexibility for users to incorporate custom `FORMAT_INSTRUCTIONS`. - Issue: https://github.com/langchain-ai/langchain/issues/10721 https://github.com/langchain-ai/langchain/issues/4044 - Dependencies: No new dependencies required for this change - Twitter handle: With my github user is enough. Thanks I hope you accept my PR.	2024-02-07 13:46:17 -08:00
Bagatur	78409634fe	core[patch]: Release 0.1.20 (#17194 )	2024-02-07 12:28:05 -08:00
Nuno Campos	65798289a4	core[minor]: Use batched tracing in sdk (#16305 ) Remove threadpool executor usage in langchain tracer, this is now handled by sdk	2024-02-07 12:10:58 -08:00
chyroc	f87b38a559	google-genai[minor]: support functions call (#15146 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-07 12:09:30 -08:00
Tomaz Bratanic	302989a2b1	allow optional newline in the action responses of JSON Agent parser (#17186 ) Based on my experiments, the newline isn't always there, so we can make the regex slightly more robust by allowing an optional newline after the bacticks	2024-02-07 10:26:14 -08:00
William FH	9fa07076da	Add trace_as_chain_group metadata (#17187 )	2024-02-07 09:42:44 -08:00

... 5 6 7 8 9 ...

3423 Commits