langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-06 03:20:49 +00:00

Author	SHA1	Message	Date
Petteri Johansson	6c1989d292	community[minor], langchain[minor], docs: Gremlin Graph Store and QA Chain (#17683 ) - Description: New feature: Gremlin graph-store and QA chain (including docs). Compatible with Azure CosmosDB. - Dependencies: no changes	2024-03-01 12:21:14 -08:00
Ather Fawaz	a5ccf5d33c	community[minor]: Add support for Perplexity chat model(#17024 ) - Description: This PR adds support for [Perplexity AI APIs](https://blog.perplexity.ai/blog/introducing-pplx-api). - Issues: None - Dependencies: None - Twitter handle: [@atherfawaz](https://twitter.com/AtherFawaz) --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-03-01 12:19:23 -08:00
Rodrigo Nogueira	3438d2cbcc	community[minor]: add maritalk chat (#17675 ) Description: Adds the MariTalk chat that is based on a LLM specially trained for Portuguese. Twitter handle: @MaritacaAI	2024-03-01 12:18:23 -08:00
sarahberenji	08fa38d56d	community[patch]: the syntax error for Redis generated query (#17717 ) To fix the reported error: https://github.com/langchain-ai/langchain/discussions/17397	2024-03-01 12:18:10 -08:00
certified-dodo	43e3244573	community[patch]: Fix MongoDBAtlasVectorSearch max_marginal_relevance_search (#17971 ) Description: * `self._embedding_key` is accessed after deletion, breaking `max_marginal_relevance_search` search * Introduced in: `e135e5257c` * Updated but still persists in: `ce22e10c4b` Issue: https://github.com/langchain-ai/langchain/issues/17963 Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-01 12:17:42 -08:00
Nikita Titov	9f2ab37162	community[patch]: don't try to parse json in case of errored response (#18317 ) Related issue: #13896. In case Ollama is behind a proxy, proxy error responses cannot be viewed. You aren't even able to check response code. For example, if your Ollama has basic access authentication and it's not passed, `JSONDecodeError` will overwrite the truth response error. <details> <summary><b>Log now:</b></summary> ``` { "name": "JSONDecodeError", "message": "Expecting value: line 1 column 1 (char 0)", "stack": "--------------------------------------------------------------------------- JSONDecodeError Traceback (most recent call last) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/requests/models.py:971, in Response.json(self, kwargs) 970 try: --> 971 return complexjson.loads(self.text, kwargs) 972 except JSONDecodeError as e: 973 # Catch JSON-related errors and raise as requests.JSONDecodeError 974 # This aliases json.JSONDecodeError and simplejson.JSONDecodeError File /opt/miniforge3/envs/.gpt/lib/python3.10/json/__init__.py:346, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, kw) 343 if (cls is None and object_hook is None and 344 parse_int is None and parse_float is None and 345 parse_constant is None and object_pairs_hook is None and not kw): --> 346 return _default_decoder.decode(s) 347 if cls is None: File /opt/miniforge3/envs/.gpt/lib/python3.10/json/decoder.py:337, in JSONDecoder.decode(self, s, _w) 333 \"\"\"Return the Python representation of ``s`` (a ``str`` instance 334 containing a JSON document). 335 336 \"\"\" --> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end()) 338 end = _w(s, end).end() File /opt/miniforge3/envs/.gpt/lib/python3.10/json/decoder.py:355, in JSONDecoder.raw_decode(self, s, idx) 354 except StopIteration as err: --> 355 raise JSONDecodeError(\"Expecting value\", s, err.value) from None 356 return obj, end JSONDecodeError: Expecting value: line 1 column 1 (char 0) During handling of the above exception, another exception occurred: JSONDecodeError Traceback (most recent call last) Cell In[3], line 1 ----> 1 print(translate_func().invoke('text')) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/runnables/base.py:2053, in RunnableSequence.invoke(self, input, config) 2051 try: 2052 for i, step in enumerate(self.steps): -> 2053 input = step.invoke( 2054 input, 2055 # mark each step as a child run 2056 patch_config( 2057 config, callbacks=run_manager.get_child(f\"seq:step:{i+1}\") 2058 ), 2059 ) 2060 # finish the root run 2061 except BaseException as e: File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:165, in BaseChatModel.invoke(self, input, config, stop, kwargs) 154 def invoke( 155 self, 156 input: LanguageModelInput, (...) 160 kwargs: Any, 161 ) -> BaseMessage: 162 config = ensure_config(config) 163 return cast( 164 ChatGeneration, --> 165 self.generate_prompt( 166 [self._convert_input(input)], 167 stop=stop, 168 callbacks=config.get(\"callbacks\"), 169 tags=config.get(\"tags\"), 170 metadata=config.get(\"metadata\"), 171 run_name=config.get(\"run_name\"), 172 kwargs, 173 ).generations[0][0], 174 ).message File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:543, in BaseChatModel.generate_prompt(self, prompts, stop, callbacks, kwargs) 535 def generate_prompt( 536 self, 537 prompts: List[PromptValue], (...) 540 kwargs: Any, 541 ) -> LLMResult: 542 prompt_messages = [p.to_messages() for p in prompts] --> 543 return self.generate(prompt_messages, stop=stop, callbacks=callbacks, kwargs) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:407, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, kwargs) 405 if run_managers: 406 run_managers[i].on_llm_error(e, response=LLMResult(generations=[])) --> 407 raise e 408 flattened_outputs = [ 409 LLMResult(generations=[res.generations], llm_output=res.llm_output) 410 for res in results 411 ] 412 llm_output = self._combine_llm_outputs([res.llm_output for res in results]) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:397, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, kwargs) 394 for i, m in enumerate(messages): 395 try: 396 results.append( --> 397 self._generate_with_cache( 398 m, 399 stop=stop, 400 run_manager=run_managers[i] if run_managers else None, 401 kwargs, 402 ) 403 ) 404 except BaseException as e: 405 if run_managers: File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:576, in BaseChatModel._generate_with_cache(self, messages, stop, run_manager, kwargs) 572 raise ValueError( 573 \"Asked to cache, but no cache found at `langchain.cache`.\" 574 ) 575 if new_arg_supported: --> 576 return self._generate( 577 messages, stop=stop, run_manager=run_manager, kwargs 578 ) 579 else: 580 return self._generate(messages, stop=stop, kwargs) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py:250, in ChatOllama._generate(self, messages, stop, run_manager, kwargs) 226 def _generate( 227 self, 228 messages: List[BaseMessage], (...) 231 kwargs: Any, 232 ) -> ChatResult: 233 \"\"\"Call out to Ollama's generate endpoint. 234 235 Args: (...) 247 ]) 248 \"\"\" --> 250 final_chunk = self._chat_stream_with_aggregation( 251 messages, 252 stop=stop, 253 run_manager=run_manager, 254 verbose=self.verbose, 255 kwargs, 256 ) 257 chat_generation = ChatGeneration( 258 message=AIMessage(content=final_chunk.text), 259 generation_info=final_chunk.generation_info, 260 ) 261 return ChatResult(generations=[chat_generation]) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py:183, in ChatOllama._chat_stream_with_aggregation(self, messages, stop, run_manager, verbose, kwargs) 174 def _chat_stream_with_aggregation( 175 self, 176 messages: List[BaseMessage], (...) 180 kwargs: Any, 181 ) -> ChatGenerationChunk: 182 final_chunk: Optional[ChatGenerationChunk] = None --> 183 for stream_resp in self._create_chat_stream(messages, stop, kwargs): 184 if stream_resp: 185 chunk = _chat_stream_response_to_chat_generation_chunk(stream_resp) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py:156, in ChatOllama._create_chat_stream(self, messages, stop, kwargs) 147 def _create_chat_stream( 148 self, 149 messages: List[BaseMessage], 150 stop: Optional[List[str]] = None, 151 kwargs: Any, 152 ) -> Iterator[str]: 153 payload = { 154 \"messages\": self._convert_messages_to_ollama_messages(messages), 155 } --> 156 yield from self._create_stream( 157 payload=payload, stop=stop, api_url=f\"{self.base_url}/api/chat/\", kwargs 158 ) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_community/llms/ollama.py:234, in _OllamaCommon._create_stream(self, api_url, payload, stop, kwargs) 228 raise OllamaEndpointNotFoundError( 229 \"Ollama call failed with status code 404. \" 230 \"Maybe your model is not found \" 231 f\"and you should pull the model with `ollama pull {self.model}`.\" 232 ) 233 else: --> 234 optional_detail = response.json().get(\"error\") 235 raise ValueError( 236 f\"Ollama call failed with status code {response.status_code}.\" 237 f\" Details: {optional_detail}\" 238 ) 239 return response.iter_lines(decode_unicode=True) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/requests/models.py:975, in Response.json(self, kwargs) 971 return complexjson.loads(self.text, kwargs) 972 except JSONDecodeError as e: 973 # Catch JSON-related errors and raise as requests.JSONDecodeError 974 # This aliases json.JSONDecodeError and simplejson.JSONDecodeError --> 975 raise RequestsJSONDecodeError(e.msg, e.doc, e.pos) JSONDecodeError: Expecting value: line 1 column 1 (char 0)" } ``` </details> <details> <summary><b>Log after a fix:</b></summary> ``` { "name": "ValueError", "message": "Ollama call failed with status code 401. Details: <html>\r <head><title>401 Authorization Required</title></head>\r <body>\r <center><h1>401 Authorization Required</h1></center>\r <hr><center>nginx/1.18.0 (Ubuntu)</center>\r </body>\r </html>\r ", "stack": "--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[2], line 1 ----> 1 print(translate_func().invoke('text')) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/runnables/base.py:2053, in RunnableSequence.invoke(self, input, config) 2051 try: 2052 for i, step in enumerate(self.steps): -> 2053 input = step.invoke( 2054 input, 2055 # mark each step as a child run 2056 patch_config( 2057 config, callbacks=run_manager.get_child(f\"seq:step:{i+1}\") 2058 ), 2059 ) 2060 # finish the root run 2061 except BaseException as e: File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:165, in BaseChatModel.invoke(self, input, config, stop, kwargs) 154 def invoke( 155 self, 156 input: LanguageModelInput, (...) 160 kwargs: Any, 161 ) -> BaseMessage: 162 config = ensure_config(config) 163 return cast( 164 ChatGeneration, --> 165 self.generate_prompt( 166 [self._convert_input(input)], 167 stop=stop, 168 callbacks=config.get(\"callbacks\"), 169 tags=config.get(\"tags\"), 170 metadata=config.get(\"metadata\"), 171 run_name=config.get(\"run_name\"), 172 kwargs, 173 ).generations[0][0], 174 ).message File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:543, in BaseChatModel.generate_prompt(self, prompts, stop, callbacks, kwargs) 535 def generate_prompt( 536 self, 537 prompts: List[PromptValue], (...) 540 kwargs: Any, 541 ) -> LLMResult: 542 prompt_messages = [p.to_messages() for p in prompts] --> 543 return self.generate(prompt_messages, stop=stop, callbacks=callbacks, kwargs) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:407, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, kwargs) 405 if run_managers: 406 run_managers[i].on_llm_error(e, response=LLMResult(generations=[])) --> 407 raise e 408 flattened_outputs = [ 409 LLMResult(generations=[res.generations], llm_output=res.llm_output) 410 for res in results 411 ] 412 llm_output = self._combine_llm_outputs([res.llm_output for res in results]) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:397, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, kwargs) 394 for i, m in enumerate(messages): 395 try: 396 results.append( --> 397 self._generate_with_cache( 398 m, 399 stop=stop, 400 run_manager=run_managers[i] if run_managers else None, 401 kwargs, 402 ) 403 ) 404 except BaseException as e: 405 if run_managers: File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:576, in BaseChatModel._generate_with_cache(self, messages, stop, run_manager, kwargs) 572 raise ValueError( 573 \"Asked to cache, but no cache found at `langchain.cache`.\" 574 ) 575 if new_arg_supported: --> 576 return self._generate( 577 messages, stop=stop, run_manager=run_manager, kwargs 578 ) 579 else: 580 return self._generate(messages, stop=stop, kwargs) File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py:250, in ChatOllama._generate(self, messages, stop, run_manager, kwargs) 226 def _generate( 227 self, 228 messages: List[BaseMessage], (...) 231 kwargs: Any, 232 ) -> ChatResult: 233 \"\"\"Call out to Ollama's generate endpoint. 234 235 Args: (...) 247 ]) 248 \"\"\" --> 250 final_chunk = self._chat_stream_with_aggregation( 251 messages, 252 stop=stop, 253 run_manager=run_manager, 254 verbose=self.verbose, 255 kwargs, 256 ) 257 chat_generation = ChatGeneration( 258 message=AIMessage(content=final_chunk.text), 259 generation_info=final_chunk.generation_info, 260 ) 261 return ChatResult(generations=[chat_generation]) File /storage/gpt-project/Repos/repo_nikita/gpt_lib/langchain/ollama.py:328, in ChatOllamaCustom._chat_stream_with_aggregation(self, messages, stop, run_manager, verbose, kwargs) 319 def _chat_stream_with_aggregation( 320 self, 321 messages: List[BaseMessage], (...) 325 kwargs: Any, 326 ) -> ChatGenerationChunk: 327 final_chunk: Optional[ChatGenerationChunk] = None --> 328 for stream_resp in self._create_chat_stream(messages, stop, kwargs): 329 if stream_resp: 330 chunk = _chat_stream_response_to_chat_generation_chunk(stream_resp) File /storage/gpt-project/Repos/repo_nikita/gpt_lib/langchain/ollama.py:301, in ChatOllamaCustom._create_chat_stream(self, messages, stop, kwargs) 292 def _create_chat_stream( 293 self, 294 messages: List[BaseMessage], 295 stop: Optional[List[str]] = None, 296 kwargs: Any, 297 ) -> Iterator[str]: 298 payload = { 299 \"messages\": self._convert_messages_to_ollama_messages(messages), 300 } --> 301 yield from self._create_stream( 302 payload=payload, stop=stop, api_url=f\"{self.base_url}/api/chat\", kwargs 303 ) File /storage/gpt-project/Repos/repo_nikita/gpt_lib/langchain/ollama.py:134, in _OllamaCommonCustom._create_stream(self, api_url, payload, stop, **kwargs) 132 else: 133 optional_detail = response.text --> 134 raise ValueError( 135 f\"Ollama call failed with status code {response.status_code}.\" 136 f\" Details: {optional_detail}\" 137 ) 138 return response.iter_lines(decode_unicode=True) ValueError: Ollama call failed with status code 401. Details: <html>\r <head><title>401 Authorization Required</title></head>\r <body>\r <center><h1>401 Authorization Required</h1></center>\r <hr><center>nginx/1.18.0 (Ubuntu)</center>\r </body>\r </html>\r " } ``` </details> The same is true for timeout errors or when you simply mistyped in `base_url` arg and get response from some other service, for instance. Real Ollama errors are still clearly readable: ``` ValueError: Ollama call failed with status code 400. Details: {"error":"invalid options: unknown_option"} ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-01 12:17:29 -08:00
Yudhajit Sinha	e2b901c35b	community[patch]: chat message histrory mypy fix (#18250 ) Description: Fixed type: ignore's for mypy for chat_message_histories(streamlit) Adresses #17048 Planning to add more based on reviews	2024-03-01 12:17:18 -08:00
Hemslo Wang	58a2abf089	community[patch]: fix RecursiveUrlLoader metadata_extractor return type (#18193 ) Description: Fix `metadata_extractor` type for `RecursiveUrlLoader`, the default `_metadata_extractor` returns `dict` instead of `str`. Issue: N/A Dependencies: N/A Twitter handle: N/A Signed-off-by: Hemslo Wang <hemslo.wang@gmail.com>	2024-03-01 12:08:20 -08:00
Maxime Perrin	98380cff9b	community[patch]: removing "response_mode" parameter in llama_index retriever (#18180 ) - Description: Removing this line ```python response = index.query(query, response_mode="no_text", self.query_kwargs) ``` to ```python response = index.query(query, self.query_kwargs) ``` Since llama index query does not support response_mode anymore : ``` \| TypeError: BaseQueryEngine.query() got an unexpected keyword argument 'response_mode'```` - Twitter handle: @maximeperrin_ --------- Co-authored-by: Maxime Perrin <mperrin@doing.fr>	2024-03-01 12:05:09 -08:00
Christophe Bornet	177f51c7bd	community: Use default load() implementation in doc loaders (#18385 ) Following https://github.com/langchain-ai/langchain/pull/18289	2024-03-01 14:46:52 -05:00
mwmajewsk	e192f6b6eb	community[patch]: fix, better error message in deeplake vectoriser (#18397 ) If the document loader recieves Pathlib path instead of str, it reads the file correctly, but the problem begins when the document is added to Deeplake. This problem arises from casting the path to str in the metadata. ```python deeplake = True fname = Path('./lorem_ipsum.txt') loader = TextLoader(fname, encoding="utf-8") docs = loader.load_and_split() text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=100) chunks= text_splitter.split_documents(docs) if deeplake: db = DeepLake(dataset_path=ds_path, embedding=embeddings, token=activeloop_token) db.add_documents(chunks) else: db = Chroma.from_documents(docs, embeddings) ``` So using this snippet of code the error message for deeplake looks like this: ``` [part of error message omitted] Traceback (most recent call last): File "/home/mwm/repositories/sources/fixing_langchain/main.py", line 53, in <module> db.add_documents(chunks) File "/home/mwm/repositories/sources/langchain/libs/core/langchain_core/vectorstores.py", line 139, in add_documents return self.add_texts(texts, metadatas, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/mwm/repositories/sources/langchain/libs/community/langchain_community/vectorstores/deeplake.py", line 258, in add_texts return self.vectorstore.add( ^^^^^^^^^^^^^^^^^^^^^ File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/vectorstore/deeplake_vectorstore.py", line 226, in add return self.dataset_handler.add( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/vectorstore/dataset_handlers/client_side_dataset_handler.py", line 139, in add dataset_utils.extend_or_ingest_dataset( File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/vectorstore/vector_search/dataset/dataset.py", line 544, in extend_or_ingest_dataset extend( File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/vectorstore/vector_search/dataset/dataset.py", line 505, in extend dataset.extend(batched_processed_tensors, progressbar=False) File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/dataset/dataset.py", line 3247, in extend raise SampleExtendError(str(e)) from e.__cause__ deeplake.util.exceptions.SampleExtendError: Failed to append a sample to the tensor 'metadata'. See more details in the traceback. If you wish to skip the samples that cause errors, please specify `ignore_errors=True`. ``` Which is does not explain the error well enough. The same error for chroma looks like this ``` During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/mwm/repositories/sources/fixing_langchain/main.py", line 56, in <module> db = Chroma.from_documents(docs, embeddings) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/mwm/repositories/sources/langchain/libs/community/langchain_community/vectorstores/chroma.py", line 778, in from_documents return cls.from_texts( ^^^^^^^^^^^^^^^ File "/home/mwm/repositories/sources/langchain/libs/community/langchain_community/vectorstores/chroma.py", line 736, in from_texts chroma_collection.add_texts( File "/home/mwm/repositories/sources/langchain/libs/community/langchain_community/vectorstores/chroma.py", line 309, in add_texts raise ValueError(e.args[0] + "\n\n" + msg) ValueError: Expected metadata value to be a str, int, float or bool, got lorem_ipsum.txt which is a <class 'pathlib.PosixPath'> Try filtering complex metadata from the document using langchain_community.vectorstores.utils.filter_complex_metadata. ``` Which is way more user friendly, so I just added information about possible mismatch of the type in the error message, the same way it is covered in chroma https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/vectorstores/chroma.py#L224	2024-03-01 11:21:21 -08:00
Daniel Chico	7d962278f6	community[patch]: type ignore fixes (#18395 ) Related to #17048	2024-03-01 11:21:02 -08:00
Christophe Bornet	69be82c86d	community[patch]: Implement lazy_load() for CSVLoader (#18391 ) Covered by `test_csv_loader.py`	2024-03-01 11:17:08 -08:00
Guangdong Liu	760a16ff32	community[patch]: Fix ChatModel for sparkllm Bug. (#18375 ) PR message: *Delete this entire checklist* and replace with - Description: fix sparkllm paramer error - Issue: close #18370 - Dependencies: change `IFLYTEK_SPARK_APP_URL` to `IFLYTEK_SPARK_API_URL` - Twitter handle: No	2024-03-01 10:49:30 -08:00
Yujie Qian	cbb65741a7	community[patch]: Voyage AI updates default model and batch size (#17655 ) - Description: update the default model and batch size in VoyageEmbeddings - Issue: N/A - Dependencies: N/A - Twitter handle: N/A --------- Co-authored-by: fodizoltan <zoltan@conway.expert>	2024-03-01 10:22:24 -08:00
Shengsheng Huang	ae471a7dcb	community[minor]: add BigDL-LLM integrations (#17953 ) - Description: [`bigdl-llm`](https://github.com/intel-analytics/BigDL) is a library for running LLM on Intel XPU (from Laptop to GPU to Cloud) using INT4/FP4/INT8/FP8 with very low latency (for any PyTorch model). This PR adds bigdl-llm integrations to langchain. - Issue: NA - Dependencies: `bigdl-llm` library - Contribution maintainer: @shane-huang Examples added: - docs/docs/integrations/llms/bigdl.ipynb	2024-03-01 10:04:53 -08:00
Ethan Yang	f61cb8d407	community[minor]: Add openvino backend support (#11591 ) - Description: add openvino backend support by HuggingFace Optimum Intel, - Dependencies: “optimum[openvino]”, --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-01 10:04:24 -08:00
RadhikaBansal97	8bafd2df5e	community[patch]: Change github endpoint in GithubLoader (#17622 ) Description- - Changed the GitHub endpoint as existing was not working and giving 404 not found error - Also the existing function was failing if file_filter is not passed as the tree api return all paths including directory as well, and when get_file_content was iterating over these path, the function was failing for directory as the api was returning list of files inside the directory, so added a condition to ignore the paths if it a directory - Fixes this issue - https://github.com/langchain-ai/langchain/issues/17453 Co-authored-by: Radhika Bansal <Radhika.Bansal@veritas.com>	2024-03-01 09:36:31 -08:00
Anush	9d663f31fa	community[patch]: FastEmbed to latest (#18040 ) ## Description Updates the `langchain_community.embeddings.fastembed` provider as per the recent updates to [`FastEmbed`](https://github.com/qdrant/fastembed) library.	2024-02-29 21:15:51 -08:00
Eugene Yurtsev	51b661cfe8	community[patch]: BaseLoader load method should just delegate to lazy_load (#18289 ) load() should just reference lazy_load()	2024-02-29 21:45:28 -05:00
Bagatur	5efb5c099f	text-splitters[minor], langchain[minor], community[patch], templates, docs: langchain-text-splitters 0.0.1 (#18346 )	2024-02-29 18:33:21 -08:00
Erick Friis	eefb49680f	multiple[patch]: fix deprecation versions (#18349 )	2024-02-29 16:58:33 -08:00
Jib	72bfc1d3db	mongodb[minor]: MongoDB Partner Package -- Porting MongoDBAtlasVectorSearch (#17652 ) This PR migrates the existing MongoDBAtlasVectorSearch abstraction from the `langchain_community` section to the partners package section of the codebase. - [x] Run the partner package script as advised in the partner-packages documentation. - [x] Add Unit Tests - [x] Migrate Integration Tests - [x] Refactor `MongoDBAtlasVectorStore` (autogenerated) to `MongoDBAtlasVectorSearch` - [x] ~Remove~ deprecate the old `langchain_community` VectorStore references. ## Additional Callouts - Implemented the `delete` method - Included any missing async function implementations - `amax_marginal_relevance_search_by_vector` - `adelete` - Added new Unit Tests that test for functionality of `MongoDBVectorSearch` methods - Removed [`del res[self._embedding_key]`](`e0c81e1cb0/libs/community/langchain_community/vectorstores/mongodb_atlas.py (L218)`) in `_similarity_search_with_score` function as it would make the `maximal_marginal_relevance` function fail otherwise. The `Document` needs to store the embedding key in metadata to work. Checklist: - [x] PR title: Please title your PR "package: description", where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message - [x] Pass lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified to check that you're passing lint and testing. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ - [x] Add tests and docs: If you're adding a new integration, please include 1. Existing tests supplied in docs/docs do not change. Updated docstrings for new functions like `delete` 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. (This already exists) If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Steven Silvester <steven.silvester@ieee.org> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-29 23:09:48 +00:00
Kai Kugler	df234fb171	community[patch]: Fixing embedchain document mapping (#18255 ) - Description: The current embedchain implementation seems to handle document metadata differently than done in the current implementation of langchain and a KeyError is thrown. I would love for someone else to test this... --------- Co-authored-by: KKUGLER <kai.kugler@mercedes-benz.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Deshraj Yadav <deshraj@gatech.edu>	2024-02-29 14:54:37 -08:00
Erick Friis	040271f33a	community[patch]: remove llmlingua extended tests (#18344 )	2024-02-29 13:51:29 -08:00
Tomaz Bratanic	5999c4a240	Add support for parameters in neo4j retrieval query (#18310 ) Sometimes, you want to use various parameters in the retrieval query of Neo4j Vector to personalize/customize results. Before, when there were only predefined chains, it didn't really make sense. Now that it's all about custom chains and LCEL, it is worth adding since users can inject any params they wish at query time. Isn't prone to SQL injection-type attacks since we use parameters and not concatenating strings.	2024-02-29 13:00:54 -08:00
Virat Singh	cd926ac3dd	community: Add PolygonFinancials Tool (#18324 ) Description: In this PR, I am adding a `PolygonFinancials` tool, which can be used to get financials data for a given ticker. The financials data is the fundamental data that is found in income statements, balance sheets, and cash flow statements of public US companies. Twitter: [@virattt](https://twitter.com/virattt)	2024-02-29 10:56:05 -08:00
Christophe Bornet	8a81fcd5d3	community: Fix deprecation version of AstraDB VectorStore (#17991 )	2024-02-28 17:15:09 -05:00
mackong	2c42f3a955	ollama[patch]: delete suffix slash to avoid redirect (#18260 ) - Description: see [ollama](https://github.com/ollama/ollama/blob/main/server/routes.go#L949)'s route definitions - Issue: N/A - Dependencies: N/A	2024-02-28 16:44:48 -05:00
William De Vena	6b58943917	community[patch]: Invoke callback prior to yielding token (#18288 ) ## PR title community[patch]: Invoke callback prior to yielding PR message Description: Invoke on_llm_new_token callback prior to yielding token in _stream and _astream methods. Issue: https://github.com/langchain-ai/langchain/issues/16913 Dependencies: None Twitter handle: None	2024-02-28 21:40:53 +00:00
Eugene Yurtsev	cd52433ba0	community[minor]: Add `SQLDatabaseLoader` document loader (#18281 ) - Description: A generic document loader adapter for SQLAlchemy on top of LangChain's `SQLDatabaseLoader`. - Needed by: https://github.com/crate-workbench/langchain/pull/1 - Depends on: GH-16655 - Addressed to: @baskaryan, @cbornet, @eyurtsev Hi from CrateDB again, in the same spirit like GH-16243 and GH-16244, this patch breaks out another commit from https://github.com/crate-workbench/langchain/pull/1, in order to reduce the size of this patch before submitting it, and to separate concerns. To accompany the SQLAlchemy adapter implementation, the patch includes integration tests for both SQLite and PostgreSQL. Let me know if corresponding utility resources should be added at different spots. With kind regards, Andreas. ### Software Tests ```console docker compose --file libs/community/tests/integration_tests/document_loaders/docker-compose/postgresql.yml up ``` ```console cd libs/community pip install psycopg2-binary pytest -vvv tests/integration_tests -k sqldatabase ``` ``` 14 passed ``` ![image](https://github.com/langchain-ai/langchain/assets/453543/42be233c-eb37-4c76-a830-474276e01436) --------- Co-authored-by: Andreas Motl <andreas.motl@crate.io>	2024-02-28 21:02:28 +00:00
David Ruan	af35e2525a	community[minor]: add hugging_face_model document loader (#17323 ) - Description: add hugging_face_model document loader, - Issue: NA, - Dependencies: NA, --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-28 20:05:35 +00:00
Sanjaypranav V M	b9a495e56e	community[patch]: added latin-1 decoder to gmail search tool (#18116 ) some mails from flipkart , amazon are encoded with other plain text format so to handle UnicodeDecode error , added exception and latin decoder Thank you for contributing to LangChain! @hwchase17	2024-02-28 19:28:29 +00:00
Ashley Xu	e3211c2b3d	community[patch]: BigQueryVectorSearch JSON type unsupported for metadatas (#18234 )	2024-02-28 08:19:53 -08:00
Ayo Ayibiowu	ac1d7d9de8	community[feat]: Adds LLMLingua as a document compressor (#17711 ) Description: This PR adds support for using the [LLMLingua project ](https://github.com/microsoft/LLMLingua) especially the LongLLMLingua (Enhancing Large Language Model Inference via Prompt Compression) as a document compressor / transformer. The LLMLingua project is an interesting project that can greatly improve RAG system by compressing prompts and contexts while keeping their semantic relevance. Issue: https://github.com/microsoft/LLMLingua/issues/31 Dependencies: [llmlingua](https://pypi.org/project/llmlingua/) @baskaryan --------- Co-authored-by: Ayodeji Ayibiowu <ayodeji.ayibiowu@getinge.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-02-27 19:23:56 -08:00
Jaskirat Singh	ce682f5a09	community: vectorstores.kdbai - Added support for when no docs are present (#18103 ) - Description: By default it expects a list but that's not the case in corner scenarios when there is no document ingested(use case: Bootstrap application). \ Hence added as check, if the instance is panda Dataframe instead of list then it will procced with return immediately. - Issue: NA - Dependencies: NA - Twitter handle: jaskiratsingh1 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-02-26 12:47:06 -08:00
am-kinetica	9b8f6455b1	Langchain vectorstore integration with Kinetica (#18102 ) - Description: New vectorstore integration with the Kinetica database - Issue: - Dependencies: the Kinetica Python API `pip install gpudb==7.2.0.1`, - Tag maintainer: @baskaryan, @hwchase17 - Twitter handle: --------- Co-authored-by: Chad Juliano <cjuliano@kinetica.com>	2024-02-26 12:46:48 -08:00
GoodBai	3589a135ef	community: make `SET allow_experimental_[engine]_index` configurabe in vectorstores.clickhouse (#18107 ) ## Description & Issue While following the official doc to use clickhouse as a vectorstore, I found only the default `annoy` index is properly supported. But I want to try another engine `usearch` for `annoy` is not properly supported on ARM platforms. Here is the settings I prefer: ``` python settings = ClickhouseSettings( table="wiki_Ethereum", index_type="usearch", # annoy by default index_param=[], ) ``` The above settings do not work for the command `set allow_experimental_annoy_index=1` is hard-coded. This PR will make sure the experimental feature follow the `index_type` which is also consistent with Clickhouse's naming conventions.	2024-02-26 12:39:17 -08:00
Dan Stambler	69344a0661	community: Add Laser Embedding Integration (#18111 ) - Description: Added Integration with Meta AI's LASER Language-Agnostic SEntence Representations embedding library, which supports multilingual embedding for any of the languages listed here: https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200, including several low resource languages - Dependencies: laser_encoders	2024-02-26 12:16:37 -08:00
Christophe Bornet	a2d5fa7649	community[patch]: Fix GenericRequestsWrapper _aget_resp_content must be async (#18065 ) There are existing tests in `libs/community/tests/unit_tests/tools/requests/test_tool.py`	2024-02-25 19:07:07 -08:00
Neli Hateva	a01e8473f8	community[patch]: Fix GraphSparqlQAChain so that it works with Ontotext GraphDB (#15009 ) - Description: Introduce a new parameter `graph_kwargs` to `RdfGraph` - parameters used to initialize the `rdflib.Graph` if `query_endpoint` is set. Also, do not set `rdflib.graph.DATASET_DEFAULT_GRAPH_ID` as default value for the `rdflib.Graph` `identifier` if `query_endpoint` is set. - Issue: N/A - Dependencies: N/A - Twitter handle: N/A	2024-02-25 19:05:21 -08:00
kYLe	17ecf6e119	community[patch]: Remove model limitation on Anyscale LLM (#17662 ) Description: Llama Guard is deprecated from Anyscale public endpoint. Issue: Change the default model. and remove the limitation of only use Llama Guard with Anyscale LLMs Anyscale LLM can also works with all other Chat model hosted on Anyscale. Also added `async_client` for Anyscale LLM	2024-02-25 18:21:19 -08:00
Barun Amalkumar Halder	cc69976860	community[minor] : adds callback handler for Fiddler AI (#17708 ) Description: Callback handler to integrate fiddler with langchain. This PR adds the following - 1. `FiddlerCallbackHandler` implementation into langchain/community 2. Example notebook `fiddler.ipynb` for usage documentation [Internal Tracker : FDL-14305] Issue: NA Dependencies: - Installation of langchain-community is unaffected. - Usage of FiddlerCallbackHandler requires installation of latest fiddler-client (2.5+) Twitter handle: @fiddlerlabs @behalder Co-authored-by: Barun Halder <barun@fiddler.ai>	2024-02-25 18:17:03 -08:00
Christophe Bornet	b8b5ce0c8c	astradb: Add AstraDBChatMessageHistory to langchain-astradb package (#17732 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-25 18:14:49 -08:00
Maxime Perrin	c06a8732aa	community[patch]: fix llama index imports and fields access (#17870 ) - Description: Fixing outdated imports after v0.10 llama index update and updating metadata and source text access - Issue: #17860 - Twitter handle: @maximeperrin_ --------- Co-authored-by: Maxime Perrin <mperrin@doing.fr>	2024-02-25 18:14:23 -08:00
2jimoo	7fc903464a	community: Add document manager and mongo document manager (#17320 ) - Description: - Add DocumentManager class, which is a nosql record manager. - In order to use index and aindex in libs/langchain/langchain/indexes/_api.py, DocumentManager inherits RecordManager. - Also I added the MongoDB implementation of Document Manager too. - Dependencies: pymongo, motor <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: Add DocumentManager class, which is a no sql record manager. To use index method and aindex method in indexes._api.py, Document Manager inherits RecordManager.Add the MongoDB implementation of Document Manager. - Dependencies: pymongo, motor Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-23 21:32:52 -05:00
kYLe	56b955fc31	community[minor]: Add async_client for Anyscale Chat model (#18050 ) Add `async_client` for Anyscale Chat_model	2024-02-23 21:22:54 -05:00
Bagatur	22b964f802	community[patch]: Release 0.0.24 (#18038 )	2024-02-23 10:49:29 -08:00
Erick Friis	29e0445490	community[patch]: BaseLLM typing in init (#18029 )	2024-02-23 17:51:27 +00:00
Nicolò Boschi	4c132b4cc6	community: fix openai streaming throws 'AIMessageChunk' object has no attribute 'text' (#18006 ) After upgrading langchain-community to 0.0.22, it's not possible to use openai from the community package with streaming=True ``` File "/home/runner/work/ragstack-ai/ragstack-ai/ragstack-e2e-tests/.tox/langchain/lib/python3.11/site-packages/langchain_community/chat_models/openai.py", line 434, in _generate return generate_from_stream(stream_iter) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/runner/work/ragstack-ai/ragstack-ai/ragstack-e2e-tests/.tox/langchain/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 65, in generate_from_stream for chunk in stream: File "/home/runner/work/ragstack-ai/ragstack-ai/ragstack-e2e-tests/.tox/langchain/lib/python3.11/site-packages/langchain_community/chat_models/openai.py", line 418, in _stream run_manager.on_llm_new_token(chunk.text, chunk=cg_chunk) ^^^^^^^^^^ AttributeError: 'AIMessageChunk' object has no attribute 'text' ``` Fix regression of https://github.com/langchain-ai/langchain/pull/17907 Twitter handle: @nicoloboschi	2024-02-23 12:12:47 -05:00
Bagatur	9b982b2aba	community[patch]: Release 0.0.23 (#18027 )	2024-02-23 08:54:31 -08:00
Guangdong Liu	4197efd67a	community: Fix SparkLLM error (#18015 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - Description: fix SparkLLM error - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out!	2024-02-23 06:40:29 -08:00
Bagatur	b46d6b04e1	community[patch]: Release 0.0.22 (#17994 )	2024-02-22 21:35:04 -08:00
Leo Diegues	b15fccbb99	community[patch]: Skip `OpenAIWhisperParser` extremely small audio chunks to avoid api error (#11450 ) Description This PR addresses a rare issue in `OpenAIWhisperParser` that causes it to crash when processing an audio file with a duration very close to the class's chunk size threshold of 20 minutes. Issue #11449 Dependencies None Tag maintainer @agola11 @eyurtsev Twitter handle leonardodiegues --------- Co-authored-by: Leonardo Diegues <leonardo.diegues@grupofolha.com.br> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-22 17:02:43 -08:00
mackong	9678797625	community[patch]: callback before yield for _stream/_astream (#17907 ) - Description: callback on_llm_new_token before yield chunk for _stream/_astream for some chat models, make all chat models in a consistent behaviour. - Issue: N/A - Dependencies: N/A	2024-02-22 16:15:21 -08:00
Chad Juliano	50ba3c68bb	community[minor]: add Kinetica LLM wrapper (#17879 ) Description: Initial pull request for Kinetica LLM wrapper Issue: N/A Dependencies: No new dependencies for unit tests. Integration tests require gpudb, typeguard, and faker Twitter handle: @chad_juliano Note: There is another pull request for Kinetica vectorstore. Ultimately we would like to make a partner package but we are starting with a community contribution.	2024-02-22 16:02:00 -08:00
Bagatur	9b0b0032c2	community[patch]: fix lint (#17984 )	2024-02-22 15:15:27 -08:00
David Loving	d068e8ea54	community[patch]: compatibility with SQLAlchemy 1.4.x (#17954 ) Description: Change type hint on `QuerySQLDataBaseTool` to be compatible with SQLAlchemy v1.4.x. Issue: Users locked to `SQLAlchemy < 2.x` are unable to import `QuerySQLDataBaseTool`. closes https://github.com/langchain-ai/langchain/issues/17819 Dependencies: None	2024-02-22 13:17:07 -05:00
kartikTAI	9cf6661dc5	community: use NeuralDB object to initialize NeuralDBVectorStore (#17272 ) Description: This PR adds an `__init__` method to the NeuralDBVectorStore class, which takes in a NeuralDB object to instantiate the state of NeuralDBVectorStore. Issue: N/A Dependencies: N/A Twitter handle: N/A	2024-02-22 12:05:01 -05:00
Brad Erickson	ecd72d26cf	community: Bugfix - correct Ollama API path to avoid HTTP 307 (#17895 ) Sets the correct /api/generate path, without ending /, to reduce HTTP requests. Reference: https://github.com/ollama/ollama/blob/efe040f8/docs/api.md#generate-request-streaming Before: DEBUG: Starting new HTTP connection (1): localhost:11434 DEBUG: http://localhost:11434 "POST /api/generate/ HTTP/1.1" 307 0 DEBUG: http://localhost:11434 "POST /api/generate HTTP/1.1" 200 None After: DEBUG: Starting new HTTP connection (1): localhost:11434 DEBUG: http://localhost:11434 "POST /api/generate HTTP/1.1" 200 None	2024-02-22 11:59:55 -05:00
Erick Friis	a53370a060	pinecone[patch], docs: PineconeVectorStore, release 0.0.3 (#17896 )	2024-02-22 08:24:08 -08:00
Hasan	7248e98b9e	community[patch]: Return PK in similarity search Document (#17561 ) Issue: #17390 Co-authored-by: hasan <hasan@m2sys.com>	2024-02-21 17:03:50 -08:00
Raunak	1ec8199c8e	community[patch]: Added more functions in NetworkxEntityGraph class (#17624 ) - Description: 1. Added add_node(), remove_node(), has_node(), remove_edge(), has_edge() and get_neighbors() functions in NetworkxEntityGraph class. 2. Added the above functions in graph_networkx_qa.ipynb documentation.	2024-02-21 17:02:56 -08:00
Christophe Bornet	3d91be94b1	community[patch]: Add missing async_astra_db_client param to AstraDBChatMessageHistory (#17742 )	2024-02-21 16:46:42 -08:00
Xudong Sun	c524bf31f5	docs: add helpful comments to sparkllm.py (#17774 ) Adding helpful comments to sparkllm.py, help users to use ChatSparkLLM more effectively	2024-02-21 16:42:54 -08:00
Ian	3019a594b7	community[minor]: Add tidb loader support (#17788 ) This pull request support loading data from TiDB database with Langchain. A simple usage: ``` from langchain_community.document_loaders import TiDBLoader CONNECTION_STRING = "mysql+pymysql://root@127.0.0.1:4000/test" QUERY = "select id, name, description from items;" loader = TiDBLoader( connection_string=CONNECTION_STRING, query=QUERY, page_content_columns=["name", "description"], metadata_columns=["id"], ) documents = loader.load() print(documents) ```	2024-02-21 16:42:33 -08:00
Christophe Bornet	815ec74298	docs: Add docstring to AstraDBStore (#17793 )	2024-02-21 16:41:47 -08:00
ehude	9e54c227f1	community[patch]: Bug Neo4j VectorStore when having multiple indexes the sort is not working and the store that returned is random (#17396 ) Bug fix: when having multiple indexes the sort is not working and the store that returned is random. The following small fix resolves the issue.	2024-02-21 16:33:33 -08:00
Michael Feil	242981b8f0	community[minor]: infinity embedding local option (#17671 ) drop-in-replacement for sentence-transformers inference. https://github.com/langchain-ai/langchain/discussions/17670 tldr from the discussion above -> around a 4x-22x speedup over using SentenceTransformers / huggingface embeddings. For more info: https://github.com/michaelfeil/infinity (pure-python dependency) --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-21 16:33:13 -08:00
Leonid Ganeline	ed0b7c3b72	docs: added `community` modules descriptions (#17827 ) API Reference: Several `community` modules (like [adapter](https://api.python.langchain.com/en/latest/community_api_reference.html#module-langchain_community.adapters) module) are missing descriptions. It happens when langchain was split to the core, langchain and community packages. - Copied module descriptions from other packages - Fixed several descriptions to the consistent format.	2024-02-21 16:18:36 -08:00
Christophe Bornet	5019951a5d	docs: AstraDB VectorStore docstring (#17834 )	2024-02-21 16:16:31 -08:00
Rohit Gupta	3acd0c74fc	community[patch]: added SCANN index in default search params (#17889 ) This will enable users to add data in same collection for index type SCANN for milvus	2024-02-21 15:47:47 -08:00
Karim Assi	afc1ba0329	community[patch]: add possibility to search by vector in OpenSearchVectorSearch (#17878 ) - Description: implements the missing `similarity_search_by_vector` function for `OpenSearchVectorSearch` - Issue: N/A - Dependencies: N/A	2024-02-21 15:44:55 -08:00
Nathan Voxland (Activeloop)	9ece134d45	docs: Improved deeplake.py init documentation (#17549 ) Description: Updated documentation for DeepLake init method. Especially the exec_option docs needed improvement, but did a general cleanup while I was looking at it. Issue: n/a Dependencies: None --------- Co-authored-by: Nathan Voxland <nathan@voxland.net>	2024-02-21 15:33:00 -08:00
Zachary Toliver	29ee0496b6	community[patch]: Allow override of 'fetch_schema_from_transport' in the GraphQL tool (#17649 ) - Description: In order to override the bool value of "fetch_schema_from_transport" in the GraphQLAPIWrapper, a "fetch_schema_from_transport" value needed to be added to the "_EXTRA_OPTIONAL_TOOLS" dictionary in load_tools in the "graphql" key. The parameter "fetch_schema_from_transport" must also be passed in to the GraphQLAPIWrapper to allow reading of the value when creating the client. Passing as an optional parameter is probably best to avoid breaking changes. This change is necessary to support GraphQL instances that do not support fetching schema, such as TigerGraph. More info here: [TigerGraph GraphQL Schema Docs](https://docs.tigergraph.com/graphql/current/schema) - Threads handle: @zacharytoliver --------- Co-authored-by: Zachary Toliver <zt10191991@hotmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-21 15:32:43 -08:00
mackong	31891092d8	community[patch]: add missing chunk parameter for _stream/_astream (#17807 ) - Description: Add missing chunk parameter for _stream/_astream for some chat models, make all chat models in a consistent behaviour. - Issue: N/A - Dependencies: N/A	2024-02-21 15:32:28 -08:00
volodymyr-memsql	0a9a519a39	community[patch]: Added add_images method to SingleStoreDB vector store (#17871 ) In this pull request, we introduce the add_images method to the SingleStoreDB vector store class, expanding its capabilities to handle multi-modal embeddings seamlessly. This method facilitates the incorporation of image data into the vector store by associating each image's URI with corresponding document content, metadata, and either pre-generated embeddings or embeddings computed using the embed_image method of the provided embedding object. the change includes integration tests, validating the behavior of the add_images. Additionally, we provide a notebook showcasing the usage of this new method. --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>	2024-02-21 15:16:32 -08:00
Shashank	8381f859b4	community[patch]: Graceful handling of redis errors in RedisCache and AsyncRedisCache (#17171 ) - Description: The existing `RedisCache` implementation lacks proper handling for redis client failures, such as `ConnectionRefusedError`, leading to subsequent failures in pipeline components like LLM calls. This pull request aims to improve error handling for redis client issues, ensuring a more robust and graceful handling of such errors. - Issue: Fixes #16866 - Dependencies: No new dependency - Twitter handle: N/A Co-authored-by: snsten <> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-21 12:15:19 -05:00
Christophe Bornet	e6311d953d	community[patch]: Add AstraDBLoader docstring (#17873 )	2024-02-21 11:41:34 -05:00
nbyrneKX	c1bb5fd498	community[patch]: typo in doc-string for kdbai vectorstore (#17811 ) community[patch]: typo in doc-string for kdbai vectorstore (#17811)	2024-02-21 10:35:11 -05:00
Christophe Bornet	bebe401b1a	astradb[patch]: Add AstraDBStore to langchain-astradb package (#17789 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-20 16:54:35 -08:00
Guangdong Liu	47b1b7092d	community[minor]: Add SparkLLM to community (#17702 )	2024-02-20 11:23:47 -08:00
Guangdong Liu	3ba1cb8650	community[minor]: Add SparkLLM Text Embedding Model and SparkLLM introduction (#17573 )	2024-02-20 11:22:27 -08:00
Virat Singh	92e52e89ca	community: Add PolygonTickerNews Tool (#17808 ) Description: In this PR, I am adding a PolygonTickerNews Tool, which can be used to get the latest news for a given ticker / stock. Twitter handle: [@virattt](https://twitter.com/virattt)	2024-02-20 10:15:29 -08:00
Christophe Bornet	b13e52b6ac	community[patch]: Fix AstraDBCache docstrings (#17802 )	2024-02-20 11:39:30 -05:00
Bagatur	ad285ca15c	community[patch]: Release 0.0.21 (#17750 )	2024-02-19 11:13:33 -08:00
Karim Lalani	ea61302f71	community[patch]: bug fix - add empty metadata when metadata not provided (#17669 ) Code fix to include empty medata dictionary to aadd_texts if metadata is not provided.	2024-02-19 10:54:52 -08:00
CogniJT	919ebcc596	community[minor]: CogniSwitch Agent Toolkit for LangChain (#17312 ) Description: CogniSwitch focusses on making GenAI usage more reliable. It abstracts out the complexity & decision making required for tuning processing, storage & retrieval. Using simple APIs documents / URLs can be processed into a Knowledge Graph that can then be used to answer questions. Dependencies: No dependencies. Just network calls & API key required Tag maintainer: @hwchase17 Twitter handle: https://github.com/CogniSwitch Documentation: Please check `docs/docs/integrations/toolkits/cogniswitch.ipynb` Tests: The usual tool & toolkits tests using `test_imports.py` PR has passed linting and testing before this submission. --------- Co-authored-by: Saicharan Sridhara <145636106+saiCogniswitch@users.noreply.github.com>	2024-02-19 10:54:13 -08:00
Christophe Bornet	6275d8b1bf	docs: Fix AstraDBChatMessageHistory docstrings (#17740 )	2024-02-19 10:47:38 -08:00
Aymeric Roucher	0d294760e7	Community: Fuse HuggingFace Endpoint-related classes into one (#17254 ) ## Description Fuse HuggingFace Endpoint-related classes into one: - [HuggingFaceHub](`5ceaf784f3/libs/community/langchain_community/llms/huggingface_hub.py`) - [HuggingFaceTextGenInference](`5ceaf784f3/libs/community/langchain_community/llms/huggingface_text_gen_inference.py`) - and [HuggingFaceEndpoint](`5ceaf784f3/libs/community/langchain_community/llms/huggingface_endpoint.py`) Are fused into - HuggingFaceEndpoint ## Issue The deduplication of classes was creating a lack of clarity, and additional effort to develop classes leads to issues like [this hack](`5ceaf784f3/libs/community/langchain_community/llms/huggingface_endpoint.py (L159)`). ## Dependancies None, this removes dependancies. ## Twitter handle If you want to post about this: @AymericRoucher --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-19 10:33:15 -08:00
Raghav Dixit	6c18f73ca5	community[patch]: LanceDB integration improvements/fixes (#16173 ) Hi, I'm from the LanceDB team. Improves LanceDB integration by making it easier to use - now you aren't required to create tables manually and pass them in the constructor, although that is still backward compatible. Bug fix - pandas was being used even though it's not a dependency for LanceDB or langchain PS - this issue was raised a few months ago but lost traction. It is a feature improvement for our users kindly review this , Thanks !	2024-02-19 10:22:02 -08:00
Christophe Bornet	e92e96193f	community[minor]: Add async methods to the AstraDB BaseStore (#16872 ) --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-02-19 10:11:49 -08:00
Mohammad Mohtashim	43dc5d3416	community[patch]: OpenLLM Client Fixes + Added Timeout Parameter (#17478 ) - OpenLLM was using outdated method to get the final text output from openllm client invocation which was raising the error. Therefore corrected that. - OpenLLM `_identifying_params` was getting the openllm's client configuration using outdated attributes which was raising error. - Updated the docstring for OpenLLM. - Added timeout parameter to be passed to underlying openllm client.	2024-02-19 10:09:11 -08:00
Guangdong Liu	73edf17b4e	community[minor]: Add Apache Doris as vector store (#17527 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-18 12:05:58 -07:00
Bagatur	a058c8812d	community[patch]: add VoyageEmbeddings truncation (#17638 )	2024-02-18 10:21:21 -07:00
Mohammad Mohtashim	8d4547ae97	[Langchain_community]: Corrected the imports to make them compatible with Sqlachemy <2.0 (#17653 ) - Small Change in Imports in sql_database module to make it work with Sqlachemy <2.0 - This was identified in the following issue: #17616	2024-02-16 16:59:08 -05:00
Christophe Bornet	19ebc7418e	community: Use _AstraDBCollectionEnvironment in AstraDB VectorStore (community) (#17635 ) Another PR will be done for the langchain-astradb package. Note: for future PRs, devs will be done in the partner package only. This one is just to align with the rest of the components in the community package and it fixes a bunch of issues.	2024-02-16 11:28:16 -05:00
Nejc Habjan	b4fa847a90	community[minor]: add exclude parameter to DirectoryLoader (#17316 ) - Description: adds an `exclude` parameter to the DirectoryLoader class, based on similar behavior in GenericLoader - Issue: discussed in https://github.com/langchain-ai/langchain/discussions/9059 and I think in some other issues that I cannot find at the moment 🙇 - Dependencies: None - Twitter handle: don't have one sorry! Just https://github/nejch --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-16 09:42:42 -05:00
Bagatur	8f14234afb	infra: ignore flakey lua test (#17618 )	2024-02-16 05:02:58 -07:00
Krista Pratico	bf8e3c6dd1	community[patch]: add fixes for AzureSearch after update to stable azure-search-documents library (#17599 ) - Description: Addresses the bugs described in linked issue where an import was erroneously removed and the rename of a keyword argument was missed when migrating from beta --> stable of the azure-search-documents package - Issue: https://github.com/langchain-ai/langchain/issues/17598 - Dependencies: N/A - Twitter handle: N/A	2024-02-15 22:23:52 -08:00
William FH	64743dea14	core[patch], community[patch], langchain[patch], experimental[patch], robocorp[patch]: bump LangSmith 0.1.* (#17567 )	2024-02-15 23:17:59 -07:00
morgana	9d7ca7df6e	community[patch]: update copy of metadata in rockset vectorstore integration (#17612 ) - Description: This fixes an issue with working with RecordManager. RecordManager was generating new hashes on documents because `add_texts` was modifying the metadata directly. Additionally moved some tests to unit tests since that was a more appropriate home. - Issue: N/A - Dependencies: N/A - Twitter handle: `@_morgan_adams_`	2024-02-15 23:13:40 -07:00
Stefano Lottini	5240ecab99	astradb: bootstrapping Astra DB as Partner Package (#16875 ) Description: This PR introduces a new "Astra DB" Partner Package. So far only the vector store class is _duplicated_ there, all others following once this is validated and established. Along with the move to separate package, incidentally, the class name will change `AstraDB` => `AstraDBVectorStore`. The strategy has been to duplicate the module (with prospected removal from community at LangChain 0.2). Until then, the code will be kept in sync with minimal, known differences (there is a makefile target to automate drift control. Out of convenience with this check, the community package has a class `AstraDBVectorStore` aliased to `AstraDB` at the end of the module). With this PR several bugfixes and improvement come to the vector store, as well as a reshuffling of the doc pages/notebooks (Astra and Cassandra) to align with the move to a separate package. Dependencies: A brand new pyproject.toml in the new package, no changes otherwise. Twitter handle: `@rsprrs` --------- Co-authored-by: Christophe Bornet <cbornet@hotmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-15 15:50:59 -08:00
Moshe Berchansky	20a56fe0a2	community[minor]: Add QuantizedEmbedders (#17391 ) Description: * adding Quantized embedders using optimum-intel and intel-extension-for-pytorch. * added mdx documentation and example notebooks * added embedding import testing. Dependencies: optimum = {extras = ["neural-compressor"], version = "^1.14.0", optional = true} intel_extension_for_pytorch = {version = "^2.2.0", optional = true} Dependencies have been added to pyproject.toml for the community lib. Twitter handle: @peter_izsak --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-15 11:01:24 -08:00
Amir Karbasi	bccc9241ea	community[patch]: Resolve KuzuQAChain API Changes (#16885 ) - Description: Updates to the Kuzu API had broken this functionality. These updates resolve those issues and add a new test to demonstrate the updates. - Issue: #11874 - Dependencies: No new dependencies - Twitter handle: @amirk08 Test results: ``` tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_no_params PASSED [ 33%] tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_params PASSED [ 66%] tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_refresh_schema PASSED [100%] =================================================== slowest 5 durations =================================================== 0.53s call tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_refresh_schema 0.34s call tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_no_params 0.28s call tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_params 0.03s teardown tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_refresh_schema 0.02s teardown tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_params ==================================================== 3 passed in 1.27s ==================================================== ```	2024-02-15 10:18:37 -08:00
Rafail Giavrimis	a84a3add25	Community[patch]: Adjusted import to be compatible with SQLAlchemy<2 (#17520 ) - Description: Adjusts an import to directly import `Result` from `sqlalchemy.engine`. - Issue: #17519 - Dependencies: N/A - Twitter handle: @grafail	2024-02-15 11:12:13 -05:00
Zachary Toliver	6746adf363	community[patch]: pass bool value for fetch_schema_from_transport in GraphQLAPIWrapper (#17552 ) - Description: Allow a bool value to be passed to fetch_schema_from_transport since not all GraphQL instances support this feature, such as TigerGraph. - Threads: @zacharytoliver --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-15 09:54:04 -05:00
Christophe Bornet	789cd5198d	community[patch]: Use astrapy built-in pagination prefetch in AstraDBLoader (#17569 )	2024-02-15 09:52:56 -05:00
Christophe Bornet	387cacb881	community[minor]: Add async methods to AstraDBChatMessageHistory (#17572 )	2024-02-15 09:48:42 -05:00
Christophe Bornet	ff1f985a2a	community: Fix some mypy types in cassandra doc loader (#17570 ) Thank you!	2024-02-15 09:45:22 -05:00
Christophe Bornet	ca2d4078f3	community: Add async methods to AstraDBCache (#17415 ) Adds async methods to AstraDBCache	2024-02-14 23:10:08 -05:00
Jan Cap	7ae3ce60d2	community[patch]: Fix pwd import that is not available on windows (#17532 ) - Description: Resolving problem in `langchain_community\document_loaders\pebblo.py` with `import pwd`. `pwd` is not available on windows. import moved to try catch block - Issue: #17514	2024-02-14 13:45:10 -08:00
nvpranak	91bcc9c5c9	community[minor]: Nemo embeddings(#16206 ) This PR is adding support for NVIDIA NeMo embeddings issue #16095. --------- Co-authored-by: Praveen Nakshatrala <pnakshatrala@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-14 13:25:42 -08:00
Mateusz Szewczyk	916332ef5b	ibm: added partners package `langchain_ibm`, added llm (#16512 ) - Description: Added `langchain_ibm` as an langchain partners package of IBM [watsonx.ai](https://www.ibm.com/products/watsonx-ai) LLM provider (`WatsonxLLM`) - Dependencies: [ibm-watsonx-ai](https://pypi.org/project/ibm-watsonx-ai/), - Tag maintainer: : --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-14 12:12:19 -08:00
Shawn	f6d3a3546f	community[patch]: document_loaders: modified athena key logic to handle s3 uris without a prefix (#17526 ) https://github.com/langchain-ai/langchain/issues/17525 ### Example Code ```python from langchain_community.document_loaders.athena import AthenaLoader database_name = "database" s3_output_path = "s3://bucket-no-prefix" query="""SELECT CAST(extract(hour FROM current_timestamp) AS INTEGER) AS current_hour, CAST(extract(minute FROM current_timestamp) AS INTEGER) AS current_minute, CAST(extract(second FROM current_timestamp) AS INTEGER) AS current_second; """ profile_name = "AdministratorAccess" loader = AthenaLoader( query=query, database=database_name, s3_output_uri=s3_output_path, profile_name=profile_name, ) documents = loader.load() print(documents) ``` ### Error Message and Stack Trace (if applicable) NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist ### Description Athena Loader errors when result s3 bucket uri has no prefix. The Loader instance call results in a "NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist." error. If s3_output_path contains a prefix like: ```python s3_output_path = "s3://bucket-with-prefix/prefix" ``` Execution works without an error. ## Suggested solution Modify: ```python key = "/".join(tokens[1:]) + "/" + query_execution_id + ".csv" ``` to ```python key = "/".join(tokens[1:]) + ("/" if tokens[1:] else "") + query_execution_id + ".csv" ``` `9e8a3fc4ff/libs/community/langchain_community/document_loaders/athena.py (L128)` ### System Info System Information ------------------ > OS: Darwin > OS Version: Darwin Kernel Version 22.6.0: Fri Sep 15 13:41:30 PDT 2023; root:xnu-8796.141.3.700.8~1/RELEASE_ARM64_T8103 > Python Version: 3.9.9 (main, Jan 9 2023, 11:42:03) [Clang 14.0.0 (clang-1400.0.29.102)] Package Information ------------------- > langchain_core: 0.1.23 > langchain: 0.1.7 > langchain_community: 0.0.20 > langsmith: 0.0.87 > langchain_openai: 0.0.6 > langchainhub: 0.1.14 Packages not installed (Not Necessarily a Problem) -------------------------------------------------- The following packages were not found: > langgraph > langserve --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-14 11:48:31 -08:00
wulixuan	c776cfc599	community[minor]: integrate with model Yuan2.0 (#15411 ) 1. integrate with [`Yuan2.0`](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/README-EN.md) 2. update `langchain.llms` 3. add a new doc for [Yuan2.0 integration](docs/docs/integrations/llms/yuan2.ipynb) --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-14 11:46:20 -08:00
Philippe PRADOS	d07db457fc	community[patch]: Fix SQLAlchemyMd5Cache race condition (#16279 ) If the SQLAlchemyMd5Cache is shared among multiple processes, it is possible to encounter a race condition during the cache update. Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-14 11:45:28 -08:00
Alex Peplowski	70c296ae96	community[patch]: Expose Anthropic Retry Logic (#17069 ) Description: Expose Anthropic's retry logic, so that `max_retries` can be configured via langchain. Anthropic's retry logic is implemented in their Python SDK here: https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#retries --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-14 11:44:28 -08:00
Lyndsey	8562a1e7d4	community[patch]: support query filters for NotionDBLoader (#17217 ) - Description: Support filtering databases in the use case where devs do not want to query ALL entries within a DB, - Issue: N/A, - Dependencies: N/A, - Twitter handle: I don't have Twitter but feel free to tag my Github! --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-14 11:43:41 -08:00
volodymyr-memsql	e36bc379f2	community[patch]: Add vector index support to SingleStoreDB VectorStore (#17308 ) This pull request introduces support for various Approximate Nearest Neighbor (ANN) vector index algorithms in the VectorStore class, starting from version 8.5 of SingleStore DB. Leveraging this enhancement enables users to harness the power of vector indexing, significantly boosting search speed, particularly when handling large sets of vectors. --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-14 11:43:12 -08:00
Kate Silverstein	0bc4a9b3fc	community[minor]: Adds Llamafile as an LLM (#17431 ) * Description: Adds a simple LLM implementation for interacting with [llamafile](https://github.com/Mozilla-Ocho/llamafile)-based models. * Dependencies: N/A * Issue: N/A Detail [llamafile](https://github.com/Mozilla-Ocho/llamafile) lets you run LLMs locally from a single file on most computers without installing any dependencies. To use the llamafile LLM implementation, the user needs to: 1. Download a llamafile e.g. https://huggingface.co/jartine/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile?download=true 2. Make the file executable. 3. Run the llamafile in 'server mode'. (All llamafiles come packaged with a lightweight server; by default, the server listens at `http://localhost:8080`.) ```bash wget https://url/of/model.llamafile chmod +x model.llamafile ./model.llamafile --server --nobrowser ``` Now, the user can invoke the LLM via the LangChain client: ```python from langchain_community.llms.llamafile import Llamafile llm = Llamafile() llm.invoke("Tell me a joke.") ```	2024-02-14 11:15:24 -08:00
Rakib Hosen	5ce1827d31	community[patch]: fix import in language parser (#17538 ) - Description: Resolving import error in language_parser.py during "from langchain.langchain.text_splitter import Language - Issue: the issue #17536 - Dependencies: NO - Twitter handle: @iRakibHosen --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-14 11:11:23 -08:00
Raunak	685d62b032	community[patch]: Added functions in NetworkxEntityGraph class (#17535 ) - Description: 1. Added _clear_edges()_ and _get_number_of_nodes()_ functions in NetworkxEntityGraph class. 2. Added the above two function in graph_networkx_qa.ipynb documentation.	2024-02-14 11:02:24 -08:00
Qihui Xie	5738143d4b	add mongodb_store (#13801 ) # Add MongoDB storage - Description: Add MongoDB Storage as an option for large doc store. Example usage: ```Python # Instantiate the MongodbStore with a MongoDB connection from langchain.storage import MongodbStore mongo_conn_str = "mongodb://localhost:27017/" mongodb_store = MongodbStore(mongo_conn_str, db_name="test-db", collection_name="test-collection") # Set values for keys doc1 = Document(page_content='test1') doc2 = Document(page_content='test2') mongodb_store.mset([("key1", doc1), ("key2", doc2)]) # Get values for keys values = mongodb_store.mget(["key1", "key2"]) # [doc1, doc2] # Iterate over keys for key in mongodb_store.yield_keys(): print(key) # Delete keys mongodb_store.mdelete(["key1", "key2"]) ``` - Dependencies: Use `mongomock` for integration test. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-13 22:33:22 -05:00
Nat Noordanus	8a3b74fe1f	community[patch]: Fix pydantic ForwardRef error in BedrockBase (#17416 ) - Description: Fixes a type annotation issue in the definition of BedrockBase. This issue was that the annotation for the `config` attribute includes a ForwardRef to `botocore.client.Config` which is only imported when `TYPE_CHECKING`. This can cause pydantic to raise an error like `pydantic.errors.ConfigError: field "config" not yet prepared so type is still a ForwardRef, ...`. - Issue: N/A - Dependencies: N/A - Twitter handle: `@__nat_n__`	2024-02-13 16:15:55 -08:00
Ashley Xu	f746a73e26	Add the BQ job usage tracking from LangChain (#17123 ) - Description: Add the BQ job usage tracking from LangChain --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-02-13 14:47:57 -08:00
Bagatur	39342d98d6	community[patch]: Release 0.0.20 (#17480 )	2024-02-13 13:01:51 -08:00
Max Jakob	ab3d944667	community[patch]: ElasticsearchStore: preserve user headers (#16830 ) Users can provide an Elasticsearch connection with custom headers. This PR makes sure these headers are preserved when adding the langchain user agent header.	2024-02-13 12:37:35 -08:00
wulixuan	5d06797905	community[minor]: integrate chat models with Yuan2.0 (#16575 ) 1. integrate chat models with [`Yuan2.0`](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/README-EN.md) 2. add a new doc for [Yuan2.0 integration](docs/docs/integrations/llms/yuan2.ipynb) Yuan2.0 is a new generation Fundamental Large Language Model developed by IEIT System. We have published all three models, Yuan 2.0-102B, Yuan 2.0-51B, and Yuan 2.0-2B. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-13 10:55:14 -08:00
Ian Gregory	e5472b5eb8	Framework for supporting more languages in LanguageParser (#13318 ) ## Description I am submitting this for a school project as part of a team of 5. Other team members are @LeilaChr, @maazh10, @Megabear137, @jelalalamy. This PR also has contributions from community members @Harrolee and @Mario928. Initial context is in the issue we opened (#11229). This pull request adds: - Generic framework for expanding the languages that `LanguageParser` can handle, using the [tree-sitter](https://github.com/tree-sitter/py-tree-sitter#py-tree-sitter) parsing library and existing language-specific parsers written for it - Support for the following additional languages in `LanguageParser`: - C - C++ - C# - Go - Java (contributed by @Mario928 https://github.com/ThatsJustCheesy/langchain/pull/2) - Kotlin - Lua - Perl - Ruby - Rust - Scala - TypeScript (contributed by @Harrolee https://github.com/ThatsJustCheesy/langchain/pull/1) Here is the [design document](https://docs.google.com/document/d/17dB14cKCWAaiTeSeBtxHpoVPGKrsPye8W0o_WClz2kk) if curious, but no need to read it. ## Issues - Closes #11229 - Closes #10996 - Closes #8405 ## Dependencies `tree_sitter` and `tree_sitter_languages` on PyPI. We have tried to add these as optional dependencies. ## Documentation We have updated the list of supported languages, and also added a section to `source_code.ipynb` detailing how to add support for additional languages using our framework. ## Maintainer - @hwchase17 (previously reviewed https://github.com/langchain-ai/langchain/pull/6486) Thanks!! ## Git commits We will gladly squash any/all of our commits (esp merge commits) if necessary. Let us know if this is desirable, or if you will be squash-merging anyway. <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Maaz Hashmi <mhashmi373@gmail.com> Co-authored-by: LeilaChr <87657694+LeilaChr@users.noreply.github.com> Co-authored-by: Jeremy La <jeremylai511@gmail.com> Co-authored-by: Megabear137 <zubair.alnoor27@gmail.com> Co-authored-by: Lee Harrold <lhharrold@sep.com> Co-authored-by: Mario928 <88029051+Mario928@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-02-13 08:45:49 -08:00
Abhishek Jain	37e1275f9e	community[patch]: Fixed the 'aembed' method of 'CohereEmbeddings'. (#16497 ) Description: - The existing code was trying to find a `.embeddings` property on the `Coroutine` returned by calling `cohere.async_client.embed`. - Instead, the `.embeddings` property is present on the value returned by the `Coroutine`. - Also, it seems that the original cohere client expects a value of `max_retries` to not be `None`. Hence, setting the default value of `max_retries` to `3`. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-12 21:57:27 -08:00
Sridhar Ramaswamy	9f1cbbc6ed	community[minor]: Add pebblo safe document loader (#16862 ) - Description: Pebblo opensource project enables developers to safely load data to their Gen AI apps. It identifies semantic topics and entities found in the loaded data and summarizes them in a developer-friendly report. - Dependencies: none - Twitter handle: srics @hwchase17	2024-02-12 21:56:12 -08:00
mhavey	1bbb64d956	community[minor], langchian[minor]: Add Neptune Rdf graph and chain (#16650 ) Description: This PR adds a chain for Amazon Neptune graph database RDF format. It complements the existing Neptune Cypher chain. The PR also includes a Neptune RDF graph class to connect to, introspect, and query a Neptune RDF graph database from the chain. A sample notebook is provided under docs that demonstrates the overall effect: invoking the chain to make natural language queries against Neptune using an LLM. Issue: This is a new feature Dependencies: The RDF graph class depends on the AWS boto3 library if using IAM authentication to connect to the Neptune database. --------- Co-authored-by: Piyush Jain <piyushjain@duck.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-12 21:30:20 -08:00
Michael Feil	e1cfd0f3e7	community[patch]: infinity embeddings update incorrect default url (#16759 ) The default url has always been incorrect (7797 instead 7997). Here is a update to the correct url.	2024-02-12 20:05:08 -08:00
Andreas Motl	1fdd9bd980	community/SQLDatabase: Generalize and trim software tests (#16659 ) - Description: Improve test cases for `SQLDatabase` adapter component, see [suggestion](https://github.com/langchain-ai/langchain/pull/16655#pullrequestreview-1846749474). - Depends on: GH-16655 - Addressed to: @baskaryan, @cbornet, @eyurtsev _Remark: This PR is stacked upon GH-16655, so that one will need to go in first._ Edit: Thank you for bringing in GH-17191, @eyurtsev. This is a little aftermath, improving/streamlining the corresponding test cases.	2024-02-12 22:58:34 -05:00
Theo / Taeyoon Kang	1987f905ed	core[patch]: Support .yml extension for YAML (#16783 ) - Description: [AS-IS] When dealing with a yaml file, the extension must be .yaml. [TO-BE] In the absence of extension length constraints in the OS, the extension of the YAML file is yaml, but control over the yml extension must still be made. It's as if it's an error because it's a .jpg extension in jpeg support. - Issue: - - Dependencies: no dependencies required for this change,	2024-02-12 19:57:20 -08:00
Kapil Sachdeva	cd00a87db7	community[patch] - in FAISS vector store, support passing custom DocStore implementation when using from_xxx methods (#16801 ) - Description: The from__xx methods of FAISS class have hardcoded InMemoryStore implementation and thereby not let users pass a custom DocStore implementation, - Issue: no referenced issue, - Dependencies: none, - Twitter handle: ksachdeva	2024-02-12 19:51:55 -08:00
Chris	f9f5626ca4	community[patch]: Fix github search issues and PRs PaginatedList has no len() error (#16806 ) Description: Bugfix: Langchain_community's GitHub Api wrapper throws a TypeError when searching for issues and/or PRs (the `search_issues_and_prs` method). This is because PyGithub's PageinatedList type does not support the len() method. See https://github.com/PyGithub/PyGithub/issues/1476 ![image](https://github.com/langchain-ai/langchain/assets/8849021/57390b11-ed41-4f48-ba50-f3028610789c) Dependencies: None Twitter handle: @ChrisKeoghNZ I haven't registered an issue as it would take me longer to fill the template out than to make the fix, but I'm happy to if that's deemed essential. I've added a simple integration test to cover this as there were no existing unit tests and it was going to be tricky to set them up. Co-authored-by: Chris Keogh <chris.keogh@xero.com>	2024-02-12 19:50:59 -08:00
morgana	722aae4fd1	community: add delete method to rocksetdb vectorstore to support recordmanager (#17030 ) - Description: This adds a delete method so that rocksetdb can be used with `RecordManager`. - Issue: N/A - Dependencies: N/A - Twitter handle: `@_morgan_adams_` --------- Co-authored-by: Rockset API Bot <admin@rockset.io>	2024-02-12 19:50:20 -08:00
yin1991	c454dc36fc	community[proxy]: Enhancement/add proxy support playwrighturlloader 16751 (#16822 ) - Description: Enhancement/add proxy support playwrighturlloader 16751 - Issue: [Enhancement: Add Proxy Support to PlaywrightURLLoader Class](https://github.com/langchain-ai/langchain/issues/16751) - Dependencies: - Twitter handle: @ootR77013489 --------- Co-authored-by: root <root@ip-172-31-46-160.ap-southeast-1.compute.internal> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-12 19:48:29 -08:00
Lingzhen Chen	30af711c34	community[patch]: update AzureSearch class to work with azure-search-documents=11.4.0 (#15659 ) - Description: Updates `libs/community/langchain_community/vectorstores/azuresearch.py` to support the stable version `azure-search-documents=11.4.0` - Issue: https://github.com/langchain-ai/langchain/issues/14534, https://github.com/langchain-ai/langchain/issues/15039, https://github.com/langchain-ai/langchain/issues/15355 - Dependencies: azure-search-documents>=11.4.0 --------- Co-authored-by: Clément Tamines <Skar0@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-12 19:23:35 -08:00
Robby	e135dc70c3	community[patch]: Invoke callback prior to yielding token (#17348 ) Description: Invoke callback prior to yielding token in stream method for Ollama. Issue: [Callback for on_llm_new_token should be invoked before the token is yielded by the model #16913](https://github.com/langchain-ai/langchain/issues/16913) Co-authored-by: Robby <h0rv@users.noreply.github.com>	2024-02-12 19:22:55 -08:00
Christophe Bornet	ab025507bc	community[patch]: Add async methods to VectorStoreQATool (#16949 )	2024-02-12 19:19:50 -08:00
Christophe Bornet	fb7552bfcf	Add async methods to InMemoryCache (#17425 ) Add async methods to InMemoryCache	2024-02-12 22:02:38 -05:00
yin1991	37ef6ac113	community[patch]: Add Pagination to GitHubIssuesLoader for Efficient GitHub Issues Retrieval (#16934 ) - Description: Add Pagination to GitHubIssuesLoader for Efficient GitHub Issues Retrieval - Issue: [the issue # it fixes if applicable,](https://github.com/langchain-ai/langchain/issues/16864) --------- Co-authored-by: root <root@ip-172-31-46-160.ap-southeast-1.compute.internal> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-02-12 18:30:36 -08:00
Robby	ece4b43a81	community[patch]: doc loaders mypy fixes (#17368 ) Description: Fixed `type: ignore`'s for mypy for some document_loaders. Issue: [Remove "type: ignore" comments #17048 ](https://github.com/langchain-ai/langchain/issues/17048) --------- Co-authored-by: Robby <h0rv@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-02-12 16:51:06 -08:00
Robby	0653aa469a	community[patch]: Invoke callback prior to yielding token (#17346 ) Description: Invoke callback prior to yielding token in stream method for watsonx. Issue: [Callback for on_llm_new_token should be invoked before the token is yielded by the model #16913](https://github.com/langchain-ai/langchain/issues/16913) Co-authored-by: Robby <h0rv@users.noreply.github.com>	2024-02-12 16:36:33 -08:00
Bagatur	f7e453971d	community[patch]: remove print (#17435 )	2024-02-12 15:21:38 -08:00
Spencer Kelly	54fa78c887	community[patch]: fixed vector similarity filtering (#16967 ) Description: changed filtering so that failed filter doesn't add document to results. Currently filtering is entirely broken and all documents are returned whether or not they pass the filter. fixes issue introduced in https://github.com/langchain-ai/langchain/pull/16190	2024-02-12 14:52:57 -08:00
Abhijeeth Padarthi	584b647b96	community[minor]: AWS Athena Document Loader (#15625 ) - Description: Adds the document loader for [AWS Athena](https://aws.amazon.com/athena/), a serverless and interactive analytics service. - Dependencies: Added boto3 as a dependency	2024-02-12 12:53:40 -08:00

1 2 3 4 5 ...

625 Commits