langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-29 17:07:25 +00:00

Author	SHA1	Message	Date
Jiří Moravčík	86646ec555	feat: Add `ApifyWrapper` class (#10067 ) If you look at documentation https://python.langchain.com/docs/integrations/tools/apify (or the actual file https://github.com/langchain-ai/langchain/blob/master/docs/extras/integrations/tools/apify.ipynb ), there's a class `ApifyWrapper` mentioned. It seems it got lost in some refactoring, i.e. it does not exist in the codebase ATM. I just propose to add it back. It would fix issues e.g. https://github.com/langchain-ai/langchain/issues/8307 or https://github.com/langchain-ai/langchain/issues/8201 To add, Apify is a wanted integration, e.g. see https://twitter.com/hwchase17/status/1695490295914545626 or https://twitter.com/hwchase17/status/1695470765343461756 Lastly, I offer taking ownership of the Apify-related parts of the codebase, so you can tag me if anything is needed.	2023-08-31 15:47:44 -07:00
Robert Perrotta	02e51f4217	update_forward_refs for Run (#9969 ) Adds a call to Pydantic's `update_forward_refs` for the `Run` class (in addition to the `ChainRun` and `ToolRun` classes, for which that method is already called). Without it, the self-reference of child classes (type `List[Run]`) is problematic. For example: ```python from langchain.callbacks import StdOutCallbackHandler from langchain.chains import LLMChain from langchain.llms import OpenAI from langchain.prompts import PromptTemplate from wandb.integration.langchain import WandbTracer llm = OpenAI() prompt = PromptTemplate.from_template("1 + {number} = ") chain = LLMChain(llm=llm, prompt=prompt, callbacks=[StdOutCallbackHandler(), WandbTracer()]) print(chain.run(number=2)) ``` results in the following output before the change ``` WARNING:root:Error in on_chain_start callback: field "child_runs" not yet prepared so type is still a ForwardRef, you might need to call Run.update_forward_refs(). > Entering new LLMChain chain... Prompt after formatting: 1 + 2 = WARNING:root:Error in on_chain_end callback: No chain Run found to be traced > Finished chain. 3 ``` but afterwards the callback error messages are gone.	2023-08-31 15:25:59 -07:00
Eugene Yurtsev	74fcfed4e2	lint for pydantic imports (#9937 ) Catch pydantic imports	2023-08-31 15:55:29 -04:00
Zizhong Zhang	641b71e2cd	refactor: rename to OpaquePrompts (#10013 ) Renamed to OpaquePrompts cc @baskaryan Thanks in advance!	2023-08-31 12:21:24 -07:00
Bagatur	8d66b00c73	Data anonymizer notebook nit (#10062 )	2023-08-31 10:58:13 -07:00
Bagatur	19400ba253	bump 278 (#10052 )	2023-08-31 07:35:42 -07:00
Bagatur	29270e0378	fix #3117 (#9957 ) fix #3117	2023-08-31 07:29:49 -07:00
Bagatur	5b913003e0	bump	2023-08-31 07:27:56 -07:00
Bagatur	4b15328767	Add indexing support for postgresql (#9933 ) Add support to postgresql for the SQL Manager Record This code was tested locally. I'm looking at how to add testing with postgres in a separate PR.	2023-08-31 07:27:09 -07:00
Bagatur	e60e1cdf23	fixed openai_functions api_response format args err (#9968 ) root cause: args may not have a key (params) resulting in an error	2023-08-31 00:49:19 -07:00
Bagatur	3efab8d3df	implement vectorstores by tencent vectordb (#9989 ) Hi there！ I'm excited to open this PR to add support for using 'Tencent Cloud VectorDB' as a vector store. Tencent Cloud VectorDB is a fully-managed, self-developed, enterprise-level distributed database service designed for storing, retrieving, and analyzing multi-dimensional vector data. The database supports multiple index types and similarity calculation methods, with a single index supporting vector scales up to 1 billion and capable of handling millions of QPS with millisecond-level query latency. Tencent Cloud VectorDB not only provides external knowledge bases for large models to improve their accuracy, but also has wide applications in AI fields such as recommendation systems, NLP services, computer vision, and intelligent customer service. The PR includes: Implementation of Vectorstore. I have read your [contributing guidelines](`72b7d76d79/.github/CONTRIBUTING.md`). And I have passed the tests below make format make lint make coverage make test	2023-08-31 00:48:25 -07:00
Bagatur	d43a36c32a	Bagatur/dereference tool schema (#10007 ) fix for #9375	2023-08-31 00:48:12 -07:00
Bagatur	6b5a970949	refactor(document_loaders): abstract page evaluation logic in PlaywrightURLLoader (#9995 ) This PR brings structural updates to `PlaywrightURLLoader`, aiming at making the code more readable and extensible through the abstraction of page evaluation logic. These changes also align this implementation with a similar structure used in LangChain.js. The key enhancements include: 1. Introduction of 'PlaywrightEvaluator', an abstract base class for all evaluators. 2. Creation of 'UnstructuredHtmlEvaluator', a concrete class implementing 'PlaywrightEvaluator', which uses `unstructured` library for processing page's HTML content. 3. Extension of 'PlaywrightURLLoader' constructor to optionally accept an evaluator of the type 'PlaywrightEvaluator'. It defaults to 'UnstructuredHtmlEvaluator' if no evaluator is provided. 4. Refactoring of 'load' and 'aload' methods to use the 'evaluate' and 'evaluate_async' methods of the provided 'PageEvaluator' for page content handling. This update brings flexibility to 'PlaywrightURLLoader' as it can now utilize different evaluators for page processing depending on the requirement. The abstraction also improves code maintainability and readability. Twitter: @ywkim	2023-08-31 00:45:33 -07:00
Bagatur	b1644bc9ad	cr	2023-08-31 00:43:34 -07:00
Hunsmore	13fef1e5d3	add bloomz_7b, llama-2-7b, llama-2-13b, llama-2-70b to ErnieBotChat (#10024 ) - Description: Add bloomz_7b, llama-2-7b, llama-2-13b, llama-2-70b to ErnieBotChat, which only supported ERNIE-Bot-turbo and ERNIE-Bot. - Issue: #10022, - Dependencies: no extra dependencies --------- Co-authored-by: hetianfeng <hetianfeng@meituan.com>	2023-08-31 00:38:55 -07:00
Cameron Vetter	e37d51cab6	fix scoring profile example (#10016 ) - Description: A change in the documentation example for Azure Cognitive Vector Search with Scoring Profile so the example works as written - Issue: #10015 - Dependencies: None - Tag maintainer: @baskaryan @ruoccofabrizio - Twitter handle: @poshporcupine	2023-08-31 00:35:06 -07:00
skspark	52a3e8a261	Add integration TCs on bing search (#8068 ) (#10021 ) ## Description Added integration TCs on bing search utility ## Issue #8068 ## Dependencies None	2023-08-31 00:34:06 -07:00
Hyeokjun seo	e2e05ad89e	Fix Typo : `openai_api_key` -> `serpapi_api_key` (#10020 ) Fixed typo in the comments Notebook. (which says `openai_api_key` for SerpAPI)	2023-08-31 00:33:13 -07:00
Tomaz Bratanic	f2e8399cc8	Fix link in Neo4j provider page (#10023 )	2023-08-31 00:32:42 -07:00
William FH	5341b04d68	Update error message (#9970 ) in evals	2023-08-30 17:42:55 -07:00
William FH	b82ad19ed2	Check memory address (#9971 ) Don't want to dup the collector but can have multiple	2023-08-30 15:30:22 -07:00
Bagatur	e805f8e263	add tests	2023-08-30 15:23:02 -07:00
Bagatur	1f5c579ef4	add	2023-08-30 13:37:50 -07:00
Bagatur	240cc289e6	wip	2023-08-30 13:37:39 -07:00
Bagatur	7fa82900cb	guides docs nits (#10005 )	2023-08-30 11:07:42 -07:00
Bagatur	2f03e71e67	rename local llm guide (#10004 )	2023-08-30 10:52:46 -07:00
Bagatur	781f274d19	make privacy guide section (#10003 )	2023-08-30 10:49:20 -07:00
maks-operlejn-ds	a8f804a618	Add data anonymizer (#9863 ) ### Description The feature for anonymizing data has been implemented. In order to protect private data, such as when querying external APIs (OpenAI), it is worth pseudonymizing sensitive data to maintain full privacy. Anonynization consists of two steps: 1. Identification: Identify all data fields that contain personally identifiable information (PII). 2. Replacement: Replace all PIIs with pseudo values or codes that do not reveal any personal information about the individual but can be used for reference. We're not using regular encryption, because the language model won't be able to understand the meaning or context of the encrypted data. We use Microsoft Presidio together with Faker framework for anonymization purposes because of the wide range of functionalities they provide. The full implementation is available in `PresidioAnonymizer`. ### Future works - deanonymization - add the ability to reverse anonymization. For example, the workflow could look like this: `anonymize -> LLMChain -> deanonymize`. By doing this, we will retain anonymity in requests to, for example, OpenAI, and then be able restore the original data. - instance anonymization - at this point, each occurrence of PII is treated as a separate entity and separately anonymized. Therefore, two occurrences of the name John Doe in the text will be changed to two different names. It is therefore worth introducing support for full instance detection, so that repeated occurrences are treated as a single object. ### Twitter handle @deepsense_ai / @MaksOpp --------- Co-authored-by: MaksOpp <maks.operlejn@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-30 10:39:44 -07:00
Bagatur	98cce7dcd3	update moderation docs (#10002 )	2023-08-30 10:34:25 -07:00
Bagatur	b3e3a31240	bump 277 (#9997 )	2023-08-30 08:29:51 -07:00
Bagatur	9828701de1	mv base cache to schema (#9953 ) if you remove all other imports from langchain.init it exposes a circular dep	2023-08-30 08:10:51 -07:00
Christophe Bornet	9870bfb9cd	Add bucket and object key to metadata in S3 loader (#9317 ) - Description: this PR adds `s3_object_key` and `s3_bucket` to the doc metadata when loading an S3 file. This is particularly useful when using `S3DirectoryLoader` to remove the files from the dir once they have been processed (getting the object keys from the metadata `source` field seems brittle) - Dependencies: N/A - Tag maintainer: ? - Twitter handle: _cbornet --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-08-30 11:03:24 -04:00
Eugene Yurtsev	6da158388b	Merge branch 'master' into ywkim/master	2023-08-30 10:46:26 -04:00
Guy Korland	24c0b01c38	Extend the FalkorDB QA demo (#9992 ) - Description: Extend the FalkorDB QA demo - Tag maintainer: @baskaryan	2023-08-30 10:13:18 -04:00
Eugene Yurtsev	588237ef30	Make document serializable, create utility to create a docstore (#9674 ) This PR makes the following changes: 1. Documents become serializable using langhchain serialization 2. Make a utility to create a docstore kw store Will help to address issue here: https://github.com/langchain-ai/langchain/issues/9345	2023-08-30 09:45:04 -04:00
Eugene Yurtsev	e8f29be350	x	2023-08-30 09:36:27 -04:00
Buckler89	a28e888b36	fix call _get_keys for custom_evaluator (#9763 ) In the function _load_run_evaluators the function _get_keys was not called if only custom_evaluators parameter is used - Description: In the function _load_run_evaluators the function _get_keys was not called if only custom_evaluators parameter is used, - Issue: no issue created for this yet, - Dependencies: None, - Tag maintainer: @vowelparrot, - Twitter handle: Buckler89 --------- Co-authored-by: ddroghini <d.droghini@mflgroup.com>	2023-08-30 06:35:23 -07:00
Eugene Yurtsev	cafce9ed23	x	2023-08-30 09:35:00 -04:00
wlleiiwang	8c4e29240c	implement vectorstores by tencent vectordb	2023-08-30 16:40:58 +08:00
Bagatur	2d2b097fab	mv chat history (#9725 )	2023-08-29 21:41:32 -07:00
Bagatur	d762a6b51f	rm mutable defaults (#9974 )	2023-08-29 20:36:27 -07:00
Arjun Aravindan	6a51672164	Update SeleniumURLLoader to use webdriver Service in favor of deprecated executable_path parameter (#9814 ) Description: This commit uses the new Service object in Selenium webdriver as executable_path has been [deprecated and removed in selenium version 4.11.2](`9f5801c82f`) Issue: https://github.com/langchain-ai/langchain/issues/9808 Tag Maintainer: @eyurtsev	2023-08-29 19:45:18 -07:00
William FH	c844aaa7a6	Weakref to tracer (#9954 ) Prevent memory/thread leakage	2023-08-29 19:27:22 -07:00
Jurik-001	a05fed9369	Fix add callbacks to spark_sql due to depreciation of callback_manager (#9831 ) Description: Due to depreciation (regarding to line 109 in [langchain/libs/langchain/langchain/chains/base.py](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/chains/base.py) of callback_manager i replaced several parts Issue: None Dependencies: Maintainer: @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-29 19:23:44 -07:00
dafu	c26deb6b38	fixed openai_functions api_response format args err root cause: args may not have a key (params) resulting in an error	2023-08-30 09:58:24 +08:00
axiangcoding	ffa5625134	feat(llms): improve ERNIE-Bot chat model (#9833 ) - Description: improve ERNIE-Bot chat model, add request timeout and more testcases. - Issue: None - Dependencies: None - Tag maintainer: @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-29 18:20:06 -07:00
Bagatur	bdccb1215a	docs: `integrations/tools` consistency (#9965 ) Updated titles, descriptions into consistent format.	2023-08-29 18:04:01 -07:00
Bagatur	d966ba63e2	fixed GoogleCloudEnterpriseSearchRetriever returning an empty array (#9858 ) `GoogleCloudEnterpriseSearchRetriever` returned an empty array of documents earlier, fixed	2023-08-29 17:49:48 -07:00
Bagatur	ec362ecbe2	Fixed regex bug in RetrievalQAWithSources in previous update (#9898 ) - Description: In my previous PR, I had modified the code to catch all kinds of [SOURCES, sources, Source, Sources]. However, this change included checking for a colon or a white space which should actually have been only checking for a colon. - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change,	2023-08-29 17:32:24 -07:00
Nikhil Suresh	56a0165a4e	cleaned up unit test example	2023-08-29 23:37:54 +00:00

1 2 3 4 5 ...

4285 Commits