langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-31 15:20:26 +00:00

Author	SHA1	Message	Date
William FH	2e3d77c34e	Fix eval loader when overriding arguments (#7734 ) - Update the negative criterion descriptions to prevent bad predictions - Add support for normalizing the string distance - Fix potential json deserializing into float issues in the example mapper	2023-07-15 08:30:32 -07:00
Bagatur	c871c04270	bump 234 (#7754 )	2023-07-15 10:49:51 -04:00
Gordon Clark	96f3dff050	MediaWiki docloader improvements + unit tests (#5879 ) Starting over from #5654 because I utterly borked the poetry.lock file. Adds new paramerters for to the MWDumpLoader class: * skip_redirecst (bool) Tells the loader to skip articles that redirect to other articles. False by default. * stop_on_error (bool) Tells the parser to skip any page that causes a parse error. True by default. * namespaces (List[int]) Tells the parser which namespaces to parse. Contains namespaces from -2 to 15 by default. Default values are chosen to preserve backwards compatibility. Sample dump XML and full unit test coverage (with extended tests that pass!) also included! --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-15 10:49:36 -04:00
Xavier	4c8106311f	Add `pip install langsmith` for Quick Install part of README (#7694 ) Issue When I use conda to install langchain, a dependency error throwed - "ModuleNotFoundError: No module named 'langsmith'" Updated Run `pip install langsmith` when install langchain with conda Co-authored-by: xaver.xu <xavier.xu@batechworks.com>	2023-07-15 10:27:32 -04:00
Mohammad Mohtashim	b8b8a138df	Simple Import fix in Tools Exception Docs (#7740 ) Issue: #7720 @hinthornw	2023-07-15 10:25:34 -04:00
Nicolas	43f900fd38	docs: Mendable Search Improvements (#7744 ) - New pin-to-side (button). This functionality allows you to search the docs while asking the AI for questions - Fixed the search bar in Firefox that won't detect a mouse click - Fixes and improvements overall in the model's performance	2023-07-15 10:19:21 -04:00
rjarun8	b7c409152a	Document loader/debug (#7750 ) Description: Added debugging output in DirectoryLoader to identify the file being processed. Issue: [Need a trace or debug feature in Lanchain DirectoryLoader #7725](https://github.com/hwchase17/langchain/issues/7725) Dependencies: No additional dependencies are required. Tag maintainer: @rlancemartin, @eyurtsev This PR enhances the DirectoryLoader with debugging output to help diagnose issues when loading documents. This new feature does not add any dependencies and has been tested on a local machine.	2023-07-15 10:18:27 -04:00
Lance Martin	b015647e31	Add GPT4All embeddings (#7743 ) Support for [GPT4All embeddings](https://docs.gpt4all.io/gpt4all_python_embedding.html) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-15 10:04:29 -04:00
Chang Sau Sheong	b6a7f40ad3	added support for Google Images search (#7751 ) - Description: Added Google Image Search support for SerpAPIWrapper - Issue: NA - Dependencies: None - Tag maintainer: @hinthornw - Twitter handle: @sausheong --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-15 10:04:18 -04:00
Kacper Łukawski	1ff5b67025	Implement async API for Qdrant vector store (#7704 ) Inspired by #5550, I implemented full async API support in Qdrant. The docs were extended to mention the existence of asynchronous operations in Langchain. I also used that chance to restructure the tests of Qdrant and provided a suite of tests for the async version. Async API requires the GRPC protocol to be enabled. Thus, it doesn't work on local mode yet, but we're considering including the support to be consistent.	2023-07-15 09:33:26 -04:00
Bearnardd	275b926cf7	add missing import (#7730 ) Just a nit documentation fix @baskaryan	2023-07-14 20:03:23 -04:00
Bearnardd	9800c6051c	add support for truncate arg for HuggingFaceTextGenInference class (#7728 ) Fixes https://github.com/hwchase17/langchain/issues/7650 * add support for `truncate` argument of `HugginFaceTextGenInference` @baskaryan	2023-07-14 16:23:56 -04:00
Lorenzo	77e6bbe6f0	fix typo in deeplake.ipynb (#7718 ) - Fixing typos in deeplake documentation - @baskaryan	2023-07-14 13:38:31 -04:00
Samuel Berthe	2be3515a66	SQLDatabase: adding security disclamer (#7710 ) It might be obvious to most engineers, but I think everybody should be cautious when using such a chain. ![image](https://github.com/hwchase17/langchain/assets/2951285/a1df6567-9d56-4c12-98ea-767401ae2ac8)	2023-07-14 13:38:16 -04:00
William FH	fcf98dc4c1	Check for Tiktoken (#7705 )	2023-07-14 09:49:01 -07:00
Bagatur	bae93682f6	update docs (#7714 )	2023-07-14 11:49:09 -04:00
Bagatur	b065da6933	Bagatur/docs nit (#7712 )	2023-07-14 11:13:02 -04:00
Bagatur	87d81b6acc	Redirect old text splitter page (#7708 ) related to #7665	2023-07-14 11:12:18 -04:00
Aarav Borthakur	210296a71f	Integrate Rockset as a document loader (#7681 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> Integrate [Rockset](https://rockset.com/docs/) as a document loader. Issue: None Dependencies: Nothing new (rockset's dependency was already added [here](https://github.com/hwchase17/langchain/pull/6216)) Tag maintainer: @rlancemartin I have added a test for the integration and an example notebook showing its use. I ran `make lint` and everything looks good. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-14 07:58:13 -07:00
Bagatur	ad7d97670b	bump 233 (#7707 )	2023-07-14 10:38:13 -04:00
Samuel Berthe	7d4843fe84	feat(chains): adding ElasticsearchDatabaseChain for interacting with analytics database (#7686 ) This pull request adds a ElasticsearchDatabaseChain chain for interacting with analytics database, in the manner of the SQLDatabaseChain. Maintainer: @samber Twitter handler: samuelberthe --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-14 10:30:57 -04:00
Daniel	6d88b23ef7	Update pgembedding.ipynb (#7699 ) Update the extension name. It changed from pg_hnsw to pg_embedding. Thank you. I missed this in my previous commit.	2023-07-14 08:39:01 -04:00
Eric Speidel	663b0933e4	Allow passing auth objects in TextRequestsWrapper (#7701 ) - Description: This allows passing auth objects in request wrappers. Currently, we can handle auth by editing headers in the RequestsWrappers, but more complex auth methods, such as Kerberos, could be handled better by using existing functionality within the requests library. There are many authentication options supported both natively and by extensions, such as requests-kerberos or requests-ntlm. - Issue: Fixes #7542 - Dependencies: none Co-authored-by: eric.speidel@de.bosch.com <eric.speidel@de.bosch.com>	2023-07-14 08:38:24 -04:00
Nuno Campos	1e40427755	Enabled nesting chain group (#7697 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-14 10:03:16 +01:00
Leonid Kuligin	85e1c9b348	Added support for examples for VertexAI chat models. (#7636 ) #5278 Co-authored-by: Leonid Kuligin <kuligin@google.com>	2023-07-14 02:03:04 -04:00
Richy Wang	45bb414be2	Add LLM for Alibaba's Damo Academy's Tongyi Qwen API (#7477 ) - Add langchain.llms.Tonyi for text completion, in examples into the Tonyi Text API, - Add system tests. Note async completion for the Text API is not yet supported and will be included in a future PR. Dependencies: dashscope. It will be installed manually cause it is not need by everyone. Happy for feedback on any aspect of this PR @hwchase17 @baskaryan.	2023-07-14 01:58:22 -04:00
Lance Martin	6325a3517c	Make recursive loader yield while crawling (#7568 ) Support actual lazy_load since it can take a while to crawl larger directories.	2023-07-13 21:55:20 -07:00
UmerHA	82f3e32d8d	[Small upgrade] Allow document limit in AzureCognitiveSearchRetriever (#7690 ) Multiple people have asked in #5081 for a way to limit the documents returned from an AzureCognitiveSearchRetriever. This PR adds the `top_n` parameter to allow that. Twitter handle: [@UmerHAdil](twitter.com/umerHAdil)	2023-07-13 23:04:40 -04:00
AI-Chef	af6d333147	Fix same issue #7524 in FileCallbackHandler (#7687 ) Fix for Serializable class to include name, used in FileCallbackHandler as same issue #7524 Description: Fixes the Serializable class to include 'name' attribute (class_name) in the dict created, This is used in Callbacks, specifically the StdOutCallbackHandler, FileCallbackHandler. Issue: As described in issue #7524 Dependencies: None Tag maintainer: SInce this is related to the callback module, tagging @agola11 @idoru Comments: Glad to see issue #7524 fixed in pull #6124, but you forget to change the same place in FileCallbackHandler	2023-07-13 22:39:21 -04:00
Ben Perry	3874bb256e	Weaviate: Batch embed texts (#5903 ) When a custom Embeddings object is set, embed all given texts in a batch instead of passing them through individually. Any code calling add_texts can then appropriately size the chunks of texts that are passed through to take full advantage of the hardware it's running on.	2023-07-13 20:57:58 -04:00
Charles P	574698a5fb	Make so explicit class constructor is called in ElasticVectorSearch from_texts (#6199 ) Fixes #6198 ElasticKnnSearch.from_texts is actually ElasticVectorSearch.from_texts and throws because it calls ElasticKnnSearch constructor with the wrong arguments. Now ElasticKnnSearch has its own from_texts, which constructs a proper ElasticKnnSearch. --------- Co-authored-by: Charles Parker <charlesparker@FiltaMacbook.local> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-13 19:55:20 -04:00
Daniel	854f3fe9b1	Update pgembedding.ipynb (#7682 ) Correct links to the pg_embedding repository and the Neon documentation.	2023-07-13 19:54:07 -04:00
William FH	051fac1e66	Improve walkthrough links for sphinx (#7672 ) Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>	2023-07-13 16:08:31 -07:00
Bagatur	5db4dba526	add integrations hub link to docs (#7675 )	2023-07-13 18:44:10 -04:00
Kenton Parton	9124221d31	Fixed handling of absolute URLs in `RecursiveUrlLoader` (#7677 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Description This PR addresses a bug in the RecursiveUrlLoader class where absolute URLs were being treated as relative URLs, causing malformed URLs to be produced. The fix involves using the urljoin function from the urllib.parse module to correctly handle both absolute and relative URLs. @rlancemartin @eyurtsev --------- Co-authored-by: Lance Martin <lance@langchain.dev>	2023-07-13 15:34:00 -07:00
EllieRoseS	c087ce74f7	Added matching async load func to PlaywrightURLLoader (#5938 ) Fixes # (issue) The existing PlaywrightURLLoader load() function uses a synchronous browser which is not compatible with jupyter. This PR adds a sister function aload() which can be run insisde a notebook. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-13 17:51:38 -04:00
William FH	ae7714f1ba	Configure Tracer Workers (#7676 ) Mainline the tracer to avoid calling feedback before run is posted. Chose a bool over `max_workers` arg for configuring since we don't want to support > 1 for now anyway. At some point may want to manage the pool ourselves (ordering only really matters within a run and with parent runs)	2023-07-13 14:00:14 -07:00
Jasper	fbc97a77ed	add browserless loader (#7562 ) # Browserless Added support for Browserless' `/content` endpoint as a document loader. ### About Browserless Browserless is a cloud service that provides access to headless Chrome browsers via a REST API. It allows developers to automate Chromium in a serverless fashion without having to configure and maintain their own Chrome infrastructure. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Lance Martin <lance@langchain.dev>	2023-07-13 13:18:28 -07:00
mebstyne-msft	120c52589b	Enabled Azure Active Directory token-based auth access to OpenAI completions (#6313 ) With AzureOpenAI openai_api_type defaulted to "azure" the logic in utils' get_from_dict_or_env() function triggered by the root validator never looks to environment for the user's runtime openai_api_type values. This inhibits folks using token-based auth, or really any auth model other than "azure." By removing the "default" value, this allows environment variables to be pulled at runtime for the openai_api_type and thus enables the other api_types which are expected to work. --------- Co-authored-by: Ebo <mebstyne@microsoft.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-07-13 16:05:47 -04:00
frangin2003	c7b687e944	Simplify GraphQL Tool Initialization documentation by Removing 'llm' Argument (#7651 ) This PR is aimed at enhancing the clarity of the documentation in the langchain project. Description: In the graphql.ipynb file, I have removed the unnecessary 'llm' argument from the initialization process of the GraphQL tool (of type _EXTRA_OPTIONAL_TOOLS). The 'llm' argument is not required for this process. Its presence could potentially confuse users. This modification simplifies the understanding of tool initialization and minimizes potential confusion. Issue: Not applicable, as this is a documentation improvement. Dependencies: None. I kindly request a review from the following maintainer: @hinthornw, who is responsible for Agents / Tools / Toolkits. No new integration is being added in this PR, hence no need for a test or an example notebook. Please see the changes for more detail and let me know if any further modification is necessary.	2023-07-13 14:52:07 -04:00
William FH	aab2a7cd4b	Normalize Trajectory Eval Score (#7668 )	2023-07-13 09:58:28 -07:00
William FH	5f03cc3511	spelling nit (#7667 )	2023-07-13 09:12:57 -07:00
Bagatur	3dd0704e38	bump 232 (#7659 )	2023-07-13 10:32:39 -04:00
Tamas Molnar	24c1654208	Fix SQLAlchemy LLM cache clear (#7653 ) Fixes #7652 Description: This is a fix for clearing the cache for SQL Alchemy based LLM caches. The langchain.llm_cache.clear() did not take effect for SQLite cache. Reason: it didn't commit the deletion database change. See SQLAlchemy documentation for proper usage: https://docs.sqlalchemy.org/en/20/orm/session_basics.html#opening-and-closing-a-session https://docs.sqlalchemy.org/en/20/orm/session_basics.html#deleting @hwchase17 @baskaryan --------- Co-authored-by: Tamas Molnar <tamas.molnar@nagarro.com>	2023-07-13 09:39:04 -04:00
Bagatur	c17a80f11c	fix chroma updated upsert interface (#7643 ) new chroma release seems to not support empty dicts for metadata. related to #7633	2023-07-13 09:27:14 -04:00
William FH	a673a51efa	[Breaking] Update Evaluation Functionality (#7388 ) - Migrate from deprecated langchainplus_sdk to `langsmith` package - Update the `run_on_dataset()` API to use an eval config - Update a number of evaluators, as well as the loading logic - Update docstrings / reference docs - Update tracer to share single HTTP session	2023-07-13 02:13:06 -07:00
Sam Coward	224199083b	Fix missing chain classname in StdOutCallbackHandler.on_chain_start (#6124 ) Retrieves the name of the class from new location as of commit `18af149e91` Co-authored-by: Zander Chase <130414180+vowelparrot@users.noreply.github.com>	2023-07-13 03:05:36 -04:00
lucasiscovici	af3f401015	update base class of ListStepContainer to BaseStepContainer (#6232 ) update base class of ListStepContainer to BaseStepContainer Fixes #6231	2023-07-13 03:03:02 -04:00
Matt Adams	98e1bbfbbd	Add missing dependencies to apify.ipynb (#6331 ) Fixes errors caused by missing dependencies when running the notebook.	2023-07-13 03:02:23 -04:00
Ma Donghao	6f62e5461c	Update the parser regex of map_rerank (#6419 ) Sometimes the score responded by chatgpt would be like 'Respone example\nScore: 90 (fully answers the question, but could provide more detail on the specific error message)' For the score contains not only numbers, it raise a ValueError like Update the RegexParser from `.` to `\d` would help us to ignore the text after number. Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-13 03:01:42 -04:00

... 4 5 6 7 8 ...

3382 Commits