langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-29 17:07:25 +00:00

Author	SHA1	Message	Date
Bearnardd	275b926cf7	add missing import (#7730 ) Just a nit documentation fix @baskaryan	2023-07-14 20:03:23 -04:00
Bearnardd	9800c6051c	add support for truncate arg for HuggingFaceTextGenInference class (#7728 ) Fixes https://github.com/hwchase17/langchain/issues/7650 * add support for `truncate` argument of `HugginFaceTextGenInference` @baskaryan	2023-07-14 16:23:56 -04:00
Lorenzo	77e6bbe6f0	fix typo in deeplake.ipynb (#7718 ) - Fixing typos in deeplake documentation - @baskaryan	2023-07-14 13:38:31 -04:00
Samuel Berthe	2be3515a66	SQLDatabase: adding security disclamer (#7710 ) It might be obvious to most engineers, but I think everybody should be cautious when using such a chain. ![image](https://github.com/hwchase17/langchain/assets/2951285/a1df6567-9d56-4c12-98ea-767401ae2ac8)	2023-07-14 13:38:16 -04:00
William FH	fcf98dc4c1	Check for Tiktoken (#7705 )	2023-07-14 09:49:01 -07:00
Bagatur	bae93682f6	update docs (#7714 )	2023-07-14 11:49:09 -04:00
Bagatur	b065da6933	Bagatur/docs nit (#7712 )	2023-07-14 11:13:02 -04:00
Bagatur	87d81b6acc	Redirect old text splitter page (#7708 ) related to #7665	2023-07-14 11:12:18 -04:00
Aarav Borthakur	210296a71f	Integrate Rockset as a document loader (#7681 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> Integrate [Rockset](https://rockset.com/docs/) as a document loader. Issue: None Dependencies: Nothing new (rockset's dependency was already added [here](https://github.com/hwchase17/langchain/pull/6216)) Tag maintainer: @rlancemartin I have added a test for the integration and an example notebook showing its use. I ran `make lint` and everything looks good. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-14 07:58:13 -07:00
Bagatur	ad7d97670b	bump 233 (#7707 )	2023-07-14 10:38:13 -04:00
Samuel Berthe	7d4843fe84	feat(chains): adding ElasticsearchDatabaseChain for interacting with analytics database (#7686 ) This pull request adds a ElasticsearchDatabaseChain chain for interacting with analytics database, in the manner of the SQLDatabaseChain. Maintainer: @samber Twitter handler: samuelberthe --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-14 10:30:57 -04:00
Daniel	6d88b23ef7	Update pgembedding.ipynb (#7699 ) Update the extension name. It changed from pg_hnsw to pg_embedding. Thank you. I missed this in my previous commit.	2023-07-14 08:39:01 -04:00
Eric Speidel	663b0933e4	Allow passing auth objects in TextRequestsWrapper (#7701 ) - Description: This allows passing auth objects in request wrappers. Currently, we can handle auth by editing headers in the RequestsWrappers, but more complex auth methods, such as Kerberos, could be handled better by using existing functionality within the requests library. There are many authentication options supported both natively and by extensions, such as requests-kerberos or requests-ntlm. - Issue: Fixes #7542 - Dependencies: none Co-authored-by: eric.speidel@de.bosch.com <eric.speidel@de.bosch.com>	2023-07-14 08:38:24 -04:00
Nuno Campos	1e40427755	Enabled nesting chain group (#7697 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-14 10:03:16 +01:00
Leonid Kuligin	85e1c9b348	Added support for examples for VertexAI chat models. (#7636 ) #5278 Co-authored-by: Leonid Kuligin <kuligin@google.com>	2023-07-14 02:03:04 -04:00
Richy Wang	45bb414be2	Add LLM for Alibaba's Damo Academy's Tongyi Qwen API (#7477 ) - Add langchain.llms.Tonyi for text completion, in examples into the Tonyi Text API, - Add system tests. Note async completion for the Text API is not yet supported and will be included in a future PR. Dependencies: dashscope. It will be installed manually cause it is not need by everyone. Happy for feedback on any aspect of this PR @hwchase17 @baskaryan.	2023-07-14 01:58:22 -04:00
Lance Martin	6325a3517c	Make recursive loader yield while crawling (#7568 ) Support actual lazy_load since it can take a while to crawl larger directories.	2023-07-13 21:55:20 -07:00
UmerHA	82f3e32d8d	[Small upgrade] Allow document limit in AzureCognitiveSearchRetriever (#7690 ) Multiple people have asked in #5081 for a way to limit the documents returned from an AzureCognitiveSearchRetriever. This PR adds the `top_n` parameter to allow that. Twitter handle: [@UmerHAdil](twitter.com/umerHAdil)	2023-07-13 23:04:40 -04:00
AI-Chef	af6d333147	Fix same issue #7524 in FileCallbackHandler (#7687 ) Fix for Serializable class to include name, used in FileCallbackHandler as same issue #7524 Description: Fixes the Serializable class to include 'name' attribute (class_name) in the dict created, This is used in Callbacks, specifically the StdOutCallbackHandler, FileCallbackHandler. Issue: As described in issue #7524 Dependencies: None Tag maintainer: SInce this is related to the callback module, tagging @agola11 @idoru Comments: Glad to see issue #7524 fixed in pull #6124, but you forget to change the same place in FileCallbackHandler	2023-07-13 22:39:21 -04:00
Ben Perry	3874bb256e	Weaviate: Batch embed texts (#5903 ) When a custom Embeddings object is set, embed all given texts in a batch instead of passing them through individually. Any code calling add_texts can then appropriately size the chunks of texts that are passed through to take full advantage of the hardware it's running on.	2023-07-13 20:57:58 -04:00
Charles P	574698a5fb	Make so explicit class constructor is called in ElasticVectorSearch from_texts (#6199 ) Fixes #6198 ElasticKnnSearch.from_texts is actually ElasticVectorSearch.from_texts and throws because it calls ElasticKnnSearch constructor with the wrong arguments. Now ElasticKnnSearch has its own from_texts, which constructs a proper ElasticKnnSearch. --------- Co-authored-by: Charles Parker <charlesparker@FiltaMacbook.local> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-13 19:55:20 -04:00
Daniel	854f3fe9b1	Update pgembedding.ipynb (#7682 ) Correct links to the pg_embedding repository and the Neon documentation.	2023-07-13 19:54:07 -04:00
William FH	051fac1e66	Improve walkthrough links for sphinx (#7672 ) Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>	2023-07-13 16:08:31 -07:00
Bagatur	5db4dba526	add integrations hub link to docs (#7675 )	2023-07-13 18:44:10 -04:00
Kenton Parton	9124221d31	Fixed handling of absolute URLs in `RecursiveUrlLoader` (#7677 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Description This PR addresses a bug in the RecursiveUrlLoader class where absolute URLs were being treated as relative URLs, causing malformed URLs to be produced. The fix involves using the urljoin function from the urllib.parse module to correctly handle both absolute and relative URLs. @rlancemartin @eyurtsev --------- Co-authored-by: Lance Martin <lance@langchain.dev>	2023-07-13 15:34:00 -07:00
EllieRoseS	c087ce74f7	Added matching async load func to PlaywrightURLLoader (#5938 ) Fixes # (issue) The existing PlaywrightURLLoader load() function uses a synchronous browser which is not compatible with jupyter. This PR adds a sister function aload() which can be run insisde a notebook. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-13 17:51:38 -04:00
William FH	ae7714f1ba	Configure Tracer Workers (#7676 ) Mainline the tracer to avoid calling feedback before run is posted. Chose a bool over `max_workers` arg for configuring since we don't want to support > 1 for now anyway. At some point may want to manage the pool ourselves (ordering only really matters within a run and with parent runs)	2023-07-13 14:00:14 -07:00
Jasper	fbc97a77ed	add browserless loader (#7562 ) # Browserless Added support for Browserless' `/content` endpoint as a document loader. ### About Browserless Browserless is a cloud service that provides access to headless Chrome browsers via a REST API. It allows developers to automate Chromium in a serverless fashion without having to configure and maintain their own Chrome infrastructure. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Lance Martin <lance@langchain.dev>	2023-07-13 13:18:28 -07:00
mebstyne-msft	120c52589b	Enabled Azure Active Directory token-based auth access to OpenAI completions (#6313 ) With AzureOpenAI openai_api_type defaulted to "azure" the logic in utils' get_from_dict_or_env() function triggered by the root validator never looks to environment for the user's runtime openai_api_type values. This inhibits folks using token-based auth, or really any auth model other than "azure." By removing the "default" value, this allows environment variables to be pulled at runtime for the openai_api_type and thus enables the other api_types which are expected to work. --------- Co-authored-by: Ebo <mebstyne@microsoft.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-07-13 16:05:47 -04:00
frangin2003	c7b687e944	Simplify GraphQL Tool Initialization documentation by Removing 'llm' Argument (#7651 ) This PR is aimed at enhancing the clarity of the documentation in the langchain project. Description: In the graphql.ipynb file, I have removed the unnecessary 'llm' argument from the initialization process of the GraphQL tool (of type _EXTRA_OPTIONAL_TOOLS). The 'llm' argument is not required for this process. Its presence could potentially confuse users. This modification simplifies the understanding of tool initialization and minimizes potential confusion. Issue: Not applicable, as this is a documentation improvement. Dependencies: None. I kindly request a review from the following maintainer: @hinthornw, who is responsible for Agents / Tools / Toolkits. No new integration is being added in this PR, hence no need for a test or an example notebook. Please see the changes for more detail and let me know if any further modification is necessary.	2023-07-13 14:52:07 -04:00
William FH	aab2a7cd4b	Normalize Trajectory Eval Score (#7668 )	2023-07-13 09:58:28 -07:00
William FH	5f03cc3511	spelling nit (#7667 )	2023-07-13 09:12:57 -07:00
Bagatur	3dd0704e38	bump 232 (#7659 )	2023-07-13 10:32:39 -04:00
Tamas Molnar	24c1654208	Fix SQLAlchemy LLM cache clear (#7653 ) Fixes #7652 Description: This is a fix for clearing the cache for SQL Alchemy based LLM caches. The langchain.llm_cache.clear() did not take effect for SQLite cache. Reason: it didn't commit the deletion database change. See SQLAlchemy documentation for proper usage: https://docs.sqlalchemy.org/en/20/orm/session_basics.html#opening-and-closing-a-session https://docs.sqlalchemy.org/en/20/orm/session_basics.html#deleting @hwchase17 @baskaryan --------- Co-authored-by: Tamas Molnar <tamas.molnar@nagarro.com>	2023-07-13 09:39:04 -04:00
Bagatur	c17a80f11c	fix chroma updated upsert interface (#7643 ) new chroma release seems to not support empty dicts for metadata. related to #7633	2023-07-13 09:27:14 -04:00
William FH	a673a51efa	[Breaking] Update Evaluation Functionality (#7388 ) - Migrate from deprecated langchainplus_sdk to `langsmith` package - Update the `run_on_dataset()` API to use an eval config - Update a number of evaluators, as well as the loading logic - Update docstrings / reference docs - Update tracer to share single HTTP session	2023-07-13 02:13:06 -07:00
Sam Coward	224199083b	Fix missing chain classname in StdOutCallbackHandler.on_chain_start (#6124 ) Retrieves the name of the class from new location as of commit `18af149e91` Co-authored-by: Zander Chase <130414180+vowelparrot@users.noreply.github.com>	2023-07-13 03:05:36 -04:00
lucasiscovici	af3f401015	update base class of ListStepContainer to BaseStepContainer (#6232 ) update base class of ListStepContainer to BaseStepContainer Fixes #6231	2023-07-13 03:03:02 -04:00
Matt Adams	98e1bbfbbd	Add missing dependencies to apify.ipynb (#6331 ) Fixes errors caused by missing dependencies when running the notebook.	2023-07-13 03:02:23 -04:00
Ma Donghao	6f62e5461c	Update the parser regex of map_rerank (#6419 ) Sometimes the score responded by chatgpt would be like 'Respone example\nScore: 90 (fully answers the question, but could provide more detail on the specific error message)' For the score contains not only numbers, it raise a ValueError like Update the RegexParser from `.` to `\d` would help us to ignore the text after number. Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-13 03:01:42 -04:00
Bagatur	b08f903755	fix chroma init bug (#7639 )	2023-07-13 03:00:33 -04:00
Nir Gazit	f307ca094b	fix(memory): allow internal chains to use memory (#6769 ) Fixed #6768. This is a workaround only. I think a better longer-term solution is for chains to declare how many input variables they actually need (as opposed to ones that are in the prompt, where some may be satisfied by the memory). Then, a wrapping chain can check the input match against the actual input variables. @hwchase17	2023-07-13 02:47:44 -04:00
Francisco Ingham	488d2d5da9	Entity extraction improvements (#6342 ) Added fix to avoid irrelevant attributes being returned plus an example of extracting unrelated entities and an exampe of using an 'extra_info' attribute to extract unstructured data for an entity. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-13 02:16:05 -04:00
Nir Gazit	a8bbfb2da3	feat(agents): allow trimming of intermediate steps to last N (#6476 ) Added an option to trim intermediate steps to last N steps. This is especially useful for long-running agents. Users can explicitly specify N or provide a function that does custom trimming/manipulation on intermediate steps. I've mimicked the API of the `handle_parsing_errors` parameter.	2023-07-13 02:09:25 -04:00
Zeeland	92ef77da35	fix: remove useless variable k (#6524 ) remove useless variable k --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-13 01:58:36 -04:00
Bagatur	7f8ff2a317	add tagger nb (#7637 )	2023-07-13 01:48:23 -04:00
Sidchat95	c5e50c40c9	Fix Document Similarity Check with passed Threshold (#6845 ) Converting the Similarity obtained in the similarity_search_with_score_by_vector method whilst comparing to the passed threshold. This is because the passed threshold is a number between 0 to 1 and is already in the relevance_score_fn format. As of now, the function is comparing two different scoring parameters and that wouldn't work. Dependencies None Issue: Different scores being compared in similarity_search_with_score_by_vector method in FAISS. Tag maintainer @hwchase17 <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-13 01:30:47 -04:00
Jacob Ajit	a08baa97c5	Use modern OpenAI endpoints for embeddings (#6573 ) - Description: LangChain passes [engine](https://github.com/hwchase17/langchain/blob/master/langchain/embeddings/openai.py#L256) and not `model` as a field when making OpenAI requests. Within the `openai` Python library, for OpenAI requests, this [makes a call](https://github.com/openai/openai-python/blob/main/openai/api_resources/abstract/engine_api_resource.py#L58) to an endpoint of the form `https://api.openai.com/v1/engines/{engine_id}/embeddings`. These endpoints are [deprecated](https://help.openai.com/en/articles/6283125-what-happened-to-engines) in favor of endpoints of the format `https://api.openai.com/v1/embeddings`, where `model` is passed as a parameter in the request body. While these deprecated endpoints continue to function for now, they may not be supported indefinitely and should be avoided in favor of the newer API format. It appears that `engine` was passed in instead of `model` to make both Azure OpenAI and OpenAI calls work similarly. However, the inclusion of `engine` [causes](https://github.com/openai/openai-python/blob/main/openai/api_resources/abstract/engine_api_resource.py#L58) OpenAI to use the deprecated endpoint, requiring a diverging code path for Azure OpenAI calls where `engine` is passed in additionally (Azure OpenAI requires `engine` to specify a deployment, and can optionally take in `model`). In the long-term, it may be worth considering spinning off Azure OpenAI embeddings into a separate class for ease of use and maintenance, similar to the [implementation for chat models](https://github.com/hwchase17/langchain/blob/master/langchain/chat_models/azure_openai.py).	2023-07-13 01:23:17 -04:00
Jacob Lee	cdb93ab5ca	Adds OpenAI functions powered document metadata tagger (#7521 ) Adds a new document transformer that automatically extracts metadata for a document based on an input schema. I also moved `document_transformers.py` to `document_transformers/__init__.py` to group it with this new transformer - it didn't seem to cause issues in the notebook, but let me know if I've done something wrong there. Also had a linter issue I couldn't figure out: ``` MacBook-Pro:langchain jacoblee$ make lint poetry run mypy . docs/dist/conf.py: error: Duplicate module named "conf" (also at "./docs/api_reference/conf.py") docs/dist/conf.py: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#mapping-file-paths-to-modules for more info docs/dist/conf.py: note: Common resolutions include: a) using `--exclude` to avoid checking one of them, b) adding `__init__.py` somewhere, c) using `--explicit-package-bases` or adjusting MYPYPATH Found 1 error in 1 file (errors prevented further checking) make: *** [lint] Error 2 ``` @rlancemartin @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-13 01:12:41 -04:00
Jason Fan	8effd90be0	Add new types of document transformers (#7379 ) - Description: Add two new document transformers that translates documents into different languages and converts documents into q&a format to improve vector search results. Uses OpenAI function calling via the [doctran](https://github.com/psychic-api/doctran/tree/main) library. - Issue: N/A - Dependencies: `doctran = "^0.0.5"` - Tag maintainer: @rlancemartin @eyurtsev @hwchase17 - Twitter handle: @psychicapi or @jfan001 Notes - Adheres to the `DocumentTransformer` abstraction set by @dev2049 in #3182 - refactored `EmbeddingsRedundantFilter` to put it in a file under a new `document_transformers` module - Added basic docs for `DocumentInterrogator`, `DocumentTransformer` as well as the existing `EmbeddingsRedundantFilter` --------- Co-authored-by: Lance Martin <lance@langchain.dev> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-12 23:53:30 -04:00

1 2 3 4 5 ...

3122 Commits