langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-06 03:20:49 +00:00

Author	SHA1	Message	Date
Rubén Barragán	ef6332ead6	Support loading files from Dropbox (#8271 ) ## Description This commit introduces the `DropboxLoader` class, a new document loader that allows loading files from Dropbox into the application. The loader relies on a Dropbox app, which requires creating an app on Dropbox, obtaining the necessary scope permissions, and generating an access token. Additionally, the dropbox Python package is required. The `DropboxLoader` class is designed to be used as a document loader for processing various file types, including text files, PDFs, and Dropbox Paper files. ## Dependencies `pip install dropbox` and `pip install unstructured` for PDF reading. ## Tag maintainer @rlancemartin, @eyurtsev (from Data Loaders). I'd appreciate some feedback here 🙏 . ## Social Networks https://github.com/rubenbarragan https://www.linkedin.com/in/rgbarragan/ https://twitter.com/RubenBarraganP --------- Co-authored-by: Ruben Barragan <rbarragan@Rubens-MacBook-Air.local>	2023-07-27 06:36:08 -07:00
Pranay Chandekar	41bb3a6f9b	fixed the bug #8343 (#8345 ) - Issue: #8343 Signed-off-by: Pranay Chandekar <pranayc6@gmail.com>	2023-07-27 06:33:15 -07:00
Martin Krasser	93260a9922	Fix broken `make` targets `format_diff` and `lint_diff` (#8344 ) Since the refactoring into sub-projects `libs/langchain` and `libs/experimental`, the `make` targets `format_diff` and `lint_diff` do not work anymore when running `make` from these subdirectories. Reason is that ``` PYTHON_FILES=$(shell git diff --name-only --diff-filter=d master \| grep -E '\.py$$\|\.ipynb$$') ``` generates paths from the project's root directory instead of the corresponding subdirectories. This PR fixes this by adding a `--relative` command line option. - Tag maintainer: @baskaryan	2023-07-27 01:56:55 -07:00
Harrison Chase	ae78ef7fe6	bump experimental to 005 (#8339 )	2023-07-26 21:46:28 -07:00
Vadim Gubergrits	e7e5cb9d08	Tree of Thought introducing a new ToTChain. (#5167 ) # [WIP] Tree of Thought introducing a new ToTChain. This PR adds a new chain called ToTChain that implements the ["Large Language Model Guided Tree-of-Though"](https://arxiv.org/pdf/2305.08291.pdf) paper. There's a notebook example `docs/modules/chains/examples/tot.ipynb` that shows how to use it. Implements #4975 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: - @hwchase17 - @vowelparrot --------- Co-authored-by: Vadim Gubergrits <vgubergrits@outbox.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-26 21:29:39 -07:00
William FH	9eb7e6e27f	Delete Old Evals Examples (#8252 ) Still retain: - Comparison Examples - Data + QA walkthrough - QA (but really minimize it)	2023-07-26 18:46:54 -07:00
Saurabh Misra	db9d5b213a	Optimize the cosine_similarity_top_k function performance (#8151 ) Optimizing important numerical code and making it run faster. Performance went up by 1.48x (148%). Runtime went down from 138715us to 56020us Optimization explanation: The `cosine_similarity_top_k` function is where we made the most significant optimizations. Instead of sorting the entire score_array which needs considering all elements, `np.argpartition` is utilized to find the top_k largest scores indices, this operation has a time complexity of O(n), higher performance than sorting. Remember, `np.argpartition` doesn't guarantee the order of the values. So we need to use argsort() to get the indices that would sort our top-k values after partitioning, which is much more efficient because it only sorts the top-K elements, not the entire array. Then to get the row and column indices of sorted top_k scores in the original score array, we use `np.unravel_index`. This operation is more efficient and cleaner than a list comprehension. The code has been tested for correctness by running the following snippet on both the original function and the optimized function and averaged over 5 times. ``` def test_cosine_similarity_top_k_large_matrices(): X = np.random.rand(1000, 1000) Y = np.random.rand(1000, 1000) top_k = 100 score_threshold = 0.5 gc.disable() counter = time.perf_counter_ns() return_value = cosine_similarity_top_k(X, Y, top_k, score_threshold) duration = time.perf_counter_ns() - counter gc.enable() ``` @hwaking @hwchase17 @jerwelborn Unit tests pass, I also generated more regression tests which all passed.	2023-07-26 18:03:49 -07:00
Fabrizio Ruocco	ddc353a768	Azure Cognitive Search: Custom index and scoring profile support (#6843 ) Description: Adding support for custom index and scoring profile support in Azure Cognitive Search @hwchase17 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-26 17:58:01 -07:00
Kacper Łukawski	c5988c1d4b	Implement async support for Cohere (#8237 ) This PR introduces async API support for Cohere, both LLM and embeddings. It requires updating `cohere` package to `^4`. Tagging @hwchase17, @baskaryan, @agola11 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-26 15:51:18 -07:00
Daniel Alexander Brenot	bf1357f584	Added async support to PlanAndExecute Chain (#8239 ) - Description: Adds async support to the PlanAndExecute Chain Maintainer responsibilities: - Async: @agola11 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-26 15:16:07 -07:00
Bastin Florian	a3ac9b23eb	feat(confluence): add markdown format option (#8246 ) # Description: Add the possibility to keep text as Markdown in the ConfluenceLoader Add a bool variable that allows to keep the Markdown format of the Confluence pages. It is useful because it allows to use MarkdownHeaderTextSplitter as a DataSplitter. If this variable in set to True in the load() method, the pages are extracted using the markdownify library. # Issue: [4407](https://github.com/langchain-ai/langchain/issues/4407) # Dependencies: Add the markdownify library # Tag maintainer: @rlancemartin, @eyurtsev # Twitter handle: FloBastinHeyI - https://twitter.com/FloBastinHeyI --------- Co-authored-by: Florian Bastin <florian.bastin@octo.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-26 15:00:27 -07:00
Leonid Ganeline	ee6ff96e28	docstrings cleanup (#8311 ) - added missed docstrings - changed docstrings into consistent format @baskaryan	2023-07-26 14:13:10 -07:00
Rohit Gupta	e5dba8978a	Avoid re-computation of embedding in weaviate similarity search (#8284 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-26 13:31:55 -07:00
Nuno Campos	a612800ef0	Runnable single protocol (#7800 ) Objects implementing Runnable: BasePromptTemplate, LLM, ChatModel, Chain, Retriever, OutputParser - [x] Implement Runnable in base Retriever - [x] Raise TypeError in operator methods for unsupported things - [x] Implement dict which calls values in parallel and outputs dict with results - [x] Merge in `+` for prompts - [x] Confirm precedence order for operators, ideal would be `+` `\|`, https://docs.python.org/3/reference/expressions.html#operator-precedence - [x] Add support for openai functions, ie. Chat Models must return messages - [x] Implement BaseMessageChunk return type for BaseChatModel, a subclass of BaseMessage which implements __add__ to return BaseMessageChunk, concatenating all str args - [x] Update implementation of stream/astream for llm and chat models to use new `_stream`, `_astream` optional methods, with default implementation in base class `raise NotImplementedError` use https://stackoverflow.com/a/59762827 to see if it is implemented in base class - [x] Delete the IteratorCallbackHandler (leave the async one because people using) - [x] Make BaseLLMOutputParser implement Runnable, accepting either str or BaseMessage --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-07-26 12:16:46 -07:00
Bharat	04a4d3e312	Fixes #8310 Fix maximum recursion depth exceeded error (#8313 ) ElasticsearchVectorStore.as_retriever() method is returning `RecursionError: maximum recursion depth exceeded` because of incorrect field reference in `embeddings()` method - Description: Fix RecursionError because of a typo - Issue: the issue #8310 - Dependencies: None, - Tag maintainer: @eyurtsev - Twitter handle: bpatel	2023-07-26 12:15:37 -07:00
Caitlin2694	b9db3dd09b	Fix "missing key op" RDFGraph OWL serialization (#8276 ) Replace this comment with: - Description: Fix "missing key op" error in RDFGraph OWL Serialization - Issue: #8263 - Dependencies: None - Tag maintainer: @baskaryan	2023-07-26 12:14:56 -07:00
Eugene Yurtsev	862e9aed66	ChatPromptTemplate: Update doc-strings, update from_role_strings behavior (#8308 ) * Update doc-strings in ChatPromptTemplate * Update from_role_strings classmethod to use well known roles	2023-07-26 15:02:36 -04:00
Bagatur	2c2fd9ff13	bump 244 (#8314 )	2023-07-26 11:58:26 -07:00
Lance Martin	77c0582243	Clean queries prior to search (#8309 ) With some search tools, we see no results returned if the query is a numeric list. E.g., if we pass: ``` '1. "LangChain vs LangSmith: How do they differ?"' ``` We see: ``` No good Google Search Result was found ``` Local testing w/ Streamlit: ![image](https://github.com/langchain-ai/langchain/assets/122662504/0a7e3dca-59e8-415e-8df6-bd9e4ea962ee)	2023-07-26 11:48:28 -07:00
shibuiwilliam	6b88fbd9bb	add test for embedding distance evaluation (#8285 ) Add tests for embedding distance evaluation - Description: Add tests for embedding distance evaluation - Issue: None - Dependencies: None - Tag maintainer: @baskaryan - Twitter handle: @MlopsJ	2023-07-26 11:45:50 -07:00
Timon Palm	70604e590f	DuckDuckGoSearch News Tool (#8292 ) Description: I wanted to use the DuckDuckGoSearch tool in an agent to let him get the latest news for a topic. DuckDuckGoSearch has already an implemented function for retrieving news articles. But there wasn't a tool to use it. I simply adapted the SearchResult class with an extra argument "backend". You can set it to "news" to only get news articles. Furthermore, I added an example to the DuckDuckGo Notebook on how to further customize the results by using the DuckDuckGoSearchAPIWrapper. Dependencies: no new dependencies --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-26 11:30:01 -07:00
Byron Saltysiak	61347bd322	giving path to the copy command for *.toml files (#8294 ) Description: in the .devcontainer, docker-compose build is currently failing due to the src paths in the COPY command. This change adds the full path to the pyproject.toml and poetry.toml to allow the build to run. Issue: You can see the issue if you try to build the dev docker image with: ``` cd .devcontainer docker-compose build ``` Dependencies: none Twitter handle: byronsalty	2023-07-26 10:37:03 -07:00
happyxhw	6384c1ec8f	fix: ElasticVectorSearch.from_documents failed #8293 (#8296 ) - Description: fix ElasticVectorSearch.from_documents with elasticsearch_url param, - Issue: ElasticVectorSearch.from_documents failed #8293 # it fixes (if applicable), --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-26 10:33:52 -07:00
jacobswe	83a53e2126	Bug Fix: AzureChatOpenAI streaming with function calls (#8300 ) - Description: During streaming, the first chunk may only contain the name of an OpenAI function and not any arguments. In this case, the current code presumes there is a streaming response and tries to append to it, but gets a KeyError. This fixes that case by checking if the arguments key exists, and if not, creates a new entry instead of appending. - Issue: Related to #6462 Sample Code: ```python llm = AzureChatOpenAI( deployment_name=deployment_name, model_name=model_name, streaming=True ) tools = [PythonREPLTool()] callbacks = [StreamingStdOutCallbackHandler()] agent = initialize_agent( tools=tools, llm=llm, agent=AgentType.OPENAI_FUNCTIONS, callbacks=callbacks ) agent('Run some python code to test your interpreter') ``` Previous Result: ``` File ...langchain/chat_models/openai.py:344, in ChatOpenAI._generate(self, messages, stop, run_manager, **kwargs) 342 function_call = _function_call 343 else: --> 344 function_call["arguments"] += _function_call["arguments"] 345 if run_manager: 346 run_manager.on_llm_new_token(token) KeyError: 'arguments' ``` New Result: ```python {'input': 'Run some python code to test your interpreter', 'output': "The Python code `print('Hello, World!')` has been executed successfully, and the output `Hello, World!` has been printed."} ``` Co-authored-by: jswe <jswe@polencapital.com>	2023-07-26 10:11:50 -07:00
German Martin	457a4730b2	Fix the mangling issue on several VectorStores child classes. (#8274 ) - Description: Fix mangling issue affecting a couple of VectorStore classes including Redis. - Issue: https://github.com/langchain-ai/langchain/issues/8185 - @rlancemartin This is a simple issue but I lack of some context in the original implementation. My changes perhaps are not the definitive fix but to start a quick discussion. @hinthornw Tagging you since one of your changes introduced this [here.](`c38965fcba`)	2023-07-26 09:48:55 -07:00
Alec Flett	4da43f77e5	Add ability to load (deserialize) objects from other namespaces (#7726 ) I have some Prompt subclasses in my project that I'd like to be able to deserialize in callbacks. Right now `loads()`/`load()` will bail when it encounters my object, but I know I can trust the objects because they're in my own projects. <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-26 16:59:28 +01:00
Bagatur	5c6dcb1960	bump 243 (#8289 )	2023-07-26 05:41:56 -07:00
William FH	adf019724f	unpack later (#8278 ) Fix https://github.com/langchain-ai/langchain/issues/8272	2023-07-26 01:53:22 -07:00
Naveen Tatikonda	9cbefcc56c	[ OpenSearch ] : Add AOSS Support to OpenSearch (#8256 ) ### Description This PR includes the following changes: - Adds AOSS (Amazon OpenSearch Service Serverless) support to OpenSearch. Please refer to the documentation on how to use it. - While creating an index, AOSS only supports Approximate Search with `nmslib` and `faiss` engines. During Search, only Approximate Search and Script Scoring (on doc values) are supported. - This PR also adds support to `efficient_filter` which can be used with `faiss` and `lucene` engines. - The `lucene_filter` is deprecated. Instead please use the `efficient_filter` for the lucene engine. Signed-off-by: Naveen Tatikonda <navtat@amazon.com>	2023-07-25 23:59:36 -07:00
Lance Martin	7a00f17033	Web research retriever (#8102 ) Given a user question, this will - * Use LLM to generate a set of queries. * Query for each. * The URLs from search results are stored in self.urls. * A check is performed for any new URLs that haven't been processed yet (not in self.url_database). * Only these new URLs are loaded, transformed, and added to the vectorstore. * The vectorstore is queried for relevant documents based on the questions generated by the LLM. * Only unique documents are returned as the final result. This code will avoid reprocessing of URLs across multiple runs of similar queries, which should improve the performance of the retriever. It also keeps track of all URLs that have been processed, which could be useful for debugging or understanding the retriever's behavior. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-25 19:58:00 -07:00
Rithwik Ediga Lakhamsani	d1d691caa4	Added Databricks support to MLflow Callback (#7906 ) Added a quick check to make integration easier with Databricks; another option would be to make a new class, but this seemed more straightfoward. cc: @liangz1 Can this be done in a more straightfoward way?	2023-07-25 18:23:54 -07:00
William FH	479cc086ba	Rm Github Import (#8257 ) It's not a required dep but would break peoples builds --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-25 18:20:58 -07:00
Erick Friis	c14571ab37	New enterprise support form (#8254 )	2023-07-25 15:43:27 -07:00
Eugene Yurtsev	ec069381fb	Remove operator overloading for BaseMessage (#8245 ) This PR removes operator overloading for base message. Removing the `+` operating from base message will help make sure that: 1) There's no need to re-define `+` for message chunks 2) That there's no unexpected behavior in terms of types changing (adding two messages yields a ChatPromptTemplate which is not a message)	2023-07-25 20:12:19 +01:00
jacobswe	0af48b06d0	Bug Fix #6462 (#8241 ) - Description: Small change to fix broken Azure streaming. More complete migration probably still necessary once the new API behavior is finalized. - Issue: Implements fix by @rock-you in #6462 - Dependencies: N/A There don't seem to be any tests specifically for this, and I was having some trouble adding some. This is just a small temporary fix to allow for the new API changes that OpenAI are releasing without breaking any other code. --------- Co-authored-by: Jacob Swe <jswe@polencapital.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-25 11:30:22 -07:00
Bagatur	c1ea8da9bc	bump 242 (#8238 )	2023-07-25 08:01:37 -07:00
shibuiwilliam	af788b7cf0	Add/faiss test score threshold (#8224 ) # What - This is to add test for faiss vector store with score threshold <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: This is to add test for faiss vector store with score threshold - Issue: None - Dependencies: None - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: @MlopsJ Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-25 09:56:29 -04:00
shibuiwilliam	bed8eb978e	use logger instead of logging (#8225 ) # What - Use `logger` instead of using logging directly. <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: Use `logger` instead of using logging directly. - Issue: None - Dependencies: None - Tag maintainer: @baskaryan - Twitter handle: @MlopsJ Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-25 09:55:30 -04:00
Leonid Ganeline	afc55a4fee	Refactored `requests` (#8203 ) Refactored `requests.py`. The same as https://github.com/langchain-ai/langchain/pull/7961 #8098 #8099 requests.py is in the root code folder. This creates the `langchain.requests: Requests` group on the API Reference navigation ToC, on the same level as Chains and Agents which is incorrect. Refactoring: - copied requests.py content into utils/requests.py - I added the backwards compatibility ref in the original requests.py. - updated imports to requests objects @hwchase17, @baskaryan	2023-07-24 21:23:59 -07:00
Alex Stachowiak	a7efa95775	Update base chain type hints (#7680 ) Addresses #7578. `run()` can return dictionaries, Pydantic objects or strings, so the type hints should reflect that. See the chain from `create_structured_output_chain` for an example of a non-string return type from `run()`. I've updated the BaseLLMChain return type hint from `str` to `Any`. Although, the differences between `run()` and `__call__()` seem less clear now. CC: @baskaryan Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 21:16:41 -07:00
Ani peter benjamin	e58b1d7073	feat: temp fixed Could not parse LLM output on agents folder (#7746 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 19:20:37 -07:00
Dayuan Jiang	125ae6d9de	add Hybrid retriever that not require any external service (#8108 ) - Until now, hybrid search was limited to modules requiring external services, such as Weaviate/Pinecone Hybrid Search. However, I have developed a hybrid retriever that can merge a list of retrievers using the [Reciprocal Rank Fusion](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf) algorithm. This new approach, similar to Weaviate hybrid search, does not require the initialization of any external service. - Dependencies: No - Twitter handle: dayuanjian21687 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 19:16:10 -07:00
earonesty	59a7c5877a	Update supabase.py, add filter to query (matches latest supabase docs & js) (#7721 ) - Description: Update supabase to support optional filter argument (if present, used, if not, doesn't break things) - Tag maintainer: @rlancemartin, @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 19:13:52 -07:00
Aditya S	00de334f81	Fixed sparql SELECT and UPDATE query function (#7758 ) - Description: Changed "SELECT" and "UPDTAE" intent check from "=" to "in", - Issue: Based on my own testing, most of the LLM (StarCoder, NeoGPT3, etc..) doesn't return a single word response ("SELECT" / "UPDATE") through this modification, we can accomplish the same output without curated prompt engineering. - Dependencies: None - Tag maintainer: @baskaryan - Twitter handle: @aditya_0290 Thank you for maintaining this library, Keep up the good efforts. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 18:29:30 -07:00
William FH	3662aca7d4	Add async support for transform chain (#8205 )	2023-07-24 17:45:17 -07:00
Taqi Jaffri	8f158b72fc	Added stop sequence support to replicate (#8107 ) Stop sequences are useful if you are doing long-running completions and need to early-out rather than running for the full max_length... not only does this save inference cost on Replicate, it is also much faster if you are going to truncate the output later anyway. Other LLMs support stop sequences natively (e.g. OpenAI) but I didn't see this for Replicate so adding this via their prediction cancel method. Housekeeping: I ran `make format` and `make lint`, no issues reported in the files I touched. I did update the replicate integration test and ran `poetry run pytest tests/integration_tests/llms/test_replicate.py` successfully. Finally, I am @tjaffri https://twitter.com/tjaffri for feature announcement tweets... or if you could please tag @docugami https://twitter.com/docugami we would really appreciate that :-) Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>	2023-07-24 17:34:13 -07:00
glaze	f7ad14acfa	Add etherscan document loader (#7943 ) @rlancemartin The modification includes: * etherscanLoader * test_etherscan * document ipynb I have run the test, lint, format, and spell check. I do encounter a linting error on ipynb, I am not sure how to address that. ``` docs/extras/modules/data_connection/document_loaders/integrations/Etherscan.ipynb:55: error: Name "null" is not defined [name-defined] docs/extras/modules/data_connection/document_loaders/integrations/Etherscan.ipynb:76: error: Name "null" is not defined [name-defined] Found 2 errors in 1 file (checked 1 source file) ``` - Description: The Etherscan loader uses etherscan api to load transaction histories under specific accounts on Ethereum Mainnet. - No dependency is introduced by this PR. - Twitter handle: glazecl --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 17:09:16 -07:00
Julien Salinas	73d5cba308	Allow user to modify the GPU and language settings when using NLP Cloud (#7985 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 17:08:56 -07:00
Liu Ming	24f889f2bc	Change with_history option to False for ChatGLM by default (#8076 ) ChatGLM LLM integration will by default accumulate conversation history(with_history=True) to ChatGLM backend api, which is not expected in most cases. This PR set with_history=False by default, user should explicitly set llm.with_history=True to turn this feature on. Related PR: #8048 #7774 --------- Co-authored-by: mlot <limpo2000@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 15:46:02 -07:00
Mahip Soni	1f055775f8	Fixing issue with MSSQL connection (#8040 ) My team recently faced an issue while using MSSQL and passing a schema name. We noticed that "SET search_path TO {self.schema}" is being called for us, which is not a valid ms-sql query, and is specific to postgresql dialect. We were able to run it locally after this fix. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 15:45:40 -07:00
Anthony Mahanna	76102971c0	ArangoDB/AQL support for Graph QA Chain (#7880 ) Description: Serves as an introduction to LangChain's support for [ArangoDB](https://github.com/arangodb/arangodb), similar to https://github.com/hwchase17/langchain/pull/7165 and https://github.com/hwchase17/langchain/pull/4881 Issue: No issue has been created for this feature Dependencies: `python-arango` has been added as an optional dependency via the `CONTRIBUTING.md` guidelines Twitter handle: [at]arangodb - Integration test has been added - Notebook has been added: [graph_arangodb_qa.ipynb](https://github.com/amahanna/langchain/blob/master/docs/extras/modules/chains/additional/graph_arangodb_qa.ipynb) [![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/amahanna/langchain/blob/master/docs/extras/modules/chains/additional/graph_arangodb_qa.ipynb) ``` docker run -p 8529:8529 -e ARANGO_ROOT_PASSWORD= arangodb/arangodb ``` ``` pip install git+https://github.com/amahanna/langchain.git ``` ```python from arango import ArangoClient from langchain.chat_models import ChatOpenAI from langchain.graphs import ArangoGraph from langchain.chains import ArangoGraphQAChain db = ArangoClient(hosts="localhost:8529").db(name="_system", username="root", password="", verify=True) graph = ArangoGraph(db) chain = ArangoGraphQAChain.from_llm(ChatOpenAI(temperature=0), graph=graph) chain.run("Is Ned Stark alive?") ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 15:16:52 -07:00
Adilkhan Sarsen	3e7d2a1b64	SelfQuery support for deeplake (#7888 ) Added support SelfQuery for Deeplake	2023-07-24 14:22:33 -07:00
Leonid Ganeline	c580c81cca	docstrings `experimental` (#7969 ) - added/changed docstring for `experimental` - added/changed docstrings for different artifacts - @baskaryan	2023-07-24 14:21:48 -07:00
Leonid Ganeline	3eb4112a1f	Refactored `example_generator` (#8099 ) Refactored `example_generator.py`. The same as #7961 `example_generator.py` is in the root code folder. This creates the `langchain.example_generator: Example Generator ` group on the API Reference navigation ToC, on the same level as `Chains` and `Agents` which is not correct. Refactoring: - moved `example_generator.py` content into `chains/example_generator.py` (not in `utils` because the `example_generator` has dependencies on other LangChain classes. It also doesn't work for moving into `utilities/`) - added the backwards compatibility ref in the original `example_generator.py` @hwchase17	2023-07-24 13:36:44 -07:00
Leonid Ganeline	7cbe28ba9b	Refactored `input` (#8202 ) Refactored `input.py`. The same as https://github.com/langchain-ai/langchain/pull/7961 #8098 #8099 input.py is in the root code folder. This creates the `langchain.input: Input` group on the API Reference navigation ToC, on the same level as Chains and Agents which is incorrect. Refactoring: - copied input.py file into utils/input.py - I added the backwards compatibility ref in the original input.py. - changed several imports to a new ref @hwchase17, @baskaryan	2023-07-24 13:10:03 -07:00
Monty Evans	72eb4fa4e8	Change WebBaseLoader metadata parsing to set missing metadata to descriptive string instead of `None` (#8175 ) Solves #8174 & #3542 Co-authored-by: mevans <mevans@palantir.com>	2023-07-24 12:17:49 -07:00
Bagatur	1a7d8667c8	Bagatur/gateway chat (#8198 ) Signed-off-by: dbczumar <corey.zumar@databricks.com> Co-authored-by: dbczumar <corey.zumar@databricks.com>	2023-07-24 12:17:00 -07:00
Ettore Di Giacinto	ae28568e2a	Add embeddings for LocalAI (#8134 ) Description: This PR adds embeddings for LocalAI ( https://github.com/go-skynet/LocalAI ), a self-hosted OpenAI drop-in replacement. As LocalAI can re-use OpenAI clients it is mostly following the lines of the OpenAI embeddings, however when embedding documents, it just uses string instead of sending tokens as sending tokens is best-effort depending on the model being used in LocalAI. Sending tokens is also tricky as token id's can mismatch with the model - so it's safer to just send strings in this case. Partly related to: https://github.com/hwchase17/langchain/issues/5256 Dependencies: No new dependencies Twitter: @mudler_it --------- Signed-off-by: mudler <mudler@localai.io> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 12:16:49 -07:00
Mike Nitsenko	d983046f90	Extend Cube Semantic Loader functionality (#8186 ) PR Description: This pull request introduces several enhancements and new features to the `CubeSemanticLoader`. The changes include the following: 1. Added imports for the `json` and `time` modules. 2. Added new constructor parameters: `load_dimension_values`, `dimension_values_limit`, `dimension_values_max_retries`, and `dimension_values_retry_delay`. 3. Updated the class documentation with descriptions for the new constructor parameters. 4. Added a new private method `_get_dimension_values()` to retrieve dimension values from Cube's REST API. 5. Modified the `load()` method to load dimension values for string dimensions if `load_dimension_values` is set to `True`. 6. Updated the API endpoint in the `load()` method from the base URL to the metadata endpoint. 7. Refactored the code to retrieve metadata from the response JSON. 8. Added the `column_member_type` field to the metadata dictionary to indicate if a column is a measure or a dimension. 9. Added the `column_values` field to the metadata dictionary to store the dimension values retrieved from Cube's API. 10. Modified the `page_content` construction to include the column title and description instead of the table name, column name, data type, title, and description. These changes improve the functionality and flexibility of the `CubeSemanticLoader` class by allowing the loading of dimension values and providing more detailed metadata for each document. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 12:11:58 -07:00
Bagatur	82b8d8596c	bump lc241 exp3 (#8193 )	2023-07-24 11:52:44 -07:00
Leonid Ganeline	848454d1e7	Refactored `formatting` (#8191 ) Refactored `formatting.py`. The same as https://github.com/langchain-ai/langchain/pull/7961 #8098 #8099 formatting.py is in the root code folder. This creates the `langchain.formatting: Formatting` group on the API Reference navigation ToC, on the same level as Chains and Agents which is incorrect. Refactoring: - moved formatting.py content into utils/formatting.py - I did not add the backwards compatibility ref in the original formatting.py. It seems unnecessary. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 11:34:15 -07:00
Bagatur	4928f7a9f5	undo bump (#8192 )	2023-07-24 11:32:17 -07:00
Leonid Ganeline	120cdf813d	docstrings `memory` (#8018 ) docstrings `memory`: - added module summary - added missed docstrings - updated docstrings into consistent format - @baskaryan	2023-07-24 10:05:36 -07:00
Bagatur	d5689d58ab	Bagatur/bump 241 (#8182 )	2023-07-24 07:47:40 -07:00
Harrison Chase	3caccf304c	Harrison/hugginggpt (#8162 ) Co-authored-by: Yongliang Shen <withsyl@163.com>	2023-07-24 07:36:24 -07:00
rajib	f3908627ed	changed to mlflow-ai-gateway in llms/__init__.py (#8114 ) - Description: In the llms/__init__.py, the key name is wrong for mlflowaigateway. It should be mlflow-ai-gateway - Issue: NA - Dependencies: NA - Tag maintainer: @hwchase17, @baskaryan - Twitter handle: na Without this fix, when we run the code for mlflowaigateway, we will get error as below ValueError: Loading mlflow-ai-gateway LLM not supported --------- Co-authored-by: rajib76 <rajib76@yahoo.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-23 23:30:46 -07:00
Gordon Clark	80b3ec5869	GitHub toolkit improvements (#8121 ) Fixes an issue with the github tool where the API returned special objects but the tool was expecting dictionaries. Also added proper docstrings to the GitHubAPIWraper methods and a (very basic) integration test. Maintainer responsibilities: - Agents / Tools / Toolkits: @hinthornw --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-23 20:17:53 -07:00
shibuiwilliam	8f5000146c	add faiss test for score threshold (#8143 ) # What - Add faiss vector search test for score threshold - Fix failing faiss vector search test; filtering with list value is wrong. <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: Add faiss vector search test for score threshold; Fix failing faiss vector search test; filtering with list value is wrong. - Issue: None - Dependencies: None - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: @MlopsJ Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-23 19:36:38 -07:00
Nolan	7686dabd36	Unbreak devcontainer (#8154 ) Codespaces and devcontainer was broken by the [repo restructure](https://github.com/langchain-ai/langchain/discussions/8043). - Description: Add libs/langchain to container so it can be built without error. - Issue: - - Dependencies: - - Tag maintainer: @hwchase17 @baskaryan - Twitter handle: @finnless The failed build log says: ``` #10 [langchain-dev-dependencies 2/2] RUN poetry install --no-interaction --no-ansi --with dev,test,docs #10 sha256:e850ee99fc966158bfd2d85e82b7c57244f47ecbb1462e75bd83b981a56a1929 2023-07-23 23:30:33.692Z: #10 0.827 #10 0.827 Directory libs/langchain does not exist 2023-07-23 23:30:33.738Z: #10 ERROR: executor failed running [/bin/sh -c poetry install --no-interaction --no-ansi --with dev,test,docs]: exit code: 1 ``` The new pyproject.toml imports from libs/langchain: `77bf75c236/pyproject.toml (L14-L16)` But libs/langchain is never added to the dev.Dockerfile: `77bf75c236/libs/langchain/dev.Dockerfile (L37-L39)`	2023-07-23 19:33:47 -07:00
Harrison Chase	9205919ad2	actually use input key (#8136 )	2023-07-23 18:02:45 -07:00
Leonid Ganeline	670304a8b3	simplified nmspace (#8152 ) recreated #7894 (it is easy to recreate than resolve conflicts) A small refactoring to improve the API Reference Agents table @baskaryan	2023-07-23 18:02:20 -07:00
William FH	c5b50be225	Function calling logging fixup (#8153 ) Fix bad overwriting of "functions" arg in invocation params. Cleanup precedence in the dict Clean up some inappropriate types (mapping should be dict) Example: https://dev.smith.langchain.com/public/9a7a6817-1679-49d8-8775-c13916975aae/r ![image](https://github.com/langchain-ai/langchain/assets/13333726/94cd0775-b6ef-40c3-9e5a-3ab65e466ab9)	2023-07-23 18:01:33 -07:00
SlapDrone	961a0e200f	Implement AgentExecutorIterator (#6929 ) - Description: Implements a `.iter()` method for the `AgentExecutor` class. This allows hooking into and intercepting intermediate agent steps. - Issue: #6925 - Dependencies: None - Tag maintainer: @vowelparrot @agola11 - Twitter handle: @SlapDron3 @lacicocodes --------- Co-authored-by: Lacico <Lacicocodes@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-23 18:00:22 -07:00
Harrison Chase	77bf75c236	bump experimental to 002 (#8150 )	2023-07-23 09:22:39 -07:00
Harrison Chase	e46126eac6	add llamaapi (#8140 )	2023-07-23 09:16:16 -07:00
Harrison Chase	cbf2fc8af8	prompt ergonomics (#7799 )	2023-07-22 14:19:17 -07:00
Harrison Chase	9f3073d418	bump versions (#8129 )	2023-07-22 08:46:37 -07:00
Harrison Chase	86946a47a8	Harrison/add back in experimental (#8128 )	2023-07-22 08:27:29 -07:00
Karthik Raja A	8b08687fc4	MultiOn client toolkit (#8110 ) Addition of MultiOn Client Agent Toolkit Dependencies: multion pip package This PR consists of the following: - MultiOn utility,tools and integration with agent - sample jupyter notebook. Request @hwchase17 , @hinthornw --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-22 08:19:01 -07:00
Harrison Chase	aa0e69bc98	Harrison/official pre release (#8106 )	2023-07-21 18:44:32 -07:00
Philip Kiely - Baseten	95bcf68802	add kwargs support for Baseten models (#8091 ) This bugfix PR adds kwargs support to Baseten model invocations so that e.g. the following script works properly: ```python chatgpt_chain = LLMChain( llm=Baseten(model="MODEL_ID"), prompt=prompt, verbose=False, memory=ConversationBufferWindowMemory(k=2), llm_kwargs={"max_length": 4096} ) ```	2023-07-21 13:56:27 -07:00
Harrison Chase	8dcabd9205	bump releases rc0 (#8097 )	2023-07-21 13:54:57 -07:00
Harrison Chase	d353d668e4	remove CVEs (#8092 ) This PR aims to move all code with CVEs into `langchain.experimental`. Note that we are NOT yet removing from the core `langchain` package - we will give people a week to migrate here. See MIGRATE.md for how to migrate Zero changes to functionality Vulnerabilities this addresses: PALChain: - https://security.snyk.io/vuln/SNYK-PYTHON-LANGCHAIN-5752409 - https://security.snyk.io/vuln/SNYK-PYTHON-LANGCHAIN-5759265 SQLDatabaseChain - https://security.snyk.io/vuln/SNYK-PYTHON-LANGCHAIN-5759268 `load_prompt` (Python files only) - https://security.snyk.io/vuln/SNYK-PYTHON-LANGCHAIN-5725807	2023-07-21 13:32:39 -07:00
Bagatur	08c658d3f8	fix api ref (#8083 )	2023-07-21 12:37:21 -07:00
Harrison Chase	da04760de1	Harrison/move experimental (#8084 )	2023-07-21 10:36:28 -07:00
Harrison Chase	f35db9f43e	(WIP) set up experimental (#7959 )	2023-07-21 09:20:24 -07:00

... 35 36 37 38 39

1936 Commits