langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-29 17:07:25 +00:00

Author	SHA1	Message	Date
yangdihang	ff5024634e	fix: openapi controller prompt, when bot is unable to resolve an api … (#7525 ) …call, it needs retry <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> Co-authored-by: yangdihang <yangdihang@bytedance.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-31 17:56:43 -07:00
Kenny	1e8fca5518	Add ConcurrentLoader (#7512 ) Works just like the GenericLoader but concurrently for those who choose to optimize their workflow. @rlancemartin @eyurtsev --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-31 17:56:31 -07:00
Kevin Buckley	8061994c61	AzureSearch Vector Store: Moving the usage of additional_fields into context of it's definition (bug fix from python error) (#8551 ) Description: Using Azure Cognitive Search as a VectorStore. Calling the `add_texts` method throws an error if there is no metadata property specified. The `additional_fields` field is set in an `if` statement and then is used later outside the if statement. This PR just moves the declaration of `additional_fields` below and puts the usage of it in context. Issue: https://github.com/langchain-ai/langchain/issues/8544 Tagging @rlancemartin, @eyurtsev as this is related to Vector stores. `make format`, `make lint`, `make spellcheck`, and `make test` have been run	2023-07-31 17:25:57 -07:00
Pranay Chandekar	7e70cd2a28	Bug Fix - #8415 (#8417 ) - Issue: #8415 Signed-off-by: Pranay Chandekar <pranayc6@gmail.com>	2023-07-31 17:08:46 -07:00
shibuiwilliam	de61ebd9e0	add tests to redis vectorstore (#8116 ) # What - Add function to get similarity with score with threshold in Redis vector store. - Add tests to Redis vector store.	2023-07-31 17:07:09 -07:00
Bruno Bornsztein	5a490a79f4	fix issue #8357 by making json backtick regex greedy (#8528 ) - Description: Markdown code blocks in json response should not break the parser - Issue: #8357 @baskaryan @hinthornw	2023-07-31 16:36:57 -07:00
Gordon Clark	64d0a0fcc0	Updating docstings in utilities (#8411 ) Updating docstrings on utility packages @baskaryan	2023-07-31 16:34:53 -07:00
Mohammad Mohtashim	144b4c0c78	SQL Query Prompt update + added _execute method for SQLDatabase (#8100 ) - Description: This pull request (PR) includes two minor changes: 1. Updated the default prompt for SQL Query Checker: The current prompt does not clearly specify the final response that the LLM (Language Model) should provide when checking for the query if `use_query_checker` is enabled in SQLDatabase Chain. As a result, the LLM adds extra words like "Here is your updated query" to the response. However, this causes a syntax error when executing the SQL command in SQLDatabaseChain, as these additional words are also included in the SQL query. 2. Moved the query's execution part into a separate method for SQLDatabase: The purpose of this change is to provide users with more flexibility when obtaining the result of an SQL query in the original form returned by sqlalchemy. In the previous implementation, the run method returned the results as a string. By creating a distinct method for execution, users can now receive the results in original format, which proves helpful in various scenarios. For example, during the development of a tool, I found it advantageous to obtain results in original format rather than a string, as currently done by the run method. - Tag maintainer: @hinthornw --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-31 16:28:08 -07:00
Matthew DeGuzman	844eca98d5	Add LLaMa Formatter and AzureML Chat Endpoint (#8382 ) ## Description Microsoft and Meta recently [announced their collaboration](https://blogs.microsoft.com/blog/2023/07/18/microsoft-and-meta-expand-their-ai-partnership-with-llama-2-on-azure-and-windows/) on LLaMa2. This PR extends the current LLM wrapper and introduces a new Chat Model wrapper for AzureML to support LLaMa2. ## Dependencies No dependencies added :) ## Twitter Handles [@matthew_d13](https://twitter.com/matthew_d13) [@prakhar_in](https://twitter.com/prakhar_in) maintainers - @hwchase17, @baskaryan	2023-07-31 16:26:25 -07:00
Harrison Chase	15de57b848	fix web loader (#8538 )	2023-07-31 12:47:33 -07:00
Nuno Campos	4780156955	Rely less on positional arg order in subclasses of vector store when calling async methods (#8534 )	2023-07-31 20:13:11 +01:00
Harrison Chase	5e3b968078	router runnable (#8496 ) Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-07-31 11:07:10 -07:00
Anubhav Bindlish	913a156cff	Minor improvements to rockset vectorstore (#8416 ) This PR makes minor improvements to our python notebook, and adds support for `Rockset` workspaces in our vectorstore client. @rlancemartin, @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-31 09:54:59 -07:00
Bagatur	a8be207ea3	bump 248 (#8518 )	2023-07-31 07:14:45 -07:00
Harrison Chase	6556a8fcfd	add initial anthropic agent (#8468 ) Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-07-30 21:30:49 -07:00
os1ma	a795c3d860	Fix GitLoader to handle repeated load calls (#8412 ) Description: a description of the change In this pull request, GitLoader has been updated to handle multiple load calls, provided the same repository is being cloned. Previously, calling `load` multiple times would raise an error if a clone URL was provided. Additionally, a check has been added to raise a ValueError when attempting to clone a different repository into an existing path. New tests have also been introduced to verify the correct behavior of the GitLoader class when `load` is called multiple times. Lastly, the GitPython package, a dependency for the GitLoader class, has been added to the project dependencies (pyproject.toml and poetry.lock). Issue: the issue # it fixes (if applicable) None Dependencies: any dependencies required for this change GitPython Tag maintainer: for a quicker response, tag the relevant maintainer (see below) - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev	2023-07-30 21:27:20 -07:00
Piyush Jain	b2f8a5bae9	Fixed exports for NeptuneOpenCypherQAChain (#8439 ) ## Description The imports for `NeptuneOpenCypherQAChain` are failing. This PR adds the chain class to the `__init__.py` file to fix this issue. ## Maintainers @dev2049 @krlawrence	2023-07-30 20:36:22 -07:00
Eugene Yurtsev	e98e2b2b81	ChatPromptTemplate: clean up doc-string (#8473 ) Minor doc-string clean up --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-30 20:11:04 -07:00
Eugene Yurtsev	529cb2e30c	Update doc-string in few shot template (#8474 ) Partial update of doc-string, need to update other instances in documentation	2023-07-30 19:39:14 -07:00
Muneeb Ahmad	4923cf029a	Added Proper Documentation for `faiss-gpu` Installation (#8492 ) ### Description In the LangChain Documentation and Comments, I've Noticed that `pip install faiss` was mentioned, instead of `pip install faiss-gpu`, since installing `pip install faiss` results in an error. I've gone ahead and updated the Documentation, and `faiss.ipynb`. This Change will ensure ease of use for the end user, trying to install `faiss-gpu`. ### Issue: Documentation / Comments Related. ### Dependencies: No Dependencies we're changed only updated the files with the wrong reference. ### Tag maintainer: @rlancemartin, @eyurtsev (Thank You for your contributions 😄 )	2023-07-30 13:24:30 -07:00
shibuiwilliam	549720ae51	add test to ensure values in time weighted retriever are updated (#8479 ) # What - add test to ensure values in time weighted retriever are updated <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: add test to ensure values in time weighted retriever are updated - Issue: None - Dependencies: None - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: @MlopsJ Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-30 11:42:25 -07:00
Harrison Chase	18a2452121	prompt cleanup (#8470 )	2023-07-30 10:47:31 -07:00
Harrison Chase	4d526c49ed	bump experimental to 008 (#8490 )	2023-07-30 07:28:18 -07:00
Harrison Chase	8f14ddefdf	add anthropic functions wrapper (#8475 ) a cheeky wrapper around claude that adds in function calling support (kind of, hence it going in experimental)	2023-07-30 07:23:46 -07:00
Nuno Campos	b65a9414bb	runnable.bind().bind() should combine kwargs, instead of nesting wrappers (#8467 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-29 15:48:30 -07:00
Nuno Campos	872abb4198	Implement Runnable for Tools (#8460 ) - Make _arun optional - Pass run_manager to inner chains in tools that have them <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-29 10:01:18 -07:00
William FH	b7c0eb9ecb	Wfh/ref links (#8454 )	2023-07-29 08:44:32 -07:00
Harrison Chase	13b4f465e2	log output parser (#8446 )	2023-07-29 07:53:45 +01:00
William FH	d935573362	Partial formatting for chat messages (#8450 )	2023-07-28 23:08:33 -07:00
William FH	3314f54383	Update supabase docstrings (#8443 )	2023-07-28 23:08:14 -07:00
Harrison Chase	2448043b84	bump and fix (#8441 )	2023-07-28 17:16:51 -07:00
Amélie	8ee56b9a5b	Feature: Add support for meilisearch vectorstore (#7649 ) Description: Add support for Meilisearch vector store. Resolve #7603 - No external dependencies added - A notebook has been added @rlancemartin https://twitter.com/meilisearch Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-28 17:06:54 -07:00
Bearnardd	b7d6e1909c	fix empty ids when metadatas is provided (#8127 ) Fixes https://github.com/hwchase17/langchain/issues/7865 and https://github.com/hwchase17/langchain/issues/8061 - [x] fixes returning empty ids when metadatas argument is provided @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-28 16:17:31 -07:00
lvisdd	abe4c361f9	update get_num_tokens_from_messages model (#8431 ) (#8430) Co-authored-by: Kano Kunihiko <kkano@heroz.co.jp>	2023-07-28 15:07:03 -07:00
Jeffrey Wang	e0de62f6da	Add RoPE Scaling params from llamacpp (#8422 ) Description: Just adding parameters from `llama-python-cpp` that support RoPE scaling. @hwchase17, @baskaryan sources: papers and explanation: https://kaiokendev.github.io/context llamacpp conversation: https://github.com/ggerganov/llama.cpp/discussions/1965 Supports models like: https://huggingface.co/conceptofmind/LLongMA-2-13b	2023-07-28 14:42:41 -07:00
Harrison Chase	fab24457bc	remove code (#8425 )	2023-07-28 13:19:44 -07:00
Harrison Chase	3a78450883	update experimental (#8402 ) some changes were made to experimental, porting them over	2023-07-28 13:01:36 -07:00
Harrison Chase	af7e70d4af	expose function for converting messages to messages (#8426 )	2023-07-28 13:00:54 -07:00
Eugene Yurtsev	06bdbe06fe	PromptTemplate update documentation and expand kwarg (#8423 ) # PromptTemplate * Update documentation to highlight the classmethod for instantiating a prompt template. * Expand kwargs in the classmethod to make parameters easier to discover This PR got reverted here: https://github.com/langchain-ai/langchain/pull/8395/files	2023-07-28 14:11:49 -04:00
Eugene Yurtsev	e62a1686e2	ChatPromptTemplate: minor fix in doc string (#8424 ) Minor fix in doc-string to use `ai` rather than `assistant`	2023-07-28 13:01:13 -04:00
Eugene Yurtsev	760c278fe0	ChatPromptTemplate: Expand support for message formats and documentation (#8244 ) * Expands support for a variety of message formats in the `from_messages` classmethod. Ideally, we could deprecate the other on-ramps to reduce the amount of classmethods users need to know about. * Expand documentation with code examples.	2023-07-28 12:48:08 -04:00
Bagatur	61dd92f821	bump 246 (#8410 )	2023-07-28 01:18:37 -07:00
Harrison Chase	394b67ab92	add kwargs to llm runnables (#8388 )	2023-07-28 09:13:11 +01:00
HeTaoPKU	d5884017a9	Add Minimax llm model to langchain (#7645 ) - Description: Minimax is a great AI startup from China, recently they released their latest model and chat API, and the API is widely-spread in China. As a result, I'd like to add the Minimax llm model to Langchain. - Tag maintainer: @hwchase17, @baskaryan --------- Co-authored-by: the <tao.he@hulu.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-27 22:53:23 -07:00
James Campbell	0ad2d5f27a	[nit] Add default value for ChatOpenAI client (#7939 ) Micro convenience PR to avoid warning regarding missing `client` parameter. It is always set during initialization. @baskaryan Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-27 22:38:32 -07:00
Harrison Chase	82df923f37	Merge branch 'master' of github.com:hwchase17/langchain	2023-07-27 22:01:20 -07:00
Harrison Chase	1b0bfa54cf	cr	2023-07-27 22:00:52 -07:00
Jeff Vestal	c7ff5f19a8	ElasticKnnSearch rewrite - bug fix - return Document (#8180 ) Fixes: https://github.com/hwchase17/langchain/issues/7117 https://github.com/hwchase17/langchain/issues/5760 Adding back `create_index` , `add_texts`, `from_texts` to ElasticKnnSearch `from_texts` matches standard `from_texts` methods as quick start up method `knn_search` and `hybrid_result` return a list of [`Document()`, `score`,] # Test `from_texts` for quick start ``` # create new index using from_text from langchain.vectorstores.elastic_vector_search import ElasticKnnSearch from langchain.embeddings import ElasticsearchEmbeddings model_id = "sentence-transformers__all-distilroberta-v1" dims = 768 es_cloud_id = "" es_user = "" es_password = "" test_index = "knn_test_index_305" embeddings = ElasticsearchEmbeddings.from_credentials( model_id, #input_field=input_field, es_cloud_id=es_cloud_id, es_user=es_user, es_password=es_password, ) # add texts and create class instance texts = ["This is a test document", "This is another test document"] knnvectorsearch = ElasticKnnSearch.from_texts( texts=texts, embedding=embeddings, index_name= test_index, vector_query_field='vector', query_field='text', model_id=model_id, dims=dims, es_cloud_id=es_cloud_id, es_user=es_user, es_password=es_password ) # Test `add_texts` method texts2 = ["Hello, world!", "Machine learning is fun.", "I love Python."] knnvectorsearch.add_texts(texts2) query = "Hello" knn_result = knnvectorsearch.knn_search(query = query, model_id= model_id, k=2) hybrid_result = knnvectorsearch.knn_hybrid_search(query = query, model_id= model_id, k=2) ``` The mapping is as follows: ``` { "knn_test_index_012": { "mappings": { "properties": { "text": { "type": "text" }, "vector": { "type": "dense_vector", "dims": 768, "index": true, "similarity": "dot_product" } } } } } ``` # Check response type ``` >>> hybrid_result [(Document(page_content='Hello, world!', metadata={}), 0.94232327), (Document(page_content='I love Python.', metadata={}), 0.5321523)] >>> hybrid_result[0] (Document(page_content='Hello, world!', metadata={}), 0.94232327) >>> hybrid_result[0][0] Document(page_content='Hello, world!', metadata={}) >>> type(hybrid_result[0][0]) <class 'langchain.schema.document.Document'> ``` # Test with existing Index ``` from langchain.vectorstores.elastic_vector_search import ElasticKnnSearch from langchain.embeddings import ElasticsearchEmbeddings ## Initialize ElasticsearchEmbeddings model_id = "sentence-transformers__all-distilroberta-v1" dims = 768 es_cloud_id = es_user = "" es_password = "" test_index = "knn_test_index_012" embeddings = ElasticsearchEmbeddings.from_credentials( model_id, es_cloud_id=es_cloud_id, es_user=es_user, es_password=es_password, ) ## Initialize ElasticKnnSearch knn_search = ElasticKnnSearch( es_cloud_id=es_cloud_id, es_user=es_user, es_password=es_password, index_name= test_index, embedding= embeddings ) ## Test adding vectors ### Test `add_texts` method when index created texts = ["Hello, world!", "Machine learning is fun.", "I love Python."] knn_search.add_texts(texts) ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-27 22:00:18 -07:00
Harrison Chase	a221a9ced0	Harrison/sql query (#8370 ) Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-07-27 21:55:17 -07:00
Bagatur	a1a650c743	Bagatur/from texts bug fix (#8394 ) --------- Co-authored-by: Davit Buniatyan <davit@loqsh.com> Co-authored-by: Davit Buniatyan <d@activeloop.ai> Co-authored-by: adilkhan <adilkhan.sarsen@nu.edu.kz> Co-authored-by: Ivo Stranic <istranic@gmail.com>	2023-07-27 21:52:38 -07:00
Jiayi Ni	1efb9bae5f	FEAT: Integrate Xinference LLMs and Embeddings (#8171 ) - [Xorbits Inference(Xinference)](https://github.com/xorbitsai/inference) is a powerful and versatile library designed to serve language, speech recognition, and multimodal models. Xinference supports a variety of GGML-compatible models including chatglm, whisper, and vicuna, and utilizes heterogeneous hardware and a distributed architecture for seamless cross-device and cross-server model deployment. - This PR integrates Xinference models and Xinference embeddings into LangChain. - Dependencies: To install the depenedencies for this integration, run `pip install "xinference[all]"` - Example Usage: To start a local instance of Xinference, run `xinference`. To deploy Xinference in a distributed cluster, first start an Xinference supervisor using `xinference-supervisor`: `xinference-supervisor -H "${supervisor_host}"` Then, start the Xinference workers using `xinference-worker` on each server you want to run them on. `xinference-worker -e "http://${supervisor_host}:9997"` To use Xinference with LangChain, you also need to launch a model. You can use command line interface (CLI) to do so. Fo example: `xinference launch -n vicuna-v1.3 -f ggmlv3 -q q4_0`. This launches a model named vicuna-v1.3 with `model_format="ggmlv3"` and `quantization="q4_0"`. A model UID is returned for you to use. Now you can use Xinference with LangChain: ```python from langchain.llms import Xinference llm = Xinference( server_url="http://0.0.0.0:9997", # suppose the supervisor_host is "0.0.0.0" model_uid = {model_uid} # model UID returned from launching a model ) llm( prompt="Q: where can we visit in the capital of France? A:", generate_config={"max_tokens": 1024}, ) ``` You can also use RESTful client to launch a model: ```python from xinference.client import RESTfulClient client = RESTfulClient("http://0.0.0.0:9997") model_uid = client.launch_model(model_name="vicuna-v1.3", model_size_in_billions=7, quantization="q4_0") ``` The following code block demonstrates how to use Xinference embeddings with LangChain: ```python from langchain.embeddings import XinferenceEmbeddings xinference = XinferenceEmbeddings( server_url="http://0.0.0.0:9997", model_uid = model_uid ) ``` ```python query_result = xinference.embed_query("This is a test query") ``` ```python doc_result = xinference.embed_documents(["text A", "text B"]) ``` Xinference is still under rapid development. Feel free to [join our Slack community](https://xorbitsio.slack.com/join/shared_invite/zt-1z3zsm9ep-87yI9YZ_B79HLB2ccTq4WA) to get the latest updates! - Request for review: @hwchase17, @baskaryan - Twitter handle: https://twitter.com/Xorbitsio --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-27 21:23:19 -07:00
Bagatur	877d384bc9	Revert "PromptTemplate update documentation and expand kwargs (#8234 )" (#8395 ) fyi @eyurtsev was failing a unit test	2023-07-27 21:11:10 -07:00
Gordon Clark	e66759cc9d	Github add "Create PR" tool + Docs update (#8235 ) Added a new tool to the Github toolkit called Create Pull Request. Now we can make our own langchain contributor in langchain 😁 In order to have somewhere to pull from, I also added a new env var, "GITHUB_BASE_BRANCH." This will allow the existing env var, "GITHUB_BRANCH," to be a working branch for the bot (so that it doesn't have to always commit on the main/master). For example, if you want the bot to work in a branch called `bot_dev` and your repo base is `main`, you would set up the vars like: ``` GITHUB_BASE_BRANCH = "main" GITHUB_BRANCH = "bot_dev" ``` Maintainer responsibilities: - Agents / Tools / Toolkits: @hinthornw	2023-07-27 19:19:44 -07:00
William FH	ecd4aae818	Few Shot Chat Prompt (#8038 ) Proposal for a few shot chat message example selector --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-07-27 18:46:10 -07:00
Eugene Yurtsev	6dd18eee26	PromptTemplate update documentation and expand kwargs (#8234 ) # PromptTemplate * Update documentation to highlight the classmethod for instantiating a prompt template. * Expand kwargs in the classmethod to make parameters easier to discover	2023-07-27 18:11:39 -07:00
Karan V	a003a0baf6	fix(petals) allows to run models that aren't Bloom (Support for LLama and newer models) (#8356 ) In this PR: - Removed restricted model loading logic for Petals-Bloom - Removed petals imports (DistributedBloomForCausalLM, BloomTokenizerFast) - Instead imported more generalized versions of loader (AutoDistributedModelForCausalLM, AutoTokenizer) - Updated the Petals example notebook to allow for a successful installation of Petals in Apple Silicon Macs - Tag maintainer: @hwchase17, @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-27 18:01:04 -07:00
lars.gersmann	e758e9e7f5	fix(openapi): openapi chain will work without/empty description/summa… (#8351 ) Description: This PR will enable the Open API chain to work with valid Open API specifications missing `description` and `summary` properties for path and operation nodes in open api specs. Since both `description` and `summary` property are declared optional we cannot be sure they are defined. This PR resolves this problem by providing an empty (`''`) description as fallback. The previous behavior of the Open API chain was that the underlying LLM (OpenAI) throw ed an exception since `None` is not of type string: ``` openai.error.InvalidRequestError: None is not of type 'string' - 'functions.0.description' ``` Using this PR the Open API chain will succeed also using Open API specs lacking `description` and `summary` properties for path and operation nodes. Thanks for your amazing work ! Tag maintainer: @baskaryan --------- Co-authored-by: Lars Gersmann <lars.gersmann@cm4all.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-27 17:58:43 -07:00
ljeagle	caa6caeb8a	Upgrade the AwaDB from v0.3.7 to v0.3.9 and change the default embeddings (#8281 ) 1. Upgrade the AwaDB from v0.3.7 to v0.3.9 2. Change the default embedding to AwaEmbedding --------- Co-authored-by: ljeagle <awadb.vincent@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-27 17:20:50 -07:00
Holt Skinner	d7e6770de8	refactor: Code refactoring & simplification for Google Cloud Enterprise Search retriever (#8369 ) Followup to https://github.com/langchain-ai/langchain/pull/7857 - Changes `_convert_search_response()` to use object attributes instead of converting to dictionary - Simplifies logic for readability	2023-07-27 17:13:49 -07:00
Taozhi Wang	594f195e54	Add embeddings for AwaEmbedding (#8353 ) - Description: Adds AwaEmbeddings class for embeddings, which provides users with a convenient way to do fine-tuning, as well as the potential need for multimodality - Tag maintainer: @baskaryan Create `Awa.ipynb`: an example notebook for AwaEmbeddings class Modify `embeddings/__init__.py`: Import the class Create `embeddings/awa.py`: The embedding class Create `embeddings/test_awa.py`: The test file. --------- Co-authored-by: taozhiwang <taozhiwa@gmail.com>	2023-07-27 17:08:00 -07:00
thehunmonkgroup	ba4e82bb47	fix missing _identifying_params() in _VertexAICommon (#8303 ) Full set of params are missing from Vertex* LLMs when `dict()` method is called. ``` >>> from langchain.chat_models.vertexai import ChatVertexAI >>> from langchain.llms.vertexai import VertexAI >>> chat_llm = ChatVertexAI() l>>> llm = VertexAI() >>> chat_llm.dict() {'_type': 'vertexai'} >>> llm.dict() {'_type': 'vertexai'} ``` This PR just uses the same mechanism used elsewhere to expose the full params. Since `_identifying_params()` is on the `_VertexAICommon` class, it should cover the chat and non-chat cases.	2023-07-27 16:59:10 -07:00
Caitlin2694	b2e4b9dca4	Fix exception caused by restrictions in OWL (#8341 ) Description: Fix exception caused by restrictions in OWL Issue: #8331 Dependencies: none Maintainer: @baskaryan	2023-07-27 16:51:32 -07:00
Nikita Pokidyshev	f499e6ea6a	Add FunctionMessage to _message_from_dict (#8374 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-27 16:45:27 -07:00
emarco177	2ab13ab743	added unit tests for mrkl output_parser.py (#8321 ) - Description: added unit tests for mrkl output_parser.py, - Tag maintainer: @hinthornw - Twitter handle: EdenEmarco177	2023-07-27 13:46:06 -07:00
Harrison Chase	f5bf893035	rename to str output parser (#8373 )	2023-07-27 12:57:34 -07:00
William FH	0e9e5b5202	Retry events on any run type (#8375 )	2023-07-27 12:56:46 -07:00
William FH	ff98fad2d9	Add Retry Events (#8053 ) ![image](https://github.com/hwchase17/langchain/assets/13333726/59a5c3b4-4367-47e6-9f58-5b6557576a8a) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-27 12:39:39 -07:00
Nuno Campos	0eca3e7d90	Add Runnable.bind method to attach kwargs to a Runnable that will be passed to all invoke/stream/batch calls when it is run (#8368 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-27 11:16:30 -07:00
Nuno Campos	1bbadde77b	Support using RunnableMap directly (#8317 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-27 17:24:29 +01:00
Bagatur	944321c6ab	bump 245 (#8359 )	2023-07-27 06:53:24 -07:00
Rubén Barragán	ef6332ead6	Support loading files from Dropbox (#8271 ) ## Description This commit introduces the `DropboxLoader` class, a new document loader that allows loading files from Dropbox into the application. The loader relies on a Dropbox app, which requires creating an app on Dropbox, obtaining the necessary scope permissions, and generating an access token. Additionally, the dropbox Python package is required. The `DropboxLoader` class is designed to be used as a document loader for processing various file types, including text files, PDFs, and Dropbox Paper files. ## Dependencies `pip install dropbox` and `pip install unstructured` for PDF reading. ## Tag maintainer @rlancemartin, @eyurtsev (from Data Loaders). I'd appreciate some feedback here 🙏 . ## Social Networks https://github.com/rubenbarragan https://www.linkedin.com/in/rgbarragan/ https://twitter.com/RubenBarraganP --------- Co-authored-by: Ruben Barragan <rbarragan@Rubens-MacBook-Air.local>	2023-07-27 06:36:08 -07:00
Pranay Chandekar	41bb3a6f9b	fixed the bug #8343 (#8345 ) - Issue: #8343 Signed-off-by: Pranay Chandekar <pranayc6@gmail.com>	2023-07-27 06:33:15 -07:00
Martin Krasser	93260a9922	Fix broken `make` targets `format_diff` and `lint_diff` (#8344 ) Since the refactoring into sub-projects `libs/langchain` and `libs/experimental`, the `make` targets `format_diff` and `lint_diff` do not work anymore when running `make` from these subdirectories. Reason is that ``` PYTHON_FILES=$(shell git diff --name-only --diff-filter=d master \| grep -E '\.py$$\|\.ipynb$$') ``` generates paths from the project's root directory instead of the corresponding subdirectories. This PR fixes this by adding a `--relative` command line option. - Tag maintainer: @baskaryan	2023-07-27 01:56:55 -07:00
Harrison Chase	ae78ef7fe6	bump experimental to 005 (#8339 )	2023-07-26 21:46:28 -07:00
Vadim Gubergrits	e7e5cb9d08	Tree of Thought introducing a new ToTChain. (#5167 ) # [WIP] Tree of Thought introducing a new ToTChain. This PR adds a new chain called ToTChain that implements the ["Large Language Model Guided Tree-of-Though"](https://arxiv.org/pdf/2305.08291.pdf) paper. There's a notebook example `docs/modules/chains/examples/tot.ipynb` that shows how to use it. Implements #4975 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: - @hwchase17 - @vowelparrot --------- Co-authored-by: Vadim Gubergrits <vgubergrits@outbox.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-26 21:29:39 -07:00
William FH	9eb7e6e27f	Delete Old Evals Examples (#8252 ) Still retain: - Comparison Examples - Data + QA walkthrough - QA (but really minimize it)	2023-07-26 18:46:54 -07:00
Saurabh Misra	db9d5b213a	Optimize the cosine_similarity_top_k function performance (#8151 ) Optimizing important numerical code and making it run faster. Performance went up by 1.48x (148%). Runtime went down from 138715us to 56020us Optimization explanation: The `cosine_similarity_top_k` function is where we made the most significant optimizations. Instead of sorting the entire score_array which needs considering all elements, `np.argpartition` is utilized to find the top_k largest scores indices, this operation has a time complexity of O(n), higher performance than sorting. Remember, `np.argpartition` doesn't guarantee the order of the values. So we need to use argsort() to get the indices that would sort our top-k values after partitioning, which is much more efficient because it only sorts the top-K elements, not the entire array. Then to get the row and column indices of sorted top_k scores in the original score array, we use `np.unravel_index`. This operation is more efficient and cleaner than a list comprehension. The code has been tested for correctness by running the following snippet on both the original function and the optimized function and averaged over 5 times. ``` def test_cosine_similarity_top_k_large_matrices(): X = np.random.rand(1000, 1000) Y = np.random.rand(1000, 1000) top_k = 100 score_threshold = 0.5 gc.disable() counter = time.perf_counter_ns() return_value = cosine_similarity_top_k(X, Y, top_k, score_threshold) duration = time.perf_counter_ns() - counter gc.enable() ``` @hwaking @hwchase17 @jerwelborn Unit tests pass, I also generated more regression tests which all passed.	2023-07-26 18:03:49 -07:00
Fabrizio Ruocco	ddc353a768	Azure Cognitive Search: Custom index and scoring profile support (#6843 ) Description: Adding support for custom index and scoring profile support in Azure Cognitive Search @hwchase17 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-26 17:58:01 -07:00
Kacper Łukawski	c5988c1d4b	Implement async support for Cohere (#8237 ) This PR introduces async API support for Cohere, both LLM and embeddings. It requires updating `cohere` package to `^4`. Tagging @hwchase17, @baskaryan, @agola11 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-26 15:51:18 -07:00
Daniel Alexander Brenot	bf1357f584	Added async support to PlanAndExecute Chain (#8239 ) - Description: Adds async support to the PlanAndExecute Chain Maintainer responsibilities: - Async: @agola11 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-26 15:16:07 -07:00
Bastin Florian	a3ac9b23eb	feat(confluence): add markdown format option (#8246 ) # Description: Add the possibility to keep text as Markdown in the ConfluenceLoader Add a bool variable that allows to keep the Markdown format of the Confluence pages. It is useful because it allows to use MarkdownHeaderTextSplitter as a DataSplitter. If this variable in set to True in the load() method, the pages are extracted using the markdownify library. # Issue: [4407](https://github.com/langchain-ai/langchain/issues/4407) # Dependencies: Add the markdownify library # Tag maintainer: @rlancemartin, @eyurtsev # Twitter handle: FloBastinHeyI - https://twitter.com/FloBastinHeyI --------- Co-authored-by: Florian Bastin <florian.bastin@octo.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-26 15:00:27 -07:00
Leonid Ganeline	ee6ff96e28	docstrings cleanup (#8311 ) - added missed docstrings - changed docstrings into consistent format @baskaryan	2023-07-26 14:13:10 -07:00
Rohit Gupta	e5dba8978a	Avoid re-computation of embedding in weaviate similarity search (#8284 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-26 13:31:55 -07:00
Nuno Campos	a612800ef0	Runnable single protocol (#7800 ) Objects implementing Runnable: BasePromptTemplate, LLM, ChatModel, Chain, Retriever, OutputParser - [x] Implement Runnable in base Retriever - [x] Raise TypeError in operator methods for unsupported things - [x] Implement dict which calls values in parallel and outputs dict with results - [x] Merge in `+` for prompts - [x] Confirm precedence order for operators, ideal would be `+` `\|`, https://docs.python.org/3/reference/expressions.html#operator-precedence - [x] Add support for openai functions, ie. Chat Models must return messages - [x] Implement BaseMessageChunk return type for BaseChatModel, a subclass of BaseMessage which implements __add__ to return BaseMessageChunk, concatenating all str args - [x] Update implementation of stream/astream for llm and chat models to use new `_stream`, `_astream` optional methods, with default implementation in base class `raise NotImplementedError` use https://stackoverflow.com/a/59762827 to see if it is implemented in base class - [x] Delete the IteratorCallbackHandler (leave the async one because people using) - [x] Make BaseLLMOutputParser implement Runnable, accepting either str or BaseMessage --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-07-26 12:16:46 -07:00
Bharat	04a4d3e312	Fixes #8310 Fix maximum recursion depth exceeded error (#8313 ) ElasticsearchVectorStore.as_retriever() method is returning `RecursionError: maximum recursion depth exceeded` because of incorrect field reference in `embeddings()` method - Description: Fix RecursionError because of a typo - Issue: the issue #8310 - Dependencies: None, - Tag maintainer: @eyurtsev - Twitter handle: bpatel	2023-07-26 12:15:37 -07:00
Caitlin2694	b9db3dd09b	Fix "missing key op" RDFGraph OWL serialization (#8276 ) Replace this comment with: - Description: Fix "missing key op" error in RDFGraph OWL Serialization - Issue: #8263 - Dependencies: None - Tag maintainer: @baskaryan	2023-07-26 12:14:56 -07:00
Eugene Yurtsev	862e9aed66	ChatPromptTemplate: Update doc-strings, update from_role_strings behavior (#8308 ) * Update doc-strings in ChatPromptTemplate * Update from_role_strings classmethod to use well known roles	2023-07-26 15:02:36 -04:00
Bagatur	2c2fd9ff13	bump 244 (#8314 )	2023-07-26 11:58:26 -07:00
Lance Martin	77c0582243	Clean queries prior to search (#8309 ) With some search tools, we see no results returned if the query is a numeric list. E.g., if we pass: ``` '1. "LangChain vs LangSmith: How do they differ?"' ``` We see: ``` No good Google Search Result was found ``` Local testing w/ Streamlit: ![image](https://github.com/langchain-ai/langchain/assets/122662504/0a7e3dca-59e8-415e-8df6-bd9e4ea962ee)	2023-07-26 11:48:28 -07:00
shibuiwilliam	6b88fbd9bb	add test for embedding distance evaluation (#8285 ) Add tests for embedding distance evaluation - Description: Add tests for embedding distance evaluation - Issue: None - Dependencies: None - Tag maintainer: @baskaryan - Twitter handle: @MlopsJ	2023-07-26 11:45:50 -07:00
Timon Palm	70604e590f	DuckDuckGoSearch News Tool (#8292 ) Description: I wanted to use the DuckDuckGoSearch tool in an agent to let him get the latest news for a topic. DuckDuckGoSearch has already an implemented function for retrieving news articles. But there wasn't a tool to use it. I simply adapted the SearchResult class with an extra argument "backend". You can set it to "news" to only get news articles. Furthermore, I added an example to the DuckDuckGo Notebook on how to further customize the results by using the DuckDuckGoSearchAPIWrapper. Dependencies: no new dependencies --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-26 11:30:01 -07:00
Byron Saltysiak	61347bd322	giving path to the copy command for *.toml files (#8294 ) Description: in the .devcontainer, docker-compose build is currently failing due to the src paths in the COPY command. This change adds the full path to the pyproject.toml and poetry.toml to allow the build to run. Issue: You can see the issue if you try to build the dev docker image with: ``` cd .devcontainer docker-compose build ``` Dependencies: none Twitter handle: byronsalty	2023-07-26 10:37:03 -07:00
happyxhw	6384c1ec8f	fix: ElasticVectorSearch.from_documents failed #8293 (#8296 ) - Description: fix ElasticVectorSearch.from_documents with elasticsearch_url param, - Issue: ElasticVectorSearch.from_documents failed #8293 # it fixes (if applicable), --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-26 10:33:52 -07:00
jacobswe	83a53e2126	Bug Fix: AzureChatOpenAI streaming with function calls (#8300 ) - Description: During streaming, the first chunk may only contain the name of an OpenAI function and not any arguments. In this case, the current code presumes there is a streaming response and tries to append to it, but gets a KeyError. This fixes that case by checking if the arguments key exists, and if not, creates a new entry instead of appending. - Issue: Related to #6462 Sample Code: ```python llm = AzureChatOpenAI( deployment_name=deployment_name, model_name=model_name, streaming=True ) tools = [PythonREPLTool()] callbacks = [StreamingStdOutCallbackHandler()] agent = initialize_agent( tools=tools, llm=llm, agent=AgentType.OPENAI_FUNCTIONS, callbacks=callbacks ) agent('Run some python code to test your interpreter') ``` Previous Result: ``` File ...langchain/chat_models/openai.py:344, in ChatOpenAI._generate(self, messages, stop, run_manager, **kwargs) 342 function_call = _function_call 343 else: --> 344 function_call["arguments"] += _function_call["arguments"] 345 if run_manager: 346 run_manager.on_llm_new_token(token) KeyError: 'arguments' ``` New Result: ```python {'input': 'Run some python code to test your interpreter', 'output': "The Python code `print('Hello, World!')` has been executed successfully, and the output `Hello, World!` has been printed."} ``` Co-authored-by: jswe <jswe@polencapital.com>	2023-07-26 10:11:50 -07:00
German Martin	457a4730b2	Fix the mangling issue on several VectorStores child classes. (#8274 ) - Description: Fix mangling issue affecting a couple of VectorStore classes including Redis. - Issue: https://github.com/langchain-ai/langchain/issues/8185 - @rlancemartin This is a simple issue but I lack of some context in the original implementation. My changes perhaps are not the definitive fix but to start a quick discussion. @hinthornw Tagging you since one of your changes introduced this [here.](`c38965fcba`)	2023-07-26 09:48:55 -07:00
Alec Flett	4da43f77e5	Add ability to load (deserialize) objects from other namespaces (#7726 ) I have some Prompt subclasses in my project that I'd like to be able to deserialize in callbacks. Right now `loads()`/`load()` will bail when it encounters my object, but I know I can trust the objects because they're in my own projects. <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-26 16:59:28 +01:00
Bagatur	5c6dcb1960	bump 243 (#8289 )	2023-07-26 05:41:56 -07:00
William FH	adf019724f	unpack later (#8278 ) Fix https://github.com/langchain-ai/langchain/issues/8272	2023-07-26 01:53:22 -07:00
Naveen Tatikonda	9cbefcc56c	[ OpenSearch ] : Add AOSS Support to OpenSearch (#8256 ) ### Description This PR includes the following changes: - Adds AOSS (Amazon OpenSearch Service Serverless) support to OpenSearch. Please refer to the documentation on how to use it. - While creating an index, AOSS only supports Approximate Search with `nmslib` and `faiss` engines. During Search, only Approximate Search and Script Scoring (on doc values) are supported. - This PR also adds support to `efficient_filter` which can be used with `faiss` and `lucene` engines. - The `lucene_filter` is deprecated. Instead please use the `efficient_filter` for the lucene engine. Signed-off-by: Naveen Tatikonda <navtat@amazon.com>	2023-07-25 23:59:36 -07:00
Lance Martin	7a00f17033	Web research retriever (#8102 ) Given a user question, this will - * Use LLM to generate a set of queries. * Query for each. * The URLs from search results are stored in self.urls. * A check is performed for any new URLs that haven't been processed yet (not in self.url_database). * Only these new URLs are loaded, transformed, and added to the vectorstore. * The vectorstore is queried for relevant documents based on the questions generated by the LLM. * Only unique documents are returned as the final result. This code will avoid reprocessing of URLs across multiple runs of similar queries, which should improve the performance of the retriever. It also keeps track of all URLs that have been processed, which could be useful for debugging or understanding the retriever's behavior. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-25 19:58:00 -07:00

1 2 3 4 5

206 Commits