community[minor]: Vectara Integration Update - Streaming, FCS, Chat, updates to documentation and example notebooks (#21334)

Thank you for contributing to LangChain! **Description:** update to the Vectara / Langchain integration to integrate new Vectara capabilities: - Full RAG implemented as a Runnable with as_rag() - Vectara chat supported with as_chat() - Both support streaming response - Updated documentation and example notebook to reflect all the changes - Updated Vectara templates **Twitter handle:** ofermend **Add tests and docs**: no new tests or docs, but updated both existing tests and existing docs
4 months ago · ad502e8d50
parent cb183a9bf1
commit ad502e8d50
14 changed files with 1000 additions and 1496 deletions
--- a/docs/docs/integrations/providers/vectara/index.mdx
+++ b/docs/docs/integrations/providers/vectara/index.mdx
@ -1,28 +1,38 @@
 # Vectara

->[Vectara](https://vectara.com/) is the trusted GenAI platform for developers. It provides a simple API to build GenAI applications
-> for semantic search or RAG (Retreieval augmented generation).
+>[Vectara](https://vectara.com/) provides a Trusted Generative AI platform, allowing organizations to rapidly create a ChatGPT-like experience (an AI assistant) 
+> which is grounded in the data, documents, and knowledge that they have (technically, it is Retrieval-Augmented-Generation-as-a-service).

 **Vectara Overview:**
- `Vectara` is developer-first API platform for building trusted GenAI applications. 
- To use Vectara - first [sign up](https://vectara.com/integrations/langchain) and create an account. Then create a corpus and an API key for indexing and searching.
- You can use Vectara's [indexing API](https://docs.vectara.com/docs/indexing-apis/indexing) to add documents into Vectara's index
- You can use Vectara's [Search API](https://docs.vectara.com/docs/search-apis/search) to query Vectara's index (which also supports Hybrid search implicitly).
+`Vectara` is RAG-as-a-service, providing all the components of RAG behind an easy-to-use API, including:
+1. A way to extract text from files (PDF, PPT, DOCX, etc)
+2. ML-based chunking that provides state of the art performance.
+3. The [Boomerang](https://vectara.com/how-boomerang-takes-retrieval-augmented-generation-to-the-next-level-via-grounded-generation/) embeddings model.
+4. Its own internal vector database where text chunks and embedding vectors are stored.
+5. A query service that automatically encodes the query into embedding, and retrieves the most relevant text segments
+(including support for [Hybrid Search](https://docs.vectara.com/docs/api-reference/search-apis/lexical-matching) and 
+[MMR](https://vectara.com/get-diverse-results-and-comprehensive-summaries-with-vectaras-mmr-reranker/))
+7. An LLM to for creating a [generative summary](https://docs.vectara.com/docs/learn/grounded-generation/grounded-generation-overview), based on the retrieved documents (context), including citations.
+
+For more information:
+- [Documentation](https://docs.vectara.com/docs/)
+- [API Playground](https://docs.vectara.com/docs/rest-api/)
+- [Quickstart](https://docs.vectara.com/docs/quickstart)

 ## Installation and Setup

 To use `Vectara` with LangChain no special installation steps are required. 
-To get started, [sign up](https://vectara.com/integrations/langchain) and follow our [quickstart](https://docs.vectara.com/docs/quickstart) guide to create a corpus and an API key. 
-Once you have these, you can provide them as arguments to the Vectara vectorstore, or you can set them as environment variables.
+To get started, [sign up](https://vectara.com/integrations/langchain) for a free Vectara account (if you don't already have one), 
+and follow the [quickstart](https://docs.vectara.com/docs/quickstart) guide to create a corpus and an API key. 
+Once you have these, you can provide them as arguments to the Vectara `vectorstore`, or you can set them as environment variables.

 - export `VECTARA_CUSTOMER_ID`="your_customer_id"
 - export `VECTARA_CORPUS_ID`="your_corpus_id"
 - export `VECTARA_API_KEY`="your-vectara-api-key"

-
 ## Vectara as a Vector Store

-There exists a wrapper around the Vectara platform, allowing you to use it as a vectorstore, whether for semantic search or example selection.
+There exists a wrapper around the Vectara platform, allowing you to use it as a `vectorstore` in LangChain:

 To import this vectorstore:
 ```python
@ -37,7 +47,10 @@ vectara = Vectara(
    vectara_api_key=api_key
 )
 ```
-The customer_id, corpus_id and api_key are optional, and if they are not supplied will be read from the environment variables `VECTARA_CUSTOMER_ID`, `VECTARA_CORPUS_ID` and `VECTARA_API_KEY`, respectively.
+The `customer_id`, `corpus_id` and `api_key` are optional, and if they are not supplied will be read from 
+the environment variables `VECTARA_CUSTOMER_ID`, `VECTARA_CORPUS_ID` and `VECTARA_API_KEY`, respectively.
+
+### Adding Texts or Files

 After you have the vectorstore, you can `add_texts` or `add_documents` as per the standard `VectorStore` interface, for example:

@ -45,8 +58,8 @@ After you have the vectorstore, you can `add_texts` or `add_documents` as per th
 vectara.add_texts(["to be or not to be", "that is the question"])
 ```

-
-Since Vectara supports file-upload, we also added the ability to upload files (PDF, TXT, HTML, PPT, DOC, etc) directly as file. When using this method, the file is uploaded directly to the Vectara backend, processed and chunked optimally there, so you don't have to use the LangChain document loader or chunking mechanism.
+Since Vectara supports file-upload in the platform, we also added the ability to upload files (PDF, TXT, HTML, PPT, DOC, etc) directly. 
+When using this method, each file is uploaded directly to the Vectara backend, processed and chunked optimally there, so you don't have to use the LangChain document loader or chunking mechanism.

 As an example:

@ -54,9 +67,13 @@ As an example:
 vectara.add_files(["path/to/file1.pdf", "path/to/file2.pdf",...])
 ```

-To query the vectorstore, you can use the `similarity_search` method (or `similarity_search_with_score`), which takes a query string and returns a list of results:
+Of course you do not have to add any data, and instead just connect to an existing Vectara corpus where data may already be indexed.
+
+### Querying the VectorStore
+
+To query the Vectara vectorstore, you can use the `similarity_search` method (or `similarity_search_with_score`), which takes a query string and returns a list of results:
 ```python
-results = vectara.similarity_score("what is LangChain?")
+results = vectara.similarity_search_with_score("what is LangChain?")
 ```
 The results are returned as a list of relevant documents, and a relevance score of each document.

@ -65,28 +82,101 @@ In this case, we used the default retrieval parameters, but you can also specify
 - `lambda_val`: the [lexical matching](https://docs.vectara.com/docs/api-reference/search-apis/lexical-matching) factor for hybrid search (defaults to 0.025)
 - `filter`: a [filter](https://docs.vectara.com/docs/common-use-cases/filtering-by-metadata/filter-overview) to apply to the results (default None)
 - `n_sentence_context`: number of sentences to include before/after the actual matching segment when returning results. This defaults to 2.
- `mmr_config`: can be used to specify MMR mode in the query.
-   - `is_enabled`: True or False
-   - `mmr_k`: number of results to use for MMR reranking
-   - `diversity_bias`: 0 = no diversity, 1 = full diversity. This is the lambda parameter in the MMR formula and is in the range 0...1
+- `rerank_config`: can be used to specify reranker for thr results
+   - `reranker`: mmr, rerank_multilingual_v1 or none. Note that "rerank_multilingual_v1" is a Scale only feature
+   - `rerank_k`: number of results to use for reranking
+   - `mmr_diversity_bias`: 0 = no diversity, 1 = full diversity. This is the lambda parameter in the MMR formula and is in the range 0...1
+
+To get results without the relevance score, you can simply use the 'similarity_search' method:
+```python   
+results = vectara.similarity_search("what is LangChain?")
+```

 ## Vectara for Retrieval Augmented Generation (RAG)

-Vectara provides a full RAG pipeline, including generative summarization. 
-To use this pipeline, you can specify the `summary_config` argument in `similarity_search` or `similarity_search_with_score` as follows:
+Vectara provides a full RAG pipeline, including generative summarization. To use it as a complete RAG solution, you can use the `as_rag` method.
+There are a few additional parameters that can be specified in the `VectaraQueryConfig` object to control retrieval and summarization:
+* k: number of results to return
+* lambda_val: the lexical matching factor for hybrid search
+* summary_config (optional): can be used to request an LLM summary in RAG
+   - is_enabled: True or False
+   - max_results: number of results to use for summary generation
+   - response_lang: language of the response summary, in ISO 639-2 format (e.g. 'en', 'fr', 'de', etc)
+* rerank_config (optional): can be used to specify Vectara Reranker of the results
+   - reranker: mmr, rerank_multilingual_v1 or none
+   - rerank_k: number of results to use for reranking
+   - mmr_diversity_bias: 0 = no diversity, 1 = full diversity. 
+     This is the lambda parameter in the MMR formula and is in the range 0...1
+
+For example:

- `summary_config`: can be used to request an LLM summary in RAG
-   - `is_enabled`: True or False
-   - `max_results`: number of results to use for summary generation
-   - `response_lang`: language of the response summary, in ISO 639-2 format (e.g. 'en', 'fr', 'de', etc)
+```python
+summary_config = SummaryConfig(is_enabled=True, max_results=7, response_lang='eng')
+rerank_config = RerankConfig(reranker="mmr", rerank_k=50, mmr_diversity_bias=0.2)
+config = VectaraQueryConfig(k=10, lambda_val=0.005, rerank_config=rerank_config, summary_config=summary_config)
+```
+Then you can use the `as_rag` method to create a RAG pipeline:

-## Example Notebooks
+```python
+query_str = "what did Biden say?"

-For a more detailed examples of using Vectara, see the following examples:
-* [this notebook](/docs/integrations/vectorstores/vectara) shows how to use Vectara as a vectorstore for semantic search
-* [this notebook](/docs/integrations/providers/vectara/vectara_chat) shows how to build a chatbot with Langchain and Vectara
-* [this notebook](/docs/integrations/providers/vectara/vectara_summary) shows how to use the full Vectara RAG pipeline, including generative summarization
-* [this notebook](/docs/integrations/retrievers/self_query/vectara_self_query) shows the self-query capability with Vectara.
+rag = vectara.as_rag(config)
+rag.invoke(query_str)['answer']
+```
+
+The `as_rag` method returns a `VectaraRAG` object, which behaves just like any LangChain Runnable, including the `invoke` or `stream` methods.
+
+## Vectara Chat
+
+The RAG functionality can be used to create a chatbot. For example, you can create a simple chatbot that responds to user input:
+
+```python
+summary_config = SummaryConfig(is_enabled=True, max_results=7, response_lang='eng')
+rerank_config = RerankConfig(reranker="mmr", rerank_k=50, mmr_diversity_bias=0.2)
+config = VectaraQueryConfig(k=10, lambda_val=0.005, rerank_config=rerank_config, summary_config=summary_config)
+
+query_str = "what did Biden say?"
+bot = vectara.as_chat(config)
+bot.invoke(query_str)['answer']
+```
+
+The main difference is the following: with `as_chat` Vectara internally tracks the chat history and conditions each response on the full chat history.
+There is no need to keep that history locally to LangChain, as Vectara will manage it internally.
+
+## Vectara as a LangChain retriever only
+
+If you want to use Vectara as a retriever only, you can use the `as_retriever` method, which returns a `VectaraRetriever` object.
+```python
+retriever = vectara.as_retriever(config=config)
+retriever.invoke(query_str)
+```
+
+Like with as_rag, you provide a `VectaraQueryConfig` object to control the retrieval parameters.
+In most cases you would not enable the summary_config, but it is left as an option for backwards compatibility. 
+If no summary is requested, the response will be a list of relevant documents, each with a relevance score.
+If a summary is requested, the response will be a list of relevant documents as before, plus an additional document that includes the generative summary.

+## Hallucination Detection score

+Vectara created [HHEM](https://huggingface.co/vectara/hallucination_evaluation_model) - an open source model that can be used to evaluate RAG responses for factual consistency. 
+As part of the Vectara RAG, the "Factual Consistency Score" (or FCS), which is an improved version of the open source HHEM is made available via the API. 
+This is automatically included in the output of the RAG pipeline
+
+```python
+summary_config = SummaryConfig(is_enabled=True, max_results=7, response_lang='eng')
+rerank_config = RerankConfig(reranker="mmr", rerank_k=50, mmr_diversity_bias=0.2)
+config = VectaraQueryConfig(k=10, lambda_val=0.005, rerank_config=rerank_config, summary_config=summary_config)
+
+rag = vectara.as_rag(config)
+resp = rag.invoke(query_str)
+print(resp['answer'])
+print(f"Vectara FCS = {resp['fcs']}")
+```
+
+## Example Notebooks
+
+For a more detailed examples of using Vectara with LangChain, see the following example notebooks:
+* [this notebook](/docs/integrations/vectorstores/vectara) shows how to use Vectara: with full RAG or just as a retriever.
+* [this notebook](/docs/integrations/retrievers/self_query/vectara_self_query) shows the self-query capability with Vectara.
+* [this notebook](/docs/integrations/providers/vectara/vectara_chat) shows how to build a chatbot with Langchain and Vectara

--- a/docs/docs/integrations/providers/vectara/vectara_chat.ipynb
+++ b/docs/docs/integrations/providers/vectara/vectara_chat.ipynb
@ -5,7 +5,21 @@
   "id": "134a0785",
   "metadata": {},
   "source": [
-    "# Chat Over Documents with Vectara"
+    "# Vectara Chat\n",
+    "\n",
+    "[Vectara](https://vectara.com/) provides a Trusted Generative AI platform, allowing organizations to rapidly create a ChatGPT-like experience (an AI assistant) which is grounded in the data, documents, and knowledge that they have (technically, it is Retrieval-Augmented-Generation-as-a-service). \n",
+    "\n",
+    "Vectara serverless RAG-as-a-service provides all the components of RAG behind an easy-to-use API, including:\n",
+    "1. A way to extract text from files (PDF, PPT, DOCX, etc)\n",
+    "2. ML-based chunking that provides state of the art performance.\n",
+    "3. The [Boomerang](https://vectara.com/how-boomerang-takes-retrieval-augmented-generation-to-the-next-level-via-grounded-generation/) embeddings model.\n",
+    "4. Its own internal vector database where text chunks and embedding vectors are stored.\n",
+    "5. A query service that automatically encodes the query into embedding, and retrieves the most relevant text segments (including support for [Hybrid Search](https://docs.vectara.com/docs/api-reference/search-apis/lexical-matching) and [MMR](https://vectara.com/get-diverse-results-and-comprehensive-summaries-with-vectaras-mmr-reranker/))\n",
+    "7. An LLM to for creating a [generative summary](https://docs.vectara.com/docs/learn/grounded-generation/grounded-generation-overview), based on the retrieved documents (context), including citations.\n",
+    "\n",
+    "See the [Vectara API documentation](https://docs.vectara.com/docs/) for more information on how to use the API.\n",
+    "\n",
+    "This notebook shows how to use Vectara's [Chat](https://docs.vectara.com/docs/api-reference/chat-apis/chat-apis-overview) functionality."
   ]
  },
  {
@ -13,19 +27,19 @@
   "id": "56372c5b",
   "metadata": {},
   "source": [
-    "# Setup\n",
+    "# Getting Started\n",
    "\n",
-    "You will need a Vectara account to use Vectara with LangChain. To get started, use the following steps:\n",
-    "1. [Sign up](https://www.vectara.com/integrations/langchain) for a Vectara account if you don't already have one. Once you have completed your sign up you will have a Vectara customer ID. You can find your customer ID by clicking on your name, on the top-right of the Vectara console window.\n",
+    "To get started, use the following steps:\n",
+    "1. If you don't already have one, [Sign up](https://www.vectara.com/integrations/langchain) for your free Vectara account. Once you have completed your sign up you will have a Vectara customer ID. You can find your customer ID by clicking on your name, on the top-right of the Vectara console window.\n",
    "2. Within your account you can create one or more corpora. Each corpus represents an area that stores text data upon ingest from input documents. To create a corpus, use the **\"Create Corpus\"** button. You then provide a name to your corpus as well as a description. Optionally you can define filtering attributes and apply some advanced options. If you click on your created corpus, you can see its name and corpus ID right on the top.\n",
-    "3. Next you'll need to create API keys to access the corpus. Click on the **\"Authorization\"** tab in the corpus view and then the **\"Create API Key\"** button. Give your key a name, and choose whether you want query only or query+index for your key. Click \"Create\" and you now have an active API key. Keep this key confidential. \n",
+    "3. Next you'll need to create API keys to access the corpus. Click on the **\"Access Control\"** tab in the corpus view and then the **\"Create API Key\"** button. Give your key a name, and choose whether you want query-only or query+index for your key. Click \"Create\" and you now have an active API key. Keep this key confidential. \n",
    "\n",
-    "To use LangChain with Vectara, you'll need to have these three values: customer ID, corpus ID and api_key.\n",
+    "To use LangChain with Vectara, you'll need to have these three values: `customer ID`, `corpus ID` and `api_key`.\n",
    "You can provide those to LangChain in two ways:\n",
    "\n",
    "1. Include in your environment these three variables: `VECTARA_CUSTOMER_ID`, `VECTARA_CORPUS_ID` and `VECTARA_API_KEY`.\n",
    "\n",
-    "> For example, you can set these variables using os.environ and getpass as follows:\n",
+    "   For example, you can set these variables using os.environ and getpass as follows:\n",
    "\n",
    "```python\n",
    "import os\n",
@ -36,20 +50,21 @@
    "os.environ[\"VECTARA_API_KEY\"] = getpass.getpass(\"Vectara API Key:\")\n",
    "```\n",
    "\n",
-    "2. Add them to the Vectara vectorstore constructor:\n",
+    "2. Add them to the `Vectara` vectorstore constructor:\n",
    "\n",
    "```python\n",
-    "vectorstore = Vectara(\n",
+    "vectara = Vectara(\n",
    "                vectara_customer_id=vectara_customer_id,\n",
    "                vectara_corpus_id=vectara_corpus_id,\n",
    "                vectara_api_key=vectara_api_key\n",
    "            )\n",
-    "```"
+    "```\n",
+    "In this notebook we assume they are provided in the environment."
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 1,
   "id": "70c4e529",
   "metadata": {
    "tags": []
@ -58,9 +73,16 @@
   "source": [
    "import os\n",
    "\n",
-    "from langchain.chains import ConversationalRetrievalChain\n",
+    "os.environ[\"VECTARA_API_KEY\"] = \"<YOUR_VECTARA_API_KEY>\"\n",
+    "os.environ[\"VECTARA_CORPUS_ID\"] = \"<YOUR_VECTARA_CORPUS_ID>\"\n",
+    "os.environ[\"VECTARA_CUSTOMER_ID\"] = \"<YOUR_VECTARA_CUSTOMER_ID>\"\n",
+    "\n",
    "from langchain_community.vectorstores import Vectara\n",
-    "from langchain_openai import OpenAI"
+    "from langchain_community.vectorstores.vectara import (\n",
+    "    RerankConfig,\n",
+    "    SummaryConfig,\n",
+    "    VectaraQueryConfig,\n",
+    ")"
   ]
  },
  {
@ -68,62 +90,30 @@
   "id": "cdff94be",
   "metadata": {},
   "source": [
-    "Load in documents. You can replace this with a loader for whatever type of data you want"
+    "## Vectara Chat Explained\n",
+    "\n",
+    "In most uses of LangChain to create chatbots, one must integrate a special `memory` component that maintains the history of chat sessions and then uses that history to ensure the chatbot is aware of conversation history.\n",
+    "\n",
+    "With Vectara Chat - all of that is performed in the backend by Vectara automatically. You can look at the [Chat](https://docs.vectara.com/docs/api-reference/chat-apis/chat-apis-overview) documentation for the details, to learn more about the internals of how this is implemented, but with LangChain all you have to do is turn that feature on in the Vectara vectorstore.\n",
+    "\n",
+    "Let's see an example. First we load the SOTU document (remember, text extraction and chunking all occurs automatically on the Vectara platform):"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 2,
   "id": "01c46e92",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
-    "from langchain_community.document_loaders import TextLoader\n",
+    "from langchain.document_loaders import TextLoader\n",
    "\n",
    "loader = TextLoader(\"state_of_the_union.txt\")\n",
-    "documents = loader.load()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "239475d2",
-   "metadata": {},
-   "source": [
-    "Since we're using Vectara, there's no need to chunk the documents, as that is done automatically in the Vectara platform backend. We just use `from_document()` to upload the text loaded from the file, and directly ingest it into Vectara:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "id": "a8930cf7",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "vectara = Vectara.from_documents(documents, embedding=None)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "898b574b",
-   "metadata": {},
-   "source": [
-    "We can now create a memory object, which is neccessary to track the inputs/outputs and hold a conversation."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "id": "af803fee",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain.memory import ConversationBufferMemory\n",
+    "documents = loader.load()\n",
    "\n",
-    "memory = ConversationBufferMemory(memory_key=\"chat_history\", return_messages=True)"
+    "vectara = Vectara.from_documents(documents, embedding=None)"
   ]
  },
  {
@ -131,139 +121,25 @@
   "id": "3c96b118",
   "metadata": {},
   "source": [
-    "We now initialize the `ConversationalRetrievalChain`:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "id": "7b4110f3",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "[Document(page_content='Justice Breyer, thank you for your service. One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence. A former top litigator in private practice.', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '29486', 'len': '97'}), Document(page_content='Groups of citizens blocking tanks with their bodies. Everyone from students to retirees teachers turned soldiers defending their homeland. In this struggle as President Zelenskyy said in his speech to the European Parliament “Light will win over darkness.” The Ukrainian Ambassador to the United States is here tonight. Let each of us here tonight in this Chamber send an unmistakable signal to Ukraine and to the world.', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '1083', 'len': '117'}), Document(page_content='All told, we created 369,000 new manufacturing jobs in America just last year. Powered by people I’ve met like JoJo Burgess, from generations of union steelworkers from Pittsburgh, who’s here with us tonight. As Ohio Senator Sherrod Brown says, “It’s time to bury the label “Rust Belt.” It’s time. \\n\\nBut with all the bright spots in our economy, record job growth and higher wages, too many families are struggling to keep up with the bills. Inflation is robbing them of the gains they might otherwise feel.', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '14257', 'len': '77'}), Document(page_content='This is personal to me and Jill, to Kamala, and to so many of you. Cancer is the #2 cause of death in America–second only to heart disease. Last month, I announced our plan to supercharge  \\nthe Cancer Moonshot that President Obama asked me to lead six years ago. Our goal is to cut the cancer death rate by at least 50% over the next 25 years, turn more cancers from death sentences into treatable diseases. More support for patients and families.', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '36196', 'len': '122'}), Document(page_content='Six days ago, Russia’s Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. He thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. He met the Ukrainian people.', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '664', 'len': '68'}), Document(page_content='I understand. \\n\\nI remember when my Dad had to leave our home in Scranton, Pennsylvania to find work. I grew up in a family where if the price of food went up, you felt it. That’s why one of the first things I did as President was fight to pass the American Rescue Plan. Because people were hurting. We needed to act, and we did.', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '8042', 'len': '97'}), Document(page_content='He rejected repeated efforts at diplomacy. He thought the West and NATO wouldn’t respond. And he thought he could divide us at home. We were ready.  Here is what we did. We prepared extensively and carefully.', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '2100', 'len': '42'}), Document(page_content='He thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. He met the Ukrainian people. From President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world. Groups of citizens blocking tanks with their bodies.', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '788', 'len': '28'}), Document(page_content='Putin’s latest attack on Ukraine was premeditated and unprovoked. He rejected repeated efforts at diplomacy. He thought the West and NATO wouldn’t respond. And he thought he could divide us at home. We were ready.  Here is what we did.', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '2053', 'len': '46'}), Document(page_content='A unity agenda for the nation. We can do this. \\n\\nMy fellow Americans—tonight , we have gathered in a sacred space—the citadel of our democracy. In this Capitol, generation after generation, Americans have debated great questions amid great strife, and have done great things. We have fought for freedom, expanded liberty, defeated totalitarianism and terror. And built the strongest, freest, and most prosperous nation the world has ever known.', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '36968', 'len': '131'})]\n"
-     ]
-    }
-   ],
-   "source": [
-    "openai_api_key = os.environ[\"OPENAI_API_KEY\"]\n",
-    "llm = OpenAI(openai_api_key=openai_api_key, temperature=0)\n",
-    "retriever = vectara.as_retriever()\n",
-    "d = retriever.invoke(\"What did the president say about Ketanji Brown Jackson\", k=2)\n",
-    "print(d)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "id": "44ed803e",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "bot = ConversationalRetrievalChain.from_llm(\n",
-    "    llm, retriever, memory=memory, verbose=False\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "5b6deb16",
-   "metadata": {},
-   "source": [
-    "And can have a multi-turn conversation with out new bot:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "id": "e8ce4fe9",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "result = bot.invoke({\"question\": query})"
+    "And now we create a Chat Runnable using the `as_chat` method:"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 9,
-   "id": "4c79862b",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "\" The president said that Ketanji Brown Jackson is one of the nation's top legal minds and a former top litigator in private practice, and that she will continue Justice Breyer's legacy of excellence.\""
-      ]
-     },
-     "execution_count": 9,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "result[\"answer\"]"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 10,
-   "id": "c697d9d1",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "query = \"Did he mention who she suceeded\"\n",
-    "result = bot.invoke({\"question\": query})"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 11,
-   "id": "ba0678f3",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "' Ketanji Brown Jackson succeeded Justice Breyer on the United States Supreme Court.'"
-      ]
-     },
-     "execution_count": 11,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "result[\"answer\"]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "b3308b01-5300-4999-8cd3-22f16dae757e",
-   "metadata": {},
-   "source": [
-    "## Pass in chat history\n",
-    "\n",
-    "In the above example, we used a Memory object to track chat history. We can also just pass it in explicitly. In order to do this, we need to initialize a chain without any memory object."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 12,
+   "execution_count": 3,
   "id": "1b41a10b-bf68-4689-8f00-9aed7675e2ab",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
-    "bot = ConversationalRetrievalChain.from_llm(\n",
-    "    OpenAI(temperature=0), vectara.as_retriever()\n",
-    ")"
+    "summary_config = SummaryConfig(is_enabled=True, max_results=7, response_lang=\"eng\")\n",
+    "rerank_config = RerankConfig(reranker=\"mmr\", rerank_k=50, mmr_diversity_bias=0.2)\n",
+    "config = VectaraQueryConfig(\n",
+    "    k=10, lambda_val=0.005, rerank_config=rerank_config, summary_config=summary_config\n",
+    ")\n",
+    "\n",
+    "bot = vectara.as_chat(config)"
   ]
  },
  {
@ -276,39 +152,25 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 4,
   "id": "bc672290-8a8b-4828-a90c-f1bbdd6b3920",
   "metadata": {
    "tags": []
   },
-   "outputs": [],
-   "source": [
-    "chat_history = []\n",
-    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "result = bot.invoke({\"question\": query, \"chat_history\": chat_history})"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 14,
-   "id": "6b62d758-c069-4062-88f0-21e7ea4710bf",
-   "metadata": {
-    "tags": []
-   },
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "\" The president said that Ketanji Brown Jackson is one of the nation's top legal minds and a former top litigator in private practice, and that she will continue Justice Breyer's legacy of excellence.\""
+       "'The President expressed gratitude to Justice Breyer and highlighted the significance of nominating Ketanji Brown Jackson to the Supreme Court, praising her legal expertise and commitment to upholding excellence [1]. The President also reassured the public about the situation with gas prices and the conflict in Ukraine, emphasizing unity with allies and the belief that the world will emerge stronger from these challenges [2][4]. Additionally, the President shared personal experiences related to economic struggles and the importance of passing the American Rescue Plan to support those in need [3]. The focus was also on job creation and economic growth, acknowledging the impact of inflation on families [5]. While addressing cancer as a significant issue, the President discussed plans to enhance cancer research and support for patients and families [7].'"
      ]
     },
-     "execution_count": 14,
+     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "result[\"answer\"]"
+    "bot.invoke(\"What did the president say about Ketanji Brown Jackson?\")[\"answer\"]"
   ]
  },
  {
@ -321,256 +183,25 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 15,
+   "execution_count": 5,
   "id": "9c95460b-7116-4155-a9d2-c0fb027ee592",
   "metadata": {
    "tags": []
   },
-   "outputs": [],
-   "source": [
-    "chat_history = [(query, result[\"answer\"])]\n",
-    "query = \"Did he mention who she suceeded\"\n",
-    "result = bot.invoke({\"question\": query, \"chat_history\": chat_history})"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 16,
-   "id": "698ac00c-cadc-407f-9423-226b2d9258d0",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "' Ketanji Brown Jackson succeeded Justice Breyer on the United States Supreme Court.'"
-      ]
-     },
-     "execution_count": 16,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "result[\"answer\"]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "0eaadf0f",
-   "metadata": {},
-   "source": [
-    "## Return Source Documents\n",
-    "You can also easily return source documents from the ConversationalRetrievalChain. This is useful for when you want to inspect what documents were returned."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 17,
-   "id": "562769c6",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "bot = ConversationalRetrievalChain.from_llm(\n",
-    "    llm, vectara.as_retriever(), return_source_documents=True\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 18,
-   "id": "ea478300",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "chat_history = []\n",
-    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "result = bot.invoke({\"question\": query, \"chat_history\": chat_history})"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 19,
-   "id": "4cb75b4e",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "Document(page_content='Justice Breyer, thank you for your service. One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence. A former top litigator in private practice.', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '29486', 'len': '97'})"
-      ]
-     },
-     "execution_count": 19,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "result[\"source_documents\"][0]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "99b96dae",
-   "metadata": {},
-   "source": [
-    "## ConversationalRetrievalChain with `map_reduce`\n",
-    "LangChain supports different types of ways to combine document chains with the ConversationalRetrievalChain chain."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 20,
-   "id": "e53a9d66",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "from langchain.chains import LLMChain\n",
-    "from langchain.chains.conversational_retrieval.prompts import CONDENSE_QUESTION_PROMPT\n",
-    "from langchain.chains.question_answering import load_qa_chain"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 21,
-   "id": "bf205e35",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "question_generator = LLMChain(llm=llm, prompt=CONDENSE_QUESTION_PROMPT)\n",
-    "doc_chain = load_qa_chain(llm, chain_type=\"map_reduce\")\n",
-    "\n",
-    "chain = ConversationalRetrievalChain(\n",
-    "    retriever=vectara.as_retriever(),\n",
-    "    question_generator=question_generator,\n",
-    "    combine_docs_chain=doc_chain,\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 22,
-   "id": "78155887",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "chat_history = []\n",
-    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "result = chain({\"question\": query, \"chat_history\": chat_history})"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 23,
-   "id": "e54b5fa2",
-   "metadata": {
-    "tags": []
-   },
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "\" The president said that he nominated Circuit Court of Appeals Judge Ketanji Brown Jackson, who is one of the nation's top legal minds and a former top litigator in private practice.\""
+       "\"In his remarks, the President specified that Ketanji Brown Jackson is succeeding Justice Breyer on the United States Supreme Court[1]. The President praised Jackson as a top legal mind who will continue Justice Breyer's legacy of excellence. The nomination of Jackson was highlighted as a significant constitutional responsibility of the President[1]. The President emphasized the importance of this nomination and the qualities that Jackson brings to the role. The focus was on the transition from Justice Breyer to Judge Ketanji Brown Jackson on the Supreme Court[1].\""
      ]
     },
-     "execution_count": 23,
+     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "result[\"answer\"]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "a2fe6b14",
-   "metadata": {},
-   "source": [
-    "## ConversationalRetrievalChain with Question Answering with sources\n",
-    "\n",
-    "You can also use this chain with the question answering with sources chain."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 24,
-   "id": "d1058fd2",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "from langchain.chains.qa_with_sources import load_qa_with_sources_chain"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 25,
-   "id": "a6594482",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "question_generator = LLMChain(llm=llm, prompt=CONDENSE_QUESTION_PROMPT)\n",
-    "doc_chain = load_qa_with_sources_chain(llm, chain_type=\"map_reduce\")\n",
-    "\n",
-    "chain = ConversationalRetrievalChain(\n",
-    "    retriever=vectara.as_retriever(),\n",
-    "    question_generator=question_generator,\n",
-    "    combine_docs_chain=doc_chain,\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 26,
-   "id": "e2badd21",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "chat_history = []\n",
-    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "result = chain({\"question\": query, \"chat_history\": chat_history})"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 27,
-   "id": "edb31fe5",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "\" The president said that Ketanji Brown Jackson is one of the nation's top legal minds and a former top litigator in private practice.\\nSOURCES: langchain\""
-      ]
-     },
-     "execution_count": 27,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "result[\"answer\"]"
+    "bot.invoke(\"Did he mention who she suceeded?\")[\"answer\"]"
   ]
  },
  {
@ -578,74 +209,16 @@
   "id": "2324cdc6-98bf-4708-b8cd-02a98b1e5b67",
   "metadata": {},
   "source": [
-    "## ConversationalRetrievalChain with streaming to `stdout`\n",
-    "\n",
-    "Output from the chain will be streamed to `stdout` token by token in this example."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 28,
-   "id": "2efacec3-2690-4b05-8de3-a32fd2ac3911",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "from langchain.chains.conversational_retrieval.prompts import (\n",
-    "    CONDENSE_QUESTION_PROMPT,\n",
-    "    QA_PROMPT,\n",
-    ")\n",
-    "from langchain.chains.llm import LLMChain\n",
-    "from langchain.chains.question_answering import load_qa_chain\n",
-    "from langchain_core.callbacks import StreamingStdOutCallbackHandler\n",
-    "\n",
-    "# Construct a ConversationalRetrievalChain with a streaming llm for combine docs\n",
-    "# and a separate, non-streaming llm for question generation\n",
-    "llm = OpenAI(temperature=0, openai_api_key=openai_api_key)\n",
-    "streaming_llm = OpenAI(\n",
-    "    streaming=True,\n",
-    "    callbacks=[StreamingStdOutCallbackHandler()],\n",
-    "    temperature=0,\n",
-    "    openai_api_key=openai_api_key,\n",
-    ")\n",
-    "\n",
-    "question_generator = LLMChain(llm=llm, prompt=CONDENSE_QUESTION_PROMPT)\n",
-    "doc_chain = load_qa_chain(streaming_llm, chain_type=\"stuff\", prompt=QA_PROMPT)\n",
+    "## Chat with streaming\n",
    "\n",
-    "bot = ConversationalRetrievalChain(\n",
-    "    retriever=vectara.as_retriever(),\n",
-    "    combine_docs_chain=doc_chain,\n",
-    "    question_generator=question_generator,\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 29,
-   "id": "fd6d43f4-7428-44a4-81bc-26fe88a98762",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      " The president said that Ketanji Brown Jackson is one of the nation's top legal minds and a former top litigator in private practice, and that she will continue Justice Breyer's legacy of excellence."
-     ]
-    }
-   ],
-   "source": [
-    "chat_history = []\n",
-    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "result = bot.invoke({\"question\": query, \"chat_history\": chat_history})"
+    "Of course the chatbot interface also supports streaming.\n",
+    "Instead of the `invoke` method you simply use `stream`:"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 30,
-   "id": "5ab38978-f3e8-4fa7-808c-c79dec48379a",
+   "execution_count": 6,
+   "id": "936dc62f",
   "metadata": {
    "tags": []
   },
@ -654,81 +227,22 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      " Ketanji Brown Jackson succeeded Justice Breyer on the United States Supreme Court."
+      "Judge Ketanji Brown Jackson is a nominee for the United States Supreme Court, known for her legal expertise and experience as a former litigator. She is praised for her potential to continue the legacy of excellence on the Court[1]. While the search results provide information on various topics like innovation, economic growth, and healthcare initiatives, they do not directly address Judge Ketanji Brown Jackson's specific accomplishments. Therefore, I do not have enough information to answer this question."
     ]
    }
   ],
   "source": [
-    "chat_history = [(query, result[\"answer\"])]\n",
-    "query = \"Did he mention who she suceeded\"\n",
-    "result = bot.invoke({\"question\": query, \"chat_history\": chat_history})"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "f793d56b",
-   "metadata": {},
-   "source": [
-    "## get_chat_history Function\n",
-    "You can also specify a `get_chat_history` function, which can be used to format the chat_history string."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 31,
-   "id": "a7ba9d8c",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "def get_chat_history(inputs) -> str:\n",
-    "    res = []\n",
-    "    for human, ai in inputs:\n",
-    "        res.append(f\"Human:{human}\\nAI:{ai}\")\n",
-    "    return \"\\n\".join(res)\n",
-    "\n",
-    "\n",
-    "bot = ConversationalRetrievalChain.from_llm(\n",
-    "    llm, vectara.as_retriever(), get_chat_history=get_chat_history\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 32,
-   "id": "a3e33c0d",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "chat_history = []\n",
-    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "result = bot.invoke({\"question\": query, \"chat_history\": chat_history})"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 33,
-   "id": "936dc62f",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "\" The president said that Ketanji Brown Jackson is one of the nation's top legal minds and a former top litigator in private practice, and that she will continue Justice Breyer's legacy of excellence.\""
-      ]
-     },
-     "execution_count": 33,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "result[\"answer\"]"
+    "output = {}\n",
+    "curr_key = None\n",
+    "for chunk in bot.stream(\"what about her accopmlishments?\"):\n",
+    "    for key in chunk:\n",
+    "        if key not in output:\n",
+    "            output[key] = chunk[key]\n",
+    "        else:\n",
+    "            output[key] += chunk[key]\n",
+    "        if key == \"answer\":\n",
+    "            print(chunk[key], end=\"\", flush=True)\n",
+    "        curr_key = key"
   ]
  }
 ],
@ -748,7 +262,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.6"
+   "version": "3.11.8"
  }
 },
 "nbformat": 4,
--- a/docs/docs/integrations/providers/vectara/vectara_summary.ipynb
+++ b/docs/docs/integrations/providers/vectara/vectara_summary.ipynb
@ -1,311 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "559f8e0e",
-   "metadata": {},
-   "source": [
-    "# Vectara\n",
-    "\n",
-    ">[Vectara](https://vectara.com/) is the trusted GenAI platform that provides an easy-to-use API for document indexing and querying. \n",
-    "\n",
-    "Vectara provides an end-to-end managed service for Retrieval Augmented Generation or [RAG](https://vectara.com/grounded-generation/), which includes:\n",
-    "\n",
-    "1. A way to extract text from document files and chunk them into sentences.\n",
-    "\n",
-    "2. The state-of-the-art [Boomerang](https://vectara.com/how-boomerang-takes-retrieval-augmented-generation-to-the-next-level-via-grounded-generation/) embeddings model. Each text chunk is encoded into a vector embedding using Boomerang, and stored in the Vectara internal knowledge (vector+text) store\n",
-    "\n",
-    "3. A query service that automatically encodes the query into embedding, and retrieves the most relevant text segments (including support for [Hybrid Search](https://docs.vectara.com/docs/api-reference/search-apis/lexical-matching) and [MMR](https://vectara.com/get-diverse-results-and-comprehensive-summaries-with-vectaras-mmr-reranker/))\n",
-    "\n",
-    "4. An option to create [generative summary](https://docs.vectara.com/docs/learn/grounded-generation/grounded-generation-overview), based on the retrieved documents, including citations.\n",
-    "\n",
-    "See the [Vectara API documentation](https://docs.vectara.com/docs/) for more information on how to use the API.\n",
-    "\n",
-    "This notebook shows how to use functionality related to the `Vectara`'s integration with langchain.\n",
-    "Specificaly we will demonstrate how to use chaining with [LangChain's Expression Language](/docs/concepts#langchain-expression-language) and using Vectara's integrated summarization capability."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "e97dcf11",
-   "metadata": {},
-   "source": [
-    "# Setup\n",
-    "\n",
-    "You will need a Vectara account to use Vectara with LangChain. To get started, use the following steps:\n",
-    "\n",
-    "1. [Sign up](https://www.vectara.com/integrations/langchain) for a Vectara account if you don't already have one. Once you have completed your sign up you will have a Vectara customer ID. You can find your customer ID by clicking on your name, on the top-right of the Vectara console window.\n",
-    "\n",
-    "2. Within your account you can create one or more corpora. Each corpus represents an area that stores text data upon ingest from input documents. To create a corpus, use the **\"Create Corpus\"** button. You then provide a name to your corpus as well as a description. Optionally you can define filtering attributes and apply some advanced options. If you click on your created corpus, you can see its name and corpus ID right on the top.\n",
-    "\n",
-    "3. Next you'll need to create API keys to access the corpus. Click on the **\"Authorization\"** tab in the corpus view and then the **\"Create API Key\"** button. Give your key a name, and choose whether you want query only or query+index for your key. Click \"Create\" and you now have an active API key. Keep this key confidential. \n",
-    "\n",
-    "To use LangChain with Vectara, you'll need to have these three values: customer ID, corpus ID and api_key.\n",
-    "You can provide those to LangChain in two ways:\n",
-    "\n",
-    "1. Include in your environment these three variables: `VECTARA_CUSTOMER_ID`, `VECTARA_CORPUS_ID` and `VECTARA_API_KEY`.\n",
-    "\n",
-    "> For example, you can set these variables using os.environ and getpass as follows:\n",
-    "\n",
-    "```python\n",
-    "import os\n",
-    "import getpass\n",
-    "\n",
-    "os.environ[\"VECTARA_CUSTOMER_ID\"] = getpass.getpass(\"Vectara Customer ID:\")\n",
-    "os.environ[\"VECTARA_CORPUS_ID\"] = getpass.getpass(\"Vectara Corpus ID:\")\n",
-    "os.environ[\"VECTARA_API_KEY\"] = getpass.getpass(\"Vectara API Key:\")\n",
-    "```\n",
-    "\n",
-    "2. Add them to the Vectara vectorstore constructor:\n",
-    "\n",
-    "```python\n",
-    "vectorstore = Vectara(\n",
-    "                vectara_customer_id=vectara_customer_id,\n",
-    "                vectara_corpus_id=vectara_corpus_id,\n",
-    "                vectara_api_key=vectara_api_key\n",
-    "            )\n",
-    "```"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "id": "aac7a9a6",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain_community.embeddings import FakeEmbeddings\n",
-    "from langchain_community.vectorstores import Vectara\n",
-    "from langchain_core.output_parsers import StrOutputParser\n",
-    "from langchain_core.prompts import ChatPromptTemplate\n",
-    "from langchain_core.runnables import RunnableLambda, RunnablePassthrough"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "875ffb7e",
-   "metadata": {},
-   "source": [
-    "First we load the state-of-the-union text into Vectara. Note that we use the `from_files` interface which does not require any local processing or chunking - Vectara receives the file content and performs all the necessary pre-processing, chunking and embedding of the file into its knowledge store."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "id": "be0a4973",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "vectara = Vectara.from_files([\"state_of_the_union.txt\"])"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "22a6b953",
-   "metadata": {},
-   "source": [
-    "We now create a Vectara retriever and specify that:\n",
-    "* It should return only the 3 top Document matches\n",
-    "* For summary, it should use the top 5 results and respond in English"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "id": "19cd2f86",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "summary_config = {\"is_enabled\": True, \"max_results\": 5, \"response_lang\": \"eng\"}\n",
-    "retriever = vectara.as_retriever(\n",
-    "    search_kwargs={\"k\": 3, \"summary_config\": summary_config}\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "c49284ed",
-   "metadata": {},
-   "source": [
-    "When using summarization with Vectara, the retriever responds with a list of `Document` objects:\n",
-    "1. The first `k` documents are the ones that match the query (as we are used to with a standard vector store)\n",
-    "2. With summary enabled, an additional `Document` object is apended, which includes the summary text. This Document has the metadata field `summary` set as True.\n",
-    "\n",
-    "Let's define two utility functions to split those out:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "id": "e5100654",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def get_sources(documents):\n",
-    "    return documents[:-1]\n",
-    "\n",
-    "\n",
-    "def get_summary(documents):\n",
-    "    return documents[-1].page_content\n",
-    "\n",
-    "\n",
-    "query_str = \"what did Biden say?\""
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "f2a74368",
-   "metadata": {},
-   "source": [
-    "Now we can try a summary response for the query:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "id": "ee4759c4",
-   "metadata": {
-    "scrolled": false
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'The returned results did not contain sufficient information to be summarized into a useful answer for your query. Please try a different search or restate your query differently.'"
-      ]
-     },
-     "execution_count": 5,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "(retriever | get_summary).invoke(query_str)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "dd7c4593",
-   "metadata": {},
-   "source": [
-    "And if we would like to see the sources retrieved from Vectara that were used in this summary (the citations):"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "id": "0eb66034",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "[Document(page_content='When they came home, many of the world’s fittest and best trained warriors were never the same. Dizziness. \\n\\nA cancer that would put them in a flag-draped coffin. I know. \\n\\nOne of those soldiers was my son Major Beau Biden. We don’t know for sure if a burn pit was the cause of his brain cancer, or the diseases of so many of our troops. But I’m committed to finding out everything we can.', metadata={'lang': 'eng', 'section': '1', 'offset': '34652', 'len': '60', 'X-TIKA:Parsed-By': 'org.apache.tika.parser.csv.TextAndCSVParser', 'Content-Encoding': 'UTF-8', 'Content-Type': 'text/plain; charset=UTF-8', 'source': 'vectara'}),\n",
-       " Document(page_content='The U.S. Department of Justice is assembling a dedicated task force to go after the crimes of Russian oligarchs. We are joining with our European allies to find and seize your yachts your luxury apartments your private jets. We are coming for your ill-begotten gains. And tonight I am announcing that we will join our allies in closing off American air space to all Russian flights – further isolating Russia – and adding an additional squeeze –on their economy. The Ruble has lost 30% of its value.', metadata={'lang': 'eng', 'section': '1', 'offset': '3807', 'len': '42', 'X-TIKA:Parsed-By': 'org.apache.tika.parser.csv.TextAndCSVParser', 'Content-Encoding': 'UTF-8', 'Content-Type': 'text/plain; charset=UTF-8', 'source': 'vectara'}),\n",
-       " Document(page_content='He rejected repeated efforts at diplomacy. He thought the West and NATO wouldn’t respond. And he thought he could divide us at home. We were ready.  Here is what we did. We prepared extensively and carefully.', metadata={'lang': 'eng', 'section': '1', 'offset': '2100', 'len': '42', 'X-TIKA:Parsed-By': 'org.apache.tika.parser.csv.TextAndCSVParser', 'Content-Encoding': 'UTF-8', 'Content-Type': 'text/plain; charset=UTF-8', 'source': 'vectara'})]"
-      ]
-     },
-     "execution_count": 6,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "(retriever | get_sources).invoke(query_str)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "8f16bf8d",
-   "metadata": {},
-   "source": [
-    "Vectara's \"RAG as a service\" does a lot of the heavy lifting in creating question answering or chatbot chains. The integration with LangChain provides the option to use additional capabilities such as query pre-processing  like `SelfQueryRetriever` or `MultiQueryRetriever`. Let's look at an example of using the [MultiQueryRetriever](/docs/how_to/MultiQueryRetriever).\n",
-    "\n",
-    "Since MQR uses an LLM we have to set that up - here we choose `ChatOpenAI`:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "id": "e14325b9",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "\"President Biden has made several notable quotes and comments. He expressed a commitment to investigate the potential impact of burn pits on soldiers' health, referencing his son's brain cancer [1]. He emphasized the importance of unity among Americans, urging us to see each other as fellow citizens rather than enemies [2]. Biden also highlighted the need for schools to use funds from the American Rescue Plan to hire teachers and address learning loss, while encouraging community involvement in supporting education [3].\""
-      ]
-     },
-     "execution_count": 7,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from langchain.retrievers.multi_query import MultiQueryRetriever\n",
-    "from langchain_openai import ChatOpenAI\n",
-    "\n",
-    "llm = ChatOpenAI(temperature=0)\n",
-    "mqr = MultiQueryRetriever.from_llm(retriever=retriever, llm=llm)\n",
-    "\n",
-    "(mqr | get_summary).invoke(query_str)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "id": "fa14f923",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "[Document(page_content='When they came home, many of the world’s fittest and best trained warriors were never the same. Dizziness. \\n\\nA cancer that would put them in a flag-draped coffin. I know. \\n\\nOne of those soldiers was my son Major Beau Biden. We don’t know for sure if a burn pit was the cause of his brain cancer, or the diseases of so many of our troops. But I’m committed to finding out everything we can.', metadata={'lang': 'eng', 'section': '1', 'offset': '34652', 'len': '60', 'X-TIKA:Parsed-By': 'org.apache.tika.parser.csv.TextAndCSVParser', 'Content-Encoding': 'UTF-8', 'Content-Type': 'text/plain; charset=UTF-8', 'source': 'vectara'}),\n",
-       " Document(page_content='The U.S. Department of Justice is assembling a dedicated task force to go after the crimes of Russian oligarchs. We are joining with our European allies to find and seize your yachts your luxury apartments your private jets. We are coming for your ill-begotten gains. And tonight I am announcing that we will join our allies in closing off American air space to all Russian flights – further isolating Russia – and adding an additional squeeze –on their economy. The Ruble has lost 30% of its value.', metadata={'lang': 'eng', 'section': '1', 'offset': '3807', 'len': '42', 'X-TIKA:Parsed-By': 'org.apache.tika.parser.csv.TextAndCSVParser', 'Content-Encoding': 'UTF-8', 'Content-Type': 'text/plain; charset=UTF-8', 'source': 'vectara'}),\n",
-       " Document(page_content='And, if Congress provides the funds we need, we’ll have new stockpiles of tests, masks, and pills ready if needed. I cannot promise a new variant won’t come. But I can promise you we’ll do everything within our power to be ready if it does. Third – we can end the shutdown of schools and businesses. We have the tools we need.', metadata={'lang': 'eng', 'section': '1', 'offset': '24753', 'len': '82', 'X-TIKA:Parsed-By': 'org.apache.tika.parser.csv.TextAndCSVParser', 'Content-Encoding': 'UTF-8', 'Content-Type': 'text/plain; charset=UTF-8', 'source': 'vectara'}),\n",
-       " Document(page_content='The returned results did not contain sufficient information to be summarized into a useful answer for your query. Please try a different search or restate your query differently.', metadata={'summary': True}),\n",
-       " Document(page_content='Danielle says Heath was a fighter to the very end. He didn’t know how to stop fighting, and neither did she. Through her pain she found purpose to demand we do better. Tonight, Danielle—we are. The VA is pioneering new ways of linking toxic exposures to diseases, already helping more veterans get benefits.', metadata={'lang': 'eng', 'section': '1', 'offset': '35502', 'len': '58', 'X-TIKA:Parsed-By': 'org.apache.tika.parser.csv.TextAndCSVParser', 'Content-Encoding': 'UTF-8', 'Content-Type': 'text/plain; charset=UTF-8', 'source': 'vectara'}),\n",
-       " Document(page_content='Let’s stop seeing each other as enemies, and start seeing each other for who we really are: Fellow Americans. We can’t change how divided we’ve been. But we can change how we move forward—on COVID-19 and other issues we must face together. I recently visited the New York City Police Department days after the funerals of Officer Wilbert Mora and his partner, Officer Jason Rivera. They were responding to a 9-1-1 call when a man shot and killed them with a stolen gun.', metadata={'lang': 'eng', 'section': '1', 'offset': '26312', 'len': '89', 'X-TIKA:Parsed-By': 'org.apache.tika.parser.csv.TextAndCSVParser', 'Content-Encoding': 'UTF-8', 'Content-Type': 'text/plain; charset=UTF-8', 'source': 'vectara'}),\n",
-       " Document(page_content='The American Rescue Plan gave schools money to hire teachers and help students make up for lost learning. I urge every parent to make sure your school does just that. And we can all play a part—sign up to be a tutor or a mentor. Children were also struggling before the pandemic. Bullying, violence, trauma, and the harms of social media.', metadata={'lang': 'eng', 'section': '1', 'offset': '33227', 'len': '61', 'X-TIKA:Parsed-By': 'org.apache.tika.parser.csv.TextAndCSVParser', 'Content-Encoding': 'UTF-8', 'Content-Type': 'text/plain; charset=UTF-8', 'source': 'vectara'})]"
-      ]
-     },
-     "execution_count": 8,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "(mqr | get_sources).invoke(query_str)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "16853820",
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.10.9"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
--- a/docs/docs/integrations/retrievers/self_query/vectara_self_query.ipynb
+++ b/docs/docs/integrations/retrievers/self_query/vectara_self_query.ipynb
@ -5,15 +5,17 @@
   "id": "13afcae7",
   "metadata": {},
   "source": [
-    "# Vectara \n",
+    "# Vectara self-querying \n",
    "\n",
-    ">[Vectara](https://vectara.com/) is the trusted GenAI platform that provides an easy-to-use API for document indexing and querying. \n",
-    ">\n",
-    ">`Vectara` provides an end-to-end managed service for `Retrieval Augmented Generation` or [RAG](https://vectara.com/grounded-generation/), which includes:\n",
-    ">1. A way to `extract text` from document files and `chunk` them into sentences.\n",
-    ">2. The state-of-the-art [Boomerang](https://vectara.com/how-boomerang-takes-retrieval-augmented-generation-to-the-next-level-via-grounded-generation/) embeddings model. Each text chunk is encoded into a vector embedding using `Boomerang`, and stored in the Vectara internal knowledge (vector+text) store\n",
-    ">3. A query service that automatically encodes the query into embedding, and retrieves the most relevant text segments (including support for [Hybrid Search](https://docs.vectara.com/docs/api-reference/search-apis/lexical-matching) and [MMR](https://vectara.com/get-diverse-results-and-comprehensive-summaries-with-vectaras-mmr-reranker/))\n",
-    ">4. An option to create [generative summary](https://docs.vectara.com/docs/learn/grounded-generation/grounded-generation-overview), based on the retrieved documents, including citations.\n",
+    "[Vectara](https://vectara.com/) provides a Trusted Generative AI platform, allowing organizations to rapidly create a ChatGPT-like experience (an AI assistant) which is grounded in the data, documents, and knowledge that they have (technically, it is Retrieval-Augmented-Generation-as-a-service). \n",
+    "\n",
+    "Vectara serverless RAG-as-a-service provides all the components of RAG behind an easy-to-use API, including:\n",
+    "1. A way to extract text from files (PDF, PPT, DOCX, etc)\n",
+    "2. ML-based chunking that provides state of the art performance.\n",
+    "3. The [Boomerang](https://vectara.com/how-boomerang-takes-retrieval-augmented-generation-to-the-next-level-via-grounded-generation/) embeddings model.\n",
+    "4. Its own internal vector database where text chunks and embedding vectors are stored.\n",
+    "5. A query service that automatically encodes the query into embedding, and retrieves the most relevant text segments (including support for [Hybrid Search](https://docs.vectara.com/docs/api-reference/search-apis/lexical-matching) and [MMR](https://vectara.com/get-diverse-results-and-comprehensive-summaries-with-vectaras-mmr-reranker/))\n",
+    "7. An LLM to for creating a [generative summary](https://docs.vectara.com/docs/learn/grounded-generation/grounded-generation-overview), based on the retrieved documents (context), including citations.\n",
    "\n",
    "See the [Vectara API documentation](https://docs.vectara.com/docs/) for more information on how to use the API.\n",
    "\n",
@ -25,19 +27,19 @@
   "id": "68e75fb9",
   "metadata": {},
   "source": [
-    "# Setup\n",
+    "# Getting Started\n",
    "\n",
-    "You will need a `Vectara` account to use `Vectara` with `LangChain`. To get started, use the following steps (see our [quickstart](https://docs.vectara.com/docs/quickstart) guide):\n",
-    "1. [Sign up](https://console.vectara.com/signup) for a `Vectara` account if you don't already have one. Once you have completed your sign up you will have a Vectara customer ID. You can find your customer ID by clicking on your name, on the top-right of the Vectara console window.\n",
-    "2. Within your account you can create one or more corpora. Each corpus represents an area that stores text data upon ingesting from input documents. To create a corpus, use the **\"Create Corpus\"** button. You then provide a name to your corpus as well as a description. Optionally you can define filtering attributes and apply some advanced options. If you click on your created corpus, you can see its name and corpus ID right on the top.\n",
-    "3. Next you'll need to create API keys to access the corpus. Click on the **\"Authorization\"** tab in the corpus view and then the **\"Create API Key\"** button. Give your key a name, and choose whether you want query only or query+index for your key. Click \"Create\" and you now have an active API key. Keep this key confidential. \n",
+    "To get started, use the following steps:\n",
+    "1. If you don't already have one, [Sign up](https://www.vectara.com/integrations/langchain) for your free Vectara account. Once you have completed your sign up you will have a Vectara customer ID. You can find your customer ID by clicking on your name, on the top-right of the Vectara console window.\n",
+    "2. Within your account you can create one or more corpora. Each corpus represents an area that stores text data upon ingest from input documents. To create a corpus, use the **\"Create Corpus\"** button. You then provide a name to your corpus as well as a description. Optionally you can define filtering attributes and apply some advanced options. If you click on your created corpus, you can see its name and corpus ID right on the top.\n",
+    "3. Next you'll need to create API keys to access the corpus. Click on the **\"Access Control\"** tab in the corpus view and then the **\"Create API Key\"** button. Give your key a name, and choose whether you want query-only or query+index for your key. Click \"Create\" and you now have an active API key. Keep this key confidential. \n",
    "\n",
-    "To use LangChain with Vectara, you need three values: customer ID, corpus ID and api_key.\n",
+    "To use LangChain with Vectara, you'll need to have these three values: `customer ID`, `corpus ID` and `api_key`.\n",
    "You can provide those to LangChain in two ways:\n",
    "\n",
    "1. Include in your environment these three variables: `VECTARA_CUSTOMER_ID`, `VECTARA_CORPUS_ID` and `VECTARA_API_KEY`.\n",
    "\n",
-    "> For example, you can set these variables using `os.environ` and `getpass` as follows:\n",
+    "   For example, you can set these variables using os.environ and getpass as follows:\n",
    "\n",
    "```python\n",
    "import os\n",
@ -48,17 +50,18 @@
    "os.environ[\"VECTARA_API_KEY\"] = getpass.getpass(\"Vectara API Key:\")\n",
    "```\n",
    "\n",
-    "1. Provide them as arguments when creating the `Vectara` vectorstore object:\n",
+    "2. Add them to the `Vectara` vectorstore constructor:\n",
    "\n",
    "```python\n",
-    "vectorstore = Vectara(\n",
+    "vectara = Vectara(\n",
    "                vectara_customer_id=vectara_customer_id,\n",
    "                vectara_corpus_id=vectara_corpus_id,\n",
    "                vectara_api_key=vectara_api_key\n",
    "            )\n",
    "```\n",
+    "In this notebook we assume they are provided in the environment.\n",
    "\n",
-    "**Note:** The self-query retriever requires you to have `lark` installed (`pip install lark`). "
+    "**Notes:** The self-query retriever requires you to have `lark` installed (`pip install lark`). "
   ]
  },
  {
@ -68,34 +71,44 @@
   "source": [
    "## Connecting to Vectara from LangChain\n",
    "\n",
-    "In this example, we assume that you've created an account and a corpus, and added your VECTARA_CUSTOMER_ID, VECTARA_CORPUS_ID and VECTARA_API_KEY (created with permissions for both indexing and query) as environment variables.\n",
+    "In this example, we assume that you've created an account and a corpus, and added your `VECTARA_CUSTOMER_ID`, `VECTARA_CORPUS_ID` and `VECTARA_API_KEY` (created with permissions for both indexing and query) as environment variables.\n",
    "\n",
-    "The corpus has 4 fields defined as metadata for filtering: year, director, rating, and genre\n"
+    "We further assume the corpus has 4 fields defined as filterable metadata attributes: `year`, `director`, `rating`, and `genre`"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 2,
-   "id": "cb4a5787",
-   "metadata": {
-    "tags": []
-   },
+   "execution_count": 1,
+   "id": "9d3aa44f",
+   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain.chains import ConversationalRetrievalChain\n",
+    "import os\n",
+    "\n",
+    "os.environ[\"VECTARA_API_KEY\"] = \"<YOUR_VECTARA_API_KEY>\"\n",
+    "os.environ[\"VECTARA_CORPUS_ID\"] = \"<YOUR_VECTARA_CORPUS_ID>\"\n",
+    "os.environ[\"VECTARA_CUSTOMER_ID\"] = \"<YOUR_VECTARA_CUSTOMER_ID>\"\n",
+    "\n",
    "from langchain.chains.query_constructor.base import AttributeInfo\n",
    "from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
-    "from langchain_community.document_loaders import TextLoader\n",
-    "from langchain_community.embeddings import FakeEmbeddings\n",
+    "from langchain.schema import Document\n",
    "from langchain_community.vectorstores import Vectara\n",
-    "from langchain_core.documents import Document\n",
-    "from langchain_openai import OpenAI\n",
-    "from langchain_text_splitters import CharacterTextSplitter"
+    "from langchain_openai.chat_models import ChatOpenAI"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "13a6be33-de3c-4628-acc8-b94102c275b7",
+   "metadata": {},
+   "source": [
+    "## Dataset\n",
+    "\n",
+    "We first define an example dataset of movie, and upload those to the corpus, along with the metadata:"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 2,
   "id": "bcbe04d9",
   "metadata": {
    "tags": []
@ -136,11 +149,7 @@
    "\n",
    "vectara = Vectara()\n",
    "for doc in docs:\n",
-    "    vectara.add_texts(\n",
-    "        [doc.page_content],\n",
-    "        embedding=FakeEmbeddings(size=768),\n",
-    "        doc_metadata=doc.metadata,\n",
-    "    )"
+    "    vectara.add_texts([doc.page_content], doc_metadata=doc.metadata)"
   ]
  },
  {
@ -148,23 +157,21 @@
   "id": "5ecaab6d",
   "metadata": {},
   "source": [
-    "## Creating our self-querying retriever\n",
-    "Now we can instantiate our retriever. To do this we'll need to provide some information upfront about the metadata fields that our documents support and a short description of the document contents."
+    "## Creating the self-querying retriever\n",
+    "Now we can instantiate our retriever. To do this we'll need to provide some information upfront about the metadata fields that our documents support and a short description of the document contents.\n",
+    "\n",
+    "We then provide an llm (in this case OpenAI) and the `vectara` vectorstore as arguments:"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 3,
   "id": "86e34dbf",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
-    "from langchain.chains.query_constructor.base import AttributeInfo\n",
-    "from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
-    "from langchain_openai import OpenAI\n",
-    "\n",
    "metadata_field_info = [\n",
    "    AttributeInfo(\n",
    "        name=\"genre\",\n",
@ -186,7 +193,7 @@
    "    ),\n",
    "]\n",
    "document_content_description = \"Brief summary of a movie\"\n",
-    "llm = OpenAI(temperature=0)\n",
+    "llm = ChatOpenAI(temperature=0, model=\"gpt-4o\", max_tokens=4069)\n",
    "retriever = SelfQueryRetriever.from_llm(\n",
    "    llm, vectara, document_content_description, metadata_field_info, verbose=True\n",
    ")"
@ -197,13 +204,13 @@
   "id": "ea9df8d4",
   "metadata": {},
   "source": [
-    "## Testing it out\n",
+    "## Self-retrieval Queries\n",
    "And now we can try actually using our retriever!"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 4,
   "id": "38a126e9",
   "metadata": {},
   "outputs": [
@ -211,26 +218,26 @@
     "data": {
      "text/plain": [
       "[Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'lang': 'eng', 'offset': '0', 'len': '66', 'year': '1993', 'rating': '7.7', 'genre': 'science fiction', 'source': 'langchain'}),\n",
-       " Document(page_content='Toys come alive and have a blast doing so', metadata={'lang': 'eng', 'offset': '0', 'len': '41', 'year': '1995', 'genre': 'animated', 'source': 'langchain'}),\n",
       " Document(page_content='A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea', metadata={'lang': 'eng', 'offset': '0', 'len': '116', 'year': '2006', 'director': 'Satoshi Kon', 'rating': '8.6', 'source': 'langchain'}),\n",
-       " Document(page_content='Leo DiCaprio gets lost in a dream within a dream within a dream within a ...', metadata={'lang': 'eng', 'offset': '0', 'len': '76', 'year': '2010', 'director': 'Christopher Nolan', 'rating': '8.2', 'source': 'langchain'}),\n",
+       " Document(page_content='Toys come alive and have a blast doing so', metadata={'lang': 'eng', 'offset': '0', 'len': '41', 'year': '1995', 'genre': 'animated', 'source': 'langchain'}),\n",
+       " Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'lang': 'eng', 'offset': '0', 'len': '60', 'year': '1979', 'rating': '9.9', 'director': 'Andrei Tarkovsky', 'genre': 'science fiction', 'source': 'langchain'}),\n",
       " Document(page_content='A bunch of normal-sized women are supremely wholesome and some men pine after them', metadata={'lang': 'eng', 'offset': '0', 'len': '82', 'year': '2019', 'director': 'Greta Gerwig', 'rating': '8.3', 'source': 'langchain'}),\n",
-       " Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'lang': 'eng', 'offset': '0', 'len': '60', 'year': '1979', 'rating': '9.9', 'director': 'Andrei Tarkovsky', 'genre': 'science fiction', 'source': 'langchain'})]"
+       " Document(page_content='Leo DiCaprio gets lost in a dream within a dream within a dream within a ...', metadata={'lang': 'eng', 'offset': '0', 'len': '76', 'year': '2010', 'director': 'Christopher Nolan', 'rating': '8.2', 'source': 'langchain'})]"
      ]
     },
-     "execution_count": 5,
+     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# This example only specifies a relevant query\n",
-    "retriever.invoke(\"What are some movies about dinosaurs\")"
+    "retriever.invoke(\"What are movies about scientists\")"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 5,
   "id": "fc3f1e6e",
   "metadata": {},
   "outputs": [
@ -241,7 +248,7 @@
       " Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'lang': 'eng', 'offset': '0', 'len': '60', 'year': '1979', 'rating': '9.9', 'director': 'Andrei Tarkovsky', 'genre': 'science fiction', 'source': 'langchain'})]"
      ]
     },
-     "execution_count": 6,
+     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -253,7 +260,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": 6,
   "id": "b19d4da0",
   "metadata": {},
   "outputs": [
@ -263,7 +270,7 @@
       "[Document(page_content='A bunch of normal-sized women are supremely wholesome and some men pine after them', metadata={'lang': 'eng', 'offset': '0', 'len': '82', 'year': '2019', 'director': 'Greta Gerwig', 'rating': '8.3', 'source': 'langchain'})]"
      ]
     },
-     "execution_count": 7,
+     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -275,17 +282,18 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": 7,
   "id": "f900e40e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "[Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'lang': 'eng', 'offset': '0', 'len': '60', 'year': '1979', 'rating': '9.9', 'director': 'Andrei Tarkovsky', 'genre': 'science fiction', 'source': 'langchain'})]"
+       "[Document(page_content='A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea', metadata={'lang': 'eng', 'offset': '0', 'len': '116', 'year': '2006', 'director': 'Satoshi Kon', 'rating': '8.6', 'source': 'langchain'}),\n",
+       " Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'lang': 'eng', 'offset': '0', 'len': '60', 'year': '1979', 'rating': '9.9', 'director': 'Andrei Tarkovsky', 'genre': 'science fiction', 'source': 'langchain'})]"
      ]
     },
-     "execution_count": 8,
+     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -297,17 +305,18 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 9,
+   "execution_count": 8,
   "id": "12a51522",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "[Document(page_content='Toys come alive and have a blast doing so', metadata={'lang': 'eng', 'offset': '0', 'len': '41', 'year': '1995', 'genre': 'animated', 'source': 'langchain'})]"
+       "[Document(page_content='Toys come alive and have a blast doing so', metadata={'lang': 'eng', 'offset': '0', 'len': '41', 'year': '1995', 'genre': 'animated', 'source': 'langchain'}),\n",
+       " Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'lang': 'eng', 'offset': '0', 'len': '66', 'year': '1993', 'rating': '7.7', 'genre': 'science fiction', 'source': 'langchain'})]"
      ]
     },
-     "execution_count": 9,
+     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -333,7 +342,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 9,
   "id": "bff36b88-b506-4877-9c63-e5a1a8d78e64",
   "metadata": {
    "tags": []
@ -350,9 +359,17 @@
    ")"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "00e8baad-a9d7-4498-bd8d-ca41d0691386",
+   "metadata": {},
+   "source": [
+    "This is cool, we can include the number of results we would like to see in the query and the self retriever would correctly understand it. For example, let's look for "
+   ]
+  },
  {
   "cell_type": "code",
-   "execution_count": 11,
+   "execution_count": 10,
   "id": "2758d229-4f97-499c-819f-888acaf8ee10",
   "metadata": {
    "tags": []
@ -361,19 +378,27 @@
    {
     "data": {
      "text/plain": [
-       "[Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'lang': 'eng', 'offset': '0', 'len': '66', 'year': '1993', 'rating': '7.7', 'genre': 'science fiction', 'source': 'langchain'}),\n",
-       " Document(page_content='Toys come alive and have a blast doing so', metadata={'lang': 'eng', 'offset': '0', 'len': '41', 'year': '1995', 'genre': 'animated', 'source': 'langchain'})]"
+       "[Document(page_content='A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea', metadata={'lang': 'eng', 'offset': '0', 'len': '116', 'year': '2006', 'director': 'Satoshi Kon', 'rating': '8.6', 'source': 'langchain'}),\n",
+       " Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'lang': 'eng', 'offset': '0', 'len': '60', 'year': '1979', 'rating': '9.9', 'director': 'Andrei Tarkovsky', 'genre': 'science fiction', 'source': 'langchain'})]"
      ]
     },
-     "execution_count": 11,
+     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# This example only specifies a relevant query\n",
-    "retriever.invoke(\"what are two movies about dinosaurs\")"
+    "retriever.invoke(\"what are two movies with a rating above 8.5\")"
   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ed4b9dbc-e3cd-442d-b108-705295f51fa1",
+   "metadata": {},
+   "outputs": [],
+   "source": []
  }
 ],
 "metadata": {
@ -392,7 +417,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.11.8"
  }
 },
 "nbformat": 4,
--- a/docs/docs/integrations/vectorstores/vectara.ipynb
+++ b/docs/docs/integrations/vectorstores/vectara.ipynb
@ -2,22 +2,20 @@
 "cells": [
  {
   "cell_type": "markdown",
-   "id": "683953b3",
+   "id": "559f8e0e",
   "metadata": {},
   "source": [
    "# Vectara\n",
    "\n",
-    ">[Vectara](https://vectara.com/) is the trusted GenAI platform that provides an easy-to-use API for document indexing and querying. \n",
+    "[Vectara](https://vectara.com/) provides a Trusted Generative AI platform, allowing organizations to rapidly create a ChatGPT-like experience (an AI assistant) which is grounded in the data, documents, and knowledge that they have (technically, it is Retrieval-Augmented-Generation-as-a-service). \n",
    "\n",
-    "Vectara provides an end-to-end managed service for Retrieval Augmented Generation or [RAG](https://vectara.com/grounded-generation/), which includes:\n",
-    "\n",
-    "1. A way to extract text from document files and chunk them into sentences.\n",
-    "\n",
-    "2. The state-of-the-art [Boomerang](https://vectara.com/how-boomerang-takes-retrieval-augmented-generation-to-the-next-level-via-grounded-generation/) embeddings model. Each text chunk is encoded into a vector embedding using Boomerang, and stored in the Vectara internal knowledge (vector+text) store\n",
-    "\n",
-    "3. A query service that automatically encodes the query into embedding, and retrieves the most relevant text segments (including support for [Hybrid Search](https://docs.vectara.com/docs/api-reference/search-apis/lexical-matching) and [MMR](https://vectara.com/get-diverse-results-and-comprehensive-summaries-with-vectaras-mmr-reranker/))\n",
-    "\n",
-    "4. An option to create [generative summary](https://docs.vectara.com/docs/learn/grounded-generation/grounded-generation-overview), based on the retrieved documents, including citations.\n",
+    "Vectara serverless RAG-as-a-service provides all the components of RAG behind an easy-to-use API, including:\n",
+    "1. A way to extract text from files (PDF, PPT, DOCX, etc)\n",
+    "2. ML-based chunking that provides state of the art performance.\n",
+    "3. The [Boomerang](https://vectara.com/how-boomerang-takes-retrieval-augmented-generation-to-the-next-level-via-grounded-generation/) embeddings model.\n",
+    "4. Its own internal vector database where text chunks and embedding vectors are stored.\n",
+    "5. A query service that automatically encodes the query into embedding, and retrieves the most relevant text segments (including support for [Hybrid Search](https://docs.vectara.com/docs/api-reference/search-apis/lexical-matching) and [MMR](https://vectara.com/get-diverse-results-and-comprehensive-summaries-with-vectaras-mmr-reranker/))\n",
+    "7. An LLM to for creating a [generative summary](https://docs.vectara.com/docs/learn/grounded-generation/grounded-generation-overview), based on the retrieved documents (context), including citations.\n",
    "\n",
    "See the [Vectara API documentation](https://docs.vectara.com/docs/) for more information on how to use the API.\n",
    "\n",
@ -28,25 +26,22 @@
  },
  {
   "cell_type": "markdown",
-   "id": "dc0f4344",
+   "id": "e97dcf11",
   "metadata": {},
   "source": [
-    "# Setup\n",
-    "\n",
-    "You will need a Vectara account to use Vectara with LangChain. To get started, use the following steps:\n",
-    "\n",
-    "1. [Sign up](https://www.vectara.com/integrations/langchain) for a Vectara account if you don't already have one. Once you have completed your sign up you will have a Vectara customer ID. You can find your customer ID by clicking on your name, on the top-right of the Vectara console window.\n",
+    "# Getting Started\n",
    "\n",
+    "To get started, use the following steps:\n",
+    "1. If you don't already have one, [Sign up](https://www.vectara.com/integrations/langchain) for your free Vectara account. Once you have completed your sign up you will have a Vectara customer ID. You can find your customer ID by clicking on your name, on the top-right of the Vectara console window.\n",
    "2. Within your account you can create one or more corpora. Each corpus represents an area that stores text data upon ingest from input documents. To create a corpus, use the **\"Create Corpus\"** button. You then provide a name to your corpus as well as a description. Optionally you can define filtering attributes and apply some advanced options. If you click on your created corpus, you can see its name and corpus ID right on the top.\n",
+    "3. Next you'll need to create API keys to access the corpus. Click on the **\"Access Control\"** tab in the corpus view and then the **\"Create API Key\"** button. Give your key a name, and choose whether you want query-only or query+index for your key. Click \"Create\" and you now have an active API key. Keep this key confidential. \n",
    "\n",
-    "3. Next you'll need to create API keys to access the corpus. Click on the **\"Authorization\"** tab in the corpus view and then the **\"Create API Key\"** button. Give your key a name, and choose whether you want query only or query+index for your key. Click \"Create\" and you now have an active API key. Keep this key confidential. \n",
-    "\n",
-    "To use LangChain with Vectara, you'll need to have these three values: customer ID, corpus ID and api_key.\n",
+    "To use LangChain with Vectara, you'll need to have these three values: `customer ID`, `corpus ID` and `api_key`.\n",
    "You can provide those to LangChain in two ways:\n",
    "\n",
    "1. Include in your environment these three variables: `VECTARA_CUSTOMER_ID`, `VECTARA_CORPUS_ID` and `VECTARA_API_KEY`.\n",
    "\n",
-    "> For example, you can set these variables using os.environ and getpass as follows:\n",
+    "   For example, you can set these variables using os.environ and getpass as follows:\n",
    "\n",
    "```python\n",
    "import os\n",
@ -57,480 +52,304 @@
    "os.environ[\"VECTARA_API_KEY\"] = getpass.getpass(\"Vectara API Key:\")\n",
    "```\n",
    "\n",
-    "2. Add them to the Vectara vectorstore constructor:\n",
+    "2. Add them to the `Vectara` vectorstore constructor:\n",
    "\n",
    "```python\n",
-    "vectorstore = Vectara(\n",
+    "vectara = Vectara(\n",
    "                vectara_customer_id=vectara_customer_id,\n",
    "                vectara_corpus_id=vectara_corpus_id,\n",
    "                vectara_api_key=vectara_api_key\n",
    "            )\n",
-    "```"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "eeead681",
-   "metadata": {},
-   "source": [
-    "## Connecting to Vectara from LangChain\n",
+    "```\n",
    "\n",
-    "To get started, let's ingest the documents using the from_documents() method.\n",
-    "We assume here that you've added your VECTARA_CUSTOMER_ID, VECTARA_CORPUS_ID and query+indexing VECTARA_API_KEY as environment variables."
+    "In this notebook we assume they are provided in the environment."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
-   "id": "04a1f1a0",
+   "id": "aac7a9a6",
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain_community.document_loaders import TextLoader\n",
-    "from langchain_community.embeddings.fake import FakeEmbeddings\n",
+    "import os\n",
+    "\n",
+    "os.environ[\"VECTARA_API_KEY\"] = \"<YOUR_VECTARA_API_KEY>\"\n",
+    "os.environ[\"VECTARA_CORPUS_ID\"] = \"<YOUR_VECTARA_CORPUS_ID>\"\n",
+    "os.environ[\"VECTARA_CUSTOMER_ID\"] = \"<YOUR_VECTARA_CUSTOMER_ID>\"\n",
+    "\n",
    "from langchain_community.vectorstores import Vectara\n",
-    "from langchain_text_splitters import CharacterTextSplitter"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "id": "be0a4973",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "loader = TextLoader(\"state_of_the_union.txt\")\n",
-    "documents = loader.load()\n",
-    "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
-    "docs = text_splitter.split_documents(documents)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "id": "8429667e",
-   "metadata": {
-    "ExecuteTime": {
-     "end_time": "2023-04-04T10:51:22.525091Z",
-     "start_time": "2023-04-04T10:51:22.522015Z"
-    },
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "vectara = Vectara.from_documents(\n",
-    "    docs,\n",
-    "    embedding=FakeEmbeddings(size=768),\n",
-    "    doc_metadata={\"speech\": \"state-of-the-union\"},\n",
+    "from langchain_community.vectorstores.vectara import (\n",
+    "    RerankConfig,\n",
+    "    SummaryConfig,\n",
+    "    VectaraQueryConfig,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
-   "id": "90dbf3e7",
+   "id": "875ffb7e",
   "metadata": {},
   "source": [
-    "Vectara's indexing API provides a file upload API where the file is handled directly by Vectara - pre-processed, chunked optimally and added to the Vectara vector store.\n",
-    "To use this, we added the add_files() method (as well as from_files()). \n",
+    "First we load the state-of-the-union text into Vectara. \n",
    "\n",
-    "Let's see this in action. We pick two PDF documents to upload: \n",
+    "Note that we use the `from_files` interface which does not require any local processing or chunking - Vectara receives the file content and performs all the necessary pre-processing, chunking and embedding of the file into its knowledge store.\n",
    "\n",
-    "1. The \"I have a dream\" speech by Dr. King\n",
-    "2. Churchill's \"We Shall Fight on the Beaches\" speech"
+    "In this case it uses a `.txt` file but the same works for many other [file types](https://docs.vectara.com/docs/api-reference/indexing-apis/file-upload/file-upload-filetypes)."
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
-   "id": "85ef3468",
+   "execution_count": 2,
+   "id": "be0a4973",
   "metadata": {},
   "outputs": [],
   "source": [
-    "import tempfile\n",
-    "import urllib.request\n",
-    "\n",
-    "urls = [\n",
-    "    [\n",
-    "        \"https://www.gilderlehrman.org/sites/default/files/inline-pdfs/king.dreamspeech.excerpts.pdf\",\n",
-    "        \"I-have-a-dream\",\n",
-    "    ],\n",
-    "    [\n",
-    "        \"https://www.parkwayschools.net/cms/lib/MO01931486/Centricity/Domain/1578/Churchill_Beaches_Speech.pdf\",\n",
-    "        \"we shall fight on the beaches\",\n",
-    "    ],\n",
-    "]\n",
-    "files_list = []\n",
-    "for url, _ in urls:\n",
-    "    name = tempfile.NamedTemporaryFile().name\n",
-    "    urllib.request.urlretrieve(url, name)\n",
-    "    files_list.append(name)\n",
-    "\n",
-    "docsearch: Vectara = Vectara.from_files(\n",
-    "    files=files_list,\n",
-    "    embedding=FakeEmbeddings(size=768),\n",
-    "    metadatas=[{\"url\": url, \"speech\": title} for url, title in urls],\n",
-    ")"
+    "vectara = Vectara.from_files([\"state_of_the_union.txt\"])"
   ]
  },
  {
   "cell_type": "markdown",
-   "id": "1f9215c8",
-   "metadata": {
-    "ExecuteTime": {
-     "end_time": "2023-04-04T09:27:29.920258Z",
-     "start_time": "2023-04-04T09:27:29.913714Z"
-    }
-   },
+   "id": "22a6b953",
+   "metadata": {},
   "source": [
-    "## Similarity search\n",
+    "## Basic Vectara RAG (retrieval augmented generation)\n",
    "\n",
-    "The simplest scenario for using Vectara is to perform a similarity search. "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "id": "a8c513ab",
-   "metadata": {
-    "ExecuteTime": {
-     "end_time": "2023-04-04T10:51:25.204469Z",
-     "start_time": "2023-04-04T10:51:24.855618Z"
-    },
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "found_docs = vectara.similarity_search(\n",
-    "    query, n_sentence_context=0, filter=\"doc.speech = 'state-of-the-union'\"\n",
-    ")"
+    "We now create a `VectaraQueryConfig` object to control the retrieval and summarization options:\n",
+    "* We enable summarization, specifying we would like the LLM to pick the top 7 matching chunks and respond in English\n",
+    "* We enable MMR (max marginal relevance) in the retrieval process, with a 0.2 diversity bias factor\n",
+    "* We want the top-10 results, with hybrid search configured with a value of 0.025\n",
+    "\n",
+    "Using this configuration, let's create a LangChain `Runnable` object that encpasulates the full Vectara RAG pipeline, using the `as_rag` method:"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 6,
-   "id": "53324492",
+   "execution_count": 3,
+   "id": "9ecda054-96a8-4a91-aeae-32006efb1ac8",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "[Document(page_content='And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson.', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '596', 'len': '97', 'speech': 'state-of-the-union'}),\n",
-       " Document(page_content='In this struggle as President Zelenskyy said in his speech to the European Parliament “Light will win over darkness.”', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '141', 'len': '117', 'speech': 'state-of-the-union'}),\n",
-       " Document(page_content='As Ohio Senator Sherrod Brown says, “It’s time to bury the label “Rust Belt.”', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '0', 'len': '77', 'speech': 'state-of-the-union'}),\n",
-       " Document(page_content='Last month, I announced our plan to supercharge  \\nthe Cancer Moonshot that President Obama asked me to lead six years ago.', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '0', 'len': '122', 'speech': 'state-of-the-union'}),\n",
-       " Document(page_content='He thought he could roll into Ukraine and the world would roll over.', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '664', 'len': '68', 'speech': 'state-of-the-union'}),\n",
-       " Document(page_content='That’s why one of the first things I did as President was fight to pass the American Rescue Plan.', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '314', 'len': '97', 'speech': 'state-of-the-union'}),\n",
-       " Document(page_content='And he thought he could divide us at home.', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '160', 'len': '42', 'speech': 'state-of-the-union'}),\n",
-       " Document(page_content='He met the Ukrainian people.', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '788', 'len': '28', 'speech': 'state-of-the-union'}),\n",
-       " Document(page_content='He thought the West and NATO wouldn’t respond.', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '113', 'len': '46', 'speech': 'state-of-the-union'}),\n",
-       " Document(page_content='In this Capitol, generation after generation, Americans have debated great questions amid great strife, and have done great things.', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '772', 'len': '131', 'speech': 'state-of-the-union'})]"
+       "\"Biden addressed various topics in his statements. He highlighted the need to confront Putin by building a coalition of nations[1]. He also expressed commitment to investigating the impact of burn pits on soldiers' health, including his son's case[2]. Additionally, Biden outlined a plan to fight inflation by cutting prescription drug costs[3]. He emphasized the importance of continuing to combat COVID-19 and not just accepting living with it[4]. Furthermore, he discussed measures to weaken Russia economically and target Russian oligarchs[6]. Biden also advocated for passing the Equality Act to support LGBTQ+ Americans and condemned state laws targeting transgender individuals[7].\""
      ]
     },
-     "execution_count": 6,
+     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "found_docs"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "id": "fc516993",
-   "metadata": {
-    "ExecuteTime": {
-     "end_time": "2023-04-04T10:51:25.220984Z",
-     "start_time": "2023-04-04T10:51:25.213943Z"
-    },
-    "tags": []
-   },
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson.\n"
-     ]
-    }
-   ],
-   "source": [
-    "print(found_docs[0].page_content)"
+    "summary_config = SummaryConfig(is_enabled=True, max_results=7, response_lang=\"eng\")\n",
+    "rerank_config = RerankConfig(reranker=\"mmr\", rerank_k=50, mmr_diversity_bias=0.2)\n",
+    "config = VectaraQueryConfig(\n",
+    "    k=10, lambda_val=0.005, rerank_config=rerank_config, summary_config=summary_config\n",
+    ")\n",
+    "\n",
+    "query_str = \"what did Biden say?\"\n",
+    "\n",
+    "rag = vectara.as_rag(config)\n",
+    "rag.invoke(query_str)[\"answer\"]"
   ]
  },
  {
   "cell_type": "markdown",
-   "id": "1bda9bf5",
+   "id": "cd825d63-93a0-4e45-a455-bfabb01ee1a1",
   "metadata": {},
   "source": [
-    "## Similarity search with score\n",
-    "\n",
-    "Sometimes we might want to perform the search, but also obtain a relevancy score to know how good is a particular result."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "id": "8804a21d",
-   "metadata": {
-    "ExecuteTime": {
-     "end_time": "2023-04-04T10:51:25.631585Z",
-     "start_time": "2023-04-04T10:51:25.227384Z"
-    }
-   },
-   "outputs": [],
-   "source": [
-    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "found_docs = vectara.similarity_search_with_score(\n",
-    "    query,\n",
-    "    filter=\"doc.speech = 'state-of-the-union'\",\n",
-    "    score_threshold=0.2,\n",
-    ")"
+    "We can also use the streaming interface like this:"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 9,
-   "id": "756a6887",
-   "metadata": {
-    "ExecuteTime": {
-     "end_time": "2023-04-04T10:51:25.642282Z",
-     "start_time": "2023-04-04T10:51:25.635947Z"
-    }
-   },
+   "execution_count": 4,
+   "id": "27f01330-8917-4eff-b603-59ab2571a4d2",
+   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "Justice Breyer, thank you for your service. One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence. A former top litigator in private practice.\n",
-      "\n",
-      "Score: 0.74179757\n"
+      "Biden addressed various topics in his statements. He highlighted the importance of building coalitions to confront global challenges [1]. He also expressed commitment to investigating the impact of burn pits on soldiers' health, including his son's case [2, 4]. Additionally, Biden outlined his plan to combat inflation by cutting prescription drug costs and reducing the deficit, with support from Nobel laureates and business leaders [3]. He emphasized the ongoing fight against COVID-19 and the need to continue combating the virus [5]. Furthermore, Biden discussed measures taken to weaken Russia's economic and military strength, targeting Russian oligarchs and corrupt leaders [6]. He also advocated for passing the Equality Act to support LGBTQ+ Americans and address discriminatory state laws [7]."
     ]
    }
   ],
   "source": [
-    "document, score = found_docs[0]\n",
-    "print(document.page_content)\n",
-    "print(f\"\\nScore: {score}\")"
+    "output = {}\n",
+    "curr_key = None\n",
+    "for chunk in rag.stream(query_str):\n",
+    "    for key in chunk:\n",
+    "        if key not in output:\n",
+    "            output[key] = chunk[key]\n",
+    "        else:\n",
+    "            output[key] += chunk[key]\n",
+    "        if key == \"answer\":\n",
+    "            print(chunk[key], end=\"\", flush=True)\n",
+    "        curr_key = key"
   ]
  },
  {
   "cell_type": "markdown",
-   "id": "1f9876a8",
+   "id": "7eaf871d-eba2-46b1-bfa3-b9c82947d2be",
   "metadata": {},
   "source": [
-    "Now let's do similar search for content in the files we uploaded"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 10,
-   "id": "47784de5",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "With this threshold of 1.2 we have 0 documents\n"
-     ]
-    }
-   ],
-   "source": [
-    "query = \"We must forever conduct our struggle\"\n",
-    "min_score = 1.2\n",
-    "found_docs = vectara.similarity_search_with_score(\n",
-    "    query,\n",
-    "    filter=\"doc.speech = 'I-have-a-dream'\",\n",
-    "    score_threshold=min_score,\n",
-    ")\n",
-    "print(f\"With this threshold of {min_score} we have {len(found_docs)} documents\")"
+    "## Hallucination detection and Factual Consistency Score\n",
+    "\n",
+    "Vectara created [HHEM](https://huggingface.co/vectara/hallucination_evaluation_model) - an open source model that can be used to evaluate RAG responses for factual consistency. \n",
+    "\n",
+    "As part of the Vectara RAG, the \"Factual Consistency Score\" (or FCS), which is an improved version of the open source HHEM is made available via the API. This is automatically included in the output of the RAG pipeline"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 11,
-   "id": "29f465e5",
+   "execution_count": 5,
+   "id": "b2e0aa2c-7c8e-4d79-8abc-66f5a1f961b3",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "With this threshold of 0.2 we have 10 documents\n"
+      "Biden addressed various topics in his statements. He highlighted the need to confront Putin by building a coalition of nations[1]. He also expressed his commitment to investigating the impact of burn pits on soldiers' health, referencing his son's experience[2]. Additionally, Biden discussed his plan to fight inflation by cutting prescription drug costs and garnering support from Nobel laureates and business leaders[4]. Furthermore, he emphasized the importance of continuing to combat COVID-19 and not merely accepting living with the virus[5]. Biden's remarks encompassed international relations, healthcare challenges faced by soldiers, economic strategies, and the ongoing battle against the pandemic.\n",
+      "Vectara FCS = 0.41796625\n"
     ]
    }
   ],
   "source": [
-    "query = \"We must forever conduct our struggle\"\n",
-    "min_score = 0.2\n",
-    "found_docs = vectara.similarity_search_with_score(\n",
-    "    query,\n",
-    "    filter=\"doc.speech = 'I-have-a-dream'\",\n",
-    "    score_threshold=min_score,\n",
+    "summary_config = SummaryConfig(is_enabled=True, max_results=5, response_lang=\"eng\")\n",
+    "rerank_config = RerankConfig(reranker=\"mmr\", rerank_k=50, mmr_diversity_bias=0.1)\n",
+    "config = VectaraQueryConfig(\n",
+    "    k=10, lambda_val=0.005, rerank_config=rerank_config, summary_config=summary_config\n",
    ")\n",
-    "print(f\"With this threshold of {min_score} we have {len(found_docs)} documents\")"
+    "\n",
+    "rag = vectara.as_rag(config)\n",
+    "resp = rag.invoke(query_str)\n",
+    "print(resp[\"answer\"])\n",
+    "print(f\"Vectara FCS = {resp['fcs']}\")"
   ]
  },
  {
   "cell_type": "markdown",
-   "id": "471112c0",
+   "id": "b651396a-5726-4d49-bacf-c9d7a5ddcf7a",
   "metadata": {},
   "source": [
-    "MMR is an important retrieval capability for many applications, whereby search results feeding your GenAI application are reranked to improve diversity of results. \n",
+    "## Vectara as a langchain retreiver\n",
    "\n",
-    "Let's see how that works with Vectara:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 12,
-   "id": "5d597e91",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Economic assistance.\n",
-      "\n",
-      "Grow the workforce. Build the economy from the bottom up  \n",
-      "and the middle out, not from the top down.\n",
-      "\n",
-      "When we invest in our workers, when we build the economy from the bottom up and the middle out together, we can do something we haven’t done in a long time: build a better America.\n",
-      "\n",
-      "Our economy grew at a rate of 5.7% last year, the strongest growth in nearly 40 years, the first step in bringing fundamental change to an economy that hasn’t worked for the working people of this nation for too long.\n",
-      "\n",
-      "Economists call it “increasing the productive capacity of our economy.”\n"
-     ]
-    }
-   ],
-   "source": [
-    "query = \"state of the economy\"\n",
-    "found_docs = vectara.similarity_search(\n",
-    "    query,\n",
-    "    n_sentence_context=0,\n",
-    "    filter=\"doc.speech = 'state-of-the-union'\",\n",
-    "    k=5,\n",
-    "    mmr_config={\"is_enabled\": True, \"mmr_k\": 50, \"diversity_bias\": 0.0},\n",
-    ")\n",
-    "print(\"\\n\\n\".join([x.page_content for x in found_docs]))"
+    "The Vectara component can also be used just as a retriever. \n",
+    "\n",
+    "In this case, it behaves just like any other LangChain retriever. The main use of this mode is for semantic search, and in this case we disable summarization:"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 13,
-   "id": "be2b2326",
+   "execution_count": 6,
+   "id": "19cd2f86",
   "metadata": {},
   "outputs": [
    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Economic assistance.\n",
-      "\n",
-      "The Russian stock market has lost 40% of its value and trading remains suspended.\n",
-      "\n",
-      "But that trickle-down theory led to weaker economic growth, lower wages, bigger deficits, and the widest gap between those at the top and everyone else in nearly a century.\n",
-      "\n",
-      "In state after state, new laws have been passed, not only to suppress the vote, but to subvert entire elections.\n",
-      "\n",
-      "The federal government spends about $600 Billion a year to keep the country safe and secure.\n"
-     ]
+     "data": {
+      "text/plain": [
+       "[Document(page_content='He thought the West and NATO wouldn’t respond. And he thought he could divide us at home. We were ready.  Here is what we did. We prepared extensively and carefully. We spent months building a coalition of other freedom-loving nations from Europe and the Americas to Asia and Africa to confront Putin.', metadata={'lang': 'eng', 'section': '1', 'offset': '2160', 'len': '36', 'X-TIKA:Parsed-By': 'org.apache.tika.parser.csv.TextAndCSVParser', 'Content-Encoding': 'UTF-8', 'Content-Type': 'text/plain; charset=UTF-8', 'source': 'vectara'}),\n",
+       " Document(page_content='When they came home, many of the world’s fittest and best trained warriors were never the same. Dizziness. \\n\\nA cancer that would put them in a flag-draped coffin. I know. \\n\\nOne of those soldiers was my son Major Beau Biden. We don’t know for sure if a burn pit was the cause of his brain cancer, or the diseases of so many of our troops. But I’m committed to finding out everything we can.', metadata={'lang': 'eng', 'section': '1', 'offset': '34652', 'len': '60', 'X-TIKA:Parsed-By': 'org.apache.tika.parser.csv.TextAndCSVParser', 'Content-Encoding': 'UTF-8', 'Content-Type': 'text/plain; charset=UTF-8', 'source': 'vectara'}),\n",
+       " Document(page_content='But cancer from prolonged exposure to burn pits ravaged Heath’s lungs and body. Danielle says Heath was a fighter to the very end. He didn’t know how to stop fighting, and neither did she. Through her pain she found purpose to demand we do better. Tonight, Danielle—we are.', metadata={'lang': 'eng', 'section': '1', 'offset': '35442', 'len': '57', 'X-TIKA:Parsed-By': 'org.apache.tika.parser.csv.TextAndCSVParser', 'Content-Encoding': 'UTF-8', 'Content-Type': 'text/plain; charset=UTF-8', 'source': 'vectara'})]"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
    }
   ],
   "source": [
-    "query = \"state of the economy\"\n",
-    "found_docs = vectara.similarity_search(\n",
-    "    query,\n",
-    "    n_sentence_context=0,\n",
-    "    filter=\"doc.speech = 'state-of-the-union'\",\n",
-    "    k=5,\n",
-    "    mmr_config={\"is_enabled\": True, \"mmr_k\": 50, \"diversity_bias\": 1.0},\n",
-    ")\n",
-    "print(\"\\n\\n\".join([x.page_content for x in found_docs]))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "10c1427e",
-   "metadata": {},
-   "source": [
-    "As you can see, in the first example diversity_bias was set to 0.0 (equivalent to diversity reranking disabled), which resulted in a the top-5 most relevant documents. With diversity_bias=1.0 we maximize diversity and as you can see the resulting top documents are much more diverse in their semantic meanings."
+    "config.summary_config.is_enabled = False\n",
+    "config.k = 3\n",
+    "retriever = vectara.as_retriever(config=config)\n",
+    "retriever.invoke(query_str)"
   ]
  },
  {
   "cell_type": "markdown",
-   "id": "691a82d6",
+   "id": "c49284ed",
   "metadata": {},
   "source": [
-    "## Vectara as a Retriever\n",
-    "\n",
-    "Finally let's see how to use Vectara with the `as_retriever()` interface:"
+    "For backwards compatibility, you can also enable summarization with a retriever, in which case the summary is added as an additional Document object:"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 14,
-   "id": "9427195f",
-   "metadata": {
-    "ExecuteTime": {
-     "end_time": "2023-04-04T10:51:26.031451Z",
-     "start_time": "2023-04-04T10:51:26.018763Z"
-    }
-   },
+   "execution_count": 7,
+   "id": "59268e9a-6089-4bb2-8c61-1ea6b956f83c",
+   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "VectorStoreRetriever(tags=['Vectara'], vectorstore=<langchain_community.vectorstores.vectara.Vectara object at 0x109a3c760>)"
+       "[Document(page_content='He thought the West and NATO wouldn’t respond. And he thought he could divide us at home. We were ready.  Here is what we did. We prepared extensively and carefully. We spent months building a coalition of other freedom-loving nations from Europe and the Americas to Asia and Africa to confront Putin.', metadata={'lang': 'eng', 'section': '1', 'offset': '2160', 'len': '36', 'X-TIKA:Parsed-By': 'org.apache.tika.parser.csv.TextAndCSVParser', 'Content-Encoding': 'UTF-8', 'Content-Type': 'text/plain; charset=UTF-8', 'source': 'vectara'}),\n",
+       " Document(page_content='When they came home, many of the world’s fittest and best trained warriors were never the same. Dizziness. \\n\\nA cancer that would put them in a flag-draped coffin. I know. \\n\\nOne of those soldiers was my son Major Beau Biden. We don’t know for sure if a burn pit was the cause of his brain cancer, or the diseases of so many of our troops. But I’m committed to finding out everything we can.', metadata={'lang': 'eng', 'section': '1', 'offset': '34652', 'len': '60', 'X-TIKA:Parsed-By': 'org.apache.tika.parser.csv.TextAndCSVParser', 'Content-Encoding': 'UTF-8', 'Content-Type': 'text/plain; charset=UTF-8', 'source': 'vectara'}),\n",
+       " Document(page_content='But cancer from prolonged exposure to burn pits ravaged Heath’s lungs and body. Danielle says Heath was a fighter to the very end. He didn’t know how to stop fighting, and neither did she. Through her pain she found purpose to demand we do better. Tonight, Danielle—we are.', metadata={'lang': 'eng', 'section': '1', 'offset': '35442', 'len': '57', 'X-TIKA:Parsed-By': 'org.apache.tika.parser.csv.TextAndCSVParser', 'Content-Encoding': 'UTF-8', 'Content-Type': 'text/plain; charset=UTF-8', 'source': 'vectara'}),\n",
+       " Document(page_content=\"Biden discussed various topics in his statements. He highlighted the importance of unity and preparation to confront challenges, such as building coalitions to address global issues [1]. Additionally, he shared personal stories about the impact of health issues on soldiers, including his son's experience with brain cancer possibly linked to burn pits [2]. Biden also outlined his plans to combat inflation by cutting prescription drug costs and emphasized the ongoing efforts to combat COVID-19, rejecting the idea of merely living with the virus [4, 5]. Overall, Biden's messages revolved around unity, healthcare challenges faced by soldiers, economic plans, and the ongoing fight against COVID-19.\", metadata={'summary': True, 'fcs': 0.54751414})]"
      ]
     },
-     "execution_count": 14,
+     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "retriever = vectara.as_retriever()\n",
-    "retriever"
+    "config.summary_config.is_enabled = True\n",
+    "config.k = 3\n",
+    "retriever = vectara.as_retriever(config=config)\n",
+    "retriever.invoke(query_str)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8f16bf8d",
+   "metadata": {},
+   "source": [
+    "## Advanced LangChain query pre-processing with Vectara\n",
+    "\n",
+    "Vectara's \"RAG as a service\" does a lot of the heavy lifting in creating question answering or chatbot chains. The integration with LangChain provides the option to use additional capabilities such as query pre-processing  like `SelfQueryRetriever` or `MultiQueryRetriever`. Let's look at an example of using the [MultiQueryRetriever](https://python.langchain.com/docs/modules/data_connection/retrievers/MultiQueryRetriever).\n",
+    "\n",
+    "Since MQR uses an LLM we have to set that up - here we choose `ChatOpenAI`:"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 15,
-   "id": "f3c70c31",
-   "metadata": {
-    "ExecuteTime": {
-     "end_time": "2023-04-04T10:51:26.495652Z",
-     "start_time": "2023-04-04T10:51:26.046407Z"
-    },
-    "scrolled": false
-   },
+   "execution_count": 8,
+   "id": "e14325b9",
+   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "Document(page_content='Justice Breyer, thank you for your service. One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence. A former top litigator in private practice.', metadata={'source': 'langchain', 'lang': 'eng', 'offset': '596', 'len': '97', 'speech': 'state-of-the-union'})"
+       "\"Biden's statement highlighted his efforts to unite freedom-loving nations against Putin's aggression, sharing information in advance to counter Russian lies and hold Putin accountable[1]. Additionally, he emphasized his commitment to military families, like Danielle Robinson, and outlined plans for more affordable housing, Pre-K for 3- and 4-year-olds, and ensuring no additional taxes for those earning less than $400,000 a year[2][3]. The statement also touched on the readiness of the West and NATO to respond to Putin's actions, showcasing extensive preparation and coalition-building efforts[4]. Heath Robinson's story, a combat medic who succumbed to cancer from burn pits, was used to illustrate the resilience and fight for better conditions[5].\""
      ]
     },
-     "execution_count": 15,
+     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "retriever.invoke(query)[0]"
+    "from langchain.retrievers.multi_query import MultiQueryRetriever\n",
+    "from langchain_openai.chat_models import ChatOpenAI\n",
+    "\n",
+    "llm = ChatOpenAI(temperature=0)\n",
+    "mqr = MultiQueryRetriever.from_llm(retriever=retriever, llm=llm)\n",
+    "\n",
+    "\n",
+    "def get_summary(documents):\n",
+    "    return documents[-1].page_content\n",
+    "\n",
+    "\n",
+    "(mqr | get_summary).invoke(query_str)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
-   "id": "2300e785",
+   "id": "8060a423-b291-4166-8fd7-ba0e01692b51",
   "metadata": {},
   "outputs": [],
   "source": []
@ -552,7 +371,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.9"
+   "version": "3.11.8"
  }
 },
 "nbformat": 4,
--- a/libs/community/langchain_community/vectorstores/vectara.py
+++ b/libs/community/langchain_community/vectorstores/vectara.py
@ -3,18 +3,25 @@ from __future__ import annotations
 import json
 import logging
 import os
+import warnings
 from dataclasses import dataclass, field
 from hashlib import md5
-from typing import Any, Iterable, List, Optional, Tuple, Type
+from typing import Any, Iterable, Iterator, List, Optional, Tuple, Type

 import requests
+from langchain_core.callbacks.manager import (
+    CallbackManagerForRetrieverRun,
+)
 from langchain_core.documents import Document
 from langchain_core.embeddings import Embeddings
-from langchain_core.pydantic_v1 import Field
+from langchain_core.runnables import Runnable, RunnableConfig
 from langchain_core.vectorstores import VectorStore, VectorStoreRetriever

 logger = logging.getLogger(__name__)

+MMR_RERANKER_ID = 272725718
+RERANKER_MULTILINGUAL_V1_ID = 272725719
+

@dataclass
 class SummaryConfig:
@ -31,11 +38,13 @@ class SummaryConfig:
    max_results: int = 7
    response_lang: str = "eng"
    prompt_name: str = "vectara-summary-ext-v1.2.0"
+    stream: bool = False


@dataclass
 class MMRConfig:
    """Configuration for Maximal Marginal Relevance (MMR) search.
+       This will soon be deprated in favor of RerankConfig.

    is_enabled: True if MMR is enabled, False otherwise
    mmr_k: number of results to fetch for MMR, defaults to 50
@ -53,6 +62,26 @@ class MMRConfig:
    diversity_bias: float = 0.3


+@dataclass
+class RerankConfig:
+    """Configuration for Reranker.
+
+    reranker: "mmr", "rerank_multilingual_v1" or "none"
+    rerank_k: number of results to fetch before reranking, defaults to 50
+    mmr_diversity_bias: for MMR only - a number between 0 and 1 that determines
+        the degree of diversity among the results with 0 corresponding
+        to minimum diversity and 1 to maximum diversity.
+        Defaults to 0.3.
+        Note: mmr_diversity_bias is equivalent 1-lambda_mult
+        where lambda_mult is the value often used in max_marginal_relevance_search()
+        We chose to use that since we believe it's more intuitive to the user.
+    """
+
+    reranker: str = "none"
+    rerank_k: int = 50
+    mmr_diversity_bias: float = 0.3
+
+
@dataclass
 class VectaraQueryConfig:
    """Configuration for Vectara query.
@ -66,9 +95,11 @@ class VectaraQueryConfig:
    score_threshold: minimal score threshold for the result.
        If defined, results with score less than this value will be
        filtered out.
-    n_sentence_context: number of sentences before/after the matching segment
+    n_sentence_before: number of sentences before the matching segment
+        to add, defaults to 2
+    n_sentence_after: number of sentences before the matching segment
        to add, defaults to 2
-    mmr_config: MMRConfig configuration dataclass
+    rerank_config: RerankConfig configuration dataclass
    summary_config: SummaryConfig configuration dataclass
    """

@ -76,10 +107,63 @@ class VectaraQueryConfig:
    lambda_val: float = 0.0
    filter: str = ""
    score_threshold: Optional[float] = None
-    n_sentence_context: int = 2
-    mmr_config: MMRConfig = field(default_factory=MMRConfig)
+    n_sentence_before: int = 2
+    n_sentence_after: int = 2
+    rerank_config: RerankConfig = field(default_factory=RerankConfig)
    summary_config: SummaryConfig = field(default_factory=SummaryConfig)

+    def __init__(
+        self,
+        k: int = 10,
+        lambda_val: float = 0.0,
+        filter: str = "",
+        score_threshold: Optional[float] = None,
+        n_sentence_before: int = 2,
+        n_sentence_after: int = 2,
+        n_sentence_context: Optional[int] = None,
+        mmr_config: Optional[MMRConfig] = None,
+        summary_config: Optional[SummaryConfig] = None,
+        rerank_config: Optional[RerankConfig] = None,
+    ):
+        self.k = k
+        self.lambda_val = lambda_val
+        self.filter = filter
+        self.score_threshold = score_threshold
+
+        if summary_config:
+            self.summary_config = summary_config
+        else:
+            self.summary_config = SummaryConfig()
+
+        # handle n_sentence_context for backward compatibility
+        if n_sentence_context:
+            self.n_sentence_before = n_sentence_context
+            self.n_sentence_after = n_sentence_context
+            warnings.warn(
+                "n_sentence_context is deprecated. "
+                "Please use n_sentence_before and n_sentence_after instead",
+                DeprecationWarning,
+            )
+        else:
+            self.n_sentence_before = n_sentence_before
+            self.n_sentence_after = n_sentence_after
+
+        # handle mmr_config for backward compatibility
+        if rerank_config:
+            self.rerank_config = rerank_config
+        elif mmr_config:
+            self.rerank_config = RerankConfig(
+                reranker="mmr",
+                rerank_k=mmr_config.mmr_k,
+                mmr_diversity_bias=mmr_config.diversity_bias,
+            )
+            warnings.warn(
+                "MMRConfig is deprecated. Please use RerankConfig instead.",
+                DeprecationWarning,
+            )
+        else:
+            self.rerank_config = RerankConfig()
+

 class Vectara(VectorStore):
    """`Vectara API` vector store.
@ -150,9 +234,7 @@ class Vectara(VectorStore):
        Delete a document from the Vectara corpus.

        Args:
-            url (str): URL of the page to delete.
            doc_id (str): ID of the document to delete.
-
        Returns:
            bool: True if deletion was successful, False otherwise.
        """
@ -207,6 +289,21 @@ class Vectara(VectorStore):
        else:
            return "E_SUCCEEDED"

+    def delete(self, ids: Optional[List[str]] = None, **kwargs: Any) -> Optional[bool]:
+        """Delete by vector ID or other criteria.
+        Args:
+            ids: List of ids to delete.
+
+        Returns:
+            Optional[bool]: True if deletion is successful,
+            False otherwise, None if not implemented.
+        """
+        if ids:
+            success = [self._delete_doc(id) for id in ids]
+            return all(success)
+        else:
+            return True
+
    def add_files(
        self,
        files_list: Iterable[str],
@ -317,69 +414,104 @@ class Vectara(VectorStore):
            )
        return [doc_id]

-    def vectara_query(
+    def _get_query_body(
        self,
        query: str,
        config: VectaraQueryConfig,
+        chat: Optional[bool] = False,
+        chat_conv_id: Optional[str] = None,
        **kwargs: Any,
-    ) -> List[Tuple[Document, float]]:
-        """Run a Vectara query
+    ) -> dict:
+        """Build the body for the API

        Args:
            query: Text to look up documents similar to.
            config: VectaraQueryConfig object
        Returns:
-            A list of k Documents matching the given query
-            If summary is enabled, last document is the summary text with 'summary'=True
+            A dictionary with the body of the query
        """
-        if isinstance(config.mmr_config, dict):
-            config.mmr_config = MMRConfig(**config.mmr_config)
+        if isinstance(config.rerank_config, dict):
+            config.rerank_config = RerankConfig(**config.rerank_config)
        if isinstance(config.summary_config, dict):
            config.summary_config = SummaryConfig(**config.summary_config)

-        data = {
+        body = {
            "query": [
                {
                    "query": query,
                    "start": 0,
                    "numResults": (
-                        config.mmr_config.mmr_k
-                        if config.mmr_config.is_enabled
+                        config.rerank_config.rerank_k
+                        if (
+                            config.rerank_config.reranker
+                            in ["mmr", "rerank_multilingual_v1"]
+                        )
                        else config.k
                    ),
                    "contextConfig": {
-                        "sentencesBefore": config.n_sentence_context,
-                        "sentencesAfter": config.n_sentence_context,
+                        "sentencesBefore": config.n_sentence_before,
+                        "sentencesAfter": config.n_sentence_after,
                    },
                    "corpusKey": [
                        {
-                            "customerId": self._vectara_customer_id,
                            "corpusId": self._vectara_corpus_id,
                            "metadataFilter": config.filter,
-                            "lexicalInterpolationConfig": {"lambda": config.lambda_val},
                        }
                    ],
                }
            ]
        }
-        if config.mmr_config.is_enabled:
-            data["query"][0]["rerankingConfig"] = {
-                "rerankerId": 272725718,
-                "mmrConfig": {"diversityBias": config.mmr_config.diversity_bias},
+
+        if config.lambda_val > 0:
+            body["query"][0]["corpusKey"][0]["lexicalInterpolationConfig"] = {  # type: ignore
+                "lambda": config.lambda_val
            }
+
+        if config.rerank_config.reranker == "mmr":
+            body["query"][0]["rerankingConfig"] = {
+                "rerankerId": MMR_RERANKER_ID,
+                "mmrConfig": {"diversityBias": config.rerank_config.mmr_diversity_bias},
+            }
+        elif config.rerank_config.reranker == "rerank_multilingual_v1":
+            body["query"][0]["rerankingConfig"] = {
+                "rerankerId": RERANKER_MULTILINGUAL_V1_ID,
+            }
+
        if config.summary_config.is_enabled:
-            data["query"][0]["summary"] = [
+            body["query"][0]["summary"] = [
                {
                    "maxSummarizedResults": config.summary_config.max_results,
                    "responseLang": config.summary_config.response_lang,
                    "summarizerPromptName": config.summary_config.prompt_name,
                }
            ]
+            if chat:
+                body["query"][0]["summary"][0]["chat"] = {  # type: ignore
+                    "store": True,
+                    "conversationId": chat_conv_id,
+                }
+        return body
+
+    def vectara_query(
+        self,
+        query: str,
+        config: VectaraQueryConfig,
+        **kwargs: Any,
+    ) -> List[Tuple[Document, float]]:
+        """Run a Vectara query

+        Args:
+            query: Text to look up documents similar to.
+            config: VectaraQueryConfig object
+        Returns:
+            A list of k Documents matching the given query
+            If summary is enabled, last document is the summary text with 'summary'=True
+        """
+        body = self._get_query_body(query, config, **kwargs)
        response = self._session.post(
            headers=self._get_post_headers(),
            url="https://api.vectara.io/v1/query",
-            data=json.dumps(data),
+            data=json.dumps(body),
            timeout=self.vectara_api_timeout,
        )

@ -389,7 +521,7 @@ class Vectara(VectorStore):
                f"(code {response.status_code}, reason {response.reason}, details "
                f"{response.text})",
            )
-            return [], ""  # type: ignore[return-value]
+            return []

        result = response.json()

@ -424,14 +556,19 @@ class Vectara(VectorStore):
            for x, md in zip(responses, metadatas)
        ]

-        if config.mmr_config.is_enabled:
+        if config.rerank_config.reranker in ["mmr", "rerank_multilingual_v1"]:
            res = res[: config.k]
        if config.summary_config.is_enabled:
            summary = result["responseSet"][0]["summary"][0]["text"]
+            fcs = result["responseSet"][0]["summary"][0]["factualConsistency"]["score"]
            res.append(
-                (Document(page_content=summary, metadata={"summary": True}), 0.0)
+                (
+                    Document(
+                        page_content=summary, metadata={"summary": True, "fcs": fcs}
+                    ),
+                    0.0,
+                )
            )
-
        return res

    def similarity_search_with_score(
@ -444,12 +581,15 @@ class Vectara(VectorStore):
        Args:
            query: Text to look up documents similar to.
            k: Number of Documents to return. Defaults to 10.
+
            any other querying variable in VectaraQueryConfig like:
            - lambda_val: lexical match parameter for hybrid search.
            - filter: filter string
            - score_threshold: minimal score threshold for the result.
-            - n_sentence_context: number of sentences before/after the matching segment
-            - mmr_config: optional configuration for MMR (see MMRConfig dataclass)
+            - n_sentence_before: number of sentences before the matching segment
+            - n_sentence_after: number of sentences after the matching segment
+            - rerank_config: optional configuration for Reranking
+              (see RerankConfig dataclass)
            - summary_config: optional configuration for summary
              (see SummaryConfig dataclass)
        Returns:
@ -503,8 +643,8 @@ class Vectara(VectorStore):
        Returns:
            List of Documents selected by maximal marginal relevance.
        """
-        kwargs["mmr_config"] = MMRConfig(
-            is_enabled=True, mmr_k=fetch_k, diversity_bias=1 - lambda_mult
+        kwargs["rerank_config"] = RerankConfig(
+            reranker="mmr", rerank_k=fetch_k, mmr_diversity_bias=1 - lambda_mult
        )
        return self.similarity_search(query, **kwargs)

@ -567,42 +707,188 @@ class Vectara(VectorStore):
        vectara.add_files(files, metadatas)
        return vectara

+    def as_rag(self, config: VectaraQueryConfig) -> VectaraRAG:
+        """Return a Vectara RAG runnable."""
+        return VectaraRAG(self, config)
+
+    def as_chat(self, config: VectaraQueryConfig) -> VectaraRAG:
+        """Return a Vectara RAG runnable for chat."""
+        return VectaraRAG(self, config, chat=True)
+
+    def as_retriever(self, **kwargs: Any) -> VectaraRetriever:
+        """return a retriever object."""
+        return VectaraRetriever(
+            vectorstore=self, config=kwargs.get("config", VectaraQueryConfig())
+        )
+

 class VectaraRetriever(VectorStoreRetriever):
-    """Retriever for `Vectara`."""
+    """Vectara Retriever class."""

    vectorstore: Vectara
-    """Vectara vectorstore."""
-    search_kwargs: dict = Field(
-        default_factory=lambda: {
-            "lambda_val": 0.0,
-            "k": 5,
-            "filter": "",
-            "n_sentence_context": "2",
-            "summary_config": SummaryConfig(),
-        }
-    )
-
-    """Search params.
-        k: Number of Documents to return. Defaults to 5.
-        lambda_val: lexical match parameter for hybrid search.
-        filter: Dictionary of argument(s) to filter on metadata. For example a
-            filter can be "doc.rating > 3.0 and part.lang = 'deu'"} see
-            https://docs.vectara.com/docs/search-apis/sql/filter-overview
-            for more details.
-        n_sentence_context: number of sentences before/after the matching segment to add
-    """
+    """VectorStore to use for retrieval."""

-    def add_texts(
+    config: VectaraQueryConfig
+    """Configuration for this retriever."""
+
+    class Config:
+        """Configuration for this pydantic object."""
+
+        arbitrary_types_allowed = True
+
+    def _get_relevant_documents(
+        self, query: str, *, run_manager: CallbackManagerForRetrieverRun
+    ) -> List[Document]:
+        docs_and_scores = self.vectorstore.vectara_query(query, self.config)
+        return [doc for doc, _ in docs_and_scores]
+
+    def add_documents(self, documents: List[Document], **kwargs: Any) -> List[str]:
+        """Add documents to vectorstore."""
+        return self.vectorstore.add_documents(documents, **kwargs)
+
+
+class VectaraRAG(Runnable):
+    def __init__(
+        self, vectara: Vectara, config: VectaraQueryConfig, chat: bool = False
+    ):
+        self.vectara = vectara
+        self.config = config
+        self.chat = chat
+        self.conv_id = None
+
+    def stream(
        self,
-        texts: List[str],
-        metadatas: Optional[List[dict]] = None,
-        doc_metadata: Optional[dict] = None,
-    ) -> None:
-        """Add text to the Vectara vectorstore.
+        input: str,
+        config: Optional[RunnableConfig] = None,
+        **kwargs: Any,
+    ) -> Iterator[dict]:
+        """get streaming output from Vectara RAG

        Args:
-            texts (List[str]): The text
-            metadatas (List[dict]): Metadata dicts, must line up with existing store
+            query: The input query
+
+        Returns:
+            The output dictionary with question, answer and context
        """
-        self.vectorstore.add_texts(texts, metadatas, doc_metadata or {})
+        body = self.vectara._get_query_body(input, self.config, self.chat, self.conv_id)
+
+        response = self.vectara._session.post(
+            headers=self.vectara._get_post_headers(),
+            url="https://api.vectara.io/v1/stream-query",
+            data=json.dumps(body),
+            timeout=self.vectara.vectara_api_timeout,
+            stream=True,
+        )
+
+        if response.status_code != 200:
+            logger.error(
+                "Query failed %s",
+                f"(code {response.status_code}, reason {response.reason}, details "
+                f"{response.text})",
+            )
+            return
+
+        responses = []
+        documents = []
+
+        yield {"question": input}  # First chunk is the question
+
+        for line in response.iter_lines():
+            if line:  # filter out keep-alive new lines
+                data = json.loads(line.decode("utf-8"))
+                result = data["result"]
+                response_set = result["responseSet"]
+                if response_set is None:
+                    summary = result.get("summary", None)
+                    if summary is None:
+                        continue
+                    if len(summary.get("status")) > 0:
+                        logger.error(
+                            f"Summary generation failed with status "
+                            f"{summary.get('status')[0].get('statusDetail')}"
+                        )
+                        continue
+
+                    # Store conversation ID for chat, if applicable
+                    chat = summary.get("chat", None)
+                    if chat and chat.get("status", None):
+                        st_code = chat["status"]
+                        logger.info(f"Chat query failed with code {st_code}")
+                        if st_code == "RESOURCE_EXHAUSTED":
+                            self.conv_id = None
+                            logger.error(
+                                "Sorry, Vectara chat turns exceeds plan limit."
+                            )
+                            continue
+
+                    conv_id = chat.get("conversationId", None) if chat else None
+                    if conv_id:
+                        self.conv_id = conv_id
+
+                    # If FCS is provided, pull it from the JSON response
+                    if summary.get("factualConsistency", None):
+                        fcs = summary.get("factualConsistency", {}).get("score", None)
+                        yield {"fcs": fcs}
+                        continue
+
+                    # Yield the summary chunk
+                    chunk = str(summary["text"])
+                    yield {"answer": chunk}
+                else:
+                    if self.config.score_threshold:
+                        responses = [
+                            r
+                            for r in response_set["response"]
+                            if r["score"] > self.config.score_threshold
+                        ]
+                    else:
+                        responses = response_set["response"]
+                    documents = response_set["document"]
+                    metadatas = []
+                    for x in responses:
+                        md = {m["name"]: m["value"] for m in x["metadata"]}
+                        doc_num = x["documentIndex"]
+                        doc_md = {
+                            m["name"]: m["value"]
+                            for m in documents[doc_num]["metadata"]
+                        }
+                        if "source" not in doc_md:
+                            doc_md["source"] = "vectara"
+                        md.update(doc_md)
+                        metadatas.append(md)
+                    res = [
+                        (
+                            Document(
+                                page_content=x["text"],
+                                metadata=md,
+                            ),
+                            x["score"],
+                        )
+                        for x, md in zip(responses, metadatas)
+                    ]
+                    if self.config.rerank_config.reranker in [
+                        "mmr",
+                        "rerank_multilingual_v1",
+                    ]:
+                        res = res[: self.config.k]
+                    yield {"context": res}
+        return
+
+    def invoke(
+        self,
+        input: str,
+        config: Optional[RunnableConfig] = None,
+    ) -> dict:
+        res = {"answer": ""}
+        for chunk in self.stream(input):
+            if "context" in chunk:
+                res["context"] = chunk["context"]
+            elif "question" in chunk:
+                res["question"] = chunk["question"]
+            elif "answer" in chunk:
+                res["answer"] += chunk["answer"]
+            elif "fcs" in chunk:
+                res["fcs"] = chunk["fcs"]
+            else:
+                logger.error(f"Unknown chunk type: {chunk}")
+        return res
--- a/libs/community/tests/integration_tests/vectorstores/test_vectara.py
+++ b/libs/community/tests/integration_tests/vectorstores/test_vectara.py
@ -4,19 +4,25 @@ import urllib.request
 import pytest
 from langchain_core.documents import Document

-# from langchain_community.vectorstores.vectara import Vectara, SummaryConfig
-from langchain_community.vectorstores.vectara import SummaryConfig, Vectara
-from tests.integration_tests.vectorstores.fake_embeddings import FakeEmbeddings
+from langchain_community.vectorstores import Vectara
+from langchain_community.vectorstores.vectara import (
+    MMRConfig,
+    RerankConfig,
+    SummaryConfig,
+    VectaraQueryConfig,
+)

 #
 # For this test to run properly, please setup as follows:
-# 1. Create a Vectara account: sign up at https://console.vectara.com/signup
+# 1. Create a Vectara account: sign up at https://www.vectara.com/integrations/langchain
 # 2. Create a corpus in your Vectara account, with a filter attribute called "test_num".
 # 3. Create an API_KEY for this corpus with permissions for query and indexing
 # 4. Setup environment variables:
 #    VECTARA_API_KEY, VECTARA_CORPUS_ID and VECTARA_CUSTOMER_ID
 #

+test_prompt_name = "vectara-experimental-summary-ext-2023-12-11-sml"
+

 def get_abbr(s: str) -> str:
    words = s.split(" ")  # Split the string into words
@ -50,36 +56,34 @@ def vectara1():  # type: ignore[no-untyped-def]
    yield vectara1

    # Tear down code
-    for doc_id in doc_ids:
-        vectara1._delete_doc(doc_id)
+    vectara1.delete(doc_ids)


-def test_vectara_add_documents(vectara1) -> None:  # type: ignore[no-untyped-def]
+def test_vectara_add_documents(vectara1: Vectara) -> None:  # type: ignore[no-untyped-def]
    """Test add_documents."""

    # test without filter
    output1 = vectara1.similarity_search(
        "large language model",
        k=2,
-        n_sentence_context=0,
+        n_sentence_before=0,
+        n_sentence_after=0,
    )
    assert len(output1) == 2
    assert output1[0].page_content == "large language model"
    assert output1[0].metadata["abbr"] == "llm"
-    assert output1[1].page_content == "grounded generation"
-    assert output1[1].metadata["abbr"] == "gg"

    # test with metadata filter (doc level)
-    # since the query does not match test_num=1 directly we get "LLM" as the result
    output2 = vectara1.similarity_search(
        "large language model",
        k=1,
-        n_sentence_context=0,
+        n_sentence_before=0,
+        n_sentence_after=0,
        filter="doc.test_num = 1",
    )
    assert len(output2) == 1
-    assert output2[0].page_content == "grounded generation"
-    assert output2[0].metadata["abbr"] == "gg"
+    assert output2[0].page_content == "retrieval augmented generation"
+    assert output2[0].metadata["abbr"] == "rag"

    # test without filter but with similarity score
    # this is similar to the first test, but given the score threshold
@ -87,19 +91,21 @@ def test_vectara_add_documents(vectara1) -> None:  # type: ignore[no-untyped-def
    output3 = vectara1.similarity_search_with_score(
        "large language model",
        k=2,
-        score_threshold=0.8,
-        n_sentence_context=0,
+        score_threshold=0.5,
+        n_sentence_before=0,
+        n_sentence_after=0,
    )
-    assert len(output3) == 1
+    assert len(output3) == 2
    assert output3[0][0].page_content == "large language model"
    assert output3[0][0].metadata["abbr"] == "llm"


-def test_vectara_from_files() -> None:
-    """Test end to end construction and search."""
-
+@pytest.fixture(scope="function")
+def vectara2():  # type: ignore[no-untyped-def]
    # download documents to local storage and then upload as files
    # attention paper and deep learning book
+    vectara2: Vectara = Vectara()
+
    urls = [
        (
            "https://papers.nips.cc/paper_files/paper/2017/"
@ -117,50 +123,102 @@ def test_vectara_from_files() -> None:
        urllib.request.urlretrieve(url, name)
        files_list.append(name)

-    docsearch: Vectara = Vectara()
-    doc_ids = docsearch.add_files(
+    doc_ids = vectara2.add_files(
        files_list=files_list,
-        embedding=FakeEmbeddings(),
        metadatas=[{"url": url, "test_num": "2"} for url in urls],
    )

-    # finally do a similarity search to see if all works okay
-    output = docsearch.similarity_search(
+    yield vectara2
+
+    # Tear down code
+    vectara2.delete(doc_ids)
+
+
+def test_vectara_from_files(vectara2: Vectara) -> None:
+    """test uploading data from files"""
+    output = vectara2.similarity_search(
        "By the commonly adopted machine learning tradition",
        k=1,
-        n_sentence_context=0,
+        n_sentence_before=0,
+        n_sentence_after=0,
        filter="doc.test_num = 2",
    )
-    assert output[0].page_content == (
-        "By the commonly adopted machine learning tradition "
-        "(e.g., Chapter 28 in Murphy, 2012; Deng and Li, 2013), it may be natural "
-        "to just classify deep learning techniques into deep discriminative models "
-        "(e.g., DNNs) and deep probabilistic generative models (e.g., DBN, Deep "
-        "Boltzmann Machine (DBM))."
+    assert (
+        "By the commonly adopted machine learning tradition" in output[0].page_content
    )

-    # finally do a similarity search to see if all works okay
-    output = docsearch.similarity_search(
+    # another similarity search, this time with n_sentences_before/after = 1
+    output = vectara2.similarity_search(
+        "By the commonly adopted machine learning tradition",
+        k=1,
+        n_sentence_before=1,
+        n_sentence_after=1,
+        filter="doc.test_num = 2",
+    )
+    assert "Note the use of" in output[0].page_content
+
+    # Test the old n_sentence_context to ensure it's backward compatible
+    output = vectara2.similarity_search(
        "By the commonly adopted machine learning tradition",
        k=1,
        n_sentence_context=1,
        filter="doc.test_num = 2",
    )
-    assert output[0].page_content == (
-        """\
-Note the use of “hybrid” in 3) above is different from that used sometimes in the literature, \
-which for example refers to the hybrid systems for speech recognition feeding the output probabilities of a neural network into an HMM \
-(Bengio et al., 1991; Bourlard and Morgan, 1993; Morgan, 2012). \
-By the commonly adopted machine learning tradition (e.g., Chapter 28 in Murphy, 2012; Deng and Li, 2013), \
-it may be natural to just classify deep learning techniques into deep discriminative models (e.g., DNNs) \
-and deep probabilistic generative models (e.g., DBN, Deep Boltzmann Machine (DBM)). \
-This classification scheme, however, misses a key insight gained in deep learning research about how generative \
-models can greatly improve the training of DNNs and other deep discriminative models via better regularization.\
-"""  # noqa: E501
+    assert "Note the use of" in output[0].page_content
+
+
+def test_vectara_rag_with_reranking(vectara2: Vectara) -> None:
+    """Test Vectara reranking."""
+
+    query_str = "What is a transformer model?"
+
+    # Note: we don't test rerank_multilingual_v1 as it's for Scale only
+
+    # Test MMR
+    summary_config = SummaryConfig(
+        is_enabled=True,
+        max_results=7,
+        response_lang="eng",
+        prompt_name=test_prompt_name,
+    )
+    rerank_config = RerankConfig(reranker="mmr", rerank_k=50, mmr_diversity_bias=0.2)
+    config = VectaraQueryConfig(
+        k=10,
+        lambda_val=0.005,
+        rerank_config=rerank_config,
+        summary_config=summary_config,
+    )
+
+    rag1 = vectara2.as_rag(config)
+    response1 = rag1.invoke(query_str)
+
+    assert "transformer model" in response1["answer"].lower()
+
+    # Test No reranking
+    summary_config = SummaryConfig(
+        is_enabled=True,
+        max_results=7,
+        response_lang="eng",
+        prompt_name=test_prompt_name,
+    )
+    rerank_config = RerankConfig(reranker="None")
+    config = VectaraQueryConfig(
+        k=10,
+        lambda_val=0.005,
+        rerank_config=rerank_config,
+        summary_config=summary_config,
    )
+    rag2 = vectara2.as_rag(config)
+    response2 = rag2.invoke(query_str)

-    for doc_id in doc_ids:
-        docsearch._delete_doc(doc_id)
+    assert "transformer model" in response2["answer"].lower()
+
+    # assert that the page content is different for the top 5 results
+    # in each reranking
+    n_results = 10
+    response1_content = [x[0].page_content for x in response1["context"][:n_results]]
+    response2_content = [x[0].page_content for x in response2["context"][:n_results]]
+    assert response1_content != response2_content


@pytest.fixture(scope="function")
@ -206,21 +264,20 @@ def vectara3():  # type: ignore[no-untyped-def]
    yield vectara3

    # Tear down code
-    for doc_id in doc_ids:
-        vectara3._delete_doc(doc_id)
+    vectara3.delete(doc_ids)


-def test_vectara_mmr(vectara3) -> None:  # type: ignore[no-untyped-def]
+def test_vectara_with_langchain_mmr(vectara3: Vectara) -> None:  # type: ignore[no-untyped-def]
    # test max marginal relevance
    output1 = vectara3.max_marginal_relevance_search(
        "generative AI",
        k=2,
        fetch_k=6,
        lambda_mult=1.0,  # no diversity bias
-        n_sentence_context=0,
+        n_sentence_before=0,
+        n_sentence_after=0,
    )
    assert len(output1) == 2
-    assert "Generative AI promises to revolutionize how" in output1[0].page_content
    assert (
        "This is why today we're adding a fundamental capability"
        in output1[1].page_content
@ -231,16 +288,64 @@ def test_vectara_mmr(vectara3) -> None:  # type: ignore[no-untyped-def]
        k=2,
        fetch_k=6,
        lambda_mult=0.0,  # only diversity bias
-        n_sentence_context=0,
+        n_sentence_before=0,
+        n_sentence_after=0,
    )
    assert len(output2) == 2
-    assert "Generative AI promises to revolutionize how" in output2[0].page_content
    assert (
        "Neural LLM systems are excellent at understanding the context"
        in output2[1].page_content
    )


+def test_vectara_mmr(vectara3: Vectara) -> None:  # type: ignore[no-untyped-def]
+    # test MMR directly with rerank_config
+    summary_config = SummaryConfig(is_enabled=True, max_results=7, response_lang="eng")
+    rerank_config = RerankConfig(reranker="mmr", rerank_k=50, mmr_diversity_bias=0.2)
+    config = VectaraQueryConfig(
+        k=10,
+        lambda_val=0.005,
+        rerank_config=rerank_config,
+        summary_config=summary_config,
+    )
+    rag = vectara3.as_rag(config)
+    output1 = rag.invoke("what is generative AI?")["answer"]
+    assert len(output1) > 0
+
+    # test MMR directly with old mmr_config
+    summary_config = SummaryConfig(is_enabled=True, max_results=7, response_lang="eng")
+    mmr_config = MMRConfig(is_enabled=True, mmr_k=50, diversity_bias=0.2)
+    config = VectaraQueryConfig(
+        k=10, lambda_val=0.005, mmr_config=mmr_config, summary_config=summary_config
+    )
+    rag = vectara3.as_rag(config)
+    output2 = rag.invoke("what is generative AI?")["answer"]
+    assert len(output2) > 0
+
+    # test reranking disabled - RerankConfig
+    summary_config = SummaryConfig(is_enabled=True, max_results=7, response_lang="eng")
+    rerank_config = RerankConfig(reranker="none")
+    config = VectaraQueryConfig(
+        k=10,
+        lambda_val=0.005,
+        rerank_config=rerank_config,
+        summary_config=summary_config,
+    )
+    rag = vectara3.as_rag(config)
+    output1 = rag.invoke("what is generative AI?")["answer"]
+    assert len(output1) > 0
+
+    # test with reranking disabled - MMRConfig
+    summary_config = SummaryConfig(is_enabled=True, max_results=7, response_lang="eng")
+    mmr_config = MMRConfig(is_enabled=False, mmr_k=50, diversity_bias=0.2)
+    config = VectaraQueryConfig(
+        k=10, lambda_val=0.005, mmr_config=mmr_config, summary_config=summary_config
+    )
+    rag = vectara3.as_rag(config)
+    output2 = rag.invoke("what is generative AI?")["answer"]
+    assert len(output2) > 0
+
+
 def test_vectara_with_summary(vectara3) -> None:  # type: ignore[no-untyped-def]
    """Test vectara summary."""
    # test summarization
@ -248,7 +353,12 @@ def test_vectara_with_summary(vectara3) -> None:  # type: ignore[no-untyped-def]
    output1 = vectara3.similarity_search(
        query="what is generative AI?",
        k=num_results,
-        summary_config=SummaryConfig(is_enabled=True, max_results=5),
+        summary_config=SummaryConfig(
+            is_enabled=True,
+            max_results=5,
+            response_lang="eng",
+            prompt_name=test_prompt_name,
+        ),
    )

    assert len(output1) == num_results + 1
--- a/libs/community/tests/unit_tests/vectorstores/test_indexing_docs.py
+++ b/libs/community/tests/unit_tests/vectorstores/test_indexing_docs.py
@ -93,6 +93,7 @@ def test_compatible_vectorstore_documentation() -> None:
        "Vald",
        "VDMS",
        "Vearch",
+        "Vectara",
        "VespaStore",
        "VLite",
        "Weaviate",
--- a/templates/rag-vectara-multiquery/README.md
+++ b/templates/rag-vectara-multiquery/README.md
@ -5,7 +5,7 @@ This template performs multiquery RAG with vectara.

 ## Environment Setup

-Set the `OPENAI_API_KEY` environment variable to access the OpenAI models.
+Set the `OPENAI_API_KEY` environment variable to access the OpenAI models for the multi-query processing.

 Also, ensure the following environment variables are set:
 * `VECTARA_CUSTOMER_ID`
--- a/templates/rag-vectara-multiquery/pyproject.toml
+++ b/templates/rag-vectara-multiquery/pyproject.toml
@ -1,6 +1,6 @@
 [tool.poetry]
 name = "rag-vectara-multiquery"
-version = "0.1.0"
+version = "0.2.0"
 description = "RAG using vectara with multiquery retriever"
 authors = [
    "Ofer Mendelevitch <ofer@vectara.com>",
--- a/templates/rag-vectara-multiquery/rag_vectara_multiquery/chain.py
+++ b/templates/rag-vectara-multiquery/rag_vectara_multiquery/chain.py
@ -1,11 +1,12 @@
 import os

 from langchain.retrievers.multi_query import MultiQueryRetriever
-from langchain_community.chat_models import ChatOpenAI
 from langchain_community.vectorstores import Vectara
+from langchain_community.vectorstores.vectara import SummaryConfig, VectaraQueryConfig
 from langchain_core.output_parsers import StrOutputParser
 from langchain_core.pydantic_v1 import BaseModel
 from langchain_core.runnables import RunnableParallel, RunnablePassthrough
+from langchain_openai.chat_models import ChatOpenAI

 if os.environ.get("VECTARA_CUSTOMER_ID", None) is None:
    raise Exception("Missing `VECTARA_CUSTOMER_ID` environment variable.")
@ -16,30 +17,20 @@ if os.environ.get("VECTARA_API_KEY", None) is None:


 # Setup the Vectara retriever with your Corpus ID and API Key
+vectara = Vectara()

-# note you can customize the retriever behavior by passing additional arguments:
-# - k: number of results to return (defaults to 5)
-# - lambda_val: the
-#   [lexical matching](https://docs.vectara.com/docs/api-reference/search-apis/lexical-matching)
-#   factor for hybrid search (defaults to 0.025)
-# - filter: a [filter](https://docs.vectara.com/docs/common-use-cases/filtering-by-metadata/filter-overview)
-#   to apply to the results (default None)
-# - n_sentence_context: number of sentences to include before/after the actual matching
-#   segment when returning results. This defaults to 2.
-# - mmr_config: can be used to specify MMR mode in the query.
-#   - is_enabled: True or False
-#   - mmr_k: number of results to use for MMR reranking
-#   - diversity_bias: 0 = no diversity, 1 = full diversity. This is the lambda
-#     parameter in the MMR formula and is in the range 0...1
-vectara_retriever = Vectara().as_retriever()
+# Define the query configuration:
+summary_config = SummaryConfig(is_enabled=True, max_results=5, response_lang="eng")
+config = VectaraQueryConfig(k=10, lambda_val=0.005, summary_config=summary_config)

 # Setup the Multi-query retriever
 llm = ChatOpenAI(temperature=0)
-retriever = MultiQueryRetriever.from_llm(retriever=vectara_retriever, llm=llm)
+retriever = MultiQueryRetriever.from_llm(
+    retriever=vectara.as_retriever(config=config), llm=llm
+)

 # Setup RAG pipeline with multi-query.
-# We extract the summary from the RAG output, which is the last document
-# (if summary is enabled)
+# We extract the summary from the RAG output, which is the last document in the list.
 # Note that if you want to extract the citation information, you can use res[:-1]]
 chain = (
    RunnableParallel({"context": retriever, "question": RunnablePassthrough()})
--- a/templates/rag-vectara/README.md
+++ b/templates/rag-vectara/README.md
@ -5,8 +5,6 @@ This template performs RAG with vectara.

 ## Environment Setup

-Set the `OPENAI_API_KEY` environment variable to access the OpenAI models.
-
 Also, ensure the following environment variables are set:
 * `VECTARA_CUSTOMER_ID`
 * `VECTARA_CORPUS_ID`
--- a/templates/rag-vectara/pyproject.toml
+++ b/templates/rag-vectara/pyproject.toml
@ -1,6 +1,6 @@
 [tool.poetry]
 name = "rag-vectara"
-version = "0.1.0"
+version = "0.2.0"
 description = "RAG using vectara retriever"
 authors = [
    "Ofer Mendelevitch <ofer@vectara.com>",
--- a/templates/rag-vectara/rag_vectara/chain.py
+++ b/templates/rag-vectara/rag_vectara/chain.py
@ -1,9 +1,8 @@
 import os

 from langchain_community.vectorstores import Vectara
-from langchain_core.output_parsers import StrOutputParser
+from langchain_community.vectorstores.vectara import SummaryConfig, VectaraQueryConfig
 from langchain_core.pydantic_v1 import BaseModel
-from langchain_core.runnables import RunnableParallel, RunnablePassthrough

 if os.environ.get("VECTARA_CUSTOMER_ID", None) is None:
    raise Exception("Missing `VECTARA_CUSTOMER_ID` environment variable.")
@ -12,32 +11,14 @@ if os.environ.get("VECTARA_CORPUS_ID", None) is None:
 if os.environ.get("VECTARA_API_KEY", None) is None:
    raise Exception("Missing `VECTARA_API_KEY` environment variable.")

-# Setup the Vectara retriever with your Corpus ID and API Key
-
-# note you can customize the retriever behavior by passing additional arguments:
-# - k: number of results to return (defaults to 5)
-# - lambda_val: the
-#   [lexical matching](https://docs.vectara.com/docs/api-reference/search-apis/lexical-matching)
-#   factor for hybrid search (defaults to 0.025)
-# - filter: a [filter](https://docs.vectara.com/docs/common-use-cases/filtering-by-metadata/filter-overview)
-#   to apply to the results (default None)
-# - n_sentence_context: number of sentences to include before/after the actual matching
-#   segment when returning results. This defaults to 2.
-# - mmr_config: can be used to specify MMR mode in the query.
-#   - is_enabled: True or False
-#   - mmr_k: number of results to use for MMR reranking
-#   - diversity_bias: 0 = no diversity, 1 = full diversity. This is the lambda
-#     parameter in the MMR formula and is in the range 0...1
-retriever = Vectara().as_retriever()
-
-# RAG pipeline: we extract the summary from the RAG output, which is the last document
-# (if summary is enabled)
-# Note that if you want to extract the citation information, you can use res[:-1]]
-chain = (
-    RunnableParallel({"context": retriever, "question": RunnablePassthrough()})
-    | (lambda res: res[-1])
-    | StrOutputParser()
-)
+# Setup the Vectara vectorstore with your Corpus ID and API Key
+vectara = Vectara()
+
+# Define the query configuration:
+summary_config = SummaryConfig(is_enabled=True, max_results=5, response_lang="eng")
+config = VectaraQueryConfig(k=10, lambda_val=0.005, summary_config=summary_config)
+
+rag = Vectara().as_rag(config)


 # Add typing for input
@ -45,4 +26,4 @@ class Question(BaseModel):
    __root__: str


-chain = chain.with_types(input_type=Question)
+chain = rag.with_types(input_type=Question)