langchain/docs/extras/integrations/vectorstores/vectara.ipynb

{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "683953b3",
   "metadata": {},
   "source": [
    "# Vectara\n",
    "\n",
    ">[Vectara](https://vectara.com/) is a API platform for building GenAI applications. It provides an easy-to-use API for document indexing and query that is managed by Vectara and is optimized for performance and accuracy. \n",
    "See the [Vectara API documentation ](https://docs.vectara.com/docs/) for more information on how to use the API.\n",
    "\n",
    "This notebook shows how to use functionality related to the `Vectara`'s integration with langchain.\n",
    "Note that unlike many other integrations in this category, Vectara provides an end-to-end managed service for [Grounded Generation](https://vectara.com/grounded-generation/) (aka retrieval agumented generation), which includes:\n",
    "1. A way to extract text from document files and chunk them into sentences.\n",
    "2. Its own embeddings model and vector store - each text segment is encoded into a vector embedding and stored in the Vectara internal vector store\n",
    "3. A query service that automatically encodes the query into embedding, and retrieves the most relevant text segments (including support for [Hybrid Search](https://docs.vectara.com/docs/api-reference/search-apis/lexical-matching))\n",
    "\n",
    "All of these are supported in this LangChain integration."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dc0f4344",
   "metadata": {},
   "source": [
    "# Setup\n",
    "\n",
    "You will need a Vectara account to use Vectara with LangChain. To get started, use the following steps:\n",
    "1. [Sign up](https://console.vectara.com/signup) for a Vectara account if you don't already have one. Once you have completed your sign up you will have a Vectara customer ID. You can find your customer ID by clicking on your name, on the top-right of the Vectara console window.\n",
    "2. Within your account you can create one or more corpora. Each corpus represents an area that stores text data upon ingest from input documents. To create a corpus, use the **\"Create Corpus\"** button. You then provide a name to your corpus as well as a description. Optionally you can define filtering attributes and apply some advanced options. If you click on your created corpus, you can see its name and corpus ID right on the top.\n",
    "3. Next you'll need to create API keys to access the corpus. Click on the **\"Authorization\"** tab in the corpus view and then the **\"Create API Key\"** button. Give your key a name, and choose whether you want query only or query+index for your key. Click \"Create\" and you now have an active API key. Keep this key confidential. \n",
    "\n",
    "To use LangChain with Vectara, you'll need to have these three values: customer ID, corpus ID and api_key.\n",
    "You can provide those to LangChain in two ways:\n",
    "\n",
    "1. Include in your environment these three variables: `VECTARA_CUSTOMER_ID`, `VECTARA_CORPUS_ID` and `VECTARA_API_KEY`.\n",
    "\n",
    "> For example, you can set these variables using os.environ and getpass as follows:\n",
    "\n",
    "```python\n",
    "import os\n",
    "import getpass\n",
    "\n",
    "os.environ[\"VECTARA_CUSTOMER_ID\"] = getpass.getpass(\"Vectara Customer ID:\")\n",
    "os.environ[\"VECTARA_CORPUS_ID\"] = getpass.getpass(\"Vectara Corpus ID:\")\n",
    "os.environ[\"VECTARA_API_KEY\"] = getpass.getpass(\"Vectara API Key:\")\n",
    "```\n",
    "\n",
    "2. Add them to the Vectara vectorstore constructor:\n",
    "\n",
    "```python\n",
    "vectorstore = Vectara(\n",
    "                vectara_customer_id=vectara_customer_id,\n",
    "                vectara_corpus_id=vectara_corpus_id,\n",
    "                vectara_api_key=vectara_api_key\n",
    "            )\n",
    "```"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "eeead681",
   "metadata": {},
   "source": [
    "## Connecting to Vectara from LangChain\n",
    "\n",
    "To get started, let's ingest the documents using the from_documents() method.\n",
    "We assume here that you've added your VECTARA_CUSTOMER_ID, VECTARA_CORPUS_ID and query+indexing VECTARA_API_KEY as environment variables."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "04a1f1a0",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.embeddings import FakeEmbeddings\n",
    "from langchain.text_splitter import CharacterTextSplitter\n",
    "from langchain.vectorstores import Vectara\n",
    "from langchain.document_loaders import TextLoader"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "be0a4973",
   "metadata": {},
   "outputs": [],
   "source": [
    "loader = TextLoader(\"../../../state_of_the_union.txt\")\n",
    "documents = loader.load()\n",
    "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
    "docs = text_splitter.split_documents(documents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "8429667e",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-04-04T10:51:22.525091Z",
     "start_time": "2023-04-04T10:51:22.522015Z"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "vectara = Vectara.from_documents(\n",
    "    docs,\n",
    "    embedding=FakeEmbeddings(size=768),\n",
    "    doc_metadata={\"speech\": \"state-of-the-union\"},\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "90dbf3e7",
   "metadata": {},
   "source": [
    "Vectara's indexing API provides a file upload API where the file is handled directly by Vectara - pre-processed, chunked optimally and added to the Vectara vector store.\n",
    "To use this, we added the add_files() method (as well as from_files()). \n",
    "\n",
    "Let's see this in action. We pick two PDF documents to upload: \n",
    "1. The \"I have a dream\" speech by Dr. King\n",
    "2. Churchill's \"We Shall Fight on the Beaches\" speech"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "85ef3468",
   "metadata": {},
   "outputs": [],
   "source": [
    "import tempfile\n",
    "import urllib.request\n",
    "\n",
    "urls = [\n",
    "    [\n",
    "        \"https://www.gilderlehrman.org/sites/default/files/inline-pdfs/king.dreamspeech.excerpts.pdf\",\n",
    "        \"I-have-a-dream\",\n",
    "    ],\n",
    "    [\n",
    "        \"https://www.parkwayschools.net/cms/lib/MO01931486/Centricity/Domain/1578/Churchill_Beaches_Speech.pdf\",\n",
    "        \"we shall fight on the beaches\",\n",
    "    ],\n",
    "]\n",
    "files_list = []\n",
    "for url, _ in urls:\n",
    "    name = tempfile.NamedTemporaryFile().name\n",
    "    urllib.request.urlretrieve(url, name)\n",
    "    files_list.append(name)\n",
    "\n",
    "docsearch: Vectara = Vectara.from_files(\n",
    "    files=files_list,\n",
    "    embedding=FakeEmbeddings(size=768),\n",
    "    metadatas=[{\"url\": url, \"speech\": title} for url, title in urls],\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "1f9215c8",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-04-04T09:27:29.920258Z",
     "start_time": "2023-04-04T09:27:29.913714Z"
    }
   },
   "source": [
    "## Similarity search\n",
    "\n",
    "The simplest scenario for using Vectara is to perform a similarity search. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "a8c513ab",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-04-04T10:51:25.204469Z",
     "start_time": "2023-04-04T10:51:24.855618Z"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
    "found_docs = vectara.similarity_search(\n",
    "    query, n_sentence_context=0, filter=\"doc.speech = 'state-of-the-union'\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "fc516993",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-04-04T10:51:25.220984Z",
     "start_time": "2023-04-04T10:51:25.213943Z"
    },
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
      "\n",
      "Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
      "\n",
      "One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
      "\n",
      "And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.\n"
     ]
    }
   ],
   "source": [
    "print(found_docs[0].page_content)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "1bda9bf5",
   "metadata": {},
   "source": [
    "## Similarity search with score\n",
    "\n",
    "Sometimes we might want to perform the search, but also obtain a relevancy score to know how good is a particular result."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "8804a21d",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-04-04T10:51:25.631585Z",
     "start_time": "2023-04-04T10:51:25.227384Z"
    }
   },
   "outputs": [],
   "source": [
    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
    "found_docs = vectara.similarity_search_with_score(\n",
    "    query, filter=\"doc.speech = 'state-of-the-union'\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "756a6887",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-04-04T10:51:25.642282Z",
     "start_time": "2023-04-04T10:51:25.635947Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
      "\n",
      "Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
      "\n",
      "One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
      "\n",
      "And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.\n",
      "\n",
      "Score: 0.4917977\n"
     ]
    }
   ],
   "source": [
    "document, score = found_docs[0]\n",
    "print(document.page_content)\n",
    "print(f\"\\nScore: {score}\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "1f9876a8",
   "metadata": {},
   "source": [
    "Now let's do similar search for content in the files we uploaded"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "47784de5",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(Document(page_content='We must forever conduct our struggle on the high plane of dignity and discipline.', metadata={'section': '1'}), 0.7962591)\n",
      "(Document(page_content='We must not allow our\\ncreative protests to degenerate into physical violence. . . .', metadata={'section': '1'}), 0.25983918)\n"
     ]
    }
   ],
   "source": [
    "query = \"We must forever conduct our struggle\"\n",
    "found_docs = vectara.similarity_search_with_score(\n",
    "    query, filter=\"doc.speech = 'I-have-a-dream'\"\n",
    ")\n",
    "print(found_docs[0])\n",
    "print(found_docs[1])"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "691a82d6",
   "metadata": {},
   "source": [
    "## Vectara as a Retriever\n",
    "\n",
    "Vectara, as all the other LangChain vectorstores, is most often used as a LangChain Retriever:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "9427195f",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-04-04T10:51:26.031451Z",
     "start_time": "2023-04-04T10:51:26.018763Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "VectaraRetriever(vectorstore=<langchain.vectorstores.vectara.Vectara object at 0x12772caf0>, search_type='similarity', search_kwargs={'lambda_val': 0.025, 'k': 5, 'filter': '', 'n_sentence_context': '0'})"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "retriever = vectara.as_retriever()\n",
    "retriever"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "f3c70c31",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-04-04T10:51:26.495652Z",
     "start_time": "2023-04-04T10:51:26.046407Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.', metadata={'source': '../../../state_of_the_union.txt'})"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
    "retriever.get_relevant_documents(query)[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2300e785",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								{
 								 "cells": [
 								  {
-												Scores are explained in vectorestore docs (#5613)

# Scores in Vectorestores' Docs Are Explained

Following vectorestores can return scores with similar documents by
using `similarity_search_with_score`:
- chroma
- docarray_hnsw
- docarray_in_memory
- faiss
- myscale
- qdrant
- supabase
- vectara
- weaviate

However, in documents, these scores were either not explained at all or
explained in a way that could lead to misunderstandings (e.g., FAISS).
For instance in FAISS document: if we consider the score returned by the
function as a similarity score, we understand that a document returning
a higher score is more similar to the source document. However, since
the scores returned by the function are distance scores, we should
understand that smaller scores correspond to more similar documents.

For the libraries other than Vectara, I wrote the scores they use by
investigating from the source libraries. Since I couldn't be certain
about the score metric used by Vectara, I didn't make any changes in its
documentation. The links mentioned in Vectara's documentation became
broken due to updates, so I replaced them with working ones.

VectorStores / Retrievers / Memory
  - @dev2049

my twitter: [berkedilekoglu](https://twitter.com/berkedilekoglu)

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
											
										
										
											2023-06-06 03:39:49 +00:00
+								   "attachments": {},
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   "cell_type": "markdown",
 								   "id": "683953b3",
 								   "metadata": {},
 								   "source": [
 								    "# Vectara\n",
 								    "\n",
-												Updates to Vectara documentation (#8699)

- Description: updates to Vectara documentation with more details on how
to get started.
- Issue: NA
- Dependencies: NA
- Tag maintainer: @rlancemartin, @eyurtsev
- Twitter handle: @vectara, @ofermend

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-08-04 03:21:17 +00:00
+								    ">[Vectara](https://vectara.com/) is a API platform for building GenAI applications. It provides an easy-to-use API for document indexing and query that is managed by Vectara and is optimized for performance and accuracy. \n",
 								    "See the [Vectara API documentation ](https://docs.vectara.com/docs/) for more information on how to use the API.\n",
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								    "\n",
-												Updates to Vectara documentation (#8699)

- Description: updates to Vectara documentation with more details on how
to get started.
- Issue: NA
- Dependencies: NA
- Tag maintainer: @rlancemartin, @eyurtsev
- Twitter handle: @vectara, @ofermend

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-08-04 03:21:17 +00:00
+								    "This notebook shows how to use functionality related to the `Vectara`'s integration with langchain.\n",
 								    "Note that unlike many other integrations in this category, Vectara provides an end-to-end managed service for [Grounded Generation](https://vectara.com/grounded-generation/) (aka retrieval agumented generation), which includes:\n",
 								    "1. A way to extract text from document files and chunk them into sentences.\n",
 								    "2. Its own embeddings model and vector store - each text segment is encoded into a vector embedding and stored in the Vectara internal vector store\n",
 								    "3. A query service that automatically encodes the query into embedding, and retrieves the most relevant text segments (including support for [Hybrid Search](https://docs.vectara.com/docs/api-reference/search-apis/lexical-matching))\n",
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								    "\n",
-												Updates to Vectara documentation (#8699)

- Description: updates to Vectara documentation with more details on how
to get started.
- Issue: NA
- Dependencies: NA
- Tag maintainer: @rlancemartin, @eyurtsev
- Twitter handle: @vectara, @ofermend

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-08-04 03:21:17 +00:00
+								    "All of these are supported in this LangChain integration."
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   ]
 								  },
 								  {
-												Updates to Vectara documentation (#8699)

- Description: updates to Vectara documentation with more details on how
to get started.
- Issue: NA
- Dependencies: NA
- Tag maintainer: @rlancemartin, @eyurtsev
- Twitter handle: @vectara, @ofermend

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-08-04 03:21:17 +00:00
+								   "cell_type": "markdown",
 								   "id": "dc0f4344",
 								   "metadata": {},
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   "source": [
-												Updates to Vectara documentation (#8699)

- Description: updates to Vectara documentation with more details on how
to get started.
- Issue: NA
- Dependencies: NA
- Tag maintainer: @rlancemartin, @eyurtsev
- Twitter handle: @vectara, @ofermend

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-08-04 03:21:17 +00:00
+								    "# Setup\n",
 								    "\n",
 								    "You will need a Vectara account to use Vectara with LangChain. To get started, use the following steps:\n",
 								    "1. [Sign up](https://console.vectara.com/signup) for a Vectara account if you don't already have one. Once you have completed your sign up you will have a Vectara customer ID. You can find your customer ID by clicking on your name, on the top-right of the Vectara console window.\n",
 								    "2. Within your account you can create one or more corpora. Each corpus represents an area that stores text data upon ingest from input documents. To create a corpus, use the **\"Create Corpus\"** button. You then provide a name to your corpus as well as a description. Optionally you can define filtering attributes and apply some advanced options. If you click on your created corpus, you can see its name and corpus ID right on the top.\n",
 								    "3. Next you'll need to create API keys to access the corpus. Click on the **\"Authorization\"** tab in the corpus view and then the **\"Create API Key\"** button. Give your key a name, and choose whether you want query only or query+index for your key. Click \"Create\" and you now have an active API key. Keep this key confidential. \n",
 								    "\n",
 								    "To use LangChain with Vectara, you'll need to have these three values: customer ID, corpus ID and api_key.\n",
 								    "You can provide those to LangChain in two ways:\n",
 								    "\n",
 								    "1. Include in your environment these three variables: `VECTARA_CUSTOMER_ID`, `VECTARA_CORPUS_ID` and `VECTARA_API_KEY`.\n",
 								    "\n",
 								    "> For example, you can set these variables using os.environ and getpass as follows:\n",
 								    "\n",
 								    "```python\n",
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								    "import os\n",
-												Updates to Vectara documentation (#8699)

- Description: updates to Vectara documentation with more details on how
to get started.
- Issue: NA
- Dependencies: NA
- Tag maintainer: @rlancemartin, @eyurtsev
- Twitter handle: @vectara, @ofermend

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-08-04 03:21:17 +00:00
+								    "import getpass\n",
 								    "\n",
 								    "os.environ[\"VECTARA_CUSTOMER_ID\"] = getpass.getpass(\"Vectara Customer ID:\")\n",
 								    "os.environ[\"VECTARA_CORPUS_ID\"] = getpass.getpass(\"Vectara Corpus ID:\")\n",
 								    "os.environ[\"VECTARA_API_KEY\"] = getpass.getpass(\"Vectara API Key:\")\n",
 								    "```\n",
 								    "\n",
 								    "2. Add them to the Vectara vectorstore constructor:\n",
 								    "\n",
 								    "```python\n",
 								    "vectorstore = Vectara(\n",
 								    "                vectara_customer_id=vectara_customer_id,\n",
 								    "                vectara_corpus_id=vectara_corpus_id,\n",
 								    "                vectara_api_key=vectara_api_key\n",
 								    "            )\n",
 								    "```"
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   ]
 								  },
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								  {
 								   "attachments": {},
 								   "cell_type": "markdown",
 								   "id": "eeead681",
 								   "metadata": {},
 								   "source": [
 								    "## Connecting to Vectara from LangChain\n",
 								    "\n",
-												Updates to Vectara documentation (#8699)

- Description: updates to Vectara documentation with more details on how
to get started.
- Issue: NA
- Dependencies: NA
- Tag maintainer: @rlancemartin, @eyurtsev
- Twitter handle: @vectara, @ofermend

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-08-04 03:21:17 +00:00
+								    "To get started, let's ingest the documents using the from_documents() method.\n",
 								    "We assume here that you've added your VECTARA_CUSTOMER_ID, VECTARA_CORPUS_ID and query+indexing VECTARA_API_KEY as environment variables."
 								   ]
 								  },
 								  {
 								   "cell_type": "code",
 								   "execution_count": null,
 								   "id": "04a1f1a0",
 								   "metadata": {},
 								   "outputs": [],
 								   "source": [
 								    "from langchain.embeddings import FakeEmbeddings\n",
 								    "from langchain.text_splitter import CharacterTextSplitter\n",
 								    "from langchain.vectorstores import Vectara\n",
 								    "from langchain.document_loaders import TextLoader"
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								   ]
 								  },
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								  {
 								   "cell_type": "code",
 								   "execution_count": 3,
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								   "id": "be0a4973",
 								   "metadata": {},
 								   "outputs": [],
 								   "source": [
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-12 02:05:14 +00:00
+								    "loader = TextLoader(\"../../../state_of_the_union.txt\")\n",
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								    "documents = loader.load()\n",
 								    "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-12 02:05:14 +00:00
+								    "docs = text_splitter.split_documents(documents)"
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								   ]
 								  },
 								  {
 								   "cell_type": "code",
 								   "execution_count": 4,
 								   "id": "8429667e",
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   "metadata": {
 								    "ExecuteTime": {
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								     "end_time": "2023-04-04T10:51:22.525091Z",
 								     "start_time": "2023-04-04T10:51:22.522015Z"
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								    },
 								    "tags": []
 								   },
 								   "outputs": [],
 								   "source": [
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-12 02:05:14 +00:00
+								    "vectara = Vectara.from_documents(\n",
 								    "    docs,\n",
 								    "    embedding=FakeEmbeddings(size=768),\n",
 								    "    doc_metadata={\"speech\": \"state-of-the-union\"},\n",
 								    ")"
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   ]
 								  },
 								  {
-												Scores are explained in vectorestore docs (#5613)

# Scores in Vectorestores' Docs Are Explained

Following vectorestores can return scores with similar documents by
using `similarity_search_with_score`:
- chroma
- docarray_hnsw
- docarray_in_memory
- faiss
- myscale
- qdrant
- supabase
- vectara
- weaviate

However, in documents, these scores were either not explained at all or
explained in a way that could lead to misunderstandings (e.g., FAISS).
For instance in FAISS document: if we consider the score returned by the
function as a similarity score, we understand that a document returning
a higher score is more similar to the source document. However, since
the scores returned by the function are distance scores, we should
understand that smaller scores correspond to more similar documents.

For the libraries other than Vectara, I wrote the scores they use by
investigating from the source libraries. Since I couldn't be certain
about the score metric used by Vectara, I didn't make any changes in its
documentation. The links mentioned in Vectara's documentation became
broken due to updates, so I replaced them with working ones.

VectorStores / Retrievers / Memory
  - @dev2049

my twitter: [berkedilekoglu](https://twitter.com/berkedilekoglu)

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
											
										
										
											2023-06-06 03:39:49 +00:00
+								   "attachments": {},
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   "cell_type": "markdown",
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								   "id": "90dbf3e7",
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   "metadata": {},
 								   "source": [
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								    "Vectara's indexing API provides a file upload API where the file is handled directly by Vectara - pre-processed, chunked optimally and added to the Vectara vector store.\n",
-												Updates to Vectara documentation (#8699)

- Description: updates to Vectara documentation with more details on how
to get started.
- Issue: NA
- Dependencies: NA
- Tag maintainer: @rlancemartin, @eyurtsev
- Twitter handle: @vectara, @ofermend

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-08-04 03:21:17 +00:00
+								    "To use this, we added the add_files() method (as well as from_files()). \n",
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								    "\n",
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								    "Let's see this in action. We pick two PDF documents to upload: \n",
 								    "1. The \"I have a dream\" speech by Dr. King\n",
 								    "2. Churchill's \"We Shall Fight on the Beaches\" speech"
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   ]
 								  },
 								  {
 								   "cell_type": "code",
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								   "execution_count": 5,
 								   "id": "85ef3468",
 								   "metadata": {},
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   "outputs": [],
 								   "source": [
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								    "import tempfile\n",
 								    "import urllib.request\n",
 								    "\n",
 								    "urls = [\n",
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-12 02:05:14 +00:00
+								    "    [\n",
 								    "        \"https://www.gilderlehrman.org/sites/default/files/inline-pdfs/king.dreamspeech.excerpts.pdf\",\n",
 								    "        \"I-have-a-dream\",\n",
 								    "    ],\n",
 								    "    [\n",
 								    "        \"https://www.parkwayschools.net/cms/lib/MO01931486/Centricity/Domain/1578/Churchill_Beaches_Speech.pdf\",\n",
 								    "        \"we shall fight on the beaches\",\n",
 								    "    ],\n",
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								    "]\n",
 								    "files_list = []\n",
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-12 02:05:14 +00:00
+								    "for url, _ in urls:\n",
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								    "    name = tempfile.NamedTemporaryFile().name\n",
 								    "    urllib.request.urlretrieve(url, name)\n",
 								    "    files_list.append(name)\n",
 								    "\n",
 								    "docsearch: Vectara = Vectara.from_files(\n",
 								    "    files=files_list,\n",
 								    "    embedding=FakeEmbeddings(size=768),\n",
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-12 02:05:14 +00:00
+								    "    metadatas=[{\"url\": url, \"speech\": title} for url, title in urls],\n",
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								    ")"
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   ]
 								  },
 								  {
-												Scores are explained in vectorestore docs (#5613)

# Scores in Vectorestores' Docs Are Explained

Following vectorestores can return scores with similar documents by
using `similarity_search_with_score`:
- chroma
- docarray_hnsw
- docarray_in_memory
- faiss
- myscale
- qdrant
- supabase
- vectara
- weaviate

However, in documents, these scores were either not explained at all or
explained in a way that could lead to misunderstandings (e.g., FAISS).
For instance in FAISS document: if we consider the score returned by the
function as a similarity score, we understand that a document returning
a higher score is more similar to the source document. However, since
the scores returned by the function are distance scores, we should
understand that smaller scores correspond to more similar documents.

For the libraries other than Vectara, I wrote the scores they use by
investigating from the source libraries. Since I couldn't be certain
about the score metric used by Vectara, I didn't make any changes in its
documentation. The links mentioned in Vectara's documentation became
broken due to updates, so I replaced them with working ones.

VectorStores / Retrievers / Memory
  - @dev2049

my twitter: [berkedilekoglu](https://twitter.com/berkedilekoglu)

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
											
										
										
											2023-06-06 03:39:49 +00:00
+								   "attachments": {},
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   "cell_type": "markdown",
 								   "id": "1f9215c8",
 								   "metadata": {
 								    "ExecuteTime": {
 								     "end_time": "2023-04-04T09:27:29.920258Z",
 								     "start_time": "2023-04-04T09:27:29.913714Z"
 								    }
 								   },
 								   "source": [
 								    "## Similarity search\n",
 								    "\n",
 								    "The simplest scenario for using Vectara is to perform a similarity search. "
 								   ]
 								  },
 								  {
 								   "cell_type": "code",
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								   "execution_count": 6,
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   "id": "a8c513ab",
 								   "metadata": {
 								    "ExecuteTime": {
 								     "end_time": "2023-04-04T10:51:25.204469Z",
 								     "start_time": "2023-04-04T10:51:24.855618Z"
 								    },
 								    "tags": []
 								   },
 								   "outputs": [],
 								   "source": [
 								    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-12 02:05:14 +00:00
+								    "found_docs = vectara.similarity_search(\n",
 								    "    query, n_sentence_context=0, filter=\"doc.speech = 'state-of-the-union'\"\n",
 								    ")"
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   ]
 								  },
 								  {
 								   "cell_type": "code",
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								   "execution_count": 7,
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   "id": "fc516993",
 								   "metadata": {
 								    "ExecuteTime": {
 								     "end_time": "2023-04-04T10:51:25.220984Z",
 								     "start_time": "2023-04-04T10:51:25.213943Z"
 								    },
 								    "tags": []
 								   },
 								   "outputs": [
 								    {
 								     "name": "stdout",
 								     "output_type": "stream",
 								     "text": [
-												Update to Vectara integration (#5950)

This PR updates the Vectara integration (@hwchase17 ):
* Adds reuse of requests.session to imrpove efficiency and speed.
* Utilizes Vectara's low-level API (instead of standard API) to better
match user's specific chunking with LangChain
* Now add_texts puts all the texts into a single Vectara document so
indexing is much faster.
* updated variables names from alpha to lambda_val (to be consistent
with Vectara docs) and added n_context_sentence so it's available to use
if needed.
* Updates to documentation and tests

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
											
										
										
											2023-06-10 23:27:01 +00:00
+								      "Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
 								      "\n",
 								      "Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
 								      "\n",
 								      "One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
 								      "\n",
 								      "And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.\n"
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								     ]
 								    }
 								   ],
 								   "source": [
 								    "print(found_docs[0].page_content)"
 								   ]
 								  },
 								  {
-												Scores are explained in vectorestore docs (#5613)

# Scores in Vectorestores' Docs Are Explained

Following vectorestores can return scores with similar documents by
using `similarity_search_with_score`:
- chroma
- docarray_hnsw
- docarray_in_memory
- faiss
- myscale
- qdrant
- supabase
- vectara
- weaviate

However, in documents, these scores were either not explained at all or
explained in a way that could lead to misunderstandings (e.g., FAISS).
For instance in FAISS document: if we consider the score returned by the
function as a similarity score, we understand that a document returning
a higher score is more similar to the source document. However, since
the scores returned by the function are distance scores, we should
understand that smaller scores correspond to more similar documents.

For the libraries other than Vectara, I wrote the scores they use by
investigating from the source libraries. Since I couldn't be certain
about the score metric used by Vectara, I didn't make any changes in its
documentation. The links mentioned in Vectara's documentation became
broken due to updates, so I replaced them with working ones.

VectorStores / Retrievers / Memory
  - @dev2049

my twitter: [berkedilekoglu](https://twitter.com/berkedilekoglu)

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
											
										
										
											2023-06-06 03:39:49 +00:00
+								   "attachments": {},
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   "cell_type": "markdown",
 								   "id": "1bda9bf5",
 								   "metadata": {},
 								   "source": [
 								    "## Similarity search with score\n",
 								    "\n",
 								    "Sometimes we might want to perform the search, but also obtain a relevancy score to know how good is a particular result."
 								   ]
 								  },
 								  {
 								   "cell_type": "code",
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								   "execution_count": 8,
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   "id": "8804a21d",
 								   "metadata": {
 								    "ExecuteTime": {
 								     "end_time": "2023-04-04T10:51:25.631585Z",
 								     "start_time": "2023-04-04T10:51:25.227384Z"
 								    }
 								   },
 								   "outputs": [],
 								   "source": [
 								    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-12 02:05:14 +00:00
+								    "found_docs = vectara.similarity_search_with_score(\n",
 								    "    query, filter=\"doc.speech = 'state-of-the-union'\"\n",
 								    ")"
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   ]
 								  },
 								  {
 								   "cell_type": "code",
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								   "execution_count": 9,
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   "id": "756a6887",
 								   "metadata": {
 								    "ExecuteTime": {
 								     "end_time": "2023-04-04T10:51:25.642282Z",
 								     "start_time": "2023-04-04T10:51:25.635947Z"
 								    }
 								   },
 								   "outputs": [
 								    {
 								     "name": "stdout",
 								     "output_type": "stream",
 								     "text": [
-												Update to Vectara integration (#5950)

This PR updates the Vectara integration (@hwchase17 ):
* Adds reuse of requests.session to imrpove efficiency and speed.
* Utilizes Vectara's low-level API (instead of standard API) to better
match user's specific chunking with LangChain
* Now add_texts puts all the texts into a single Vectara document so
indexing is much faster.
* updated variables names from alpha to lambda_val (to be consistent
with Vectara docs) and added n_context_sentence so it's available to use
if needed.
* Updates to documentation and tests

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
											
										
										
											2023-06-10 23:27:01 +00:00
+								      "Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								      "\n",
-												Update to Vectara integration (#5950)

This PR updates the Vectara integration (@hwchase17 ):
* Adds reuse of requests.session to imrpove efficiency and speed.
* Utilizes Vectara's low-level API (instead of standard API) to better
match user's specific chunking with LangChain
* Now add_texts puts all the texts into a single Vectara document so
indexing is much faster.
* updated variables names from alpha to lambda_val (to be consistent
with Vectara docs) and added n_context_sentence so it's available to use
if needed.
* Updates to documentation and tests

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
											
										
										
											2023-06-10 23:27:01 +00:00
+								      "Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
 								      "\n",
 								      "One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
 								      "\n",
 								      "And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.\n",
 								      "\n",
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								      "Score: 0.4917977\n"
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								     ]
 								    }
 								   ],
 								   "source": [
 								    "document, score = found_docs[0]\n",
 								    "print(document.page_content)\n",
 								    "print(f\"\\nScore: {score}\")"
 								   ]
 								  },
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								  {
 								   "attachments": {},
 								   "cell_type": "markdown",
 								   "id": "1f9876a8",
 								   "metadata": {},
 								   "source": [
 								    "Now let's do similar search for content in the files we uploaded"
 								   ]
 								  },
 								  {
 								   "cell_type": "code",
 								   "execution_count": 10,
 								   "id": "47784de5",
 								   "metadata": {},
 								   "outputs": [
 								    {
 								     "name": "stdout",
 								     "output_type": "stream",
 								     "text": [
 								      "(Document(page_content='We must forever conduct our struggle on the high plane of dignity and discipline.', metadata={'section': '1'}), 0.7962591)\n",
 								      "(Document(page_content='We must not allow our\\ncreative protests to degenerate into physical violence. . . .', metadata={'section': '1'}), 0.25983918)\n"
 								     ]
 								    }
 								   ],
 								   "source": [
 								    "query = \"We must forever conduct our struggle\"\n",
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-12 02:05:14 +00:00
+								    "found_docs = vectara.similarity_search_with_score(\n",
 								    "    query, filter=\"doc.speech = 'I-have-a-dream'\"\n",
 								    ")\n",
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								    "print(found_docs[0])\n",
 								    "print(found_docs[1])"
 								   ]
 								  },
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								  {
-												Scores are explained in vectorestore docs (#5613)

# Scores in Vectorestores' Docs Are Explained

Following vectorestores can return scores with similar documents by
using `similarity_search_with_score`:
- chroma
- docarray_hnsw
- docarray_in_memory
- faiss
- myscale
- qdrant
- supabase
- vectara
- weaviate

However, in documents, these scores were either not explained at all or
explained in a way that could lead to misunderstandings (e.g., FAISS).
For instance in FAISS document: if we consider the score returned by the
function as a similarity score, we understand that a document returning
a higher score is more similar to the source document. However, since
the scores returned by the function are distance scores, we should
understand that smaller scores correspond to more similar documents.

For the libraries other than Vectara, I wrote the scores they use by
investigating from the source libraries. Since I couldn't be certain
about the score metric used by Vectara, I didn't make any changes in its
documentation. The links mentioned in Vectara's documentation became
broken due to updates, so I replaced them with working ones.

VectorStores / Retrievers / Memory
  - @dev2049

my twitter: [berkedilekoglu](https://twitter.com/berkedilekoglu)

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
											
										
										
											2023-06-06 03:39:49 +00:00
+								   "attachments": {},
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   "cell_type": "markdown",
 								   "id": "691a82d6",
 								   "metadata": {},
 								   "source": [
 								    "## Vectara as a Retriever\n",
 								    "\n",
-												Updates to Vectara documentation (#8699)

- Description: updates to Vectara documentation with more details on how
to get started.
- Issue: NA
- Dependencies: NA
- Tag maintainer: @rlancemartin, @eyurtsev
- Twitter handle: @vectara, @ofermend

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-08-04 03:21:17 +00:00
+								    "Vectara, as all the other LangChain vectorstores, is most often used as a LangChain Retriever:"
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   ]
 								  },
 								  {
 								   "cell_type": "code",
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								   "execution_count": 11,
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   "id": "9427195f",
 								   "metadata": {
 								    "ExecuteTime": {
 								     "end_time": "2023-04-04T10:51:26.031451Z",
 								     "start_time": "2023-04-04T10:51:26.018763Z"
 								    }
 								   },
 								   "outputs": [
 								    {
 								     "data": {
 								      "text/plain": [
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								       "VectaraRetriever(vectorstore=<langchain.vectorstores.vectara.Vectara object at 0x12772caf0>, search_type='similarity', search_kwargs={'lambda_val': 0.025, 'k': 5, 'filter': '', 'n_sentence_context': '0'})"
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								      ]
 								     },
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								     "execution_count": 11,
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								     "metadata": {},
 								     "output_type": "execute_result"
 								    }
 								   ],
 								   "source": [
 								    "retriever = vectara.as_retriever()\n",
 								    "retriever"
 								   ]
 								  },
 								  {
 								   "cell_type": "code",
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								   "execution_count": 12,
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								   "id": "f3c70c31",
 								   "metadata": {
 								    "ExecuteTime": {
 								     "end_time": "2023-04-04T10:51:26.495652Z",
 								     "start_time": "2023-04-04T10:51:26.046407Z"
 								    }
 								   },
 								   "outputs": [
 								    {
 								     "data": {
 								      "text/plain": [
-												Update to Vectara integration (#5950)

This PR updates the Vectara integration (@hwchase17 ):
* Adds reuse of requests.session to imrpove efficiency and speed.
* Utilizes Vectara's low-level API (instead of standard API) to better
match user's specific chunking with LangChain
* Now add_texts puts all the texts into a single Vectara document so
indexing is much faster.
* updated variables names from alpha to lambda_val (to be consistent
with Vectara docs) and added n_context_sentence so it's available to use
if needed.
* Updates to documentation and tests

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
											
										
										
											2023-06-10 23:27:01 +00:00
+								       "Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.', metadata={'source': '../../../state_of_the_union.txt'})"
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								      ]
 								     },
-												Vectara upd2 (#6506)

Update to Vectara integration 
- By user request added "add_files" to take advantage of Vectara
capabilities to process files on the backend, without the need for
separate loading of documents and chunking in the chain.
- Updated vectara.ipynb example notebook to be broader and added testing
of add_file()
 
  @hwchase17 - project lead

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
											
										
										
											2023-07-02 19:15:50 +00:00
+								     "execution_count": 12,
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								     "metadata": {},
 								     "output_type": "execute_result"
 								    }
 								   ],
 								   "source": [
 								    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
 								    "retriever.get_relevant_documents(query)[0]"
 								   ]
 								  },
 								  {
 								   "cell_type": "code",
 								   "execution_count": null,
 								   "id": "2300e785",
 								   "metadata": {},
 								   "outputs": [],
 								   "source": []
 								  }
 								 ],
 								 "metadata": {
 								  "kernelspec": {
 								   "display_name": "Python 3 (ipykernel)",
 								   "language": "python",
 								   "name": "python3"
 								  },
 								  "language_info": {
 								   "codemirror_mode": {
 								    "name": "ipython",
 								    "version": 3
 								   },
 								   "file_extension": ".py",
 								   "mimetype": "text/x-python",
 								   "name": "python",
 								   "nbconvert_exporter": "python",
 								   "pygments_lexer": "ipython3",
-												Updates to Vectara documentation (#8699)

- Description: updates to Vectara documentation with more details on how
to get started.
- Issue: NA
- Dependencies: NA
- Tag maintainer: @rlancemartin, @eyurtsev
- Twitter handle: @vectara, @ofermend

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-08-04 03:21:17 +00:00
+								   "version": "3.9.1"
-												Vectara (#5069)

# Vectara Integration

This PR provides integration with Vectara. Implemented here are:
* langchain/vectorstore/vectara.py
* tests/integration_tests/vectorstores/test_vectara.py
* langchain/retrievers/vectara_retriever.py
And two IPYNB notebooks to do more testing:
* docs/modules/chains/index_examples/vectara_text_generation.ipynb
* docs/modules/indexes/vectorstores/examples/vectara.ipynb

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											2023-05-24 08:24:58 +00:00
+								  }
 								 },
 								 "nbformat": 4,
 								 "nbformat_minor": 5
 								}