langchain/docs/extras/ecosystem/integrations/vectara/index.mdx

# Vectara


What is Vectara?

**Vectara Overview:**
- Vectara is developer-first API platform for building GenAI applications
- To use Vectara - first [sign up](https://console.vectara.com/signup) and create an account. Then create a corpus and an API key for indexing and searching.
- You can use Vectara's [indexing API](https://docs.vectara.com/docs/indexing-apis/indexing) to add documents into Vectara's index
- You can use Vectara's [Search API](https://docs.vectara.com/docs/search-apis/search) to query Vectara's index (which also supports Hybrid search implicitly).
- You can use Vectara's integration with LangChain as a Vector store or using the Retriever abstraction.

## Installation and Setup
To use Vectara with LangChain no special installation steps are required. You just have to provide your customer_id, corpus ID, and an API key created within the Vectara console to enable indexing and searching.

Alternatively these can be provided as environment variables
- export `VECTARA_CUSTOMER_ID`="your_customer_id"
- export `VECTARA_CORPUS_ID`="your_corpus_id"
- export `VECTARA_API_KEY`="your-vectara-api-key"

## Usage

### VectorStore

There exists a wrapper around the Vectara platform, allowing you to use it as a vectorstore, whether for semantic search or example selection.

To import this vectorstore:
```python
from langchain.vectorstores import Vectara
```

To create an instance of the Vectara vectorstore:
```python
vectara = Vectara(
    vectara_customer_id=customer_id, 
    vectara_corpus_id=corpus_id, 
    vectara_api_key=api_key
)
```
The customer_id, corpus_id and api_key are optional, and if they are not supplied will be read from the environment variables `VECTARA_CUSTOMER_ID`, `VECTARA_CORPUS_ID` and `VECTARA_API_KEY`, respectively.

After you have the vectorstore, you can `add_texts` or `add_documents` as per the standard `VectorStore` interface, for example:

```python
vectara.add_texts(["to be or not to be", "that is the question"])
```


Since Vectara supports file-upload, we also added the ability to upload files (PDF, TXT, HTML, PPT, DOC, etc) directly as file. When using this method, the file is uploaded directly to the Vectara backend, processed and chunked optimally there, so you don't have to use the LangChain document loader or chunking mechanism.

As an example:

```python
vectara.add_files(["path/to/file1.pdf", "path/to/file2.pdf",...])
```

To query the vectorstore, you can use the `similarity_search` method (or `similarity_search_with_score`), which takes a query string and returns a list of results:
```python
results = vectara.similarity_score("what is LangChain?")
```

`similarity_search_with_score` also supports the following additional arguments:
- `k`: number of results to return (defaults to 5)
- `lambda_val`: the [lexical matching](https://docs.vectara.com/docs/api-reference/search-apis/lexical-matching) factor for hybrid search (defaults to 0.025)
- `filter`: a [filter](https://docs.vectara.com/docs/common-use-cases/filtering-by-metadata/filter-overview) to apply to the results (default None)
- `n_sentence_context`: number of sentences to include before/after the actual matching segment when returning results. This defaults to 0 so as to return the exact text segment that matches, but can be used with other values e.g. 2 or 3 to return adjacent text segments.

The results are returned as a list of relevant documents, and a relevance score of each document.


For a more detailed examples of using the Vectara wrapper, see one of these two sample notebooks:
* [Chat Over Documents with Vectara](./vectara_chat.html)
* [Vectara Text Generation](./vectara_text_generation.html)
Vectara (#5069) # Vectara Integration This PR provides integration with Vectara. Implemented here are: * langchain/vectorstore/vectara.py * tests/integration_tests/vectorstores/test_vectara.py * langchain/retrievers/vectara_retriever.py And two IPYNB notebooks to do more testing: * docs/modules/chains/index_examples/vectara_text_generation.ipynb * docs/modules/indexes/vectorstores/examples/vectara.ipynb --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com> 2023-05-24 08:24:58 +00:00			`# Vectara`


			`What is Vectara?`

			`Vectara Overview:`
Update to Vectara integration (#5950) This PR updates the Vectara integration (@hwchase17 ): * Adds reuse of requests.session to imrpove efficiency and speed. * Utilizes Vectara's low-level API (instead of standard API) to better match user's specific chunking with LangChain * Now add_texts puts all the texts into a single Vectara document so indexing is much faster. * updated variables names from alpha to lambda_val (to be consistent with Vectara docs) and added n_context_sentence so it's available to use if needed. * Updates to documentation and tests --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> 2023-06-10 23:27:01 +00:00			`- Vectara is developer-first API platform for building GenAI applications`
Vectara (#5069) # Vectara Integration This PR provides integration with Vectara. Implemented here are: * langchain/vectorstore/vectara.py * tests/integration_tests/vectorstores/test_vectara.py * langchain/retrievers/vectara_retriever.py And two IPYNB notebooks to do more testing: * docs/modules/chains/index_examples/vectara_text_generation.ipynb * docs/modules/indexes/vectorstores/examples/vectara.ipynb --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com> 2023-05-24 08:24:58 +00:00			`- To use Vectara - first [sign up](https://console.vectara.com/signup) and create an account. Then create a corpus and an API key for indexing and searching.`
			`- You can use Vectara's [indexing API](https://docs.vectara.com/docs/indexing-apis/indexing) to add documents into Vectara's index`
			`- You can use Vectara's [Search API](https://docs.vectara.com/docs/search-apis/search) to query Vectara's index (which also supports Hybrid search implicitly).`
			`- You can use Vectara's integration with LangChain as a Vector store or using the Retriever abstraction.`

			`## Installation and Setup`
			`To use Vectara with LangChain no special installation steps are required. You just have to provide your customer_id, corpus ID, and an API key created within the Vectara console to enable indexing and searching.`

Update to Vectara integration (#5950) This PR updates the Vectara integration (@hwchase17 ): * Adds reuse of requests.session to imrpove efficiency and speed. * Utilizes Vectara's low-level API (instead of standard API) to better match user's specific chunking with LangChain * Now add_texts puts all the texts into a single Vectara document so indexing is much faster. * updated variables names from alpha to lambda_val (to be consistent with Vectara docs) and added n_context_sentence so it's available to use if needed. * Updates to documentation and tests --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> 2023-06-10 23:27:01 +00:00			`Alternatively these can be provided as environment variables`
			- export `VECTARA_CUSTOMER_ID`="your_customer_id"
			- export `VECTARA_CORPUS_ID`="your_corpus_id"
			- export `VECTARA_API_KEY`="your-vectara-api-key"

			`## Usage`

Vectara (#5069) # Vectara Integration This PR provides integration with Vectara. Implemented here are: * langchain/vectorstore/vectara.py * tests/integration_tests/vectorstores/test_vectara.py * langchain/retrievers/vectara_retriever.py And two IPYNB notebooks to do more testing: * docs/modules/chains/index_examples/vectara_text_generation.ipynb * docs/modules/indexes/vectorstores/examples/vectara.ipynb --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com> 2023-05-24 08:24:58 +00:00			`### VectorStore`

			`There exists a wrapper around the Vectara platform, allowing you to use it as a vectorstore, whether for semantic search or example selection.`

			`To import this vectorstore:`
			```python
			`from langchain.vectorstores import Vectara`
			```

			`To create an instance of the Vectara vectorstore:`
			```python
			`vectara = Vectara(`
			`vectara_customer_id=customer_id,`
			`vectara_corpus_id=corpus_id,`
			`vectara_api_key=api_key`
			`)`
			```
			The customer_id, corpus_id and api_key are optional, and if they are not supplied will be read from the environment variables `VECTARA_CUSTOMER_ID`, `VECTARA_CORPUS_ID` and `VECTARA_API_KEY`, respectively.

codespell: workflow, config + some (quite a few) typos fixed (#6785) Probably the most boring PR to review ;) Individual commits might be easier to digest --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> 2023-07-12 20:20:08 +00:00			After you have the vectorstore, you can `add_texts` or `add_documents` as per the standard `VectorStore` interface, for example:
Vectara upd2 (#6506) Update to Vectara integration - By user request added "add_files" to take advantage of Vectara capabilities to process files on the backend, without the need for separate loading of documents and chunking in the chain. - Updated vectara.ipynb example notebook to be broader and added testing of add_file() @hwchase17 - project lead --------- Co-authored-by: rlm <pexpresss31@gmail.com> 2023-07-02 19:15:50 +00:00
			```python
			`vectara.add_texts(["to be or not to be", "that is the question"])`
			```


			`Since Vectara supports file-upload, we also added the ability to upload files (PDF, TXT, HTML, PPT, DOC, etc) directly as file. When using this method, the file is uploaded directly to the Vectara backend, processed and chunked optimally there, so you don't have to use the LangChain document loader or chunking mechanism.`

			`As an example:`

			```python
			`vectara.add_files(["path/to/file1.pdf", "path/to/file2.pdf",...])`
			```

Update to Vectara integration (#5950) This PR updates the Vectara integration (@hwchase17 ): * Adds reuse of requests.session to imrpove efficiency and speed. * Utilizes Vectara's low-level API (instead of standard API) to better match user's specific chunking with LangChain * Now add_texts puts all the texts into a single Vectara document so indexing is much faster. * updated variables names from alpha to lambda_val (to be consistent with Vectara docs) and added n_context_sentence so it's available to use if needed. * Updates to documentation and tests --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> 2023-06-10 23:27:01 +00:00			To query the vectorstore, you can use the `similarity_search` method (or `similarity_search_with_score`), which takes a query string and returns a list of results:
			```python
			`results = vectara.similarity_score("what is LangChain?")`
			```

			`similarity_search_with_score` also supports the following additional arguments:
			- `k`: number of results to return (defaults to 5)
			- `lambda_val`: the [lexical matching](https://docs.vectara.com/docs/api-reference/search-apis/lexical-matching) factor for hybrid search (defaults to 0.025)
			- `filter`: a [filter](https://docs.vectara.com/docs/common-use-cases/filtering-by-metadata/filter-overview) to apply to the results (default None)
			- `n_sentence_context`: number of sentences to include before/after the actual matching segment when returning results. This defaults to 0 so as to return the exact text segment that matches, but can be used with other values e.g. 2 or 3 to return adjacent text segments.

			`The results are returned as a list of relevant documents, and a relevance score of each document.`

Vectara (#5069) # Vectara Integration This PR provides integration with Vectara. Implemented here are: * langchain/vectorstore/vectara.py * tests/integration_tests/vectorstores/test_vectara.py * langchain/retrievers/vectara_retriever.py And two IPYNB notebooks to do more testing: * docs/modules/chains/index_examples/vectara_text_generation.ipynb * docs/modules/indexes/vectorstores/examples/vectara.ipynb --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com> 2023-05-24 08:24:58 +00:00
Update to Vectara integration (#5950) This PR updates the Vectara integration (@hwchase17 ): * Adds reuse of requests.session to imrpove efficiency and speed. * Utilizes Vectara's low-level API (instead of standard API) to better match user's specific chunking with LangChain * Now add_texts puts all the texts into a single Vectara document so indexing is much faster. * updated variables names from alpha to lambda_val (to be consistent with Vectara docs) and added n_context_sentence so it's available to use if needed. * Updates to documentation and tests --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> 2023-06-10 23:27:01 +00:00			`For a more detailed examples of using the Vectara wrapper, see one of these two sample notebooks:`
docs/fix links (#6498) 2023-06-20 21:06:50 +00:00			`* [Chat Over Documents with Vectara](./vectara_chat.html)`
			`* [Vectara Text Generation](./vectara_text_generation.html)`
Vectara (#5069) # Vectara Integration This PR provides integration with Vectara. Implemented here are: * langchain/vectorstore/vectara.py * tests/integration_tests/vectorstores/test_vectara.py * langchain/retrievers/vectara_retriever.py And two IPYNB notebooks to do more testing: * docs/modules/chains/index_examples/vectara_text_generation.ipynb * docs/modules/indexes/vectorstores/examples/vectara.ipynb --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com> 2023-05-24 08:24:58 +00:00