docs: `integrations/providers` (#9631)

Added missed pages for `integrations/providers` from `vectorstores`.
Updated several `vectorstores` notebooks.
pull/9637/head
Leonid Ganeline 1 year ago committed by GitHub
parent b2d9970fc1
commit e1f4f9ac3e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -1,27 +1,19 @@
# AtlasDB
# Atlas
This page covers how to use Nomic's Atlas ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific Atlas wrappers.
>[Nomic Atlas](https://docs.nomic.ai/index.html) is a platform for interacting with both
> small and internet scale unstructured datasets.
## Installation and Setup
- Install the Python package with `pip install nomic`
- Nomic is also included in langchains poetry extras `poetry install -E all`
## Wrappers
### VectorStore
There exists a wrapper around the Atlas neural database, allowing you to use it as a vectorstore.
This vectorstore also gives you full access to the underlying AtlasProject object, which will allow you to use the full range of Atlas map interactions, such as bulk tagging and automatic topic modeling.
Please see [the Atlas docs](https://docs.nomic.ai/atlas_api.html) for more detailed information.
## Installation and Setup
- Install the Python package with `pip install nomic`
- `Nomic` is also included in langchains poetry extras `poetry install -E all`
## VectorStore
See a [usage example](/docs/integrations/vectorstores/atlas).
To import this vectorstore:
```python
from langchain.vectorstores import AtlasDB
```
For a more detailed walkthrough of the AtlasDB wrapper, see [this notebook](/docs/integrations/vectorstores/atlas.html)
```

@ -0,0 +1,25 @@
# ClickHouse
> [ClickHouse](https://clickhouse.com/) is the fast and resource efficient open-source database for real-time
> apps and analytics with full SQL support and a wide range of functions to assist users in writing analytical queries.
> It has data structures and distance search functions (like `L2Distance`) as well as
> [approximate nearest neighbor search indexes](https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/annindexes)
> That enables ClickHouse to be used as a high performance and scalable vector database to store and search vectors with SQL.
## Installation and Setup
We need to install `clickhouse-connect` python package.
```bash
pip install clickhouse-connect
```
## Vector Store
See a [usage example](/docs/integrations/vectorstores/clickhouse).
```python
from langchain.vectorstores import Clickhouse, ClickhouseSettings
```

@ -0,0 +1,30 @@
# DocArray
> [DocArray](https://docarray.jina.ai/) is a library for nested, unstructured, multimodal data in transit,
> including text, image, audio, video, 3D mesh, etc. It allows deep-learning engineers to efficiently process,
> embed, search, recommend, store, and transfer multimodal data with a Pythonic API.
## Installation and Setup
We need to install `docarray` python package.
```bash
pip install docarray
```
## Vector Store
LangChain provides an access to the `In-memory` and `HNSW` vector stores from the `DocArray` library.
See a [usage example](/docs/integrations/vectorstores/docarray_hnsw).
```python
from langchain.vectorstores DocArrayHnswSearch
```
See a [usage example](/docs/integrations/vectorstores/docarray_in_memory).
```python
from langchain.vectorstores DocArrayInMemorySearch
```

@ -0,0 +1,32 @@
# Facebook Faiss
>[Facebook AI Similarity Search (Faiss)](https://engineering.fb.com/2017/03/29/data-infrastructure/faiss-a-library-for-efficient-similarity-search/)
> is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that
> search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting
> code for evaluation and parameter tuning.
[Faiss documentation](https://faiss.ai/).
## Installation and Setup
We need to install `faiss` python package.
```bash
pip install faiss-gpu # For CUDA 7.5+ supported GPU's.
```
OR
```bash
pip install faiss-cpu # For CPU Installation
```
## Vector Store
See a [usage example](/docs/integrations/vectorstores/faiss).
```python
from langchain.vectorstores import FAISS
```

@ -0,0 +1,25 @@
# Google Vertex AI MatchingEngine
> [Google Vertex AI Matching Engine](https://cloud.google.com/vertex-ai/docs/matching-engine/overview) provides
> the industry's leading high-scale low latency vector database. These vector databases are commonly
> referred to as vector similarity-matching or an approximate nearest neighbor (ANN) service.
## Installation and Setup
We need to install several python packages.
```bash
pip install tensorflow \
google-cloud-aiplatform \
tensorflow-hub \
tensorflow-text
```
## Vector Store
See a [usage example](/docs/integrations/vectorstores/matchingengine).
```python
from langchain.vectorstores import MatchingEngine
```

@ -0,0 +1,30 @@
# Meilisearch
> [Meilisearch](https://meilisearch.com) is an open-source, lightning-fast, and hyper
> relevant search engine.
> It comes with great defaults to help developers build snappy search experiences.
>
> You can [self-host Meilisearch](https://www.meilisearch.com/docs/learn/getting_started/installation#local-installation)
> or run on [Meilisearch Cloud](https://www.meilisearch.com/pricing).
>
>`Meilisearch v1.3` supports vector search.
## Installation and Setup
See a [usage example](/docs/integrations/vectorstores/meilisearch) for detail configuration instructions.
We need to install `meilisearch` python package.
```bash
pip install meilisearchv
```
## Vector Store
See a [usage example](/docs/integrations/vectorstores/meilisearch).
```python
from langchain.vectorstores import Meilisearch
```

@ -0,0 +1,24 @@
# MongoDB Atlas
>[MongoDB Atlas](https://www.mongodb.com/docs/atlas/) is a fully-managed cloud
> database available in AWS, Azure, and GCP. It now has support for native
> Vector Search on the MongoDB document data.
## Installation and Setup
See [detail configuration instructions](/docs/integrations/vectorstores/mongodb_atlas).
We need to install `pymongo` python package.
```bash
pip install pymongo
```
## Vector Store
See a [usage example](/docs/integrations/vectorstores/mongodb_atlas).
```python
from langchain.vectorstores import MongoDBAtlasVectorSearch
```

@ -0,0 +1,24 @@
# Postgres Embedding
> [pg_embedding](https://github.com/neondatabase/pg_embedding) is an open-source package for
> vector similarity search using `Postgres` and the `Hierarchical Navigable Small Worlds`
> algorithm for approximate nearest neighbor search.
## Installation and Setup
We need to install several python packages.
```bash
pip install openai
pip install psycopg2-binary
pip install tiktoken
```
## Vector Store
See a [usage example](/docs/integrations/vectorstores/pgembedding).
```python
from langchain.vectorstores import PGEmbedding
```

@ -0,0 +1,29 @@
# ScaNN
>[Google ScaNN](https://github.com/google-research/google-research/tree/master/scann)
> (Scalable Nearest Neighbors) is a python package.
>
>`ScaNN` is a method for efficient vector similarity search at scale.
>ScaNN includes search space pruning and quantization for Maximum Inner
> Product Search and also supports other distance functions such as
> Euclidean distance. The implementation is optimized for x86 processors
> with AVX2 support. See its [Google Research github](https://github.com/google-research/google-research/tree/master/scann)
> for more details.
## Installation and Setup
We need to install `scann` python package.
```bash
pip install scann
```
## Vector Store
See a [usage example](/docs/integrations/vectorstores/scann).
```python
from langchain.vectorstores import ScaNN
```

@ -0,0 +1,26 @@
# Supabase (Postgres)
>[Supabase](https://supabase.com/docs) is an open source `Firebase` alternative.
> `Supabase` is built on top of `PostgreSQL`, which offers strong `SQL`
> querying capabilities and enables a simple interface with already-existing tools and frameworks.
>[PostgreSQL](https://en.wikipedia.org/wiki/PostgreSQL) also known as `Postgres`,
> is a free and open-source relational database management system (RDBMS)
> emphasizing extensibility and `SQL` compliance.
## Installation and Setup
We need to install `supabase` python package.
```bash
pip install supabase
```
## Vector Store
See a [usage example](/docs/integrations/vectorstores/supabase).
```python
from langchain.vectorstores import SupabaseVectorStore
```

@ -0,0 +1,25 @@
# USearch
>[USearch](https://unum-cloud.github.io/usearch/) is a Smaller & Faster Single-File Vector Search Engine.
>`USearch's` base functionality is identical to `FAISS`, and the interface should look
> familiar if you have ever investigated Approximate Nearest Neighbors search.
> `USearch` and `FAISS` both employ `HNSW` algorithm, but they differ significantly
> in their design principles. `USearch` is compact and broadly compatible with FAISS without
> sacrificing performance, with a primary focus on user-defined metrics and fewer dependencies.
>
## Installation and Setup
We need to install `usearch` python package.
```bash
pip install usearch
```
## Vector Store
See a [usage example](/docs/integrations/vectorstores/usearch).
```python
from langchain.vectorstores import USearch
```

@ -0,0 +1,28 @@
# Xata
> [Xata](https://xata.io) is a serverless data platform, based on `PostgreSQL`.
> It provides a Python SDK for interacting with your database, and a UI
> for managing your data.
> `Xata` has a native vector type, which can be added to any table, and
> supports similarity search. LangChain inserts vectors directly to `Xata`,
> and queries it for the nearest neighbors of a given vector, so that you can
> use all the LangChain Embeddings integrations with `Xata`.
## Installation and Setup
We need to install `xata` python package.
```bash
pip install xata==1.0.0a7
```
## Vector Store
See a [usage example](/docs/integrations/vectorstores/xata).
```python
from langchain.vectorstores import XataVectorStore
```

@ -5,7 +5,7 @@
"id": "683953b3",
"metadata": {},
"source": [
"# ClickHouse Vector Search\n",
"# ClickHouse\n",
"\n",
"> [ClickHouse](https://clickhouse.com/) is the fastest and most resource efficient open-source database for real-time apps and analytics with full SQL support and a wide range of functions to assist users in writing analytical queries. Lately added data structures and distance search functions (like `L2Distance`) as well as [approximate nearest neighbor search indexes](https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/annindexes) enable ClickHouse to be used as a high performance and scalable vector database to store and search vectors with SQL.\n",
"\n",
@ -198,8 +198,7 @@
"ExecuteTime": {
"end_time": "2023-06-03T08:28:58.252991Z",
"start_time": "2023-06-03T08:28:58.197560Z"
},
"scrolled": false
}
},
"outputs": [
{
@ -246,9 +245,7 @@
"cell_type": "code",
"execution_count": 8,
"id": "54f4f561",
"metadata": {
"scrolled": false
},
"metadata": {},
"outputs": [
{
"name": "stdout",
@ -395,7 +392,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
"version": "3.10.12"
}
},
"nbformat": 4,

@ -1,20 +1,18 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "2ce41f46-5711-4311-b04d-2fe233ac5b1b",
"metadata": {},
"source": [
"# DocArrayHnswSearch\n",
"# DocArray HnswSearch\n",
"\n",
">[DocArrayHnswSearch](https://docs.docarray.org/user_guide/storing/index_hnswlib/) is a lightweight Document Index implementation provided by [Docarray](https://docs.docarray.org/) that runs fully locally and is best suited for small- to medium-sized datasets. It stores vectors on disk in [hnswlib](https://github.com/nmslib/hnswlib), and stores all other data in [SQLite](https://www.sqlite.org/index.html).\n",
">[DocArrayHnswSearch](https://docs.docarray.org/user_guide/storing/index_hnswlib/) is a lightweight Document Index implementation provided by [Docarray](https://github.com/docarray/docarray) that runs fully locally and is best suited for small- to medium-sized datasets. It stores vectors on disk in [hnswlib](https://github.com/nmslib/hnswlib), and stores all other data in [SQLite](https://www.sqlite.org/index.html).\n",
"\n",
"This notebook shows how to use functionality related to the `DocArrayHnswSearch`."
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "7ee37d28",
"metadata": {},
@ -57,7 +55,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "8dbb6de2",
"metadata": {
@ -103,7 +100,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "ed6f905b-4853-4a44-9730-614aa8e22b78",
"metadata": {},
@ -151,7 +147,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "3febb987-e903-416f-af26-6897d84c8d61",
"metadata": {},
@ -160,7 +155,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "bb1df11a",
"metadata": {},
@ -236,7 +230,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
"version": "3.10.12"
}
},
"nbformat": 4,

@ -1,20 +1,18 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "a3afefb0-7e99-4912-a222-c6b186da11af",
"metadata": {},
"source": [
"# DocArrayInMemorySearch\n",
"# DocArray InMemorySearch\n",
"\n",
">[DocArrayInMemorySearch](https://docs.docarray.org/user_guide/storing/index_in_memory/) is a document index provided by [Docarray](https://docs.docarray.org/) that stores documents in memory. It is a great starting point for small datasets, where you may not want to launch a database server.\n",
">[DocArrayInMemorySearch](https://docs.docarray.org/user_guide/storing/index_in_memory/) is a document index provided by [Docarray](https://github.com/docarray/docarray) that stores documents in memory. It is a great starting point for small datasets, where you may not want to launch a database server.\n",
"\n",
"This notebook shows how to use functionality related to the `DocArrayInMemorySearch`."
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "5031a3ec",
"metadata": {},
@ -56,7 +54,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "6e57a389-f637-4b8f-9ab2-759ae7485f78",
"metadata": {},
@ -98,7 +95,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "efbb6684-3846-4332-a624-ddd4d75844c1",
"metadata": {},
@ -146,7 +142,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "43896697-f99e-47b6-9117-47a25e9afa9c",
"metadata": {},
@ -155,7 +150,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "414a9bc9",
"metadata": {},
@ -224,7 +218,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
"version": "3.10.12"
}
},
"nbformat": 4,

@ -5,7 +5,7 @@
"id": "683953b3",
"metadata": {},
"source": [
"# FAISS\n",
"# Faiss\n",
"\n",
">[Facebook AI Similarity Search (Faiss)](https://engineering.fb.com/2017/03/29/data-infrastructure/faiss-a-library-for-efficient-similarity-search/) is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning.\n",
"\n",
@ -596,7 +596,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.17"
"version": "3.10.12"
}
},
"nbformat": 4,

@ -5,9 +5,9 @@
"id": "655b8f55-2089-4733-8b09-35dea9580695",
"metadata": {},
"source": [
"# MatchingEngine\n",
"# Google Vertex AI MatchingEngine\n",
"\n",
"This notebook shows how to use functionality related to the GCP Vertex AI `MatchingEngine` vector database.\n",
"This notebook shows how to use functionality related to the `GCP Vertex AI MatchingEngine` vector database.\n",
"\n",
"> Vertex AI [Matching Engine](https://cloud.google.com/vertex-ai/docs/matching-engine/overview) provides the industry's leading high-scale low latency vector database. These vector databases are commonly referred to as vector similarity-matching or an approximate nearest neighbor (ANN) service.\n",
"\n",
@ -348,7 +348,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
}
},
"nbformat": 4,

@ -197,7 +197,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -205,7 +204,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -229,7 +227,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -298,9 +295,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

@ -1,14 +1,13 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "683953b3",
"metadata": {},
"source": [
"# MongoDB Atlas\n",
"\n",
">[MongoDB Atlas](https://www.mongodb.com/docs/atlas/) is a fully-managed cloud database available in AWS , Azure, and GCP. It now has support for native Vector Search on your MongoDB document data.\n",
">[MongoDB Atlas](https://www.mongodb.com/docs/atlas/) is a fully-managed cloud database available in AWS, Azure, and GCP. It now has support for native Vector Search on your MongoDB document data.\n",
"\n",
"This notebook shows how to use `MongoDB Atlas Vector Search` to store your embeddings in MongoDB documents, create a vector search index, and perform KNN search with an approximate nearest neighbor algorithm.\n",
"\n",
@ -44,7 +43,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "457ace44-1d95-4001-9dd5-78811ab208ad",
"metadata": {},
@ -63,7 +61,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "1f3ecc42",
"metadata": {},
@ -147,7 +144,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "851a2ec9-9390-49a4-8412-3e132c9f789d",
"metadata": {},
@ -191,7 +187,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
"version": "3.10.12"
}
},
"nbformat": 4,

@ -1,18 +1,17 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "1292f057",
"metadata": {},
"source": [
"# pg_embedding\n",
"# Postgres Embedding\n",
"\n",
"> [pg_embedding](https://github.com/neondatabase/pg_embedding) is an open-source vector similarity search for `Postgres` that uses Hierarchical Navigable Small Worlds for approximate nearest neighbor search.\n",
"> [Postgres Embedding](https://github.com/neondatabase/pg_embedding) is an open-source vector similarity search for `Postgres` that uses `Hierarchical Navigable Small Worlds (HNSW)` for approximate nearest neighbor search.\n",
"\n",
"It supports:\n",
"- exact and approximate nearest neighbor search using HNSW\n",
"- L2 distance\n",
">It supports:\n",
">- exact and approximate nearest neighbor search using HNSW\n",
">- L2 distance\n",
"\n",
"This notebook shows how to use the Postgres vector database (`PGEmbedding`).\n",
"\n",
@ -36,7 +35,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "b2e49694",
"metadata": {},
@ -158,7 +156,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "7ef7b052",
"metadata": {},
@ -167,7 +164,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "939151f7",
"metadata": {},
@ -192,7 +188,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "f9510e6b",
"metadata": {},
@ -214,7 +209,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "7adacf29",
"metadata": {},
@ -236,7 +230,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "528893fb",
"metadata": {},
@ -330,7 +323,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.6"
"version": "3.10.12"
}
},
"nbformat": 4,

@ -182,7 +182,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.11"
"version": "3.10.12"
}
},
"nbformat": 4,

@ -1,7 +1,6 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "683953b3",
"metadata": {},
@ -10,7 +9,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "cc80fa84-1f2f-48b4-bd39-3e6412f012f1",
"metadata": {},
@ -87,7 +85,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "69bff365-3039-4ff8-a641-aa190166179d",
"metadata": {},
@ -237,7 +234,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "18152965",
"metadata": {},
@ -246,7 +242,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "ea13e80a",
"metadata": {},
@ -287,7 +282,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "794a7552",
"metadata": {},
@ -439,7 +433,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
"version": "3.10.12"
}
},
"nbformat": 4,

@ -8,7 +8,7 @@
"# USearch\n",
">[USearch](https://unum-cloud.github.io/usearch/) is a Smaller & Faster Single-File Vector Search Engine\n",
"\n",
"USearch's base functionality is identical to FAISS, and the interface should look familiar if you have ever investigated Approximate Nearest Neigbors search. FAISS is a widely recognized standard for high-performance vector search engines. USearch and FAISS both employ the same HNSW algorithm, but they differ significantly in their design principles. USearch is compact and broadly compatible without sacrificing performance, with a primary focus on user-defined metrics and fewer dependencies."
">USearch's base functionality is identical to FAISS, and the interface should look familiar if you have ever investigated Approximate Nearest Neigbors search. FAISS is a widely recognized standard for high-performance vector search engines. USearch and FAISS both employ the same HNSW algorithm, but they differ significantly in their design principles. USearch is compact and broadly compatible without sacrificing performance, with a primary focus on user-defined metrics and fewer dependencies."
]
},
{
@ -187,7 +187,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
}
},
"nbformat": 4,

@ -232,7 +232,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.10.12"
}
},
"nbformat": 4,

Loading…
Cancel
Save