docs: improved `vectorstore` notebooks (#3724)

- Added links to the vectorstore providers
- Added installation code (it is not clear that we have to go to the
`LangChan Ecosystem` page to get installation instructions.)
fix_agent_callbacks
leo-gan 1 year ago committed by GitHub
parent ad4eae7ef0
commit e510732ad2
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -185,7 +185,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -555,7 +554,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.16"
"version": "3.10.6"
},
"vscode": {
"interpreter": {

@ -6,15 +6,21 @@
"source": [
"# AnalyticDB\n",
"\n",
"This notebook shows how to use functionality related to the AnalyticDB vector database.\n",
">[AnalyticDB for PostgreSQL](https://www.alibabacloud.com/help/en/analyticdb-for-postgresql/latest/product-introduction-overview) is a massively parallel processing (MPP) data warehousing service that is designed to analyze large volumes of data online.\n",
"\n",
">`AnalyticDB for PostgreSQL` is developed based on the open source `Greenplum Database` project and is enhanced with in-depth extensions by `Alibaba Cloud`. AnalyticDB for PostgreSQL is compatible with the ANSI SQL 2003 syntax and the PostgreSQL and Oracle database ecosystems. AnalyticDB for PostgreSQL also supports row store and column store. AnalyticDB for PostgreSQL processes petabytes of data offline at a high performance level and supports highly concurrent online queries.\n",
"\n",
"This notebook shows how to use functionality related to the `AnalyticDB` vector database.\n",
"To run, you should have an [AnalyticDB](https://www.alibabacloud.com/help/en/analyticdb-for-postgresql/latest/product-introduction-overview) instance up and running:\n",
"- Using [AnalyticDB Cloud Vector Database](https://www.alibabacloud.com/product/hybriddb-postgresql). Click here to fast deploy it."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
@ -24,12 +30,10 @@
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Split documents and get embeddings by call OpenAI API"
],
"metadata": {
"collapsed": false
}
]
},
{
"cell_type": "code",
@ -48,6 +52,7 @@
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Connect to AnalyticDB by setting related ENVIRONMENTS.\n",
"```\n",
@ -59,10 +64,7 @@
"```\n",
"\n",
"Then store your embeddings and documents into AnalyticDB"
],
"metadata": {
"collapsed": false
}
]
},
{
"cell_type": "code",
@ -90,12 +92,10 @@
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Query and retrieve data"
],
"metadata": {
"collapsed": false
}
]
},
{
"cell_type": "code",
@ -129,13 +129,6 @@
"source": [
"print(docs[0].page_content)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@ -154,9 +147,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 1
"nbformat_minor": 4
}

@ -1,34 +1,31 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "683953b3",
"metadata": {},
"source": [
"# Annoy\n",
"\n",
"This notebook shows how to use functionality related to the Annoy vector database.\n",
"\n",
"> \"Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given query point. It also creates large read-only file-based data structures that are mmapped into memory so that many processes may share the same data.\"\n",
"\n",
"This notebook shows how to use functionality related to the `Annoy` vector database.\n",
"\n",
"via [Annoy](https://github.com/spotify/annoy) \n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "3b450bdc",
"metadata": {},
"source": [
"```{note}\n",
"Annoy is read-only - once the index is built you cannot add any more emebddings!\n",
"If you want to progressively add to your VectorStore then better choose an alternative!\n",
"NOTE: Annoy is read-only - once the index is built you cannot add any more emebddings!\n",
"If you want to progressively add new entries to your VectorStore then better choose an alternative!\n",
"```"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "6613d222",
"metadata": {},
@ -123,7 +120,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "4583b231",
"metadata": {},
@ -265,7 +261,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "341390c2",
"metadata": {},
@ -409,7 +404,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "6f570f69",
"metadata": {},
@ -472,7 +466,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "df4beb83",
"metadata": {},
@ -564,7 +557,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
"version": "3.10.6"
}
},
"nbformat": 4,

@ -6,7 +6,20 @@
"source": [
"# AtlasDB\n",
"\n",
"This notebook shows you how to use functionality related to the AtlasDB"
"This notebook shows you how to use functionality related to the `AtlasDB`.\n",
"\n",
">[MongoDBs](https://www.mongodb.com/) [Atlas](https://www.mongodb.com/cloud/atlas) is an on-demand fully managed service. `MongoDB Atlas` runs on `AWS`, `Microsoft Azure`, and `Google Cloud Platform`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install spacy"
]
},
{
@ -15,35 +28,50 @@
"metadata": {
"pycharm": {
"is_executing": true
}
},
"scrolled": true,
"tags": []
},
"outputs": [],
"source": [
"import time\n",
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import SpacyTextSplitter\n",
"from langchain.vectorstores import AtlasDB\n",
"from langchain.document_loaders import TextLoader"
"!python3 -m spacy download en_core_web_sm"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install nomic"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"pycharm": {
"is_executing": true
},
"scrolled": true
"tags": []
},
"outputs": [],
"source": [
"!python -m spacy download en_core_web_sm"
"import time\n",
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import SpacyTextSplitter\n",
"from langchain.vectorstores import AtlasDB\n",
"from langchain.document_loaders import TextLoader"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"execution_count": 7,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"ATLAS_TEST_API_KEY = '7xDPkYXSYDc1_ErdTPIcoAR9RNd8YDlkS3nVNXcVoIMZ6'"
@ -51,8 +79,10 @@
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"execution_count": 8,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"loader = TextLoader('../../../state_of_the_union.txt')\n",
@ -71,7 +101,8 @@
"metadata": {
"pycharm": {
"is_executing": true
}
},
"tags": []
},
"outputs": [],
"source": [
@ -165,13 +196,6 @@
"source": [
"db.project"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@ -190,9 +214,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 1
"nbformat_minor": 4
}

@ -7,14 +7,68 @@
"source": [
"# Chroma\n",
"\n",
"This notebook shows how to use functionality related to the Chroma vector database."
">[Chroma](https://docs.trychroma.com/getting-started) is a database for building AI applications with embeddings.\n",
"\n",
"This notebook shows how to use functionality related to the `Chroma` vector database."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0825fa4a-d950-4e78-8bba-20cfcc347765",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install chromadb"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "42080f37-8fd1-4cec-acd9-15d2b03b2f4d",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n"
]
}
],
"source": [
"# get a token: https://platform.openai.com/account/api-keys\n",
"\n",
"from getpass import getpass\n",
"\n",
"OPENAI_API_KEY = getpass()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "c7a94d6c-b4d4-4498-9bdd-eb50c92b85c5",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = OPENAI_API_KEY"
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 5,
"id": "aac9563e",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
@ -25,9 +79,11 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 8,
"id": "a3c3999a",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
@ -41,9 +97,11 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 9,
"id": "5eabdb75",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stderr",
@ -94,9 +152,11 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 10,
"id": "72aaa9c8",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"docs = db.similarity_search_with_score(query)"
@ -104,18 +164,20 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 11,
"id": "d88e958e",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"(Document(page_content='In state after state, new laws have been passed, not only to suppress the vote, but to subvert entire elections. \\n\\nWe cannot let this happen. \\n\\nTonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', lookup_str='', metadata={'source': '../../state_of_the_union.txt'}, lookup_index=0),\n",
" 0.3913410007953644)"
"(Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', metadata={'source': '../../../state_of_the_union.txt'}),\n",
" 0.3949805498123169)"
]
},
"execution_count": 6,
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
@ -170,7 +232,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "f568a322",
"metadata": {},
@ -300,7 +361,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,

@ -6,24 +6,30 @@
"source": [
"# Deep Lake\n",
"\n",
"This notebook showcases basic functionality related to Deep Lake. While Deep Lake can store embeddings, it is capable of storing any type of data. It is a fully fledged serverless data lake with version control, query engine and streaming dataloader to deep learning frameworks. \n",
">[Deep Lake](https://docs.activeloop.ai/) as a Multi-Modal Vector Store that stores embeddings and their metadata including text, jsons, images, audio, video, and more. It saves the data locally, in your cloud, or on Activeloop storage. It performs hybrid search including embeddings and their attributes.\n",
"\n",
"For more information, please see the Deep Lake [documentation](docs.activeloop.ai) or [api reference](docs.deeplake.ai)"
"This notebook showcases basic functionality related to `Deep Lake`. While `Deep Lake` can store embeddings, it is capable of storing any type of data. It is a fully fledged serverless data lake with version control, query engine and streaming dataloader to deep learning frameworks. \n",
"\n",
"For more information, please see the Deep Lake [documentation](https://docs.activeloop.ai) or [api reference](https://docs.deeplake.ai)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!python3 -m pip install openai deeplake tiktoken"
"!pip install openai deeplake tiktoken"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"execution_count": 2,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
@ -33,9 +39,19 @@
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [],
"execution_count": 3,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
"OpenAI API Key: ········\n"
]
}
],
"source": [
"import os\n",
"import getpass\n",
@ -46,8 +62,10 @@
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"execution_count": 4,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
@ -61,7 +79,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -70,9 +87,19 @@
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"execution_count": 6,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/leo/.local/lib/python3.10/site-packages/deeplake/util/check_latest_version.py:32: UserWarning: A newer version of deeplake (3.3.2) is available. It's recommended that you update to the latest version using `pip install -U deeplake`.\n",
" warnings.warn(\n"
]
},
{
"name": "stdout",
"output_type": "stream",
@ -80,11 +107,26 @@
"./my_deeplake/ loaded successfully.\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": []
},
{
"name": "stderr",
"output_type": "stream",
"text": []
},
{
"name": "stderr",
"output_type": "stream",
"text": []
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Evaluating ingest: 100%|██████████| 1/1 [00:04<00:00\n"
"Evaluating ingest: 100%|██████████████████████████████████████| 1/1 [00:07<00:00\n"
]
},
{
@ -93,17 +135,17 @@
"text": [
"Dataset(path='./my_deeplake/', tensors=['embedding', 'ids', 'metadata', 'text'])\n",
"\n",
" tensor htype shape dtype compression\n",
" ------- ------- ------- ------- ------- \n",
" embedding generic (4, 1536) float32 None \n",
" ids text (4, 1) str None \n",
" metadata json (4, 1) str None \n",
" text text (4, 1) str None \n"
" tensor htype shape dtype compression\n",
" ------- ------- ------- ------- ------- \n",
" embedding generic (42, 1536) float32 None \n",
" ids text (42, 1) str None \n",
" metadata json (42, 1) str None \n",
" text text (42, 1) str None \n"
]
}
],
"source": [
"db = DeepLake(dataset_path=\"./my_deeplake/\", embedding_function=embeddings, overwrite=True)\n",
"db = DeepLake(dataset_path=\"./my_deeplake/\", embedding_function=embeddings)\n",
"db.add_documents(docs)\n",
"# or shorter\n",
"# db = DeepLake.from_documents(docs, dataset_path=\"./my_deeplake/\", embedding=embeddings, overwrite=True)\n",
@ -113,8 +155,10 @@
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"execution_count": 7,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
@ -135,7 +179,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -144,8 +187,10 @@
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"execution_count": 8,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
@ -155,6 +200,11 @@
"\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": []
},
{
"name": "stderr",
"output_type": "stream",
@ -168,12 +218,12 @@
"text": [
"Dataset(path='./my_deeplake/', read_only=True, tensors=['embedding', 'ids', 'metadata', 'text'])\n",
"\n",
" tensor htype shape dtype compression\n",
" ------- ------- ------- ------- ------- \n",
" embedding generic (4, 1536) float32 None \n",
" ids text (4, 1) str None \n",
" metadata json (4, 1) str None \n",
" text text (4, 1) str None \n"
" tensor htype shape dtype compression\n",
" ------- ------- ------- ------- ------- \n",
" embedding generic (42, 1536) float32 None \n",
" ids text (42, 1) str None \n",
" metadata json (42, 1) str None \n",
" text text (42, 1) str None \n"
]
}
],
@ -183,7 +233,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -191,7 +240,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -200,14 +248,16 @@
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"execution_count": 9,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/media/sdb/davit/Git/experiments/langchain/langchain/llms/openai.py:672: UserWarning: You are trying to use a chat model. This way of initializing it is no longer supported. Instead, please use: `from langchain.chat_models import ChatOpenAI`\n",
"/home/leo/.local/lib/python3.10/site-packages/langchain/llms/openai.py:624: UserWarning: You are trying to use a chat model. This way of initializing it is no longer supported. Instead, please use: `from langchain.chat_models import ChatOpenAI`\n",
" warnings.warn(\n"
]
}
@ -221,16 +271,18 @@
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"execution_count": 10,
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"\"The president nominated Ketanji Brown Jackson to serve on the United States Supreme Court, describing her as one of the nation's top legal minds and a consensus builder with a background in private practice and public defense, and noting that she has received broad support from both Democrats and Republicans.\""
"'The president nominated Ketanji Brown Jackson to serve on the United States Supreme Court. He described her as a former top litigator in private practice, a former federal public defender, a consensus builder, and from a family of public school educators and police officers. He also mentioned that she has received broad support from various groups since being nominated.'"
]
},
"execution_count": 53,
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
@ -241,7 +293,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -325,7 +376,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -357,7 +407,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -389,7 +438,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -406,7 +454,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -429,7 +476,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -533,7 +579,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -594,7 +639,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -638,7 +682,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -815,7 +858,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.0"
"version": "3.10.6"
},
"vscode": {
"interpreter": {
@ -824,5 +867,5 @@
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

@ -7,14 +7,135 @@
"source": [
"# ElasticSearch\n",
"\n",
"This notebook shows how to use functionality related to the ElasticSearch database."
"[Elasticsearch](https://www.elastic.co/elasticsearch/) is a distributed, RESTful search and analytics engine. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents.\n",
"\n",
"This notebook shows how to use functionality related to the `Elasticsearch` database."
]
},
{
"cell_type": "markdown",
"id": "b66c12b2-2a07-4136-ac77-ce1c9fa7a409",
"metadata": {
"tags": []
},
"source": [
"## Installation"
]
},
{
"cell_type": "markdown",
"id": "81f43794-f002-477c-9b68-4975df30e718",
"metadata": {},
"source": [
"Check out [Elasticsearch installation instructions](https://www.elastic.co/guide/en/elasticsearch/reference/current/install-elasticsearch.html).\n",
"\n",
"To connect to an Elasticsearch instance that does not require\n",
"login credentials, pass the Elasticsearch URL and index name along with the\n",
"embedding object to the constructor.\n",
"\n",
"Example:\n",
"```python\n",
" from langchain import ElasticVectorSearch\n",
" from langchain.embeddings import OpenAIEmbeddings\n",
"\n",
" embedding = OpenAIEmbeddings()\n",
" elastic_vector_search = ElasticVectorSearch(\n",
" elasticsearch_url=\"http://localhost:9200\",\n",
" index_name=\"test_index\",\n",
" embedding=embedding\n",
" )\n",
"```\n",
"\n",
"To connect to an Elasticsearch instance that requires login credentials,\n",
"including Elastic Cloud, use the Elasticsearch URL format\n",
"https://username:password@es_host:9243. For example, to connect to Elastic\n",
"Cloud, create the Elasticsearch URL with the required authentication details and\n",
"pass it to the ElasticVectorSearch constructor as the named parameter\n",
"elasticsearch_url.\n",
"\n",
"You can obtain your Elastic Cloud URL and login credentials by logging in to the\n",
"Elastic Cloud console at https://cloud.elastic.co, selecting your deployment, and\n",
"navigating to the \"Deployments\" page.\n",
"\n",
"To obtain your Elastic Cloud password for the default \"elastic\" user:\n",
"1. Log in to the Elastic Cloud console at https://cloud.elastic.co\n",
"2. Go to \"Security\" > \"Users\"\n",
"3. Locate the \"elastic\" user and click \"Edit\"\n",
"4. Click \"Reset password\"\n",
"5. Follow the prompts to reset the password\n",
"\n",
"Format for Elastic Cloud URLs is\n",
"https://username:password@cluster_id.region_id.gcp.cloud.es.io:9243.\n",
"\n",
"Example:\n",
"```python\n",
" from langchain import ElasticVectorSearch\n",
" from langchain.embeddings import OpenAIEmbeddings\n",
"\n",
" embedding = OpenAIEmbeddings()\n",
"\n",
" elastic_host = \"cluster_id.region_id.gcp.cloud.es.io\"\n",
" elasticsearch_url = f\"https://username:password@{elastic_host}:9243\"\n",
" elastic_vector_search = ElasticVectorSearch(\n",
" elasticsearch_url=elasticsearch_url,\n",
" index_name=\"test_index\",\n",
" embedding=embedding\n",
" )\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"id": "d6197931-cbe5-460c-a5e6-b5eedb83887c",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install elasticsearch"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "67ab8afa-f7c6-4fbf-b596-cb512da949da",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
"OpenAI API Key: ········\n"
]
}
],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "markdown",
"id": "f6030187-0bd7-4798-8372-a265036af5e0",
"metadata": {
"tags": []
},
"source": [
"## Example"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "aac9563e",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
@ -25,9 +146,11 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 5,
"id": "a3c3999a",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
@ -43,7 +166,9 @@
"cell_type": "code",
"execution_count": null,
"id": "12eb86d8",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"db = ElasticVectorSearch.from_documents(docs, embeddings, elasticsearch_url=\"http://localhost:9200\")\n",
@ -105,7 +230,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,

@ -7,14 +7,65 @@
"source": [
"# FAISS\n",
"\n",
"This notebook shows how to use functionality related to the FAISS vector database."
">[Facebook AI Similarity Search (Faiss)](https://engineering.fb.com/2017/03/29/data-infrastructure/faiss-a-library-for-efficient-similarity-search/) is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning.\n",
"\n",
"[Faiss documentation](https://faiss.ai/).\n",
"\n",
"This notebook shows how to use functionality related to the `FAISS` vector database."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "aac9563e",
"execution_count": null,
"id": "497fcd89-e832-46a7-a74a-c71199666206",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"#!pip install faiss\n",
"# OR\n",
"!pip install faiss-cpu"
]
},
{
"cell_type": "markdown",
"id": "38237514-b3fa-44a4-9cff-30cd6bf50073",
"metadata": {},
"source": [
"We want to use OpenAIEmbeddings so we have to get the OpenAI API Key. "
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "47f9b495-88f1-4286-8d5d-1416103931a7",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
"OpenAI API Key: ········\n"
]
}
],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "aac9563e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
@ -25,9 +76,11 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 7,
"id": "a3c3999a",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
@ -41,9 +94,11 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 8,
"id": "5eabdb75",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"db = FAISS.from_documents(docs, embeddings)\n",
@ -54,9 +109,11 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 9,
"id": "4b172de8",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
@ -315,7 +372,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,

@ -7,24 +7,61 @@
"source": [
"# LanceDB\n",
"\n",
"This notebook shows how to use functionality related to the LanceDB vector database based on the Lance data format."
">[LanceDB](https://lancedb.com/) is an open-source database for vector-search built with persistent storage, which greatly simplifies retrevial, filtering and management of embeddings. Fully open source.\n",
"\n",
"This notebook shows how to use functionality related to the `LanceDB` vector database based on the Lance data format."
]
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": null,
"id": "bfcf346a",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"#!pip install lancedb"
"!pip install lancedb"
]
},
{
"cell_type": "markdown",
"id": "99134dd1-b91e-486f-8d90-534248e43b9d",
"metadata": {},
"source": [
"We want to use OpenAIEmbeddings so we have to get the OpenAI API Key. "
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 2,
"id": "a0361f5c-e6f4-45f4-b829-11680cf03cec",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
"OpenAI API Key: ········\n"
]
}
],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aac9563e",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings import OpenAIEmbeddings\n",
@ -72,9 +109,7 @@
"cell_type": "code",
"execution_count": 14,
"id": "9c608226",
"metadata": {
"scrolled": false
},
"metadata": {},
"outputs": [
{
"name": "stdout",
@ -171,7 +206,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,

@ -7,16 +7,63 @@
"source": [
"# Milvus\n",
"\n",
">[Milvus](https://milvus.io/docs/overview.md) is a database that stores, indexes, and manages massive embedding vectors generated by deep neural networks and other machine learning (ML) models.\n",
"\n",
"This notebook shows how to use functionality related to the Milvus vector database.\n",
"\n",
"To run, you should have a Milvus instance up and running: https://milvus.io/docs/install_standalone-docker.md"
"To run, you should have a [Milvus instance up and running](https://milvus.io/docs/install_standalone-docker.md)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a62cff8a-bcf7-4e33-bbbc-76999c2e3e20",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install pymilvus"
]
},
{
"cell_type": "markdown",
"id": "7a0f9e02-8eb0-4aef-b11f-8861360472ee",
"metadata": {},
"source": [
"We want to use OpenAIEmbeddings so we have to get the OpenAI API Key."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "8b6ed9cd-81b9-46e5-9c20-5aafca2844d0",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
"OpenAI API Key: ········\n"
]
}
],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "aac9563e",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
@ -27,9 +74,11 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 4,
"id": "a3c3999a",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
@ -45,7 +94,9 @@
"cell_type": "code",
"execution_count": null,
"id": "dcf88bdf",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"vector_db = Milvus.from_documents(\n",
@ -74,14 +125,6 @@
"source": [
"docs[0]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a359ed74",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@ -100,7 +143,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,

@ -1,37 +1,63 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "683953b3",
"metadata": {},
"source": [
"# MyScale\n",
"\n",
"This notebook shows how to use functionality related to the MyScale vector database."
">[MyScale](https://docs.myscale.com/en/overview/) is a cloud-based database optimized for AI applications and solutions, built on the open-source [ClickHouse](https://github.com/ClickHouse/ClickHouse). \n",
"\n",
"This notebook shows how to use functionality related to the `MyScale` vector database."
]
},
{
"cell_type": "markdown",
"id": "43ead5d5-2c1f-4dce-a69a-cb00e4f9d6f0",
"metadata": {},
"source": [
"## Setting up envrionments"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "aac9563e",
"execution_count": null,
"id": "7dccc580-8270-4714-ad61-f79783dd6eea",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install clickhouse-connect"
]
},
{
"cell_type": "markdown",
"id": "15a1d477-9cdb-4d82-b019-96951ecb2b72",
"metadata": {},
"source": [
"We want to use OpenAIEmbeddings so we have to get the OpenAI API Key."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "91003ea5-0c8c-436c-a5de-aaeaeef2f458",
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import MyScale\n",
"from langchain.document_loaders import TextLoader"
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "a9d16fa3",
"metadata": {},
"source": [
"## Setting up envrionments\n",
"\n",
"There are two ways to set up parameters for myscale index.\n",
"\n",
"1. Environment Variables\n",
@ -56,9 +82,26 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": null,
"id": "aac9563e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import MyScale\n",
"from langchain.document_loaders import TextLoader"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a3c3999a",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
@ -126,7 +169,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "e3a8b105",
"metadata": {},
@ -145,7 +187,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "f59360c0",
"metadata": {},
@ -216,7 +257,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "a359ed74",
"metadata": {},
@ -259,7 +299,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.8"
"version": "3.10.6"
}
},
"nbformat": 4,

@ -7,7 +7,10 @@
"source": [
"# OpenSearch\n",
"\n",
"This notebook shows how to use functionality related to the OpenSearch database.\n",
"> [OpenSearch](https://opensearch.org/) is a scalable, flexible, and extensible open-source software suite for search, analytics, and observability applications licensed under Apache 2.0. `OpenSearch` is a distributed search and analytics engine based on `Apache Lucene`.\n",
"\n",
"\n",
"This notebook shows how to use functionality related to the `OpenSearch` database.\n",
"\n",
"To run, you should have the opensearch instance up and running: [here](https://opensearch.org/docs/latest/install-and-configure/install-opensearch/index/)\n",
"`similarity_search` by default performs the Approximate k-NN Search which uses one of the several algorithms like lucene, nmslib, faiss recommended for\n",
@ -15,6 +18,39 @@
"Check [this](https://opensearch.org/docs/latest/search-plugins/knn/index/) for more details."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6e606066-9386-4427-8a87-1b93f435c57e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install opensearch-py"
]
},
{
"cell_type": "markdown",
"id": "b1fa637e-4fbf-4d5a-9188-2cad826a193e",
"metadata": {},
"source": [
"We want to use OpenAIEmbeddings so we have to get the OpenAI API Key."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "28e5455e-322d-4010-9e3b-491d522ef5db",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "code",
"execution_count": 1,
@ -233,9 +269,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
}

@ -6,7 +6,38 @@
"source": [
"# PGVector\n",
"\n",
"This notebook shows how to use functionality related to the Postgres vector database (PGVector)."
">[PGVector](https://github.com/pgvector/pgvector) is an open-source vector similarity search for `Postgres`\n",
"\n",
"It supports:\n",
"- exact and approximate nearest neighbor search\n",
"- L2 distance, inner product, and cosine distance\n",
"\n",
"This notebook shows how to use the Postgres vector database (`PGVector`)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"See the [installation instruction](https://github.com/pgvector/pgvector)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install pgvector"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We want to use `OpenAIEmbeddings` so we have to get the OpenAI API Key."
]
},
{
@ -14,6 +45,31 @@
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"## Loading Environment Variables\n",
"from typing import List, Tuple\n",
@ -23,8 +79,10 @@
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"execution_count": 4,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
@ -182,9 +240,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

@ -7,15 +7,75 @@
"source": [
"# Pinecone\n",
"\n",
"This notebook shows how to use functionality related to the Pinecone vector database."
"[Pinecone](https://docs.pinecone.io/docs/overview) is a vector database with broad functionality.\n",
"\n",
"This notebook shows how to use functionality related to the `Pinecone` vector database.\n",
"\n",
"To use Pinecone, you must have an API key. \n",
"Here are the [installation instructions](https://docs.pinecone.io/docs/quickstart)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "aac9563e",
"execution_count": null,
"id": "b4c41cad-08ef-4f72-a545-2151e4598efe",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install pinecone-client"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c1e38361-c1fe-4ac6-86e9-c90ebaf7ae87",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import getpass\n",
"\n",
"PINECONE_API_KEY = getpass.getpass('Pinecone API Key:')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "02a536e0-d603-4d79-b18b-1ed562977b40",
"metadata": {},
"outputs": [],
"source": [
"PINECONE_ENV = getpass.getpass('Pinecone Environment:')"
]
},
{
"cell_type": "markdown",
"id": "320af802-9271-46ee-948f-d2453933d44b",
"metadata": {},
"source": [
"We want to use `OpenAIEmbeddings` so we have to get the OpenAI API Key."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ffea66e4-bc23-46a9-9580-b348dfe7b7a7",
"metadata": {},
"outputs": [],
"source": [
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "aac9563e",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
@ -50,8 +110,8 @@
"\n",
"# initialize pinecone\n",
"pinecone.init(\n",
" api_key=\"YOUR_API_KEY\", # find at app.pinecone.io\n",
" environment=\"YOUR_ENV\" # next to api key in console\n",
" api_key=PINECONE_API_KEY, # find at app.pinecone.io\n",
" environment=PINECONE_ENV # next to api key in console\n",
")\n",
"\n",
"index_name = \"langchain-demo\"\n",
@ -100,7 +160,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,

@ -7,22 +7,72 @@
"source": [
"# Qdrant\n",
"\n",
"This notebook shows how to use functionality related to the Qdrant vector database. There are various modes of how to run Qdrant, and depending on the chosen one, there will be some subtle differences. The options include:\n",
">[Qdrant](https://qdrant.tech/documentation/) (read: quadrant ) is a vector similarity search engine. It provides a production-ready service with a convenient API to store, search, and manage points - vectors with an additional payload. `Qdrant` is tailored to extended filtering support. It makes it useful for all sorts of neural network or semantic-based matching, faceted search, and other applications.\n",
"\n",
"\n",
"This notebook shows how to use functionality related to the `Qdrant` vector database. \n",
"\n",
"There are various modes of how to run `Qdrant`, and depending on the chosen one, there will be some subtle differences. The options include:\n",
"- Local mode, no server required\n",
"- On-premise server deployment\n",
"- Qdrant Cloud"
"- Qdrant Cloud\n",
"\n",
"See the [installation instructions](https://qdrant.tech/documentation/install/)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"id": "e03e8460-8f32-4d1f-bb93-4f7636a476fa",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install qdrant-client"
]
},
{
"cell_type": "markdown",
"id": "7b2f111b-357a-4f42-9730-ef0603bdc1b5",
"metadata": {},
"source": [
"We want to use `OpenAIEmbeddings` so we have to get the OpenAI API Key."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "082e7e8b-ac52-430c-98d6-8f0924457642",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
"OpenAI API Key: ········\n"
]
}
],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "aac9563e",
"metadata": {
"ExecuteTime": {
"end_time": "2023-04-04T10:51:22.282884Z",
"start_time": "2023-04-04T10:51:21.408077Z"
}
},
"tags": []
},
"outputs": [],
"source": [
@ -34,13 +84,14 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 4,
"id": "a3c3999a",
"metadata": {
"ExecuteTime": {
"end_time": "2023-04-04T10:51:22.520144Z",
"start_time": "2023-04-04T10:51:22.285826Z"
}
},
"tags": []
},
"outputs": [],
"source": [
@ -70,13 +121,14 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 5,
"id": "8429667e",
"metadata": {
"ExecuteTime": {
"end_time": "2023-04-04T10:51:22.525091Z",
"start_time": "2023-04-04T10:51:22.522015Z"
}
},
"tags": []
},
"outputs": [],
"source": [
@ -99,13 +151,14 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 6,
"id": "24b370e2",
"metadata": {
"ExecuteTime": {
"end_time": "2023-04-04T10:51:24.827567Z",
"start_time": "2023-04-04T10:51:22.529080Z"
}
},
"tags": []
},
"outputs": [],
"source": [
@ -242,13 +295,14 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 7,
"id": "a8c513ab",
"metadata": {
"ExecuteTime": {
"end_time": "2023-04-04T10:51:25.204469Z",
"start_time": "2023-04-04T10:51:24.855618Z"
}
},
"tags": []
},
"outputs": [],
"source": [
@ -258,13 +312,14 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 8,
"id": "fc516993",
"metadata": {
"ExecuteTime": {
"end_time": "2023-04-04T10:51:25.220984Z",
"start_time": "2023-04-04T10:51:25.213943Z"
}
},
"tags": []
},
"outputs": [
{

@ -1,20 +1,53 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Redis\n",
"\n",
">[Redis (Remote Dictionary Server)](https://en.wikipedia.org/wiki/Redis) is an in-memory data structure store, used as a distributed, in-memory keyvalue database, cache and message broker, with optional durability.\n",
"\n",
"This notebook shows how to use functionality related to the [Redis vector database](https://redis.com/solutions/use-cases/vector-database/)."
]
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install redis"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We want to use `OpenAIEmbeddings` so we have to get the OpenAI API Key."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
@ -227,9 +260,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 1
"nbformat_minor": 4
}

@ -1,17 +1,23 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "683953b3",
"metadata": {},
"source": [
"# SupabaseVectorStore\n",
"# SupabaseVectorStore"
]
},
{
"cell_type": "markdown",
"id": "cc80fa84-1f2f-48b4-bd39-3e6412f012f1",
"metadata": {},
"source": [
">[Supabase](https://supabase.com/docs) is an open source Firebase alternative.\n",
"\n",
"This notebook shows how to use Supabase and `pgvector` as your VectorStore.\n",
"This notebook shows how to use `Supabase` and `pgvector` as your VectorStore.\n",
"\n",
"To run this notebook, please ensure:\n",
"\n",
"- the `pgvector` extension is enabled\n",
"- you have installed the `supabase-py` package\n",
"- that you have created a `match_documents` function in your database\n",
@ -57,23 +63,66 @@
" LIMIT match_count;\n",
" END;\n",
" $$;\n",
"```\n"
"```"
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"id": "6bd4498b",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# with pip\n",
"# !pip install supabase\n",
"!pip install supabase\n",
"\n",
"# with conda\n",
"# !conda install -c conda-forge supabase"
]
},
{
"cell_type": "markdown",
"id": "69bff365-3039-4ff8-a641-aa190166179d",
"metadata": {},
"source": [
"We want to use `OpenAIEmbeddings` so we have to get the OpenAI API Key."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "19846a7b-99bc-47a7-8e1c-f13c2497f1ae",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c71c3901-d44b-4d09-92c5-3018628c28fa",
"metadata": {},
"outputs": [],
"source": [
"os.environ['SUPABASE_URL'] = getpass.getpass('Supabase URL:')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8b91ecfa-f61b-489a-a337-dff1f12f6ab2",
"metadata": {},
"outputs": [],
"source": [
"os.environ['SUPABASE_SERVICE_KEY'] = getpass.getpass('Supabase Service Key:')"
]
},
{
"cell_type": "code",
"execution_count": 2,
@ -391,7 +440,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
"version": "3.10.6"
}
},
"nbformat": 4,

@ -7,14 +7,73 @@
"source": [
"# Weaviate\n",
"\n",
"This notebook shows how to use functionality related to the Weaviate vector database."
">[Weaviate](https://weaviate.io/) is an open-source vector database. It allows you to store data objects and vector embeddings from your favorite ML-models, and scale seamlessly into billions of data objects.\n",
"\n",
"This notebook shows how to use functionality related to the `Weaviate`vector database.\n",
"\n",
"See the `Weaviate` [installation instructions](https://weaviate.io/developers/weaviate/installation)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e9ab167c-fffc-4d30-b1c1-37cc1b641698",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install weaviate-client"
]
},
{
"cell_type": "markdown",
"id": "6b34828d-e627-4d85-aabd-eeb15d9f4b00",
"metadata": {},
"source": [
"We want to use `OpenAIEmbeddings` so we have to get the OpenAI API Key."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "37697b9f-fbb2-430e-b95d-28d6eb83486d",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fea2dbae-a609-4458-a05f-f1c8e1f37c6f",
"metadata": {},
"outputs": [],
"source": [
"WEAVIATE_URL = getpass.getpass('WEAVIATE_URL:')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "53b7ce2d-3c09-4d1c-b66b-5769ce6746ae",
"metadata": {},
"outputs": [],
"source": [
"os.environ['WEAVIATE_API_KEY'] = getpass.getpass('WEAVIATE_API_KEY:')"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "aac9563e",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
@ -156,7 +215,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,

@ -7,22 +7,44 @@
"source": [
"# Zilliz\n",
"\n",
">[Zilliz Cloud](https://zilliz.com/doc/quick_start) is a fully managed service on cloud for `LF AI Milvus®`,\n",
"\n",
"This notebook shows how to use functionality related to the Zilliz Cloud managed vector database.\n",
"\n",
"To run, you should have a Zilliz Cloud instance up and running: https://zilliz.com/cloud"
"To run, you should have a `Zilliz Cloud` instance up and running. Here are the [installation instructions](https://zilliz.com/cloud)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aac9563e",
"id": "c0c50102-e6ac-4475-a930-49c94ed0bd99",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install pymilvus"
]
},
{
"cell_type": "markdown",
"id": "4b25e246-ffe7-4822-a6bf-85d1a120df00",
"metadata": {},
"source": [
"We want to use `OpenAIEmbeddings` so we have to get the OpenAI API Key."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d6691489-1ebc-40fa-bc09-b0916903a24d",
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import Milvus\n",
"from langchain.document_loaders import TextLoader"
"import os\n",
"import getpass\n",
"\n",
"os.environ['OPENAI_API_KEY'] = getpass.getpass('OpenAI API Key:')"
]
},
{
@ -37,6 +59,19 @@
"ZILLIZ_CLOUD_PORT = \"\" #example: \"19532\""
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aac9563e",
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import Milvus\n",
"from langchain.document_loaders import TextLoader"
]
},
{
"cell_type": "code",
"execution_count": null,
@ -104,7 +139,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.9"
"version": "3.10.6"
}
},
"nbformat": 4,

@ -11,9 +11,9 @@
"source": [
"# Getting Started\n",
"\n",
"This notebook showcases basic functionality related to VectorStores. A key part of working with vectorstores is creating the vector to put in them, which is usually created via embeddings. Therefore, it is recommended that you familiarize yourself with the [embedding notebook](embeddings.ipynb) before diving into this.\n",
"This notebook showcases basic functionality related to VectorStores. A key part of working with vectorstores is creating the vector to put in them, which is usually created via embeddings. Therefore, it is recommended that you familiarize yourself with the [embedding notebook](../../models/text_embedding.htpl) before diving into this.\n",
"\n",
"This covers generic high level functionality related to all vector stores. For guides on specific vectorstores, please see the how-to guides [here](../how_to_guides.rst)"
"This covers generic high level functionality related to all vector stores."
]
},
{
@ -265,7 +265,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.6"
}
},
"nbformat": 4,

Loading…
Cancel
Save