docs: Add Google Firestore Vectorstore doc (#20078)

- **Description:**Add Google Firestore Vector store docs
    - **Issue:** NA
    - **Dependencies:** NA

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
This commit is contained in:
Averi Kitsch 2024-04-15 13:09:32 -07:00 committed by GitHub
parent cc3c343673
commit 30b00090ef
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 414 additions and 0 deletions

View File

@ -486,6 +486,21 @@ See [usage example](/docs/integrations/vectorstores/google_spanner).
from langchain_google_spanner import SpannerVectorStore
```
### Firestore (Native Mode)
> [Google Cloud Firestore](https://cloud.google.com/firestore/docs/) is a NoSQL document database built for automatic scaling, high performance, and ease of application development.
Install the python package:
```bash
pip install langchain-google-firestore
```
See [usage example](/docs/integrations/vectorstores/google_firestore).
```python
from langchain_google_firestore import FirestoreVectorstore
```
### Cloud SQL for MySQL
> [Google Cloud SQL for MySQL](https://cloud.google.com/sql) is a fully-managed database service that helps you set up, maintain, manage, and administer your MySQL relational databases on Google Cloud.

View File

@ -0,0 +1,399 @@
{
"cells": [
{
"cell_type": "raw",
"id": "1957f5cb",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Firestore\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "ef1f0986",
"metadata": {},
"source": [
"# Google Firestore (Native Mode)\n",
"\n",
"> [Firestore](https://cloud.google.com/firestore) is a serverless document-oriented database that scales to meet any demand. Extend your database application to build AI-powered experiences leveraging Firestore's Langchain integrations.\n",
"\n",
"This notebook goes over how to use [Firestore](https://cloud.google.com/firestore) to to store vectors and query them using the `FirestoreVectorStore` class.\n",
"\n",
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/googleapis/langchain-google-firestore-python/blob/main/docs/vectorstores.ipynb)"
]
},
{
"cell_type": "markdown",
"id": "36fdc060",
"metadata": {},
"source": [
"## Before You Begin\n",
"\n",
"To run this notebook, you will need to do the following:\n",
"\n",
"* [Create a Google Cloud Project](https://developers.google.com/workspace/guides/create-project)\n",
"* [Enable the Firestore API](https://console.cloud.google.com/flows/enableapi?apiid=firestore.googleapis.com)\n",
"* [Create a Firestore database](https://cloud.google.com/firestore/docs/manage-databases)\n",
"\n",
"After confirmed access to database in the runtime environment of this notebook, filling the following values and run the cell before running example scripts."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "22e53b34",
"metadata": {},
"outputs": [],
"source": [
"# @markdown Please specify a source for demo purpose.\n",
"COLLECTION_NAME = \"test\" # @param {type:\"CollectionReference\"|\"string\"}"
]
},
{
"cell_type": "markdown",
"id": "e5d3d8e4",
"metadata": {},
"source": [
"### 🦜🔗 Library Installation\n",
"\n",
"The integration lives in its own `langchain-google-firestore` package, so we need to install it. For this notebook, we will also install `langchain-google-genai` to use Google Generative AI embeddings."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "75510ef7",
"metadata": {},
"outputs": [],
"source": [
"%pip install -upgrade --quiet langchain-google-firestore langchain-google-vertexai"
]
},
{
"cell_type": "markdown",
"id": "2664ca45",
"metadata": {},
"source": [
"**Colab only**: Uncomment the following cell to restart the kernel or use the button to restart the kernel. For Vertex AI Workbench you can restart the terminal using the button on top."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ddfcd6b7",
"metadata": {},
"outputs": [],
"source": [
"# # Automatically restart kernel after installs so that your environment can access the new packages\n",
"# import IPython\n",
"\n",
"# app = IPython.Application.instance()\n",
"# app.kernel.do_shutdown(True)"
]
},
{
"cell_type": "markdown",
"id": "4ab63daa",
"metadata": {},
"source": [
"### ☁ Set Your Google Cloud Project\n",
"Set your Google Cloud project so that you can leverage Google Cloud resources within this notebook.\n",
"\n",
"If you don't know your project ID, try the following:\n",
"\n",
"* Run `gcloud config list`.\n",
"* Run `gcloud projects list`.\n",
"* See the support page: [Locate the project ID](https://support.google.com/googleapi/answer/7014113)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "129f1f8d",
"metadata": {},
"outputs": [],
"source": [
"# @markdown Please fill in the value below with your Google Cloud project ID and then run the cell.\n",
"\n",
"PROJECT_ID = \"extensions-testing\" # @param {type:\"string\"}\n",
"\n",
"# Set the project id\n",
"!gcloud config set project {PROJECT_ID}"
]
},
{
"cell_type": "markdown",
"id": "ccd32ce5",
"metadata": {},
"source": [
"### 🔐 Authentication\n",
"\n",
"Authenticate to Google Cloud as the IAM user logged into this notebook in order to access your Google Cloud Project.\n",
"\n",
"- If you are using Colab to run this notebook, use the cell below and continue.\n",
"- If you are using Vertex AI Workbench, check out the setup instructions [here](https://github.com/GoogleCloudPlatform/generative-ai/tree/main/setup-env)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4b5793e7",
"metadata": {},
"outputs": [],
"source": [
"from google.colab import auth\n",
"\n",
"auth.authenticate_user()"
]
},
{
"cell_type": "markdown",
"id": "2cade39f",
"metadata": {},
"source": [
"# Basic Usage"
]
},
{
"cell_type": "markdown",
"id": "580e6f96",
"metadata": {},
"source": [
"### Initialize FirestoreVectorStore\n",
"\n",
"`FirestoreVectorStore` allows you to store new vectors in a Firestore database. You can use it to store embeddings from any model, including those from Google Generative AI."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dc37144c-208d-4ab3-9f3a-0407a69fe052",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain_google_firestore import FirestoreVectorStore\n",
"from langchain_google_vertexai import VertexAIEmbeddings\n",
"\n",
"embedding = VertexAIEmbeddings(\n",
" model_name=\"textembedding-gecko@latest\",\n",
" project=PROJECT_ID,\n",
")\n",
"\n",
"# Sample data\n",
"ids = [\"apple\", \"banana\", \"orange\"]\n",
"fruits_texts = ['{\"name\": \"apple\"}', '{\"name\": \"banana\"}', '{\"name\": \"orange\"}']\n",
"\n",
"# Create a vector store\n",
"vector_store = FirestoreVectorStore(\n",
" collection=\"fruits\",\n",
" embedding=embedding,\n",
")\n",
"\n",
"# Add the fruits to the vector store\n",
"vector_store.add_texts(fruits_texts, ids=ids)"
]
},
{
"cell_type": "markdown",
"id": "f8a4d7f7",
"metadata": {},
"source": [
"As a shorthand, you can initilize and add vectors in a single step using the `from_texts` and `from_documents` method."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0bb6745e",
"metadata": {},
"outputs": [],
"source": [
"vector_store = FirestoreVectorStore.from_texts(\n",
" collection=\"fruits\",\n",
" texts=fruits_texts,\n",
" embedding=embedding,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f86024b9",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.documents import Document\n",
"\n",
"fruits_docs = [Document(page_content=fruit) for fruit in fruits_texts]\n",
"\n",
"vector_store = FirestoreVectorStore.from_documents(\n",
" collection=\"fruits\",\n",
" documents=fruits_docs,\n",
" embedding=embedding,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "942911a8",
"metadata": {},
"source": [
"### Delete Vectors"
]
},
{
"cell_type": "markdown",
"id": "ee1d8090",
"metadata": {},
"source": [
"You can delete documents with vectors from the database using the `delete` method. You'll need to provide the document ID of the vector you want to delete. This will remove the whole document from the database, including any other fields it may have."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "901f2ae7",
"metadata": {},
"outputs": [],
"source": [
"vector_store.delete(ids)"
]
},
{
"cell_type": "markdown",
"id": "bc8e555f",
"metadata": {},
"source": [
"### Update Vectors"
]
},
{
"cell_type": "markdown",
"id": "af734e8f",
"metadata": {},
"source": [
"Updating vectors is similar to adding them. You can use the `add` method to update the vector of a document by providing the document ID and the new vector."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cb2aadb7",
"metadata": {},
"outputs": [],
"source": [
"fruit_to_update = ['{\"name\": \"apple\",\"price\": 12}']\n",
"apple_id = \"apple\"\n",
"\n",
"vector_store.add_texts(fruit_to_update, ids=[apple_id])"
]
},
{
"cell_type": "markdown",
"id": "16342b7a",
"metadata": {},
"source": [
"## Similarity Search\n",
"\n",
"You can use the `FirestoreVectorStore` to perform similarity searches on the vectors you have stored. This is useful for finding similar documents or text."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "44d1b94e",
"metadata": {},
"outputs": [],
"source": [
"vector_store.similarity_search(\"I like fuji apples\", k=3)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "acb2f640",
"metadata": {},
"outputs": [],
"source": [
"vector_store.max_marginal_relevance_search(\"fuji\", 5)"
]
},
{
"cell_type": "markdown",
"id": "4ac1d391",
"metadata": {},
"source": [
"You can add a pre-filter to the search by using the `filters` parameter. This is useful for filtering by a specific field or value."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cd864d4f",
"metadata": {},
"outputs": [],
"source": [
"from google.cloud.firestore_v1.base_query import FieldFilter\n",
"\n",
"vector_store.max_marginal_relevance_search(\n",
" \"fuji\", 5, filters=FieldFilter(\"content\", \"==\", \"apple\")\n",
")"
]
},
{
"cell_type": "markdown",
"id": "9988c71d",
"metadata": {},
"source": [
"### Customize Connection & Authentication"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6b9dfc65",
"metadata": {},
"outputs": [],
"source": [
"from google.api_core.client_options import ClientOptions\n",
"from google.cloud import firestore\n",
"from langchain_google_firestore import FirestoreVectorStore\n",
"\n",
"client_options = ClientOptions()\n",
"client = firestore.Client(client_options=client_options)\n",
"\n",
"# Create a vector store\n",
"vector_store = FirestoreVectorStore(\n",
" collection=\"fruits\",\n",
" embedding=embedding,\n",
" client=client,\n",
")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.7"
}
},
"nbformat": 4,
"nbformat_minor": 5
}