From 30b00090ef72ecb891bc3a3ca48b98f05a7ff51d Mon Sep 17 00:00:00 2001 From: Averi Kitsch Date: Mon, 15 Apr 2024 13:09:32 -0700 Subject: [PATCH] docs: Add Google Firestore Vectorstore doc (#20078) - **Description:**Add Google Firestore Vector store docs - **Issue:** NA - **Dependencies:** NA --------- Co-authored-by: Erick Friis --- docs/docs/integrations/platforms/google.mdx | 15 + .../vectorstores/google_firestore.ipynb | 399 ++++++++++++++++++ 2 files changed, 414 insertions(+) create mode 100644 docs/docs/integrations/vectorstores/google_firestore.ipynb diff --git a/docs/docs/integrations/platforms/google.mdx b/docs/docs/integrations/platforms/google.mdx index f7d8a36ae4..1f732ce68a 100644 --- a/docs/docs/integrations/platforms/google.mdx +++ b/docs/docs/integrations/platforms/google.mdx @@ -486,6 +486,21 @@ See [usage example](/docs/integrations/vectorstores/google_spanner). from langchain_google_spanner import SpannerVectorStore ``` +### Firestore (Native Mode) + +> [Google Cloud Firestore](https://cloud.google.com/firestore/docs/) is a NoSQL document database built for automatic scaling, high performance, and ease of application development. +Install the python package: + +```bash +pip install langchain-google-firestore +``` + +See [usage example](/docs/integrations/vectorstores/google_firestore). + +```python +from langchain_google_firestore import FirestoreVectorstore +``` + ### Cloud SQL for MySQL > [Google Cloud SQL for MySQL](https://cloud.google.com/sql) is a fully-managed database service that helps you set up, maintain, manage, and administer your MySQL relational databases on Google Cloud. diff --git a/docs/docs/integrations/vectorstores/google_firestore.ipynb b/docs/docs/integrations/vectorstores/google_firestore.ipynb new file mode 100644 index 0000000000..8d16936f94 --- /dev/null +++ b/docs/docs/integrations/vectorstores/google_firestore.ipynb @@ -0,0 +1,399 @@ +{ + "cells": [ + { + "cell_type": "raw", + "id": "1957f5cb", + "metadata": {}, + "source": [ + "---\n", + "sidebar_label: Firestore\n", + "---" + ] + }, + { + "cell_type": "markdown", + "id": "ef1f0986", + "metadata": {}, + "source": [ + "# Google Firestore (Native Mode)\n", + "\n", + "> [Firestore](https://cloud.google.com/firestore) is a serverless document-oriented database that scales to meet any demand. Extend your database application to build AI-powered experiences leveraging Firestore's Langchain integrations.\n", + "\n", + "This notebook goes over how to use [Firestore](https://cloud.google.com/firestore) to to store vectors and query them using the `FirestoreVectorStore` class.\n", + "\n", + "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/googleapis/langchain-google-firestore-python/blob/main/docs/vectorstores.ipynb)" + ] + }, + { + "cell_type": "markdown", + "id": "36fdc060", + "metadata": {}, + "source": [ + "## Before You Begin\n", + "\n", + "To run this notebook, you will need to do the following:\n", + "\n", + "* [Create a Google Cloud Project](https://developers.google.com/workspace/guides/create-project)\n", + "* [Enable the Firestore API](https://console.cloud.google.com/flows/enableapi?apiid=firestore.googleapis.com)\n", + "* [Create a Firestore database](https://cloud.google.com/firestore/docs/manage-databases)\n", + "\n", + "After confirmed access to database in the runtime environment of this notebook, filling the following values and run the cell before running example scripts." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "22e53b34", + "metadata": {}, + "outputs": [], + "source": [ + "# @markdown Please specify a source for demo purpose.\n", + "COLLECTION_NAME = \"test\" # @param {type:\"CollectionReference\"|\"string\"}" + ] + }, + { + "cell_type": "markdown", + "id": "e5d3d8e4", + "metadata": {}, + "source": [ + "### 🦜🔗 Library Installation\n", + "\n", + "The integration lives in its own `langchain-google-firestore` package, so we need to install it. For this notebook, we will also install `langchain-google-genai` to use Google Generative AI embeddings." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "75510ef7", + "metadata": {}, + "outputs": [], + "source": [ + "%pip install -upgrade --quiet langchain-google-firestore langchain-google-vertexai" + ] + }, + { + "cell_type": "markdown", + "id": "2664ca45", + "metadata": {}, + "source": [ + "**Colab only**: Uncomment the following cell to restart the kernel or use the button to restart the kernel. For Vertex AI Workbench you can restart the terminal using the button on top." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ddfcd6b7", + "metadata": {}, + "outputs": [], + "source": [ + "# # Automatically restart kernel after installs so that your environment can access the new packages\n", + "# import IPython\n", + "\n", + "# app = IPython.Application.instance()\n", + "# app.kernel.do_shutdown(True)" + ] + }, + { + "cell_type": "markdown", + "id": "4ab63daa", + "metadata": {}, + "source": [ + "### ☁ Set Your Google Cloud Project\n", + "Set your Google Cloud project so that you can leverage Google Cloud resources within this notebook.\n", + "\n", + "If you don't know your project ID, try the following:\n", + "\n", + "* Run `gcloud config list`.\n", + "* Run `gcloud projects list`.\n", + "* See the support page: [Locate the project ID](https://support.google.com/googleapi/answer/7014113)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "129f1f8d", + "metadata": {}, + "outputs": [], + "source": [ + "# @markdown Please fill in the value below with your Google Cloud project ID and then run the cell.\n", + "\n", + "PROJECT_ID = \"extensions-testing\" # @param {type:\"string\"}\n", + "\n", + "# Set the project id\n", + "!gcloud config set project {PROJECT_ID}" + ] + }, + { + "cell_type": "markdown", + "id": "ccd32ce5", + "metadata": {}, + "source": [ + "### 🔐 Authentication\n", + "\n", + "Authenticate to Google Cloud as the IAM user logged into this notebook in order to access your Google Cloud Project.\n", + "\n", + "- If you are using Colab to run this notebook, use the cell below and continue.\n", + "- If you are using Vertex AI Workbench, check out the setup instructions [here](https://github.com/GoogleCloudPlatform/generative-ai/tree/main/setup-env)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4b5793e7", + "metadata": {}, + "outputs": [], + "source": [ + "from google.colab import auth\n", + "\n", + "auth.authenticate_user()" + ] + }, + { + "cell_type": "markdown", + "id": "2cade39f", + "metadata": {}, + "source": [ + "# Basic Usage" + ] + }, + { + "cell_type": "markdown", + "id": "580e6f96", + "metadata": {}, + "source": [ + "### Initialize FirestoreVectorStore\n", + "\n", + "`FirestoreVectorStore` allows you to store new vectors in a Firestore database. You can use it to store embeddings from any model, including those from Google Generative AI." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "dc37144c-208d-4ab3-9f3a-0407a69fe052", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "from langchain_google_firestore import FirestoreVectorStore\n", + "from langchain_google_vertexai import VertexAIEmbeddings\n", + "\n", + "embedding = VertexAIEmbeddings(\n", + " model_name=\"textembedding-gecko@latest\",\n", + " project=PROJECT_ID,\n", + ")\n", + "\n", + "# Sample data\n", + "ids = [\"apple\", \"banana\", \"orange\"]\n", + "fruits_texts = ['{\"name\": \"apple\"}', '{\"name\": \"banana\"}', '{\"name\": \"orange\"}']\n", + "\n", + "# Create a vector store\n", + "vector_store = FirestoreVectorStore(\n", + " collection=\"fruits\",\n", + " embedding=embedding,\n", + ")\n", + "\n", + "# Add the fruits to the vector store\n", + "vector_store.add_texts(fruits_texts, ids=ids)" + ] + }, + { + "cell_type": "markdown", + "id": "f8a4d7f7", + "metadata": {}, + "source": [ + "As a shorthand, you can initilize and add vectors in a single step using the `from_texts` and `from_documents` method." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0bb6745e", + "metadata": {}, + "outputs": [], + "source": [ + "vector_store = FirestoreVectorStore.from_texts(\n", + " collection=\"fruits\",\n", + " texts=fruits_texts,\n", + " embedding=embedding,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f86024b9", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_core.documents import Document\n", + "\n", + "fruits_docs = [Document(page_content=fruit) for fruit in fruits_texts]\n", + "\n", + "vector_store = FirestoreVectorStore.from_documents(\n", + " collection=\"fruits\",\n", + " documents=fruits_docs,\n", + " embedding=embedding,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "942911a8", + "metadata": {}, + "source": [ + "### Delete Vectors" + ] + }, + { + "cell_type": "markdown", + "id": "ee1d8090", + "metadata": {}, + "source": [ + "You can delete documents with vectors from the database using the `delete` method. You'll need to provide the document ID of the vector you want to delete. This will remove the whole document from the database, including any other fields it may have." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "901f2ae7", + "metadata": {}, + "outputs": [], + "source": [ + "vector_store.delete(ids)" + ] + }, + { + "cell_type": "markdown", + "id": "bc8e555f", + "metadata": {}, + "source": [ + "### Update Vectors" + ] + }, + { + "cell_type": "markdown", + "id": "af734e8f", + "metadata": {}, + "source": [ + "Updating vectors is similar to adding them. You can use the `add` method to update the vector of a document by providing the document ID and the new vector." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cb2aadb7", + "metadata": {}, + "outputs": [], + "source": [ + "fruit_to_update = ['{\"name\": \"apple\",\"price\": 12}']\n", + "apple_id = \"apple\"\n", + "\n", + "vector_store.add_texts(fruit_to_update, ids=[apple_id])" + ] + }, + { + "cell_type": "markdown", + "id": "16342b7a", + "metadata": {}, + "source": [ + "## Similarity Search\n", + "\n", + "You can use the `FirestoreVectorStore` to perform similarity searches on the vectors you have stored. This is useful for finding similar documents or text." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "44d1b94e", + "metadata": {}, + "outputs": [], + "source": [ + "vector_store.similarity_search(\"I like fuji apples\", k=3)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "acb2f640", + "metadata": {}, + "outputs": [], + "source": [ + "vector_store.max_marginal_relevance_search(\"fuji\", 5)" + ] + }, + { + "cell_type": "markdown", + "id": "4ac1d391", + "metadata": {}, + "source": [ + "You can add a pre-filter to the search by using the `filters` parameter. This is useful for filtering by a specific field or value." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cd864d4f", + "metadata": {}, + "outputs": [], + "source": [ + "from google.cloud.firestore_v1.base_query import FieldFilter\n", + "\n", + "vector_store.max_marginal_relevance_search(\n", + " \"fuji\", 5, filters=FieldFilter(\"content\", \"==\", \"apple\")\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "9988c71d", + "metadata": {}, + "source": [ + "### Customize Connection & Authentication" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6b9dfc65", + "metadata": {}, + "outputs": [], + "source": [ + "from google.api_core.client_options import ClientOptions\n", + "from google.cloud import firestore\n", + "from langchain_google_firestore import FirestoreVectorStore\n", + "\n", + "client_options = ClientOptions()\n", + "client = firestore.Client(client_options=client_options)\n", + "\n", + "# Create a vector store\n", + "vector_store = FirestoreVectorStore(\n", + " collection=\"fruits\",\n", + " embedding=embedding,\n", + " client=client,\n", + ")" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.7" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}