langchain/docs/extras/integrations/vectorstores/rockset.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "9787b308",
   "metadata": {},
   "source": [
    "# Rockset\n",
    "\n",
    ">[Rockset](https://rockset.com/) is a real-time search and analytics database built for the cloud. Rockset uses a [Converged Index™](https://rockset.com/blog/converged-indexing-the-secret-sauce-behind-rocksets-fast-queries/) with an efficient store for vector embeddings to serve low latency, high concurrency search queries at scale. Rockset has full support for metadata filtering and  handles real-time ingestion for constantly updating, streaming data.\n",
    "\n",
    "This notebook demonstrates how to use `Rockset` as a vector store in LangChain. Before getting started, make sure you have access to a `Rockset` account and an API key available. [Start your free trial today.](https://rockset.com/create/)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b823d64a",
   "metadata": {},
   "source": [
    "## Setting Up Your Environment[](https://python.langchain.com/docs/modules/data_connection/vectorstores/integrations/rockset#setting-up-environment)\n",
    "\n",
    "1. Leverage the `Rockset` console to create a [collection](https://rockset.com/docs/collections/) with the Write API as your source. In this walkthrough, we create a collection named `langchain_demo`. \n",
    "    \n",
    "    Configure the following [ingest transformation](https://rockset.com/docs/ingest-transformation/) to mark your embeddings field and take advantage of performance and storage optimizations:\n",
    "\n",
    "\n",
    "   (We used OpenAI `text-embedding-ada-002` for this examples, where #length_of_vector_embedding = 1536)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "aac58387",
   "metadata": {
    "vscode": {
     "languageId": "sql"
    }
   },
   "outputs": [],
   "source": [
    "SELECT _input.* EXCEPT(_meta), \n",
    "VECTOR_ENFORCE(_input.description_embedding, #length_of_vector_embedding, 'float') as description_embedding \n",
    "FROM _input"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "df380e1c",
   "metadata": {},
   "source": [
    "2. After creating your collection, use the console to retrieve an [API key](https://rockset.com/docs/iam/#users-api-keys-and-roles). For the purpose of this notebook, we assume you are using the `Oregon(us-west-2)` region.\n",
    "\n",
    "3. Install the [rockset-python-client](https://github.com/rockset/rockset-python-client) to enable LangChain to communicate directly with `Rockset`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "00d16b83",
   "metadata": {},
   "outputs": [],
   "source": [
    "pip install rockset"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e79550eb",
   "metadata": {},
   "source": [
    "## LangChain Tutorial\n",
    "\n",
    "Follow along in your own Python notebook to generate and store vector embeddings in Rockset.\n",
    "Start using Rockset to search for documents similar to your search queries.\n",
    "\n",
    "### 1. Define Key Variables"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "29505c1e",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import rockset\n",
    "\n",
    "ROCKSET_API_KEY = os.environ.get(\"ROCKSET_API_KEY\") # Verify ROCKSET_API_KEY environment variable\n",
    "ROCKSET_API_SERVER = rockset.Regions.usw2a1 # Verify Rockset region\n",
    "rockset_client = rockset.RocksetClient(ROCKSET_API_SERVER, ROCKSET_API_KEY)\n",
    "\n",
    "COLLECTION_NAME='langchain_demo'\n",
    "TEXT_KEY='description'\n",
    "EMBEDDING_KEY='description_embedding'"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "07625be2",
   "metadata": {},
   "source": [
    "### 2. Prepare Documents"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9740d8c4",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.embeddings.openai import OpenAIEmbeddings\n",
    "from langchain.text_splitter import CharacterTextSplitter\n",
    "from langchain.document_loaders import TextLoader\n",
    "from langchain.vectorstores import Rockset\n",
    "\n",
    "loader = TextLoader('../../../state_of_the_union.txt')\n",
    "documents = loader.load()\n",
    "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
    "docs = text_splitter.split_documents(documents)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a068be18",
   "metadata": {},
   "source": [
    "### 3. Insert Documents"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "85b6a6c5",
   "metadata": {},
   "outputs": [],
   "source": [
    "embeddings = OpenAIEmbeddings() # Verify OPENAI_API_KEY environment variable\n",
    "\n",
    "docsearch = Rockset(\n",
    "    client=rockset_client,\n",
    "    embeddings=embeddings,\n",
    "    collection_name=COLLECTION_NAME,\n",
    "    text_key=TEXT_KEY,\n",
    "    embedding_key=EMBEDDING_KEY,\n",
    ")\n",
    "\n",
    "ids=docsearch.add_texts(\n",
    "    texts=[d.page_content for d in docs],\n",
    "    metadatas=[d.metadata for d in docs],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "56eef48d",
   "metadata": {},
   "source": [
    "### 4. Search for Similar Documents"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0bbf3df0",
   "metadata": {},
   "outputs": [],
   "source": [
    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
    "output = docsearch.similarity_search_with_relevance_scores(\n",
    "    query, 4, Rockset.DistanceFunction.COSINE_SIM\n",
    ")\n",
    "print(\"output length:\", len(output))\n",
    "for d, dist in output:\n",
    "    print(dist, d.metadata, d.page_content[:20] + '...')\n",
    "\n",
    "##\n",
    "# output length: 4\n",
    "# 0.764990692109871 {'source': '../../../state_of_the_union.txt'} Madam Speaker, Madam...\n",
    "# 0.7485416901622112 {'source': '../../../state_of_the_union.txt'} And I’m taking robus...\n",
    "# 0.7468678973398306 {'source': '../../../state_of_the_union.txt'} And so many families...\n",
    "# 0.7436231261419488 {'source': '../../../state_of_the_union.txt'} Groups of citizens b..."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7037a22f",
   "metadata": {},
   "source": [
    "### 5. Search for Similar Documents with Filtering"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b64a290f",
   "metadata": {},
   "outputs": [],
   "source": [
    "output = docsearch.similarity_search_with_relevance_scores(\n",
    "    query,\n",
    "    4,\n",
    "    Rockset.DistanceFunction.COSINE_SIM,\n",
    "    where_str=\"{} NOT LIKE '%citizens%'\".format(TEXT_KEY),\n",
    ")\n",
    "print(\"output length:\", len(output))\n",
    "for d, dist in output:\n",
    "    print(dist, d.metadata, d.page_content[:20] + '...')\n",
    "\n",
    "##\n",
    "# output length: 4\n",
    "# 0.7651359650263554 {'source': '../../../state_of_the_union.txt'} Madam Speaker, Madam...\n",
    "# 0.7486265516824893 {'source': '../../../state_of_the_union.txt'} And I’m taking robus...\n",
    "# 0.7469625542348115 {'source': '../../../state_of_the_union.txt'} And so many families...\n",
    "# 0.7344177777547739 {'source': '../../../state_of_the_union.txt'} We see the unity amo..."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "13a52b38",
   "metadata": {},
   "source": [
    "### 6. [Optional] Delete Inserted Documents\n",
    "\n",
    "You must have the unique ID associated with each document to delete them from your collection.\n",
    "Define IDs when inserting documents with `Rockset.add_texts()`. Rockset will otherwise generate a unique ID for each document. Regardless, `Rockset.add_texts()` returns the IDs of inserted documents.\n",
    "\n",
    "To delete these docs, simply use the `Rockset.delete_texts()` function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1f755924",
   "metadata": {},
   "outputs": [],
   "source": [
    "docsearch.delete_texts(ids)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d468f431",
   "metadata": {},
   "source": [
    "## Summary\n",
    "\n",
    "In this tutorial, we successfully created a `Rockset` collection, `inserted` documents with  OpenAI embeddings, and searched for similar documents with and without metadata filters.\n",
    "\n",
    "Keep an eye on https://rockset.com/ for future updates in this space."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								{
 								 "cells": [
 								  {
 								   "cell_type": "markdown",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								   "id": "9787b308",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "metadata": {},
 								   "source": [
-												docs: vectorstore upgrades 2 (#6796)

updated vectorstores/ notebooks; added new integrations into
ecosystem/integrations/
@dev2049
@rlancemartin, @eyurtsev
											
										
										
											1 year ago
+								    "# Rockset\n",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								    "\n",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    ">[Rockset](https://rockset.com/) is a real-time search and analytics database built for the cloud. Rockset uses a [Converged Index™](https://rockset.com/blog/converged-indexing-the-secret-sauce-behind-rocksets-fast-queries/) with an efficient store for vector embeddings to serve low latency, high concurrency search queries at scale. Rockset has full support for metadata filtering and  handles real-time ingestion for constantly updating, streaming data.\n",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								    "\n",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "This notebook demonstrates how to use `Rockset` as a vector store in LangChain. Before getting started, make sure you have access to a `Rockset` account and an API key available. [Start your free trial today.](https://rockset.com/create/)\n"
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   ]
 								  },
 								  {
 								   "cell_type": "markdown",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								   "id": "b823d64a",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "metadata": {},
 								   "source": [
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "## Setting Up Your Environment[](https://python.langchain.com/docs/modules/data_connection/vectorstores/integrations/rockset#setting-up-environment)\n",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								    "\n",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "1. Leverage the `Rockset` console to create a [collection](https://rockset.com/docs/collections/) with the Write API as your source. In this walkthrough, we create a collection named `langchain_demo`. \n",
 								    "    \n",
-												Fix docs for Rockset (#8807)

* remove error output for notebook
* add comment about vector length for ingest transformation
* change OPENAI_KEY -> OPENAI_API_KEY

cc @baskaryan
											
										
										
											1 year ago
+								    "    Configure the following [ingest transformation](https://rockset.com/docs/ingest-transformation/) to mark your embeddings field and take advantage of performance and storage optimizations:\n",
 								    "\n",
 								    "\n",
 								    "   (We used OpenAI `text-embedding-ada-002` for this examples, where #length_of_vector_embedding = 1536)"
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   ]
 								  },
 								  {
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								   "cell_type": "code",
 								   "execution_count": null,
 								   "id": "aac58387",
 								   "metadata": {
 								    "vscode": {
 								     "languageId": "sql"
 								    }
 								   },
 								   "outputs": [],
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "source": [
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "SELECT _input.* EXCEPT(_meta), \n",
 								    "VECTOR_ENFORCE(_input.description_embedding, #length_of_vector_embedding, 'float') as description_embedding \n",
 								    "FROM _input"
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   ]
 								  },
 								  {
 								   "cell_type": "markdown",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								   "id": "df380e1c",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "metadata": {},
 								   "source": [
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "2. After creating your collection, use the console to retrieve an [API key](https://rockset.com/docs/iam/#users-api-keys-and-roles). For the purpose of this notebook, we assume you are using the `Oregon(us-west-2)` region.\n",
 								    "\n",
 								    "3. Install the [rockset-python-client](https://github.com/rockset/rockset-python-client) to enable LangChain to communicate directly with `Rockset`."
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   ]
 								  },
 								  {
 								   "cell_type": "code",
 								   "execution_count": null,
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								   "id": "00d16b83",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "metadata": {},
 								   "outputs": [],
 								   "source": [
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "pip install rockset"
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   ]
 								  },
 								  {
 								   "cell_type": "markdown",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								   "id": "e79550eb",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "metadata": {},
 								   "source": [
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "## LangChain Tutorial\n",
 								    "\n",
 								    "Follow along in your own Python notebook to generate and store vector embeddings in Rockset.\n",
 								    "Start using Rockset to search for documents similar to your search queries.\n",
 								    "\n",
 								    "### 1. Define Key Variables"
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   ]
 								  },
 								  {
 								   "cell_type": "code",
-												Fix docs for Rockset (#8807)

* remove error output for notebook
* add comment about vector length for ingest transformation
* change OPENAI_KEY -> OPENAI_API_KEY

cc @baskaryan
											
										
										
											1 year ago
+								   "execution_count": null,
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								   "id": "29505c1e",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "metadata": {},
-												Fix docs for Rockset (#8807)

* remove error output for notebook
* add comment about vector length for ingest transformation
* change OPENAI_KEY -> OPENAI_API_KEY

cc @baskaryan
											
										
										
											1 year ago
+								   "outputs": [],
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "source": [
 								    "import os\n",
 								    "import rockset\n",
 								    "\n",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "ROCKSET_API_KEY = os.environ.get(\"ROCKSET_API_KEY\") # Verify ROCKSET_API_KEY environment variable\n",
 								    "ROCKSET_API_SERVER = rockset.Regions.usw2a1 # Verify Rockset region\n",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								    "rockset_client = rockset.RocksetClient(ROCKSET_API_SERVER, ROCKSET_API_KEY)\n",
 								    "\n",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "COLLECTION_NAME='langchain_demo'\n",
 								    "TEXT_KEY='description'\n",
 								    "EMBEDDING_KEY='description_embedding'"
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   ]
 								  },
 								  {
 								   "cell_type": "markdown",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								   "id": "07625be2",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "metadata": {},
 								   "source": [
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "### 2. Prepare Documents"
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   ]
 								  },
 								  {
 								   "cell_type": "code",
 								   "execution_count": null,
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								   "id": "9740d8c4",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "metadata": {},
-												Fix docs for Rockset (#8807)

* remove error output for notebook
* add comment about vector length for ingest transformation
* change OPENAI_KEY -> OPENAI_API_KEY

cc @baskaryan
											
										
										
											1 year ago
+								   "outputs": [],
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "source": [
 								    "from langchain.embeddings.openai import OpenAIEmbeddings\n",
 								    "from langchain.text_splitter import CharacterTextSplitter\n",
 								    "from langchain.document_loaders import TextLoader\n",
-												Wfh/ref links (#8454)


											
										
										
											1 year ago
+								    "from langchain.vectorstores import Rockset\n",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								    "\n",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "loader = TextLoader('../../../state_of_the_union.txt')\n",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								    "documents = loader.load()\n",
 								    "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
 								    "docs = text_splitter.split_documents(documents)"
 								   ]
 								  },
 								  {
 								   "cell_type": "markdown",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								   "id": "a068be18",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "metadata": {},
 								   "source": [
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "### 3. Insert Documents"
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   ]
 								  },
 								  {
 								   "cell_type": "code",
 								   "execution_count": null,
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								   "id": "85b6a6c5",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "metadata": {},
-												Fix docs for Rockset (#8807)

* remove error output for notebook
* add comment about vector length for ingest transformation
* change OPENAI_KEY -> OPENAI_API_KEY

cc @baskaryan
											
										
										
											1 year ago
+								   "outputs": [],
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "source": [
-												Fix docs for Rockset (#8807)

* remove error output for notebook
* add comment about vector length for ingest transformation
* change OPENAI_KEY -> OPENAI_API_KEY

cc @baskaryan
											
										
										
											1 year ago
+								    "embeddings = OpenAIEmbeddings() # Verify OPENAI_API_KEY environment variable\n",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								    "\n",
-												Wfh/ref links (#8454)


											
										
										
											1 year ago
+								    "docsearch = Rockset(\n",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								    "    client=rockset_client,\n",
 								    "    embeddings=embeddings,\n",
 								    "    collection_name=COLLECTION_NAME,\n",
 								    "    text_key=TEXT_KEY,\n",
 								    "    embedding_key=EMBEDDING_KEY,\n",
 								    ")\n",
 								    "\n",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "ids=docsearch.add_texts(\n",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								    "    texts=[d.page_content for d in docs],\n",
 								    "    metadatas=[d.metadata for d in docs],\n",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    ")"
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   ]
 								  },
 								  {
 								   "cell_type": "markdown",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								   "id": "56eef48d",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "metadata": {},
 								   "source": [
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "### 4. Search for Similar Documents"
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   ]
 								  },
 								  {
 								   "cell_type": "code",
-												Fix docs for Rockset (#8807)

* remove error output for notebook
* add comment about vector length for ingest transformation
* change OPENAI_KEY -> OPENAI_API_KEY

cc @baskaryan
											
										
										
											1 year ago
+								   "execution_count": null,
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								   "id": "0bbf3df0",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "metadata": {},
-												Fix docs for Rockset (#8807)

* remove error output for notebook
* add comment about vector length for ingest transformation
* change OPENAI_KEY -> OPENAI_API_KEY

cc @baskaryan
											
										
										
											1 year ago
+								   "outputs": [],
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "source": [
 								    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "output = docsearch.similarity_search_with_relevance_scores(\n",
-												Wfh/ref links (#8454)


											
										
										
											1 year ago
+								    "    query, 4, Rockset.DistanceFunction.COSINE_SIM\n",
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    ")\n",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								    "print(\"output length:\", len(output))\n",
 								    "for d, dist in output:\n",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "    print(dist, d.metadata, d.page_content[:20] + '...')\n",
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "\n",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								    "##\n",
 								    "# output length: 4\n",
 								    "# 0.764990692109871 {'source': '../../../state_of_the_union.txt'} Madam Speaker, Madam...\n",
 								    "# 0.7485416901622112 {'source': '../../../state_of_the_union.txt'} And I’m taking robus...\n",
 								    "# 0.7468678973398306 {'source': '../../../state_of_the_union.txt'} And so many families...\n",
 								    "# 0.7436231261419488 {'source': '../../../state_of_the_union.txt'} Groups of citizens b..."
 								   ]
 								  },
 								  {
 								   "cell_type": "markdown",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								   "id": "7037a22f",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "metadata": {},
 								   "source": [
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "### 5. Search for Similar Documents with Filtering"
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   ]
 								  },
 								  {
 								   "cell_type": "code",
 								   "execution_count": null,
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								   "id": "b64a290f",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "metadata": {},
 								   "outputs": [],
 								   "source": [
 								    "output = docsearch.similarity_search_with_relevance_scores(\n",
 								    "    query,\n",
 								    "    4,\n",
-												Wfh/ref links (#8454)


											
										
										
											1 year ago
+								    "    Rockset.DistanceFunction.COSINE_SIM,\n",
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "    where_str=\"{} NOT LIKE '%citizens%'\".format(TEXT_KEY),\n",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								    ")\n",
 								    "print(\"output length:\", len(output))\n",
 								    "for d, dist in output:\n",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "    print(dist, d.metadata, d.page_content[:20] + '...')\n",
-												Fix `make docs_build` and related scripts (#7276)

**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "\n",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								    "##\n",
 								    "# output length: 4\n",
 								    "# 0.7651359650263554 {'source': '../../../state_of_the_union.txt'} Madam Speaker, Madam...\n",
 								    "# 0.7486265516824893 {'source': '../../../state_of_the_union.txt'} And I’m taking robus...\n",
 								    "# 0.7469625542348115 {'source': '../../../state_of_the_union.txt'} And so many families...\n",
 								    "# 0.7344177777547739 {'source': '../../../state_of_the_union.txt'} We see the unity amo..."
 								   ]
 								  },
 								  {
-												Wfh/ref links (#8454)


											
										
										
											1 year ago
+								   "attachments": {},
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "cell_type": "markdown",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								   "id": "13a52b38",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "metadata": {},
 								   "source": [
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "### 6. [Optional] Delete Inserted Documents\n",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								    "\n",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "You must have the unique ID associated with each document to delete them from your collection.\n",
 								    "Define IDs when inserting documents with `Rockset.add_texts()`. Rockset will otherwise generate a unique ID for each document. Regardless, `Rockset.add_texts()` returns the IDs of inserted documents.\n",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								    "\n",
-												Wfh/ref links (#8454)


											
										
										
											1 year ago
+								    "To delete these docs, simply use the `Rockset.delete_texts()` function."
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   ]
 								  },
 								  {
 								   "cell_type": "code",
 								   "execution_count": null,
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								   "id": "1f755924",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "metadata": {},
 								   "outputs": [],
 								   "source": [
 								    "docsearch.delete_texts(ids)"
 								   ]
 								  },
 								  {
 								   "cell_type": "markdown",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								   "id": "d468f431",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   "metadata": {},
 								   "source": [
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "## Summary\n",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								    "\n",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "In this tutorial, we successfully created a `Rockset` collection, `inserted` documents with  OpenAI embeddings, and searched for similar documents with and without metadata filters.\n",
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								    "\n",
-												Minor improvements to rockset vectorstore (#8416)

This PR makes minor improvements to our python notebook, and adds
support for `Rockset` workspaces in our vectorstore client.

@rlancemartin, @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											1 year ago
+								    "Keep an eye on https://rockset.com/ for future updates in this space."
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								   ]
 								  }
 								 ],
 								 "metadata": {
 								  "kernelspec": {
 								   "display_name": "Python 3 (ipykernel)",
 								   "language": "python",
 								   "name": "python3"
 								  },
 								  "language_info": {
 								   "codemirror_mode": {
 								    "name": "ipython",
 								    "version": 3
 								   },
 								   "file_extension": ".py",
 								   "mimetype": "text/x-python",
 								   "name": "python",
 								   "nbconvert_exporter": "python",
 								   "pygments_lexer": "ipython3",
-												Fix docs for Rockset (#8807)

* remove error output for notebook
* add comment about vector length for ingest transformation
* change OPENAI_KEY -> OPENAI_API_KEY

cc @baskaryan
											
										
										
											1 year ago
+								   "version": "3.10.12"
-												Integrate Rockset as Vectorstore (#6216)

This PR adds Rockset as a vectorstore for langchain.
[Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/)
is a real time OLAP database which provides a fast and efficient vector
search functionality. Further since it is entirely schemaless, it can
store metadata in separate columns thereby allowing fast metadata
filters during vector similarity search (as opposed to storing the
entire metadata in a single JSON column). It currently supports three
distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and
`DOT_PRODUCT`.

This PR adds `rockset` client as an optional dependency. 

We would love a twitter shoutout, our handle is
https://twitter.com/RocksetCloud

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
											
										
										
											1 year ago
+								  }
 								 },
 								 "nbformat": 4,
 								 "nbformat_minor": 5
 								}