langchain/docs/extras/integrations/vectorstores/tair.ipynb

131 lines
4.3 KiB
Plaintext
Raw Normal View History

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Tair\n",
"\n",
">[Tair](https://www.alibabacloud.com/help/en/tair/latest/what-is-tair) is a cloud native in-memory database service developed by `Alibaba Cloud`. \n",
"It provides rich data models and enterprise-grade capabilities to support your real-time online scenarios while maintaining full compatibility with open source `Redis`. `Tair` also introduces persistent memory-optimized instances that are based on the new non-volatile memory (NVM) storage medium.\n",
"\n",
"This notebook shows how to use functionality related to the `Tair` vector database.\n",
"\n",
"To run, you should have a `Tair` instance up and running."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings.fake import FakeEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import Tair"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
"\n",
"loader = TextLoader(\"../../../state_of_the_union.txt\")\n",
"documents = loader.load()\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"docs = text_splitter.split_documents(documents)\n",
"\n",
"embeddings = FakeEmbeddings(size=128)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Connect to Tair using the `TAIR_URL` environment variable \n",
"```\n",
"export TAIR_URL=\"redis://{username}:{password}@{tair_address}:{tair_port}\"\n",
"```\n",
"\n",
"or the keyword argument `tair_url`.\n",
"\n",
"Then store documents and embeddings into Tair."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"tair_url = \"redis://localhost:6379\"\n",
"\n",
"# drop first if index already exists\n",
"Tair.drop_index(tair_url=tair_url)\n",
"\n",
"vector_store = Tair.from_documents(docs, embeddings, tair_url=tair_url)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Query similar documents."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Document(page_content='Were going after the criminals who stole billions in relief money meant for small businesses and millions of Americans. \\n\\nAnd tonight, Im announcing that the Justice Department will name a chief prosecutor for pandemic fraud. \\n\\nBy the end of this year, the deficit will be down to less than half what it was before I took office. \\n\\nThe only president ever to cut the deficit by more than one trillion dollars in a single year. \\n\\nLowering your costs also means demanding more competition. \\n\\nIm a capitalist, but capitalism without competition isnt capitalism. \\n\\nIts exploitation—and it drives up prices. \\n\\nWhen corporations dont have to compete, their profits go up, your prices go up, and small businesses and family farmers and ranchers go under. \\n\\nWe see it happening with ocean carriers moving goods in and out of America. \\n\\nDuring the pandemic, these foreign-owned companies raised prices by as much as 1,000% and made record profits.', metadata={'source': '../../../state_of_the_union.txt'})"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = vector_store.similarity_search(query)\n",
"docs[0]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 4
}