langchain/docs/extras/integrations/vectorstores/awadb.ipynb
2023-07-23 23:23:16 -07:00

195 lines
4.4 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{
"cells": [
{
"cell_type": "markdown",
"id": "833c4789",
"metadata": {},
"source": [
"# AwaDB\n",
">[AwaDB](https://github.com/awa-ai/awadb) is an AI Native database for the search and storage of embedding vectors used by LLM Applications.\n",
"\n",
"This notebook shows how to use functionality related to the `AwaDB`."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "252930ea",
"metadata": {},
"outputs": [],
"source": [
"!pip install awadb"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f2b71a47",
"metadata": {},
"outputs": [],
"source": [
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import AwaDB\n",
"from langchain.document_loaders import TextLoader"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "49be0bac",
"metadata": {},
"outputs": [],
"source": [
"loader = TextLoader(\"../../../state_of_the_union.txt\")\n",
"documents = loader.load()\n",
"text_splitter = CharacterTextSplitter(chunk_size=100, chunk_overlap=0)\n",
"docs = text_splitter.split_documents(documents)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "18714278",
"metadata": {},
"outputs": [],
"source": [
"db = AwaDB.from_documents(docs)\n",
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = db.similarity_search(query)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "4b172de8",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n"
]
}
],
"source": [
"print(docs[0].page_content)"
]
},
{
"cell_type": "markdown",
"id": "87fec6b5",
"metadata": {},
"source": [
"## Similarity search with score"
]
},
{
"cell_type": "markdown",
"id": "17231924",
"metadata": {},
"source": [
"The returned distance score is between 0-1. 0 is dissimilar, 1 is the most similar"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f40ddae1",
"metadata": {},
"outputs": [],
"source": [
"docs = db.similarity_search_with_score(query)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "93cd0b7a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(Document(page_content='And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', metadata={'source': '../../../state_of_the_union.txt'}), 0.561813814013747)\n"
]
}
],
"source": [
"print(docs[0])"
]
},
{
"cell_type": "markdown",
"id": "0b49fb59",
"metadata": {},
"source": [
"## Restore the table created and added data before"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1bfa6e25",
"metadata": {},
"outputs": [],
"source": [
"AwaDB automatically persists added document data"
]
},
{
"cell_type": "markdown",
"id": "2a0f3b35",
"metadata": {},
"source": [
"If you can restore the table you created and added before, you can just do this as below:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1fd4b5b0",
"metadata": {},
"outputs": [],
"source": [
"awadb_client = awadb.Client()\n",
"ret = awadb_client.Load(\"langchain_awadb\")\n",
"if ret:\n",
" print(\"awadb load table success\")\n",
"else:\n",
" print(\"awadb load table failed\")"
]
},
{
"cell_type": "raw",
"id": "aba255c2",
"metadata": {},
"source": [
"awadb load table success"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}