You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
langchain/docs/modules/indexes/vectorstores/examples/awadb.ipynb

195 lines
4.4 KiB
Plaintext

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

{
"cells": [
{
"cell_type": "markdown",
"id": "833c4789",
"metadata": {},
"source": [
"# AwaDB\n",
"[AwaDB](https://github.com/awa-ai/awadb) is an AI Native database for the search and storage of embedding vectors used by LLM Applications.\n",
"This notebook shows how to use functionality related to the AwaDB."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "252930ea",
"metadata": {},
"outputs": [],
"source": [
"!pip install awadb"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f2b71a47",
"metadata": {},
"outputs": [],
"source": [
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import AwaDB\n",
"from langchain.document_loaders import TextLoader"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "49be0bac",
"metadata": {},
"outputs": [],
"source": [
"loader = TextLoader('../../../state_of_the_union.txt')\n",
"documents = loader.load()\n",
"text_splitter = CharacterTextSplitter(chunk_size= 100, chunk_overlap=0)\n",
"docs = text_splitter.split_documents(documents)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "18714278",
"metadata": {},
"outputs": [],
"source": [
"db = AwaDB.from_documents(docs)\n",
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = db.similarity_search(query)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "62b7a4c5",
"metadata": {},
"outputs": [],
"source": [
"print(docs[0].page_content)"
]
},
{
"cell_type": "markdown",
"id": "a9b4be48",
"metadata": {},
"source": [
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence."
]
},
{
"cell_type": "markdown",
"id": "87fec6b5",
"metadata": {},
"source": [
"## Similarity search with score"
]
},
{
"cell_type": "markdown",
"id": "17231924",
"metadata": {},
"source": [
"The returned distance score is between 0-1. 0 is dissimilar, 1 is the most similar"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f40ddae1",
"metadata": {},
"outputs": [],
"source": [
"docs = db.similarity_search_with_score(query)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f0045583",
"metadata": {},
"outputs": [],
"source": [
"print(docs[0])"
]
},
{
"cell_type": "markdown",
"id": "8c2da99d",
"metadata": {},
"source": [
"(Document(page_content='And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.', metadata={'source': '../../../state_of_the_union.txt'}), 0.561813814013747)"
]
},
{
"cell_type": "markdown",
"id": "0b49fb59",
"metadata": {},
"source": [
"## Restore the table created and added data before"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1bfa6e25",
"metadata": {},
"outputs": [],
"source": [
"AwaDB automatically persists added document data"
]
},
{
"cell_type": "markdown",
"id": "2a0f3b35",
"metadata": {},
"source": [
"If you can restore the table you created and added before, you can just do this as below:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1fd4b5b0",
"metadata": {},
"outputs": [],
"source": [
"awadb_client = awadb.Client()\n",
"ret = awadb_client.Load('langchain_awadb')\n",
"if ret : print('awadb load table success')\n",
"else:\n",
" print('awadb load table failed')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5ae9a9dd",
"metadata": {},
"outputs": [],
"source": [
"awadb load table success"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}