{ "cells": [ { "cell_type": "markdown", "id": "833c4789", "metadata": {}, "source": [ "# AwaDB\n", ">[AwaDB](https://github.com/awa-ai/awadb) is an AI Native database for the search and storage of embedding vectors used by LLM Applications.\n", "\n", "This notebook shows how to use functionality related to the `AwaDB`." ] }, { "cell_type": "code", "execution_count": null, "id": "252930ea", "metadata": {}, "outputs": [], "source": [ "!pip install awadb" ] }, { "cell_type": "code", "execution_count": null, "id": "f2b71a47", "metadata": {}, "outputs": [], "source": [ "from langchain.text_splitter import CharacterTextSplitter\n", "from langchain.vectorstores import AwaDB\n", "from langchain.document_loaders import TextLoader" ] }, { "cell_type": "code", "execution_count": null, "id": "49be0bac", "metadata": {}, "outputs": [], "source": [ "loader = TextLoader(\"../../../state_of_the_union.txt\")\n", "documents = loader.load()\n", "text_splitter = CharacterTextSplitter(chunk_size=100, chunk_overlap=0)\n", "docs = text_splitter.split_documents(documents)" ] }, { "cell_type": "code", "execution_count": null, "id": "18714278", "metadata": {}, "outputs": [], "source": [ "db = AwaDB.from_documents(docs)\n", "query = \"What did the president say about Ketanji Brown Jackson\"\n", "docs = db.similarity_search(query)" ] }, { "cell_type": "code", "execution_count": 4, "id": "4b172de8", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.\n" ] } ], "source": [ "print(docs[0].page_content)" ] }, { "cell_type": "markdown", "id": "87fec6b5", "metadata": {}, "source": [ "## Similarity search with score" ] }, { "cell_type": "markdown", "id": "17231924", "metadata": {}, "source": [ "The returned distance score is between 0-1. 0 is dissimilar, 1 is the most similar" ] }, { "cell_type": "code", "execution_count": null, "id": "f40ddae1", "metadata": {}, "outputs": [], "source": [ "docs = db.similarity_search_with_score(query)" ] }, { "cell_type": "code", "execution_count": 4, "id": "93cd0b7a", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(Document(page_content='And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.', metadata={'source': '../../../state_of_the_union.txt'}), 0.561813814013747)\n" ] } ], "source": [ "print(docs[0])" ] }, { "cell_type": "markdown", "id": "0b49fb59", "metadata": {}, "source": [ "## Restore the table created and added data before" ] }, { "cell_type": "code", "execution_count": null, "id": "1bfa6e25", "metadata": {}, "outputs": [], "source": [ "AwaDB automatically persists added document data" ] }, { "cell_type": "markdown", "id": "2a0f3b35", "metadata": {}, "source": [ "If you can restore the table you created and added before, you can just do this as below:" ] }, { "cell_type": "code", "execution_count": null, "id": "1fd4b5b0", "metadata": {}, "outputs": [], "source": [ "awadb_client = awadb.Client()\n", "ret = awadb_client.Load(\"langchain_awadb\")\n", "if ret:\n", " print(\"awadb load table success\")\n", "else:\n", " print(\"awadb load table failed\")" ] }, { "cell_type": "raw", "id": "aba255c2", "metadata": {}, "source": [ "awadb load table success" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" } }, "nbformat": 4, "nbformat_minor": 5 }