diff --git a/docs/extras/integrations/llms/gpt4all.ipynb b/docs/extras/integrations/llms/gpt4all.ipynb index 7ebbd4e9e2..a8760ceeab 100644 --- a/docs/extras/integrations/llms/gpt4all.ipynb +++ b/docs/extras/integrations/llms/gpt4all.ipynb @@ -1,6 +1,7 @@ { "cells": [ { + "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ @@ -30,6 +31,14 @@ "%pip install gpt4all > /dev/null" ] }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Import GPT4All" + ] + }, { "cell_type": "code", "execution_count": 2, @@ -43,6 +52,14 @@ "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler" ] }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Set Up Question to pass to LLM" + ] + }, { "cell_type": "code", "execution_count": 3, @@ -59,6 +76,7 @@ ] }, { + "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ @@ -66,18 +84,14 @@ "\n", "To run locally, download a compatible ggml-formatted model. \n", " \n", - "**Download option 1**: The [gpt4all page](https://gpt4all.io/index.html) has a useful `Model Explorer` section:\n", + "The [gpt4all page](https://gpt4all.io/index.html) has a useful `Model Explorer` section:\n", "\n", "* Select a model of interest\n", "* Download using the UI and move the `.bin` to the `local_path` (noted below)\n", "\n", "For more info, visit https://github.com/nomic-ai/gpt4all.\n", "\n", - "--- \n", - "\n", - "**Download option 2**: Uncomment the below block to download a model. \n", - "\n", - "* You may want to update `url` to a new version, whih can be browsed using the [gpt4all page](https://gpt4all.io/index.html)." + "---" ] }, { @@ -88,27 +102,7 @@ "source": [ "local_path = (\n", " \"./models/ggml-gpt4all-l13b-snoozy.bin\" # replace with your desired local file path\n", - ")\n", - "\n", - "# import requests\n", - "\n", - "# from pathlib import Path\n", - "# from tqdm import tqdm\n", - "\n", - "# Path(local_path).parent.mkdir(parents=True, exist_ok=True)\n", - "\n", - "# # Example model. Check https://github.com/nomic-ai/gpt4all for the latest models.\n", - "# url = 'http://gpt4all.io/models/ggml-gpt4all-l13b-snoozy.bin'\n", - "\n", - "# # send a GET request to the URL to download the file. Stream since it's large\n", - "# response = requests.get(url, stream=True)\n", - "\n", - "# # open the file in binary mode and write the contents of the response to it in chunks\n", - "# # This is a large file, so be prepared to wait.\n", - "# with open(local_path, 'wb') as f:\n", - "# for chunk in tqdm(response.iter_content(chunk_size=8192)):\n", - "# if chunk:\n", - "# f.write(chunk)" + ")" ] }, { @@ -147,6 +141,14 @@ "\n", "llm_chain.run(question)" ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Justin Bieber was born on March 1, 1994. In 1994, The Cowboys won Super Bowl XXVIII." + ] } ], "metadata": { diff --git a/docs/extras/integrations/text_embedding/gpt4all.ipynb b/docs/extras/integrations/text_embedding/gpt4all.ipynb index d8d02ee969..67ebc9c584 100644 --- a/docs/extras/integrations/text_embedding/gpt4all.ipynb +++ b/docs/extras/integrations/text_embedding/gpt4all.ipynb @@ -1,15 +1,27 @@ { "cells": [ { + "attachments": {}, "cell_type": "markdown", "id": "d63d56c2", "metadata": {}, "source": [ "# GPT4All\n", "\n", + "[GPT4All](https://gpt4all.io/index.html) is a free-to-use, locally running, privacy-aware chatbot. There is no GPU or internet required. It features popular models and its own models such as GPT4All Falcon, Wizard, etc.\n", + "\n", "This notebook explains how to use [GPT4All embeddings](https://docs.gpt4all.io/gpt4all_python_embedding.html#gpt4all.gpt4all.Embed4All) with LangChain." ] }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "46b7aa85", + "metadata": {}, + "source": [ + "## Install GPT4All's Python Bindings" + ] + }, { "cell_type": "code", "execution_count": null, @@ -17,7 +29,16 @@ "metadata": {}, "outputs": [], "source": [ - "! pip install gpt4all" + "%pip install gpt4all > /dev/null" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "d80f4b92", + "metadata": {}, + "source": [ + "Note: you may need to restart the kernel to use updated packages." ] }, { @@ -72,6 +93,15 @@ "text = \"This is a test document.\"" ] }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "eef36bde", + "metadata": {}, + "source": [ + "## Embed the Textual Data" + ] + }, { "cell_type": "code", "execution_count": 4, @@ -82,6 +112,15 @@ "query_result = gpt4all_embd.embed_query(text)" ] }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "12b24e69", + "metadata": {}, + "source": [ + "With embed_documents you can embed multiple pieces of text. You can also map these embeddings with [Nomic's Atlas](https://docs.nomic.ai/index.html) to see a visual representation of your data." + ] + }, { "cell_type": "code", "execution_count": 5, diff --git a/docs/extras/integrations/vectorstores/atlas.ipynb b/docs/extras/integrations/vectorstores/atlas.ipynb index fb18aab45f..0f761a8dc5 100644 --- a/docs/extras/integrations/vectorstores/atlas.ipynb +++ b/docs/extras/integrations/vectorstores/atlas.ipynb @@ -1,13 +1,14 @@ { "cells": [ { + "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Atlas\n", "\n", "\n", - ">[Atlas](https://docs.nomic.ai/index.html) is a platform for interacting with both small and internet scale unstructured datasets by `Nomic`. \n", + ">[Atlas](https://docs.nomic.ai/index.html) is a platform by Nomic made for interacting with both small and internet scale unstructured datasets. It enables anyone to visualize, search, and share massive datasets in their browser.\n", "\n", "This notebook shows you how to use functionality related to the `AtlasDB` vectorstore." ] @@ -49,6 +50,14 @@ "!pip install nomic" ] }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Load Packages" + ] + }, { "cell_type": "code", "execution_count": 6, @@ -78,6 +87,14 @@ "ATLAS_TEST_API_KEY = \"7xDPkYXSYDc1_ErdTPIcoAR9RNd8YDlkS3nVNXcVoIMZ6\"" ] }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Prepare the Data" + ] + }, { "cell_type": "code", "execution_count": 8, @@ -96,6 +113,14 @@ "texts = [e.strip() for e in texts]" ] }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Map the Data using Nomic's Atlas" + ] + }, { "cell_type": "code", "execution_count": null, @@ -127,78 +152,21 @@ }, { "cell_type": "code", - "execution_count": 7, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "\n", - " test_index_1677255228.136989\n", - "
\n", - " A description for your project 508 datums inserted.\n", - "
\n", - " 1 index built.\n", - "
Projections\n", - "
\n", - "\n", - "

Projection ID: db996d77-8981-48a0-897a-ff2c22bbf541

\n", - "
\n", - "
Hide embedded project
\n", - "
\n", - " Explore on atlas.nomic.ai\n", - "
\n", - "
\n", - " \n", - " \n", - "\n", - " \n", - " \n", - " \n", - " " - ], - "text/plain": [ - "AtlasProject: <{'id': 'ee2354a3-7f9a-4c6b-af43-b0cda09d7198', 'owner': '9c29afbb-a002-4d49-958e-ecf5ae1351ac', 'project_name': 'test_index_1677255228.136989', 'creator': 'auth0|63efc4b5462246f4d9a6ecf2', 'description': 'A description for your project', 'opensearch_index_id': 'f61fb8dd-0abf-4f31-9130-41870e443902', 'is_public': True, 'project_fields': ['atlas_id', 'text'], 'unique_id_field': 'atlas_id', 'modality': 'text', 'total_datums_in_project': 508, 'created_timestamp': '2023-02-24T16:13:50.313363+00:00', 'atlas_indices': [{'id': 'b1b01833-0964-4597-a4bc-a2d60700949d', 'project_id': 'ee2354a3-7f9a-4c6b-af43-b0cda09d7198', 'index_name': 'test_index_1677255228.136989_index', 'indexed_field': 'text', 'created_timestamp': '2023-02-24T16:13:52.957101+00:00', 'updated_timestamp': '2023-02-24T16:14:03.469621+00:00', 'atoms': ['charchunk', 'document'], 'colorable_fields': [], 'embedders': [{'id': '7ec0868a-4eed-4414-a482-25cce9803e1b', 'atlas_index_id': 'b1b01833-0964-4597-a4bc-a2d60700949d', 'ready': True, 'model_name': 'NomicEmbed', 'hyperparameters': {'norm': 'both', 'batch_size': 20, 'polymerize_by': 'charchunk', 'dataset_buffer_size': 1000}}], 'nearest_neighbor_indices': [{'id': '86f8e3ff-e07c-4678-a4d7-144db4b0301d', 'index_name': 'NomicOrganize', 'ready': True, 'hyperparameters': {'dim': 384, 'space': 'l2'}, 'atom_strategies': ['document']}], 'projections': [{'id': 'db996d77-8981-48a0-897a-ff2c22bbf541', 'projection_name': 'NomicProject', 'ready': True, 'hyperparameters': {'spread': 1.0, 'n_epochs': 50, 'n_neighbors': 15}, 'atom_strategies': ['document'], 'created_timestamp': '2023-02-24T16:13:52.979561+00:00', 'updated_timestamp': '2023-02-24T16:14:03.466309+00:00'}]}], 'insert_update_delete_lock': False}>" - ] - }, - "execution_count": 7, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "db.project" ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Here is a map with the result of this code. This map displays the texts of the State of the Union.\n", + "https://atlas.nomic.ai/map/3e4de075-89ff-486a-845c-36c23f30bb67/d8ce2284-8edb-4050-8b9b-9bb543d7f647" + ] } ], "metadata": {