From dd5143a8d6463d103f53faf432713324bcc9b7cd Mon Sep 17 00:00:00 2001 From: liuliu Date: Fri, 17 Mar 2023 14:00:18 +0800 Subject: [PATCH 01/11] docs: complete the missing symbol * that should be paired in pinecone/Gen_QA.ipynb --- examples/vector_databases/pinecone/Gen_QA.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/vector_databases/pinecone/Gen_QA.ipynb b/examples/vector_databases/pinecone/Gen_QA.ipynb index e03d15c6..c92429d4 100644 --- a/examples/vector_databases/pinecone/Gen_QA.ipynb +++ b/examples/vector_databases/pinecone/Gen_QA.ipynb @@ -13,7 +13,7 @@ "\n", "In this notebook we will learn how to query relevant contexts to our queries from Pinecone, and pass these to a generative OpenAI model to generate an answer backed by real data sources.\n", "\n", - "A common problem with using GPT-3 to factually answer questions is that GPT-3 can sometimes make things up. The GPT models have a broad range of general knowledge, but this does not necessarily apply to more specific information. For that we use the Pinecone vector database as our _\"external knowledge base\"_ — like *long-term memory for GPT-3.\n", + "A common problem with using GPT-3 to factually answer questions is that GPT-3 can sometimes make things up. The GPT models have a broad range of general knowledge, but this does not necessarily apply to more specific information. For that we use the Pinecone vector database as our _\"external knowledge base\"_ — like *long-term memory* for GPT-3.\n", "\n", "Required installs for this notebook are:" ] From e3c3e43703f10023fc0e7ce4f6ddb1eec1c32185 Mon Sep 17 00:00:00 2001 From: liuliu Date: Fri, 17 Mar 2023 14:03:01 +0800 Subject: [PATCH 02/11] doc: highlight text and tool name 1. cannot: highlight to enhance the importance 2. Pinecone --- examples/vector_databases/pinecone/Gen_QA.ipynb | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/examples/vector_databases/pinecone/Gen_QA.ipynb b/examples/vector_databases/pinecone/Gen_QA.ipynb index c92429d4..bbdeab60 100644 --- a/examples/vector_databases/pinecone/Gen_QA.ipynb +++ b/examples/vector_databases/pinecone/Gen_QA.ipynb @@ -212,7 +212,7 @@ "The best training method to use for fine-tuning a pre-trained model with sentence transformers is the Masked Language Model (MLM) training. MLM training involves randomly masking some of the words in a sentence and then training the model to predict the masked words. This helps the model to learn the context of the sentence and better understand the relationships between words.\n", "```\n", "\n", - "This answer seems pretty convincing right? Yet, it's wrong. MLM is typically used in the pretraining step of a transformer model but *cannot* be used to fine-tune a sentence-transformer, and has nothing to do with having _\"pairs of related sentences\"_.\n", + "This answer seems pretty convincing right? Yet, it's wrong. MLM is typically used in the pretraining step of a transformer model but *\"cannot\"* be used to fine-tune a sentence-transformer, and has nothing to do with having _\"pairs of related sentences\"_.\n", "\n", "An alternative answer we receive (and the one we returned above) is about `supervised learning approach` being the most suitable. This is completely true, but it's not specific and doesn't answer the question.\n", "\n", @@ -555,7 +555,7 @@ "id": "VMyJjt1cnwcH" }, "source": [ - "Now we need a place to store these embeddings and enable a efficient _vector search_ through them all. To do that we use Pinecone, we can get a [free API key](https://app.pinecone.io) and enter it below where we will initialize our connection to Pinecone and create a new index." + "Now we need a place to store these embeddings and enable a efficient _vector search_ through them all. To do that we use **`Pinecone`**, we can get a [free API key](https://app.pinecone.io) and enter it below where we will initialize our connection to `Pinecone` and create a new index." ] }, { From d33481caf6b2659bfb8635f80b833956cfefd54e Mon Sep 17 00:00:00 2001 From: liuliu Date: Fri, 17 Mar 2023 14:05:40 +0800 Subject: [PATCH 03/11] fix: remove unused package import 1. datetime is unused --- examples/vector_databases/pinecone/Gen_QA.ipynb | 1 - 1 file changed, 1 deletion(-) diff --git a/examples/vector_databases/pinecone/Gen_QA.ipynb b/examples/vector_databases/pinecone/Gen_QA.ipynb index bbdeab60..7e024c04 100644 --- a/examples/vector_databases/pinecone/Gen_QA.ipynb +++ b/examples/vector_databases/pinecone/Gen_QA.ipynb @@ -660,7 +660,6 @@ ], "source": [ "from tqdm.auto import tqdm\n", - "import datetime\n", "from time import sleep\n", "\n", "batch_size = 100 # how many embeddings we create and insert at once\n", From 14256c178d2e719bf0178b2cb355c14fa3ac88b4 Mon Sep 17 00:00:00 2001 From: liuliu Date: Fri, 17 Mar 2023 14:31:20 +0800 Subject: [PATCH 04/11] docs: highlight tool name and modify spell 1. highlight "Qdrant" 2. modify "REST" to "RESTful" --- .../qdrant/Getting_started_with_Qdrant_and_OpenAI.ipynb | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/examples/vector_databases/qdrant/Getting_started_with_Qdrant_and_OpenAI.ipynb b/examples/vector_databases/qdrant/Getting_started_with_Qdrant_and_OpenAI.ipynb index 568b961a..c5144a9f 100644 --- a/examples/vector_databases/qdrant/Getting_started_with_Qdrant_and_OpenAI.ipynb +++ b/examples/vector_databases/qdrant/Getting_started_with_Qdrant_and_OpenAI.ipynb @@ -6,7 +6,7 @@ "source": [ "# Using Qdrant as a vector database for OpenAI embeddings\n", "\n", - "This notebook guides you step by step on using Qdrant as a vector database for OpenAI embeddings. [Qdrant](https://qdrant.tech) is a high-performant vector search database written in Rust. It offers REST and gRPC APIs to manage your embeddings. There is an official Python [qdrant-client](https://github.com/qdrant/qdrant_client) that eases the integration with your apps.\n", + "This notebook guides you step by step on using **`Qdrant`** as a vector database for OpenAI embeddings. [Qdrant](https://qdrant.tech) is a high-performant vector search database written in Rust. It offers RESTful and gRPC APIs to manage your embeddings. There is an official Python [qdrant-client](https://github.com/qdrant/qdrant_client) that eases the integration with your apps.\n", "\n", "This notebook presents an end-to-end process of:\n", "1. Using precomputed embeddings created by OpenAI API.\n", @@ -28,7 +28,7 @@ "\n", "### Integration\n", "\n", - "[Qdrant](https://qdrant.tech) provides both REST and gRPC APIs which makes integration easy, no matter the programming language you use. However, there are some official clients for the most popular languages available, and if you use Python then the [Python Qdrant client library](https://github.com/qdrant/qdrant_client) might be the best choice." + "[Qdrant](https://qdrant.tech) provides both RESTful and gRPC APIs which makes integration easy, no matter the programming language you use. However, there are some official clients for the most popular languages available, and if you use Python then the [Python Qdrant client library](https://github.com/qdrant/qdrant_client) might be the best choice." ] }, { From 87cc560699a8a23632621f39ce17356910bcf25e Mon Sep 17 00:00:00 2001 From: liuliu Date: Fri, 17 Mar 2023 14:34:08 +0800 Subject: [PATCH 05/11] docs: create a block to show how to export parameter to environment in terminal --- .../Getting_started_with_Qdrant_and_OpenAI.ipynb | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/examples/vector_databases/qdrant/Getting_started_with_Qdrant_and_OpenAI.ipynb b/examples/vector_databases/qdrant/Getting_started_with_Qdrant_and_OpenAI.ipynb index c5144a9f..1dd3e122 100644 --- a/examples/vector_databases/qdrant/Getting_started_with_Qdrant_and_OpenAI.ipynb +++ b/examples/vector_databases/qdrant/Getting_started_with_Qdrant_and_OpenAI.ipynb @@ -132,7 +132,16 @@ "\n", "If you don't have an OpenAI API key, you can get one from [https://beta.openai.com/account/api-keys](https://beta.openai.com/account/api-keys).\n", "\n", - "Once you get your key, please add it to your environment variables as `OPENAI_API_KEY`." + "Once you get your key, please add it to your environment variables as `OPENAI_API_KEY` by running following command:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "! export OPENAI_API_KEY=\"your API key\"" ] }, { From 8d933edd7505a5cffe4cca4f0ba1ce7a054ab486 Mon Sep 17 00:00:00 2001 From: liuliu Date: Fri, 17 Mar 2023 14:56:01 +0800 Subject: [PATCH 06/11] docs: create a block to show how to export parameter to environment in terminal --- .../qdrant/QA_with_Langchain_Qdrant_and_OpenAI.ipynb | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/examples/vector_databases/qdrant/QA_with_Langchain_Qdrant_and_OpenAI.ipynb b/examples/vector_databases/qdrant/QA_with_Langchain_Qdrant_and_OpenAI.ipynb index 5f2dba84..24d72d7b 100644 --- a/examples/vector_databases/qdrant/QA_with_Langchain_Qdrant_and_OpenAI.ipynb +++ b/examples/vector_databases/qdrant/QA_with_Langchain_Qdrant_and_OpenAI.ipynb @@ -122,7 +122,16 @@ "\n", "If you don't have an OpenAI API key, you can get one from [https://beta.openai.com/account/api-keys](https://beta.openai.com/account/api-keys).\n", "\n", - "Once you get your key, please add it to your environment variables as `OPENAI_API_KEY`." + "Once you get your key, please add it to your environment variables as `OPENAI_API_KEY` by running following command:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "! export OPENAI_API_KEY=\"your API key\"" ] }, { From 6bf85caa6ffd3acaaa71cd052f009a140f551744 Mon Sep 17 00:00:00 2001 From: liuliu Date: Fri, 17 Mar 2023 17:29:58 +0800 Subject: [PATCH 07/11] docs: fix minor spelling issue --- .../redis/getting-started-with-redis-and-openai.ipynb | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/examples/vector_databases/redis/getting-started-with-redis-and-openai.ipynb b/examples/vector_databases/redis/getting-started-with-redis-and-openai.ipynb index 44a2ce5a..b4781eae 100644 --- a/examples/vector_databases/redis/getting-started-with-redis-and-openai.ipynb +++ b/examples/vector_databases/redis/getting-started-with-redis-and-openai.ipynb @@ -38,7 +38,7 @@ "source": [ "## Prerequisites\n", "\n", - "Before we start this project, we need setup the following:\n", + "Before we start this project, we need to set up the following:\n", "\n", "* start a Redis database with RediSearch (redis-stack)\n", "* install libraries\n", @@ -351,7 +351,7 @@ "source": [ "## Creating a Search Index in Redis\n", "\n", - "The below cells will show how to specify and create a search index in Redis. We will\n", + "The below cells will show how to specify and create a search index in Redis. We will:\n", "\n", "1. Set some constants for defining our index like the distance metric and the index name\n", "2. Define the index schema with RediSearch fields\n", @@ -432,7 +432,7 @@ "source": [ "## Load Documents into the Index\n", "\n", - "Now that we have a search index, we can load documents into it. We will use the same documents we used in the previous examples. In Redis, either the Hash or JSON (if using RedisJSON in addition to RediSearch) data types can be used to store documents. We will use the HASH data type in this example. The below cells will show how to load documents into the index." + "Now that we have a search index, we can load documents into it. We will use the same documents we used in the previous examples. In Redis, either the HASH or JSON (if using RedisJSON in addition to RediSearch) data types can be used to store documents. We will use the HASH data type in this example. The below cells will show how to load documents into the index." ] }, { @@ -682,7 +682,7 @@ "\n", "``HNSW`` will take longer to build and consume more memory for most cases than ``FLAT`` but will be faster to run queries on, especially for large datasets.\n", "\n", - "The following cells will show how to create an ``HNSW`` index and run queries with it using the same data as before." + "The following cells will show how to create a ``HNSW`` index and run queries with it using the same data as before." ] }, { From 5238623cbf5fbf7bb2d65e6d61fa8741e7b65bda Mon Sep 17 00:00:00 2001 From: liuliu Date: Fri, 17 Mar 2023 17:30:19 +0800 Subject: [PATCH 08/11] fix: docker-compose up command 1. docker compose -> docker-compose --- .../redis/getting-started-with-redis-and-openai.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/vector_databases/redis/getting-started-with-redis-and-openai.ipynb b/examples/vector_databases/redis/getting-started-with-redis-and-openai.ipynb index b4781eae..f96e0616 100644 --- a/examples/vector_databases/redis/getting-started-with-redis-and-openai.ipynb +++ b/examples/vector_databases/redis/getting-started-with-redis-and-openai.ipynb @@ -52,7 +52,7 @@ "To keep this example simple, we will use the Redis Stack docker container which we can start as follows\n", "\n", "```bash\n", - "$ docker compose up -d\n", + "$ docker-compose up -d\n", "```\n", "\n", "This also includes the [RedisInsight](https://redis.com/redis-enterprise/redis-insight/) GUI for managing your Redis database which you can view at [http://localhost:8001](http://localhost:8001) once you start the docker container.\n", From b50ac7dff8e876f8ce7fa0acd8e3e371159e9506 Mon Sep 17 00:00:00 2001 From: liuliu Date: Fri, 17 Mar 2023 17:31:12 +0800 Subject: [PATCH 09/11] fix: add space in block of python 1. !pip -> ! pip --- .../redis/getting-started-with-redis-and-openai.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/vector_databases/redis/getting-started-with-redis-and-openai.ipynb b/examples/vector_databases/redis/getting-started-with-redis-and-openai.ipynb index f96e0616..8162f8f9 100644 --- a/examples/vector_databases/redis/getting-started-with-redis-and-openai.ipynb +++ b/examples/vector_databases/redis/getting-started-with-redis-and-openai.ipynb @@ -78,7 +78,7 @@ "metadata": {}, "outputs": [], "source": [ - "!pip install redis wget pandas openai" + "! pip install redis wget pandas openai" ] }, { From 34367ad85df29aaf4ae6ae6ab2696c75eeff2ede Mon Sep 17 00:00:00 2001 From: liuliu Date: Fri, 17 Mar 2023 17:32:13 +0800 Subject: [PATCH 10/11] docs: create a block to show how to export parameter to environment in terminal --- .../getting-started-with-redis-and-openai.ipynb | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/examples/vector_databases/redis/getting-started-with-redis-and-openai.ipynb b/examples/vector_databases/redis/getting-started-with-redis-and-openai.ipynb index 8162f8f9..ccfe21e6 100644 --- a/examples/vector_databases/redis/getting-started-with-redis-and-openai.ipynb +++ b/examples/vector_databases/redis/getting-started-with-redis-and-openai.ipynb @@ -94,7 +94,17 @@ "\n", "If you don't have an OpenAI API key, you can get one from [https://beta.openai.com/account/api-keys](https://beta.openai.com/account/api-keys).\n", "\n", - "Once you get your key, please add it to your environment variables as `OPENAI_API_KEY`." + "Once you get your key, please add it to your environment variables as `OPENAI_API_KEY` by using following command:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "be28faa6", + "metadata": {}, + "outputs": [], + "source": [ + "! export OPENAI_API_KEY=\"your API key\"" ] }, { From 524c139bc17255ee4dd3ef9898216aad81962b74 Mon Sep 17 00:00:00 2001 From: liuliu Date: Wed, 29 Mar 2023 09:53:06 +0800 Subject: [PATCH 11/11] revert: spelling adjustment in examples/ 1. a -> an in getting-started-with-redis-and-openai.ipynb --- .../redis/getting-started-with-redis-and-openai.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/vector_databases/redis/getting-started-with-redis-and-openai.ipynb b/examples/vector_databases/redis/getting-started-with-redis-and-openai.ipynb index ccfe21e6..6520928b 100644 --- a/examples/vector_databases/redis/getting-started-with-redis-and-openai.ipynb +++ b/examples/vector_databases/redis/getting-started-with-redis-and-openai.ipynb @@ -692,7 +692,7 @@ "\n", "``HNSW`` will take longer to build and consume more memory for most cases than ``FLAT`` but will be faster to run queries on, especially for large datasets.\n", "\n", - "The following cells will show how to create a ``HNSW`` index and run queries with it using the same data as before." + "The following cells will show how to create an ``HNSW`` index and run queries with it using the same data as before." ] }, {