Add Generative Search to Weaviate examples + fix authentication examples (#398)

* add Weaviate authentication to client configuration

* add a cookbook for generative search

* fix name
pull/441/head
Sebastian Witalec 1 year ago committed by GitHub
parent eeb8ddf2f3
commit 6d77a1b93e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -0,0 +1,275 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "cb1537e6",
"metadata": {},
"source": [
"# Using Weaviate with Generative OpenAI module for Generative Search\n",
"\n",
"This notebook is prepared for a scenario where:\n",
"* Your data is already in Weaviate\n",
"* You want to use Weaviate with the Generative OpenAI module ([generative-openai](https://weaviate.io/developers/weaviate/modules/reader-generator-modules/generative-openai)).\n",
"\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "f1a618c5",
"metadata": {},
"source": [
"## Prerequisites\n",
"\n",
"This cookbook only coveres Generative Search examples, however, it doesn't cover the configuration and data imports.\n",
"\n",
"In order to make the most of this cookbook, please complete the [Getting Started cookbook](./getting-started-with-weaviate-and-openai.ipynb) firts, where you will learn the essentials of working with Weaviate and import the demo data.\n",
"\n",
"Checklist:\n",
"* completed [Getting Started cookbook](./getting-started-with-weaviate-and-openai.ipynb),\n",
"* crated a `Weaviate` instance,\n",
"* imported data into your `Weaviate` instance,\n",
"* you have an [OpenAI API key](https://beta.openai.com/account/api-keys)"
]
},
{
"cell_type": "markdown",
"id": "36fe86f4",
"metadata": {},
"source": [
"===========================================================\n",
"## Prepare your OpenAI API key\n",
"\n",
"The `OpenAI API key` is used for vectorization of your data at import, and for running queries.\n",
"\n",
"If you don't have an OpenAI API key, you can get one from [https://beta.openai.com/account/api-keys](https://beta.openai.com/account/api-keys).\n",
"\n",
"Once you get your key, please add it to your environment variables as `OPENAI_API_KEY`."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "43395339",
"metadata": {},
"outputs": [],
"source": [
"# Export OpenAI API Key\n",
"!export OPENAI_API_KEY=\"your key\""
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "88be138c",
"metadata": {},
"outputs": [],
"source": [
"# Test that your OpenAI API key is correctly set as an environment variable\n",
"# Note. if you run this notebook locally, you will need to reload your terminal and the notebook for the env variables to be live.\n",
"import os\n",
"\n",
"# Note. alternatively you can set a temporary env variable like this:\n",
"# os.environ[\"OPENAI_API_KEY\"] = 'your-key-goes-here'\n",
"\n",
"if os.getenv(\"OPENAI_API_KEY\") is not None:\n",
" print (\"OPENAI_API_KEY is ready\")\n",
"else:\n",
" print (\"OPENAI_API_KEY environment variable not found\")"
]
},
{
"cell_type": "markdown",
"id": "91df4d5b",
"metadata": {},
"source": [
"## Connect to your Weaviate instance\n",
"\n",
"In this section, we will:\n",
"\n",
"1. test env variable `OPENAI_API_KEY` **make sure** you completed the step in [#Prepare-your-OpenAI-API-key](#Prepare-your-OpenAI-API-key)\n",
"2. connect to your Weaviate with your `OpenAI API Key`\n",
"3. and test the client connection\n",
"\n",
"### The client \n",
"\n",
"After this step, the `client` object will be used to perform all Weaviate-related operations."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cc662c1b",
"metadata": {},
"outputs": [],
"source": [
"import weaviate\n",
"from datasets import load_dataset\n",
"import os\n",
"\n",
"# Connect to your Weaviate instance\n",
"client = weaviate.Client(\n",
" url=\"https://your-wcs-instance-name.weaviate.network/\",\n",
" # url=\"http://localhost:8080/\",\n",
" auth_client_secret=weaviate.auth.AuthApiKey(api_key=\"<YOUR-WEAVIATE-API-KEY>\"), # comment out this line if you are not using authentication for your Weaviate instance (i.e. for locally deployed instances)\n",
" additional_headers={\n",
" \"X-OpenAI-Api-Key\": os.getenv(\"OPENAI_API_KEY\")\n",
" }\n",
")\n",
"\n",
"# Check if your instance is live and ready\n",
"# This should return `True`\n",
"client.is_ready()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "ceb14da9",
"metadata": {},
"source": [
"## Generative Search\n",
"Weaviate offers a [Generative Search OpenAI](https://weaviate.io/developers/weaviate/modules/reader-generator-modules/generative-openai) module, which generates responses based on the data stored in your Weaviate instance.\n",
"\n",
"The way you construct a generative search query is very similar to a standard semantic search query in Weaviate. \n",
"\n",
"For example:\n",
"* search in \"Articles\", \n",
"* return \"title\", \"content\", \"url\"\n",
"* look for objects related to \"football clubs\"\n",
"* limit results to 5 objects\n",
"\n",
"```\n",
" result = (\n",
" client.query\n",
" .get(\"Articles\", [\"title\", \"content\", \"url\"])\n",
" .with_near_text(\"concepts\": \"football clubs\")\n",
" .with_limit(5)\n",
" # generative query will go here\n",
" .do()\n",
" )\n",
"```\n",
"\n",
"Now, you can add `with_generate()` function to apply generative transformation. `with_generate` takes either:\n",
"- `single_prompt` - to generate a response for each returned object,\n",
"- `grouped_task` to generate a single response from all returned objects.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "51559251",
"metadata": {},
"outputs": [],
"source": [
"def generative_search_per_item(query, collection_name):\n",
" prompt = \"Summarize in a short tweet the following content: {content}\"\n",
"\n",
" result = (\n",
" client.query\n",
" .get(collection_name, [\"title\", \"content\", \"url\"])\n",
" .with_near_text({ \"concepts\": [query], \"distance\": 0.7 })\n",
" .with_limit(5)\n",
" .with_generate(single_prompt=prompt)\n",
" .do()\n",
" )\n",
" \n",
" # Check for errors\n",
" if (\"errors\" in result):\n",
" print (\"\\033[91mYou probably have run out of OpenAI API calls for the current minute the limit is set at 60 per minute.\")\n",
" raise Exception(result[\"errors\"][0]['message'])\n",
" \n",
" return result[\"data\"][\"Get\"][collection_name]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a4604726",
"metadata": {},
"outputs": [],
"source": [
"query_result = generative_search_per_item(\"football clubs\", \"Article\")\n",
"\n",
"for i, article in enumerate(query_result):\n",
" print(f\"{i+1}. { article['title']}\")\n",
" print(article['_additional']['generate']['singleResult']) # print generated response\n",
" print(\"-----------------------\")"
]
},
{
"cell_type": "code",
"execution_count": 79,
"id": "a45ea160",
"metadata": {},
"outputs": [],
"source": [
"def generative_search_group(query, collection_name):\n",
" generateTask = \"Explain what these have in common\"\n",
"\n",
" result = (\n",
" client.query\n",
" .get(collection_name, [\"title\", \"content\", \"url\"])\n",
" .with_near_text({ \"concepts\": [query], \"distance\": 0.7 })\n",
" .with_generate(grouped_task=generateTask)\n",
" .with_limit(5)\n",
" .do()\n",
" )\n",
" \n",
" # Check for errors\n",
" if (\"errors\" in result):\n",
" print (\"\\033[91mYou probably have run out of OpenAI API calls for the current minute the limit is set at 60 per minute.\")\n",
" raise Exception(result[\"errors\"][0]['message'])\n",
" \n",
" return result[\"data\"][\"Get\"][collection_name]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "11e0dad2",
"metadata": {},
"outputs": [],
"source": [
"query_result = generative_search_group(\"football clubs\", \"Article\")\n",
"\n",
"print (query_result[0]['_additional']['generate']['groupedResult'])"
]
},
{
"cell_type": "markdown",
"id": "2007be48",
"metadata": {},
"source": [
"Thanks for following along, you're now equipped to set up your own vector databases and use embeddings to do all kinds of cool things - enjoy! For more complex use cases please continue to work through other cookbook examples in this repo."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.12"
},
"vscode": {
"interpreter": {
"hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}

@ -241,6 +241,7 @@
"client = weaviate.Client(\n",
" url=\"https://your-wcs-instance-name.weaviate.network/\",\n",
" # url=\"http://localhost:8080/\",\n",
" auth_client_secret=weaviate.auth.AuthApiKey(api_key=\"<YOUR-WEAVIATE-API-KEY>\"), # comment out this line if you are not using authentication for your Weaviate instance (i.e. for locally deployed instances)\n",
" additional_headers={\n",
" \"X-OpenAI-Api-Key\": os.getenv(\"OPENAI_API_KEY\")\n",
" }\n",

@ -241,6 +241,7 @@
"client = weaviate.Client(\n",
" url=\"https://your-wcs-instance-name.weaviate.network/\",\n",
"# url=\"http://localhost:8080/\",\n",
" auth_client_secret=weaviate.auth.AuthApiKey(api_key=\"<YOUR-WEAVIATE-API-KEY>\"), # comment out this line if you are not using authentication for your Weaviate instance (i.e. for locally deployed instances)\n",
" additional_headers={\n",
" \"X-OpenAI-Api-Key\": os.getenv(\"OPENAI_API_KEY\")\n",
" }\n",

@ -240,6 +240,7 @@
"client = weaviate.Client(\n",
" url=\"https://your-wcs-instance-name.weaviate.network/\",\n",
"# url=\"http://localhost:8080/\",\n",
" auth_client_secret=weaviate.auth.AuthApiKey(api_key=\"<YOUR-WEAVIATE-API-KEY>\"), # comment out this line if you are not using authentication for your Weaviate instance (i.e. for locally deployed instances)\n",
" additional_headers={\n",
" \"X-OpenAI-Api-Key\": os.getenv(\"OPENAI_API_KEY\")\n",
" }\n",

Loading…
Cancel
Save