mirror of
https://github.com/openai/openai-cookbook
synced 2024-11-11 13:11:02 +00:00
280 lines
7.8 KiB
Plaintext
280 lines
7.8 KiB
Plaintext
|
{
|
||
|
"cells": [
|
||
|
{
|
||
|
"attachments": {},
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"# Azure embeddings example\n",
|
||
|
"\n",
|
||
|
"> Note: There is a newer version of the openai library available. See https://github.com/openai/openai-python/discussions/742\n",
|
||
|
"\n",
|
||
|
"This example will cover embeddings using the Azure OpenAI service."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"attachments": {},
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"## Setup"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"First, we install the necessary dependencies."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"! pip install \"openai>=0.28.1,<1.0.0\""
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"For the following sections to work properly we first have to setup some things. Let's start with the `api_base` and `api_version`. To find your `api_base` go to https://portal.azure.com, find your resource and then under \"Resource Management\" -> \"Keys and Endpoints\" look for the \"Endpoint\" value."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"import os\n",
|
||
|
"import openai"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"openai.api_version = '2023-05-15'\n",
|
||
|
"openai.api_base = '' # Please add your endpoint here"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"attachments": {},
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"We next have to setup the `api_type` and `api_key`. We can either get the key from the portal or we can get it through Microsoft Active Directory Authentication. Depending on this the `api_type` is either `azure` or `azure_ad`."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"attachments": {},
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"### Setup: Portal\n",
|
||
|
"Let's first look at getting the key from the portal. Go to https://portal.azure.com, find your resource and then under \"Resource Management\" -> \"Keys and Endpoints\" look for one of the \"Keys\" values."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"openai.api_type = 'azure'\n",
|
||
|
"openai.api_key = os.environ[\"OPENAI_API_KEY\"]"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"attachments": {},
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"> Note: In this example, we configured the library to use the Azure API by setting the variables in code. For development, consider setting the environment variables instead:\n",
|
||
|
"\n",
|
||
|
"```\n",
|
||
|
"OPENAI_API_BASE\n",
|
||
|
"OPENAI_API_KEY\n",
|
||
|
"OPENAI_API_TYPE\n",
|
||
|
"OPENAI_API_VERSION\n",
|
||
|
"```"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"attachments": {},
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"### (Optional) Setup: Microsoft Active Directory Authentication\n",
|
||
|
"Let's now see how we can get a key via Microsoft Active Directory Authentication. Uncomment the following code if you want to use Active Directory Authentication instead of keys from the portal."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"# from azure.identity import DefaultAzureCredential\n",
|
||
|
"\n",
|
||
|
"# default_credential = DefaultAzureCredential()\n",
|
||
|
"# token = default_credential.get_token(\"https://cognitiveservices.azure.com/.default\")\n",
|
||
|
"\n",
|
||
|
"# openai.api_type = 'azure_ad'\n",
|
||
|
"# openai.api_key = token.token"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"attachments": {},
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"A token is valid for a period of time, after which it will expire. To ensure a valid token is sent with every request, you can refresh an expiring token by hooking into requests.auth:"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"import typing\n",
|
||
|
"import time\n",
|
||
|
"import requests\n",
|
||
|
"if typing.TYPE_CHECKING:\n",
|
||
|
" from azure.core.credentials import TokenCredential\n",
|
||
|
"\n",
|
||
|
"class TokenRefresh(requests.auth.AuthBase):\n",
|
||
|
"\n",
|
||
|
" def __init__(self, credential: \"TokenCredential\", scopes: typing.List[str]) -> None:\n",
|
||
|
" self.credential = credential\n",
|
||
|
" self.scopes = scopes\n",
|
||
|
" self.cached_token: typing.Optional[str] = None\n",
|
||
|
"\n",
|
||
|
" def __call__(self, req):\n",
|
||
|
" if not self.cached_token or self.cached_token.expires_on - time.time() < 300:\n",
|
||
|
" self.cached_token = self.credential.get_token(*self.scopes)\n",
|
||
|
" req.headers[\"Authorization\"] = f\"Bearer {self.cached_token.token}\"\n",
|
||
|
" return req\n",
|
||
|
"\n",
|
||
|
"session = requests.Session()\n",
|
||
|
"session.auth = TokenRefresh(default_credential, [\"https://cognitiveservices.azure.com/.default\"])\n",
|
||
|
"\n",
|
||
|
"openai.requestssession = session"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"attachments": {},
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"## Deployments\n",
|
||
|
"In this section we are going to create a deployment that we can use to create embeddings."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"attachments": {},
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"### Deployments: Create manually\n",
|
||
|
"Let's create a deployment using the `text-similarity-curie-001` model. Create a new deployment by going to your Resource in your portal under \"Resource Management\" -> \"Model deployments\"."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"deployment_id = '' # Fill in the deployment id from the portal here"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"attachments": {},
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"### Deployments: Listing\n",
|
||
|
"Now because creating a new deployment takes a long time, let's look in the subscription for an already finished deployment that succeeded."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"print('While deployment running, selecting a completed one that supports embeddings.')\n",
|
||
|
"deployment_id = None\n",
|
||
|
"result = openai.Deployment.list()\n",
|
||
|
"for deployment in result.data:\n",
|
||
|
" if deployment[\"status\"] != \"succeeded\":\n",
|
||
|
" continue\n",
|
||
|
" \n",
|
||
|
" model = openai.Model.retrieve(deployment[\"model\"])\n",
|
||
|
" if model[\"capabilities\"][\"embeddings\"] != True:\n",
|
||
|
" continue\n",
|
||
|
" \n",
|
||
|
" deployment_id = deployment[\"id\"]\n",
|
||
|
" break\n",
|
||
|
"\n",
|
||
|
"if not deployment_id:\n",
|
||
|
" print('No deployment with status: succeeded found.')\n",
|
||
|
"else:\n",
|
||
|
" print(f'Found a succeeded deployment that supports embeddings with id: {deployment_id}.')"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"attachments": {},
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"### Embeddings\n",
|
||
|
"Now let's send a sample embedding to the deployment."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"embeddings = openai.Embedding.create(deployment_id=deployment_id,\n",
|
||
|
" input=\"The food was delicious and the waiter...\")\n",
|
||
|
" \n",
|
||
|
"print(embeddings)"
|
||
|
]
|
||
|
}
|
||
|
],
|
||
|
"metadata": {
|
||
|
"kernelspec": {
|
||
|
"display_name": "Python 3",
|
||
|
"language": "python",
|
||
|
"name": "python3"
|
||
|
},
|
||
|
"language_info": {
|
||
|
"codemirror_mode": {
|
||
|
"name": "ipython",
|
||
|
"version": 3
|
||
|
},
|
||
|
"file_extension": ".py",
|
||
|
"mimetype": "text/x-python",
|
||
|
"name": "python",
|
||
|
"nbconvert_exporter": "python",
|
||
|
"pygments_lexer": "ipython3",
|
||
|
"version": "3.11.3"
|
||
|
},
|
||
|
"vscode": {
|
||
|
"interpreter": {
|
||
|
"hash": "3a5103089ab7e7c666b279eeded403fcec76de49a40685dbdfe9f9c78ad97c17"
|
||
|
}
|
||
|
}
|
||
|
},
|
||
|
"nbformat": 4,
|
||
|
"nbformat_minor": 2
|
||
|
}
|