mirror of
https://github.com/openai/openai-cookbook
synced 2024-11-11 13:11:02 +00:00
277 lines
7.9 KiB
Plaintext
277 lines
7.9 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Azure audio whisper (preview) example\n",
|
|
"\n",
|
|
"> Note: There is a newer version of the openai library available. See https://github.com/openai/openai-python/discussions/742\n",
|
|
"\n",
|
|
"The example shows how to use the Azure OpenAI Whisper model to transcribe audio files."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Setup\n",
|
|
"\n",
|
|
"First, we install the necessary dependencies."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"! pip install \"openai>=0.28.1,<1.0.0\"\n",
|
|
"! pip install python-dotenv"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Next, we'll import our libraries and configure the Python OpenAI SDK to work with the Azure OpenAI service."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"> Note: In this example, we configured the library to use the Azure API by setting the variables in code. For development, consider setting the environment variables instead:\n",
|
|
"\n",
|
|
"```\n",
|
|
"OPENAI_API_BASE\n",
|
|
"OPENAI_API_KEY\n",
|
|
"OPENAI_API_TYPE\n",
|
|
"OPENAI_API_VERSION\n",
|
|
"```"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"True"
|
|
]
|
|
},
|
|
"execution_count": 1,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"import os\n",
|
|
"import dotenv\n",
|
|
"import openai\n",
|
|
"\n",
|
|
"\n",
|
|
"dotenv.load_dotenv()"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"\n",
|
|
"To properly access the Azure OpenAI Service, we need to create the proper resources at the [Azure Portal](https://portal.azure.com) (you can check a detailed guide on how to do this in the [Microsoft Docs](https://learn.microsoft.com/en-us/azure/cognitive-services/openai/how-to/create-resource?pivots=web-portal))\n",
|
|
"\n",
|
|
"Once the resource is created, the first thing we need to use is its endpoint. You can get the endpoint by looking at the *\"Keys and Endpoints\"* section under the *\"Resource Management\"* section. Having this, we will set up the SDK using this information:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"openai.api_base = os.environ[\"OPENAI_API_BASE\"]\n",
|
|
"\n",
|
|
"# Min API version that supports Whisper\n",
|
|
"openai.api_version = \"2023-09-01-preview\"\n",
|
|
"\n",
|
|
"# Enter the deployment_id to use for the Whisper model\n",
|
|
"deployment_id = \"<deployment-id-for-your-whisper-model>\""
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Authentication\n",
|
|
"\n",
|
|
"The Azure OpenAI service supports multiple authentication mechanisms that include API keys and Azure credentials."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 3,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# set to True if using Azure Active Directory authentication\n",
|
|
"use_azure_active_directory = False"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"\n",
|
|
"#### Authentication using API key\n",
|
|
"\n",
|
|
"To set up the OpenAI SDK to use an *Azure API Key*, we need to set up the `api_type` to `azure` and set `api_key` to a key associated with your endpoint (you can find this key in *\"Keys and Endpoints\"* under *\"Resource Management\"* in the [Azure Portal](https://portal.azure.com))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"if not use_azure_active_directory:\n",
|
|
" openai.api_type = 'azure'\n",
|
|
" openai.api_key = os.environ[\"OPENAI_API_KEY\"]"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Authentication using Azure Active Directory\n",
|
|
"Let's now see how we can get a key via Microsoft Active Directory Authentication."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azure.identity import DefaultAzureCredential\n",
|
|
"\n",
|
|
"if use_azure_active_directory:\n",
|
|
" default_credential = DefaultAzureCredential()\n",
|
|
" token = default_credential.get_token(\"https://cognitiveservices.azure.com/.default\")\n",
|
|
"\n",
|
|
" openai.api_type = 'azure_ad'\n",
|
|
" openai.api_key = token.token"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"A token is valid for a period of time, after which it will expire. To ensure a valid token is sent with every request, you can refresh an expiring token by hooking into requests.auth:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import typing\n",
|
|
"import time\n",
|
|
"import requests\n",
|
|
"\n",
|
|
"if typing.TYPE_CHECKING:\n",
|
|
" from azure.core.credentials import TokenCredential\n",
|
|
"\n",
|
|
"class TokenRefresh(requests.auth.AuthBase):\n",
|
|
"\n",
|
|
" def __init__(self, credential: \"TokenCredential\", scopes: typing.List[str]) -> None:\n",
|
|
" self.credential = credential\n",
|
|
" self.scopes = scopes\n",
|
|
" self.cached_token: typing.Optional[str] = None\n",
|
|
"\n",
|
|
" def __call__(self, req):\n",
|
|
" if not self.cached_token or self.cached_token.expires_on - time.time() < 300:\n",
|
|
" self.cached_token = self.credential.get_token(*self.scopes)\n",
|
|
" req.headers[\"Authorization\"] = f\"Bearer {self.cached_token.token}\"\n",
|
|
" return req\n",
|
|
"\n",
|
|
"if use_azure_active_directory:\n",
|
|
" session = requests.Session()\n",
|
|
" session.auth = TokenRefresh(default_credential, [\"https://cognitiveservices.azure.com/.default\"])\n",
|
|
"\n",
|
|
" openai.requestssession = session"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Audio transcription\n",
|
|
"\n",
|
|
"Audio transcription, or speech-to-text, is the process of converting spoken words into text. Use the `openai.Audio.transcribe` method to transcribe an audio file stream to text.\n",
|
|
"\n",
|
|
"You can get sample audio files from the [Azure AI Speech SDK repository at GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/sampledata/audiofiles)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 9,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# download sample audio file\n",
|
|
"import requests\n",
|
|
"\n",
|
|
"sample_audio_url = \"https://github.com/Azure-Samples/cognitive-services-speech-sdk/raw/master/sampledata/audiofiles/wikipediaOcelot.wav\"\n",
|
|
"audio_file = requests.get(sample_audio_url)\n",
|
|
"with open(\"wikipediaOcelot.wav\", \"wb\") as f:\n",
|
|
" f.write(audio_file.content)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"transcription = openai.Audio.transcribe(\n",
|
|
" file=open(\"wikipediaOcelot.wav\", \"rb\"),\n",
|
|
" model=\"whisper-1\",\n",
|
|
" deployment_id=deployment_id,\n",
|
|
")\n",
|
|
"print(transcription.text)"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "venv",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.10.0"
|
|
},
|
|
"orig_nbformat": 4
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 2
|
|
}
|