mirror of
https://github.com/openai/openai-cookbook
synced 2024-11-17 15:29:46 +00:00
229 lines
6.9 KiB
Plaintext
229 lines
6.9 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Azure audio whisper (preview) example\n",
|
|
"\n",
|
|
"The example shows how to use the Azure OpenAI Whisper model to transcribe audio files.\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Setup\n",
|
|
"\n",
|
|
"First, we install the necessary dependencies and import the libraries we will be using."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"! pip install \"openai>=1.0.0,<2.0.0\"\n",
|
|
"! pip install python-dotenv"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import os\n",
|
|
"import openai\n",
|
|
"import dotenv\n",
|
|
"\n",
|
|
"dotenv.load_dotenv()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Authentication\n",
|
|
"\n",
|
|
"The Azure OpenAI service supports multiple authentication mechanisms that include API keys and Azure Active Directory token credentials."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"use_azure_active_directory = False # Set this flag to True if you are using Azure Active Directory"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Authentication using API key\n",
|
|
"\n",
|
|
"To set up the OpenAI SDK to use an *Azure API Key*, we need to set `api_key` to a key associated with your endpoint (you can find this key in *\"Keys and Endpoints\"* under *\"Resource Management\"* in the [Azure Portal](https://portal.azure.com)). You'll also find the endpoint for your resource here."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"if not use_azure_active_directory:\n",
|
|
" endpoint = os.environ[\"AZURE_OPENAI_ENDPOINT\"]\n",
|
|
" api_key = os.environ[\"AZURE_OPENAI_API_KEY\"]\n",
|
|
"\n",
|
|
" client = openai.AzureOpenAI(\n",
|
|
" azure_endpoint=endpoint,\n",
|
|
" api_key=api_key,\n",
|
|
" api_version=\"2023-09-01-preview\"\n",
|
|
" )"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Authentication using Azure Active Directory\n",
|
|
"Let's now see how we can autheticate via Azure Active Directory. We'll start by installing the `azure-identity` library. This library will provide the token credentials we need to authenticate and help us build a token credential provider through the `get_bearer_token_provider` helper function. It's recommended to use `get_bearer_token_provider` over providing a static token to `AzureOpenAI` because this API will automatically cache and refresh tokens for you. \n",
|
|
"\n",
|
|
"For more information on how to set up Azure Active Directory authentication with Azure OpenAI, see the [documentation](https://learn.microsoft.com/azure/ai-services/openai/how-to/managed-identity)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"! pip install \"azure-identity>=1.15.0\""
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azure.identity import DefaultAzureCredential, get_bearer_token_provider\n",
|
|
"\n",
|
|
"if use_azure_active_directory:\n",
|
|
" endpoint = os.environ[\"AZURE_OPENAI_ENDPOINT\"]\n",
|
|
" api_key = os.environ[\"AZURE_OPENAI_API_KEY\"]\n",
|
|
"\n",
|
|
" client = openai.AzureOpenAI(\n",
|
|
" azure_endpoint=endpoint,\n",
|
|
" azure_ad_token_provider=get_bearer_token_provider(DefaultAzureCredential(), \"https://cognitiveservices.azure.com/.default\"),\n",
|
|
" api_version=\"2023-09-01-preview\"\n",
|
|
" )"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"> Note: the AzureOpenAI infers the following arguments from their corresponding environment variables if they are not provided:\n",
|
|
"\n",
|
|
"- `api_key` from `AZURE_OPENAI_API_KEY`\n",
|
|
"- `azure_ad_token` from `AZURE_OPENAI_AD_TOKEN`\n",
|
|
"- `api_version` from `OPENAI_API_VERSION`\n",
|
|
"- `azure_endpoint` from `AZURE_OPENAI_ENDPOINT`\n"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Deployments\n",
|
|
"\n",
|
|
"In this section we are going to create a deployment using the `whisper-1` model to transcribe audio files."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Deployments: Create in the Azure OpenAI Studio\n",
|
|
"Let's deploy a model to use with whisper. Go to https://portal.azure.com, find your Azure OpenAI resource, and then navigate to the Azure OpenAI Studio. Click on the \"Deployments\" tab and then create a deployment for the model you want to use for whisper. The deployment name that you give the model will be used in the code below."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 6,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"deployment = \"whisper-deployment\" # Fill in the deployment name from the portal here"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Audio transcription\n",
|
|
"\n",
|
|
"Audio transcription, or speech-to-text, is the process of converting spoken words into text. Use the `openai.Audio.transcribe` method to transcribe an audio file stream to text.\n",
|
|
"\n",
|
|
"You can get sample audio files from the [Azure AI Speech SDK repository at GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/sampledata/audiofiles)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 7,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# download sample audio file\n",
|
|
"import requests\n",
|
|
"\n",
|
|
"sample_audio_url = \"https://github.com/Azure-Samples/cognitive-services-speech-sdk/raw/master/sampledata/audiofiles/wikipediaOcelot.wav\"\n",
|
|
"audio_file = requests.get(sample_audio_url)\n",
|
|
"with open(\"wikipediaOcelot.wav\", \"wb\") as f:\n",
|
|
" f.write(audio_file.content)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"transcription = client.audio.transcriptions.create(\n",
|
|
" file=open(\"wikipediaOcelot.wav\", \"rb\"),\n",
|
|
" model=deployment,\n",
|
|
")\n",
|
|
"print(transcription.text)"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "venv",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.10.0"
|
|
},
|
|
"orig_nbformat": 4
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 2
|
|
}
|