mirror of
https://github.com/hwchase17/langchain
synced 2024-10-31 15:20:26 +00:00
539672a7fd
This pull request aims to ensure that the `OpenAICallbackHandler` can properly calculate the total cost for Azure OpenAI chat models. The following changes have resolved this issue: - The `model_name` has been added to the ChatResult llm_output. Without this, the default values of `gpt-35-turbo` were applied. This was causing the total cost for Azure OpenAI's GPT-4 to be significantly inaccurate. - A new parameter `model_version` has been added to `AzureChatOpenAI`. Azure does not include the model version in the response. With the addition of `model_name`, this is not a significant issue for GPT-4 models, but it's an issue for GPT-3.5-Turbo. Version 0301 (default) of GPT-3.5-Turbo on Azure has a flat rate of 0.002 per 1k tokens for both prompt and completion. However, version 0613 introduced a split in pricing for prompt and completion tokens. - The `OpenAICallbackHandler` implementation has been updated with the proper model names, versions, and cost per 1k tokens. Unit tests have been added to ensure the functionality works as expected; the Azure ChatOpenAI notebook has been updated with examples. Maintainers: @hwchase17, @baskaryan Twitter handle: @jjczopek --------- Co-authored-by: Jerzy Czopek <jerzy.czopek@avanade.com> Co-authored-by: Bagatur <baskaryan@gmail.com>
219 lines
5.6 KiB
Plaintext
219 lines
5.6 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "38f26d7a",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Azure\n",
|
|
"\n",
|
|
"This notebook goes over how to connect to an Azure hosted OpenAI endpoint"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"id": "96164b42",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from langchain.chat_models import AzureChatOpenAI\n",
|
|
"from langchain.schema import HumanMessage"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"id": "8161278f",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"BASE_URL = \"https://${TODO}.openai.azure.com\"\n",
|
|
"API_KEY = \"...\"\n",
|
|
"DEPLOYMENT_NAME = \"chat\"\n",
|
|
"model = AzureChatOpenAI(\n",
|
|
" openai_api_base=BASE_URL,\n",
|
|
" openai_api_version=\"2023-05-15\",\n",
|
|
" deployment_name=DEPLOYMENT_NAME,\n",
|
|
" openai_api_key=API_KEY,\n",
|
|
" openai_api_type=\"azure\",\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"id": "99509140",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"AIMessage(content=\"\\n\\nJ'aime programmer.\", additional_kwargs={})"
|
|
]
|
|
},
|
|
"execution_count": 5,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"model(\n",
|
|
" [\n",
|
|
" HumanMessage(\n",
|
|
" content=\"Translate this sentence from English to French. I love programming.\"\n",
|
|
" )\n",
|
|
" ]\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "3b6e9376",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": []
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "f27fa24d",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Model Version\n",
|
|
"Azure OpenAI responses contain `model` property, which is name of the model used to generate the response. However unlike native OpenAI responses, it does not contain the version of the model, which is set on the deplyoment in Azure. This makes it tricky to know which version of the model was used to generate the response, which as result can lead to e.g. wrong total cost calculation with `OpenAICallbackHandler`.\n",
|
|
"\n",
|
|
"To solve this problem, you can pass `model_version` parameter to `AzureChatOpenAI` class, which will be added to the model name in the llm output. This way you can easily distinguish between different versions of the model."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "0531798a",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from langchain.callbacks import get_openai_callback"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 14,
|
|
"id": "3fd97dfc",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"BASE_URL = \"https://{endpoint}.openai.azure.com\"\n",
|
|
"API_KEY = \"...\"\n",
|
|
"DEPLOYMENT_NAME = \"gpt-35-turbo\" # in Azure, this deployment has version 0613 - input and output tokens are counted separately"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 15,
|
|
"id": "aceddb72",
|
|
"metadata": {
|
|
"scrolled": true
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Total Cost (USD): $0.000054\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"model = AzureChatOpenAI(\n",
|
|
" openai_api_base=BASE_URL,\n",
|
|
" openai_api_version=\"2023-05-15\",\n",
|
|
" deployment_name=DEPLOYMENT_NAME,\n",
|
|
" openai_api_key=API_KEY,\n",
|
|
" openai_api_type=\"azure\",\n",
|
|
")\n",
|
|
"with get_openai_callback() as cb:\n",
|
|
" model(\n",
|
|
" [\n",
|
|
" HumanMessage(\n",
|
|
" content=\"Translate this sentence from English to French. I love programming.\"\n",
|
|
" )\n",
|
|
" ]\n",
|
|
" )\n",
|
|
" print(f\"Total Cost (USD): ${format(cb.total_cost, '.6f')}\") # without specifying the model version, flat-rate 0.002 USD per 1k input and output tokens is used\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "2e61eefd",
|
|
"metadata": {},
|
|
"source": [
|
|
"We can provide the model version to `AzureChatOpenAI` constructor. It will get appended to the model name returned by Azure OpenAI and cost will be counted correctly."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 17,
|
|
"id": "8d5e54e9",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Total Cost (USD): $0.000044\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"model0613 = AzureChatOpenAI(\n",
|
|
" openai_api_base=BASE_URL,\n",
|
|
" openai_api_version=\"2023-05-15\",\n",
|
|
" deployment_name=DEPLOYMENT_NAME,\n",
|
|
" openai_api_key=API_KEY,\n",
|
|
" openai_api_type=\"azure\",\n",
|
|
" model_version=\"0613\"\n",
|
|
")\n",
|
|
"with get_openai_callback() as cb:\n",
|
|
" model0613(\n",
|
|
" [\n",
|
|
" HumanMessage(\n",
|
|
" content=\"Translate this sentence from English to French. I love programming.\"\n",
|
|
" )\n",
|
|
" ]\n",
|
|
" )\n",
|
|
" print(f\"Total Cost (USD): ${format(cb.total_cost, '.6f')}\")\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "99682534",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": []
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.8.10"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|