community[minor]: Update Azure Cognitive Services to Azure AI Services (#19488)

This is a follow up to #18371. These are the changes:
- New **Azure AI Services** toolkit and tools to replace those of
**Azure Cognitive Services**.
- Updated documentation for Microsoft platform.
- The image analysis tool has been rewritten to use the new package
`azure-ai-vision-imageanalysis`, doing a proper replacement of
`azure-ai-vision`.

These changes:
- Update outdated naming from "Azure Cognitive Services" to "Azure AI
Services".
- Update documentation to use non-deprecated methods to create and use
agents.
- Removes need to depend on yanked python package (`azure-ai-vision`)

There is one new dependency that is needed as a replacement to
`azure-ai-vision`:
- `azure-ai-vision-imageanalysis`. This is optional and declared within
a function.

There is a new `azure_ai_services.ipynb` notebook showing usage; Changes
have been linted and formatted.

I am leaving the actions of adding deprecation notices and future
removal of Azure Cognitive Services up to the LangChain team, as I am
not sure what the current practice around this is.

---

If this PR makes it, my handle is  @galo@mastodon.social

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
pull/19696/head
Christian Galo 2 months ago committed by GitHub
parent ac1dd8ad94
commit 1adaa3c662
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

@ -273,19 +273,20 @@ from langchain.retrievers import AzureCognitiveSearchRetriever
## Toolkits
### Azure Cognitive Services
### Azure AI Services
We need to install several python packages.
```bash
pip install azure-ai-formrecognizer azure-cognitiveservices-speech azure-ai-vision
pip install azure-ai-formrecognizer azure-cognitiveservices-speech azure-ai-vision-imageanalysis
```
See a [usage example](/docs/integrations/toolkits/azure_cognitive_services).
See a [usage example](/docs/integrations/toolkits/azure_ai_services).
```python
from langchain_community.agent_toolkits import O365Toolkit
from langchain_community.agent_toolkits import azure_ai_services
```
### Microsoft Office 365 email and calendar
We need to install `O365` python package.

@ -0,0 +1,315 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Azure AI Services\n",
"\n",
"This toolkit is used to interact with the `Azure AI Services API` to achieve some multimodal capabilities.\n",
"\n",
"Currently There are five tools bundled in this toolkit:\n",
"- **AzureAiServicesImageAnalysisTool**: used to extract caption, objects, tags, and text from images.\n",
"- **AzureAiServicesDocumentIntelligenceTool**: used to extract text, tables, and key-value pairs from documents.\n",
"- **AzureAiServicesSpeechToTextTool**: used to transcribe speech to text.\n",
"- **AzureAiServicesTextToSpeechTool**: used to synthesize text to speech.\n",
"- **AzureAiServicesTextAnalyticsForHealthTool**: used to extract healthcare entities."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First, you need to set up an Azure account and create an AI Services resource. You can follow the instructions [here](https://learn.microsoft.com/en-us/azure/ai-services/multi-service-resource) to create a resource. \n",
"\n",
"Then, you need to get the endpoint, key and region of your resource, and set them as environment variables. You can find them in the \"Keys and Endpoint\" page of your resource."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install --upgrade --quiet azure-ai-formrecognizer > /dev/null\n",
"%pip install --upgrade --quiet azure-cognitiveservices-speech > /dev/null\n",
"%pip install --upgrade --quiet azure-ai-textanalytics > /dev/null\n",
"%pip install --upgrade --quiet azure-ai-vision-imageanalysis > /dev/null"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"sk-\"\n",
"os.environ[\"AZURE_AI_SERVICES_KEY\"] = \"\"\n",
"os.environ[\"AZURE_AI_SERVICES_ENDPOINT\"] = \"\"\n",
"os.environ[\"AZURE_AI_SERVICES_REGION\"] = \"\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create the Toolkit"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from langchain_community.agent_toolkits import AzureAiServicesToolkit\n",
"\n",
"toolkit = AzureAiServicesToolkit()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['azure_ai_services_document_intelligence',\n",
" 'azure_ai_services_image_analysis',\n",
" 'azure_ai_services_speech_to_text',\n",
" 'azure_ai_services_text_to_speech',\n",
" 'azure_ai_services_text_analytics_for_health']"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"[tool.name for tool in toolkit.get_tools()]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Use within an Agent"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"from langchain import hub\n",
"from langchain.agents import AgentExecutor, create_structured_chat_agent\n",
"from langchain_openai import OpenAI"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"llm = OpenAI(temperature=0)\n",
"tools = toolkit.get_tools()\n",
"prompt = hub.pull(\"hwchase17/structured-chat-agent\")\n",
"agent = create_structured_chat_agent(llm, tools, prompt)\n",
"\n",
"agent_executor = AgentExecutor(\n",
" agent=agent, tools=tools, verbose=True, handle_parsing_errors=True\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m\n",
"Thought: I need to use the azure_ai_services_image_analysis tool to analyze the image of the ingredients.\n",
"Action:\n",
"```\n",
"{\n",
" \"action\": \"azure_ai_services_image_analysis\",\n",
" \"action_input\": \"https://images.openai.com/blob/9ad5a2ab-041f-475f-ad6a-b51899c50182/ingredients.png\"\n",
"}\n",
"```\n",
"\u001b[0m\u001b[33;1m\u001b[1;3mCaption: a group of eggs and flour in bowls\n",
"Objects: Egg, Egg, Food\n",
"Tags: dairy, ingredient, indoor, thickening agent, food, mixing bowl, powder, flour, egg, bowl\u001b[0m\u001b[32;1m\u001b[1;3m\n",
"Action:\n",
"```\n",
"{\n",
" \"action\": \"Final Answer\",\n",
" \"action_input\": \"You can make a cake or other baked goods with these ingredients.\"\n",
"}\n",
"```\n",
"\n",
"\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"{'input': 'What can I make with these ingredients? https://images.openai.com/blob/9ad5a2ab-041f-475f-ad6a-b51899c50182/ingredients.png',\n",
" 'output': 'You can make a cake or other baked goods with these ingredients.'}"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent_executor.invoke(\n",
" {\n",
" \"input\": \"What can I make with these ingredients? \"\n",
" + \"https://images.openai.com/blob/9ad5a2ab-041f-475f-ad6a-b51899c50182/ingredients.png\"\n",
" }\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m\n",
"Thought: I can use the Azure AI Services Text to Speech API to convert text to speech.\n",
"Action:\n",
"```\n",
"{\n",
" \"action\": \"azure_ai_services_text_to_speech\",\n",
" \"action_input\": \"Why don't scientists trust atoms? Because they make up everything.\"\n",
"}\n",
"```\n",
"\u001b[0m\u001b[36;1m\u001b[1;3m/tmp/tmpe48vamz0.wav\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
}
],
"source": [
"tts_result = agent_executor.invoke({\"input\": \"Tell me a joke and read it out for me.\"})\n",
"audio_file = tts_result.get(\"output\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from IPython import display\n",
"\n",
"audio = display.Audio(data=audio_file, autoplay=True, rate=22050)\n",
"display.display(audio)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m\n",
"Thought: The patient has a history of progressive angina, a strong family history of coronary artery disease, and a previous cardiac catheterization revealing total occlusion of the RCA and 50% left main disease.\n",
"Action:\n",
"```\n",
"{\n",
" \"action\": \"azure_ai_services_text_analytics_for_health\",\n",
" \"action_input\": \"The patient is a 54-year-old gentleman with a history of progressive angina over the past several months. The patient had a cardiac catheterization in July of this year revealing total occlusion of the RCA and 50% left main disease, with a strong family history of coronary artery disease with a brother dying at the age of 52 from a myocardial infarction and another brother who is status post coronary artery bypass grafting. The patient had a stress echocardiogram done on July, 2001, which showed no wall motion abnormalities, but this was a difficult study due to body habitus. The patient went for six minutes with minimal ST depressions in the anterior lateral leads, thought due to fatigue and wrist pain, his anginal equivalent. Due to the patient's increased symptoms and family history and history left main disease with total occasional of his RCA was referred for revascularization with open heart surgery.\"\n",
"\u001b[0m\u001b[33;1m\u001b[1;3mThe text contains the following healthcare entities: 54-year-old is a healthcare entity of type Age, gentleman is a healthcare entity of type Gender, progressive angina is a healthcare entity of type Diagnosis, past several months is a healthcare entity of type Time, cardiac catheterization is a healthcare entity of type ExaminationName, July of this year is a healthcare entity of type Time, total is a healthcare entity of type ConditionQualifier, occlusion is a healthcare entity of type SymptomOrSign, RCA is a healthcare entity of type BodyStructure, 50 is a healthcare entity of type MeasurementValue, % is a healthcare entity of type MeasurementUnit, left main disease is a healthcare entity of type Diagnosis, family is a healthcare entity of type FamilyRelation, coronary artery disease is a healthcare entity of type Diagnosis, brother is a healthcare entity of type FamilyRelation, dying is a healthcare entity of type Diagnosis, 52 is a healthcare entity of type Age, myocardial infarction is a healthcare entity of type Diagnosis, brother is a healthcare entity of type FamilyRelation, coronary artery bypass grafting is a healthcare entity of type TreatmentName, stress echocardiogram is a healthcare entity of type ExaminationName, July, 2001 is a healthcare entity of type Time, wall motion abnormalities is a healthcare entity of type SymptomOrSign, body habitus is a healthcare entity of type SymptomOrSign, six minutes is a healthcare entity of type Time, minimal is a healthcare entity of type ConditionQualifier, ST depressions in the anterior lateral leads is a healthcare entity of type SymptomOrSign, fatigue is a healthcare entity of type SymptomOrSign, wrist pain is a healthcare entity of type SymptomOrSign, anginal is a healthcare entity of type SymptomOrSign, increased is a healthcare entity of type Course, symptoms is a healthcare entity of type SymptomOrSign, family is a healthcare entity of type FamilyRelation, left main disease is a healthcare entity of type Diagnosis, occasional is a healthcare entity of type Course, RCA is a healthcare entity of type BodyStructure, revascularization is a healthcare entity of type TreatmentName, open heart surgery is a healthcare entity of type TreatmentName\u001b[0m\u001b[32;1m\u001b[1;3m\n",
"Action:\n",
"```\n",
"{\n",
" \"action\": \"Final Answer\",\n",
" \"action_input\": \"The patient's diagnoses include progressive angina, total occlusion of the RCA, 50% left main disease, coronary artery disease, myocardial infarction, and a family history of coronary artery disease.\"\n",
"}\n",
"\n",
"\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"{'input': \"\\nThe patient is a 54-year-old gentleman with a history of progressive angina over the past several months.\\nThe patient had a cardiac catheterization in July of this year revealing total occlusion of the RCA and 50% left main disease ,\\nwith a strong family history of coronary artery disease with a brother dying at the age of 52 from a myocardial infarction and\\nanother brother who is status post coronary artery bypass grafting. The patient had a stress echocardiogram done on July , 2001 ,\\nwhich showed no wall motion abnormalities , but this was a difficult study due to body habitus. The patient went for six minutes with\\nminimal ST depressions in the anterior lateral leads , thought due to fatigue and wrist pain , his anginal equivalent. Due to the patient's\\nincreased symptoms and family history and history left main disease with total occasional of his RCA was referred for revascularization with open heart surgery.\\n\\nList all the diagnoses.\\n\",\n",
" 'output': \"The patient's diagnoses include progressive angina, total occlusion of the RCA, 50% left main disease, coronary artery disease, myocardial infarction, and a family history of coronary artery disease.\"}"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sample_input = \"\"\"\n",
"The patient is a 54-year-old gentleman with a history of progressive angina over the past several months.\n",
"The patient had a cardiac catheterization in July of this year revealing total occlusion of the RCA and 50% left main disease ,\n",
"with a strong family history of coronary artery disease with a brother dying at the age of 52 from a myocardial infarction and\n",
"another brother who is status post coronary artery bypass grafting. The patient had a stress echocardiogram done on July , 2001 ,\n",
"which showed no wall motion abnormalities , but this was a difficult study due to body habitus. The patient went for six minutes with\n",
"minimal ST depressions in the anterior lateral leads , thought due to fatigue and wrist pain , his anginal equivalent. Due to the patient's\n",
"increased symptoms and family history and history left main disease with total occasional of his RCA was referred for revascularization with open heart surgery.\n",
"\n",
"List all the diagnoses.\n",
"\"\"\"\n",
"\n",
"agent_executor.invoke({\"input\": sample_input})"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.8"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

@ -1,12 +1,14 @@
"""**Toolkits** are sets of tools that can be used to interact with
various services and APIs.
"""
import importlib
from typing import Any
_module_lookup = {
"AINetworkToolkit": "langchain_community.agent_toolkits.ainetwork.toolkit",
"AmadeusToolkit": "langchain_community.agent_toolkits.amadeus.toolkit",
"AzureAiServicesToolkit": "langchain_community.agent_toolkits.azure_ai_services",
"AzureCognitiveServicesToolkit": "langchain_community.agent_toolkits.azure_cognitive_services", # noqa: E501
"CogniswitchToolkit": "langchain_community.agent_toolkits.cogniswitch.toolkit",
"ConneryToolkit": "langchain_community.agent_toolkits.connery",

@ -0,0 +1,31 @@
from __future__ import annotations
from typing import List
from langchain_core.tools import BaseTool
from langchain_community.agent_toolkits.base import BaseToolkit
from langchain_community.tools.azure_ai_services import (
AzureAiServicesDocumentIntelligenceTool,
AzureAiServicesImageAnalysisTool,
AzureAiServicesSpeechToTextTool,
AzureAiServicesTextAnalyticsForHealthTool,
AzureAiServicesTextToSpeechTool,
)
class AzureAiServicesToolkit(BaseToolkit):
"""Toolkit for Azure AI Services."""
def get_tools(self) -> List[BaseTool]:
"""Get the tools in the toolkit."""
tools: List[BaseTool] = [
AzureAiServicesDocumentIntelligenceTool(),
AzureAiServicesImageAnalysisTool(),
AzureAiServicesSpeechToTextTool(),
AzureAiServicesTextToSpeechTool(),
AzureAiServicesTextAnalyticsForHealthTool(),
]
return tools

@ -16,6 +16,7 @@ tool for the job.
CallbackManagerForToolRun, AsyncCallbackManagerForToolRun
"""
import importlib
from typing import Any
@ -31,6 +32,11 @@ _module_lookup = {
"AIPluginTool": "langchain_community.tools.plugin",
"APIOperation": "langchain_community.tools.openapi.utils.api_models",
"ArxivQueryRun": "langchain_community.tools.arxiv.tool",
"AzureAiServicesDocumentIntelligenceTool": "langchain_community.tools.azure_ai_services", # noqa: E501
"AzureAiServicesImageAnalysisTool": "langchain_community.tools.azure_ai_services",
"AzureAiServicesSpeechToTextTool": "langchain_community.tools.azure_ai_services",
"AzureAiServicesTextToSpeechTool": "langchain_community.tools.azure_ai_services",
"AzureAiServicesTextAnalyticsForHealthTool": "langchain_community.tools.azure_ai_services", # noqa: E501
"AzureCogsFormRecognizerTool": "langchain_community.tools.azure_cognitive_services",
"AzureCogsImageAnalysisTool": "langchain_community.tools.azure_cognitive_services",
"AzureCogsSpeech2TextTool": "langchain_community.tools.azure_cognitive_services",

@ -0,0 +1,25 @@
"""Azure AI Services Tools."""
from langchain_community.tools.azure_ai_services.document_intelligence import (
AzureAiServicesDocumentIntelligenceTool,
)
from langchain_community.tools.azure_ai_services.image_analysis import (
AzureAiServicesImageAnalysisTool,
)
from langchain_community.tools.azure_ai_services.speech_to_text import (
AzureAiServicesSpeechToTextTool,
)
from langchain_community.tools.azure_ai_services.text_analytics_for_health import (
AzureAiServicesTextAnalyticsForHealthTool,
)
from langchain_community.tools.azure_ai_services.text_to_speech import (
AzureAiServicesTextToSpeechTool,
)
__all__ = [
"AzureAiServicesDocumentIntelligenceTool",
"AzureAiServicesImageAnalysisTool",
"AzureAiServicesSpeechToTextTool",
"AzureAiServicesTextToSpeechTool",
"AzureAiServicesTextAnalyticsForHealthTool",
]

@ -0,0 +1,145 @@
from __future__ import annotations
import logging
from typing import Any, Dict, List, Optional
from langchain_core.callbacks import CallbackManagerForToolRun
from langchain_core.pydantic_v1 import root_validator
from langchain_core.tools import BaseTool
from langchain_core.utils import get_from_dict_or_env
from langchain_community.tools.azure_ai_services.utils import (
detect_file_src_type,
)
logger = logging.getLogger(__name__)
class AzureAiServicesDocumentIntelligenceTool(BaseTool):
"""Tool that queries the Azure AI Services Document Intelligence API.
In order to set this up, follow instructions at:
https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/quickstarts/get-started-sdks-rest-api?view=doc-intel-4.0.0&pivots=programming-language-python
"""
azure_ai_services_key: str = "" #: :meta private:
azure_ai_services_endpoint: str = "" #: :meta private:
doc_analysis_client: Any #: :meta private:
name: str = "azure_ai_services_document_intelligence"
description: str = (
"A wrapper around Azure AI Services Document Intelligence. "
"Useful for when you need to "
"extract text, tables, and key-value pairs from documents. "
"Input should be a url to a document."
)
@root_validator(pre=True)
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and endpoint exists in environment."""
azure_ai_services_key = get_from_dict_or_env(
values, "azure_ai_services_key", "AZURE_AI_SERVICES_KEY"
)
azure_ai_services_endpoint = get_from_dict_or_env(
values, "azure_ai_services_endpoint", "AZURE_AI_SERVICES_ENDPOINT"
)
try:
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential
values["doc_analysis_client"] = DocumentAnalysisClient(
endpoint=azure_ai_services_endpoint,
credential=AzureKeyCredential(azure_ai_services_key),
)
except ImportError:
raise ImportError(
"azure-ai-formrecognizer is not installed. "
"Run `pip install azure-ai-formrecognizer` to install."
)
return values
def _parse_tables(self, tables: List[Any]) -> List[Any]:
result = []
for table in tables:
rc, cc = table.row_count, table.column_count
_table = [["" for _ in range(cc)] for _ in range(rc)]
for cell in table.cells:
_table[cell.row_index][cell.column_index] = cell.content
result.append(_table)
return result
def _parse_kv_pairs(self, kv_pairs: List[Any]) -> List[Any]:
result = []
for kv_pair in kv_pairs:
key = kv_pair.key.content if kv_pair.key else ""
value = kv_pair.value.content if kv_pair.value else ""
result.append((key, value))
return result
def _document_analysis(self, document_path: str) -> Dict:
document_src_type = detect_file_src_type(document_path)
if document_src_type == "local":
with open(document_path, "rb") as document:
poller = self.doc_analysis_client.begin_analyze_document(
"prebuilt-document", document
)
elif document_src_type == "remote":
poller = self.doc_analysis_client.begin_analyze_document_from_url(
"prebuilt-document", document_path
)
else:
raise ValueError(f"Invalid document path: {document_path}")
result = poller.result()
res_dict = {}
if result.content is not None:
res_dict["content"] = result.content
if result.tables is not None:
res_dict["tables"] = self._parse_tables(result.tables)
if result.key_value_pairs is not None:
res_dict["key_value_pairs"] = self._parse_kv_pairs(result.key_value_pairs)
return res_dict
def _format_document_analysis_result(self, document_analysis_result: Dict) -> str:
formatted_result = []
if "content" in document_analysis_result:
formatted_result.append(
f"Content: {document_analysis_result['content']}".replace("\n", " ")
)
if "tables" in document_analysis_result:
for i, table in enumerate(document_analysis_result["tables"]):
formatted_result.append(f"Table {i}: {table}".replace("\n", " "))
if "key_value_pairs" in document_analysis_result:
for kv_pair in document_analysis_result["key_value_pairs"]:
formatted_result.append(
f"{kv_pair[0]}: {kv_pair[1]}".replace("\n", " ")
)
return "\n".join(formatted_result)
def _run(
self,
query: str,
run_manager: Optional[CallbackManagerForToolRun] = None,
) -> str:
"""Use the tool."""
try:
document_analysis_result = self._document_analysis(query)
if not document_analysis_result:
return "No good document analysis result was found"
return self._format_document_analysis_result(document_analysis_result)
except Exception as e:
raise RuntimeError(
f"Error while running AzureAiServicesDocumentIntelligenceTool: {e}"
)

@ -0,0 +1,153 @@
from __future__ import annotations
import logging
from typing import Any, Dict, Optional
from langchain_core.callbacks import CallbackManagerForToolRun
from langchain_core.pydantic_v1 import root_validator
from langchain_core.tools import BaseTool
from langchain_core.utils import get_from_dict_or_env
from langchain_community.tools.azure_ai_services.utils import (
detect_file_src_type,
)
logger = logging.getLogger(__name__)
class AzureAiServicesImageAnalysisTool(BaseTool):
"""Tool that queries the Azure AI Services Image Analysis API.
In order to set this up, follow instructions at:
https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/quickstarts-sdk/image-analysis-client-library-40
"""
azure_ai_services_key: str = "" #: :meta private:
azure_ai_services_endpoint: str = "" #: :meta private:
image_analysis_client: Any #: :meta private:
visual_features: Any #: :meta private:
name: str = "azure_ai_services_image_analysis"
description: str = (
"A wrapper around Azure AI Services Image Analysis. "
"Useful for when you need to analyze images. "
"Input should be a url to an image."
)
@root_validator(pre=True)
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and endpoint exists in environment."""
azure_ai_services_key = get_from_dict_or_env(
values, "azure_ai_services_key", "AZURE_AI_SERVICES_KEY"
)
azure_ai_services_endpoint = get_from_dict_or_env(
values, "azure_ai_services_endpoint", "AZURE_AI_SERVICES_ENDPOINT"
)
"""Validate that azure-ai-vision-imageanalysis is installed."""
try:
from azure.ai.vision.imageanalysis import ImageAnalysisClient
from azure.ai.vision.imageanalysis.models import VisualFeatures
from azure.core.credentials import AzureKeyCredential
except ImportError:
raise ImportError(
"azure-ai-vision-imageanalysis is not installed. "
"Run `pip install azure-ai-vision-imageanalysis` to install. "
)
"""Validate Azure AI Vision Image Analysis client can be initialized."""
try:
values["image_analysis_client"] = ImageAnalysisClient(
endpoint=azure_ai_services_endpoint,
credential=AzureKeyCredential(azure_ai_services_key),
)
except Exception as e:
raise RuntimeError(
f"Initialization of Azure AI Vision Image Analysis client failed: {e}"
)
values["visual_features"] = [
VisualFeatures.TAGS,
VisualFeatures.OBJECTS,
VisualFeatures.CAPTION,
VisualFeatures.READ,
]
return values
def _image_analysis(self, image_path: str) -> Dict:
try:
from azure.ai.vision.imageanalysis import ImageAnalysisClient
except ImportError:
pass
self.image_analysis_client: ImageAnalysisClient
image_src_type = detect_file_src_type(image_path)
if image_src_type == "local":
with open(image_path, "rb") as image_file:
image_data = image_file.read()
result = self.image_analysis_client.analyze(
image_data=image_data,
visual_features=self.visual_features,
)
elif image_src_type == "remote":
result = self.image_analysis_client.analyze_from_url(
image_url=image_path,
visual_features=self.visual_features,
)
else:
raise ValueError(f"Invalid image path: {image_path}")
res_dict = {}
if result:
if result.caption is not None:
res_dict["caption"] = result.caption.text
if result.objects is not None:
res_dict["objects"] = [obj.tags[0].name for obj in result.objects.list]
if result.tags is not None:
res_dict["tags"] = [tag.name for tag in result.tags.list]
if result.read is not None and len(result.read.blocks) > 0:
res_dict["text"] = [line.text for line in result.read.blocks[0].lines]
return res_dict
def _format_image_analysis_result(self, image_analysis_result: Dict) -> str:
formatted_result = []
if "caption" in image_analysis_result:
formatted_result.append("Caption: " + image_analysis_result["caption"])
if (
"objects" in image_analysis_result
and len(image_analysis_result["objects"]) > 0
):
formatted_result.append(
"Objects: " + ", ".join(image_analysis_result["objects"])
)
if "tags" in image_analysis_result and len(image_analysis_result["tags"]) > 0:
formatted_result.append("Tags: " + ", ".join(image_analysis_result["tags"]))
if "text" in image_analysis_result and len(image_analysis_result["text"]) > 0:
formatted_result.append("Text: " + ", ".join(image_analysis_result["text"]))
return "\n".join(formatted_result)
def _run(
self,
query: str,
run_manager: Optional[CallbackManagerForToolRun] = None,
) -> str:
"""Use the tool."""
try:
image_analysis_result = self._image_analysis(query)
if not image_analysis_result:
return "No good image analysis result was found"
return self._format_image_analysis_result(image_analysis_result)
except Exception as e:
raise RuntimeError(f"Error while running AzureAiImageAnalysisTool: {e}")

@ -0,0 +1,122 @@
from __future__ import annotations
import logging
import time
from typing import Any, Dict, Optional
from langchain_core.callbacks import CallbackManagerForToolRun
from langchain_core.pydantic_v1 import root_validator
from langchain_core.tools import BaseTool
from langchain_core.utils import get_from_dict_or_env
from langchain_community.tools.azure_ai_services.utils import (
detect_file_src_type,
download_audio_from_url,
)
logger = logging.getLogger(__name__)
class AzureAiServicesSpeechToTextTool(BaseTool):
"""Tool that queries the Azure AI Services Speech to Text API.
In order to set this up, follow instructions at:
https://learn.microsoft.com/en-us/azure/ai-services/speech-service/get-started-speech-to-text?pivots=programming-language-python
"""
azure_ai_services_key: str = "" #: :meta private:
azure_ai_services_region: str = "" #: :meta private:
speech_language: str = "en-US" #: :meta private:
speech_config: Any #: :meta private:
name: str = "azure_ai_services_speech_to_text"
description: str = (
"A wrapper around Azure AI Services Speech to Text. "
"Useful for when you need to transcribe audio to text. "
"Input should be a url to an audio file."
)
@root_validator(pre=True)
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and endpoint exists in environment."""
azure_ai_services_key = get_from_dict_or_env(
values, "azure_ai_services_key", "AZURE_AI_SERVICES_KEY"
)
azure_ai_services_region = get_from_dict_or_env(
values, "azure_ai_services_region", "AZURE_AI_SERVICES_REGION"
)
try:
import azure.cognitiveservices.speech as speechsdk
values["speech_config"] = speechsdk.SpeechConfig(
subscription=azure_ai_services_key, region=azure_ai_services_region
)
except ImportError:
raise ImportError(
"azure-cognitiveservices-speech is not installed. "
"Run `pip install azure-cognitiveservices-speech` to install."
)
return values
def _continuous_recognize(self, speech_recognizer: Any) -> str:
done = False
text = ""
def stop_cb(evt: Any) -> None:
"""callback that stop continuous recognition"""
speech_recognizer.stop_continuous_recognition_async()
nonlocal done
done = True
def retrieve_cb(evt: Any) -> None:
"""callback that retrieves the intermediate recognition results"""
nonlocal text
text += evt.result.text
# retrieve text on recognized events
speech_recognizer.recognized.connect(retrieve_cb)
# stop continuous recognition on either session stopped or canceled events
speech_recognizer.session_stopped.connect(stop_cb)
speech_recognizer.canceled.connect(stop_cb)
# Start continuous speech recognition
speech_recognizer.start_continuous_recognition_async()
while not done:
time.sleep(0.5)
return text
def _speech_to_text(self, audio_path: str, speech_language: str) -> str:
try:
import azure.cognitiveservices.speech as speechsdk
except ImportError:
pass
audio_src_type = detect_file_src_type(audio_path)
if audio_src_type == "local":
audio_config = speechsdk.AudioConfig(filename=audio_path)
elif audio_src_type == "remote":
tmp_audio_path = download_audio_from_url(audio_path)
audio_config = speechsdk.AudioConfig(filename=tmp_audio_path)
else:
raise ValueError(f"Invalid audio path: {audio_path}")
self.speech_config.speech_recognition_language = speech_language
speech_recognizer = speechsdk.SpeechRecognizer(self.speech_config, audio_config)
return self._continuous_recognize(speech_recognizer)
def _run(
self,
query: str,
run_manager: Optional[CallbackManagerForToolRun] = None,
) -> str:
"""Use the tool."""
try:
text = self._speech_to_text(query, self.speech_language)
return text
except Exception as e:
raise RuntimeError(
f"Error while running AzureAiServicesSpeechToTextTool: {e}"
)

@ -0,0 +1,104 @@
from __future__ import annotations
import logging
from typing import Any, Dict, Optional
from langchain_core.callbacks import CallbackManagerForToolRun
from langchain_core.pydantic_v1 import root_validator
from langchain_core.tools import BaseTool
from langchain_core.utils import get_from_dict_or_env
logger = logging.getLogger(__name__)
class AzureAiServicesTextAnalyticsForHealthTool(BaseTool):
"""Tool that queries the Azure AI Services Text Analytics for Health API.
In order to set this up, follow instructions at:
https://learn.microsoft.com/en-us/azure/ai-services/language-service/text-analytics-for-health/quickstart?pivots=programming-language-python
"""
azure_ai_services_key: str = "" #: :meta private:
azure_ai_services_endpoint: str = "" #: :meta private:
text_analytics_client: Any #: :meta private:
name: str = "azure_ai_services_text_analytics_for_health"
description: str = (
"A wrapper around Azure AI Services Text Analytics for Health. "
"Useful for when you need to identify entities in healthcare data. "
"Input should be text."
)
@root_validator(pre=True)
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and endpoint exists in environment."""
azure_ai_services_key = get_from_dict_or_env(
values, "azure_ai_services_key", "AZURE_AI_SERVICES_KEY"
)
azure_ai_services_endpoint = get_from_dict_or_env(
values, "azure_ai_services_endpoint", "AZURE_AI_SERVICES_ENDPOINT"
)
try:
import azure.ai.textanalytics as sdk
from azure.core.credentials import AzureKeyCredential
values["text_analytics_client"] = sdk.TextAnalyticsClient(
endpoint=azure_ai_services_endpoint,
credential=AzureKeyCredential(azure_ai_services_key),
)
except ImportError:
raise ImportError(
"azure-ai-textanalytics is not installed. "
"Run `pip install azure-ai-textanalytics` to install."
)
return values
def _text_analysis(self, text: str) -> Dict:
poller = self.text_analytics_client.begin_analyze_healthcare_entities(
[{"id": "1", "language": "en", "text": text}]
)
result = poller.result()
res_dict = {}
docs = [doc for doc in result if not doc.is_error]
if docs is not None:
res_dict["entities"] = [
f"{x.text} is a healthcare entity of type {x.category}"
for y in docs
for x in y.entities
]
return res_dict
def _format_text_analysis_result(self, text_analysis_result: Dict) -> str:
formatted_result = []
if "entities" in text_analysis_result:
formatted_result.append(
f"""The text contains the following healthcare entities: {
', '.join(text_analysis_result['entities'])
}""".replace("\n", " ")
)
return "\n".join(formatted_result)
def _run(
self,
query: str,
run_manager: Optional[CallbackManagerForToolRun] = None,
) -> str:
"""Use the tool."""
try:
text_analysis_result = self._text_analysis(query)
return self._format_text_analysis_result(text_analysis_result)
except Exception as e:
raise RuntimeError(
f"Error while running AzureAiServicesTextAnalyticsForHealthTool: {e}"
)

@ -0,0 +1,105 @@
from __future__ import annotations
import logging
import tempfile
from typing import Any, Dict, Optional
from langchain_core.callbacks import CallbackManagerForToolRun
from langchain_core.pydantic_v1 import root_validator
from langchain_core.tools import BaseTool
from langchain_core.utils import get_from_dict_or_env
logger = logging.getLogger(__name__)
class AzureAiServicesTextToSpeechTool(BaseTool):
"""Tool that queries the Azure AI Services Text to Speech API.
In order to set this up, follow instructions at:
https://learn.microsoft.com/en-us/azure/ai-services/speech-service/get-started-text-to-speech?pivots=programming-language-python
"""
name: str = "azure_ai_services_text_to_speech"
description: str = (
"A wrapper around Azure AI Services Text to Speech API. "
"Useful for when you need to convert text to speech. "
)
return_direct: bool = True
azure_ai_services_key: str = "" #: :meta private:
azure_ai_services_region: str = "" #: :meta private:
speech_language: str = "en-US" #: :meta private:
speech_config: Any #: :meta private:
@root_validator(pre=True)
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and endpoint exists in environment."""
azure_ai_services_key = get_from_dict_or_env(
values, "azure_ai_services_key", "AZURE_AI_SERVICES_KEY"
)
azure_ai_services_region = get_from_dict_or_env(
values, "azure_ai_services_region", "AZURE_AI_SERVICES_REGION"
)
try:
import azure.cognitiveservices.speech as speechsdk
values["speech_config"] = speechsdk.SpeechConfig(
subscription=azure_ai_services_key, region=azure_ai_services_region
)
except ImportError:
raise ImportError(
"azure-cognitiveservices-speech is not installed. "
"Run `pip install azure-cognitiveservices-speech` to install."
)
return values
def _text_to_speech(self, text: str, speech_language: str) -> str:
try:
import azure.cognitiveservices.speech as speechsdk
except ImportError:
pass
self.speech_config.speech_synthesis_language = speech_language
speech_synthesizer = speechsdk.SpeechSynthesizer(
speech_config=self.speech_config, audio_config=None
)
result = speech_synthesizer.speak_text(text)
if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
stream = speechsdk.AudioDataStream(result)
with tempfile.NamedTemporaryFile(
mode="wb", suffix=".wav", delete=False
) as f:
stream.save_to_wav_file(f.name)
return f.name
elif result.reason == speechsdk.ResultReason.Canceled:
cancellation_details = result.cancellation_details
logger.debug(f"Speech synthesis canceled: {cancellation_details.reason}")
if cancellation_details.reason == speechsdk.CancellationReason.Error:
raise RuntimeError(
f"Speech synthesis error: {cancellation_details.error_details}"
)
return "Speech synthesis canceled."
else:
return f"Speech synthesis failed: {result.reason}"
def _run(
self,
query: str,
run_manager: Optional[CallbackManagerForToolRun] = None,
) -> str:
"""Use the tool."""
try:
speech_file = self._text_to_speech(query, self.speech_language)
return speech_file
except Exception as e:
raise RuntimeError(
f"Error while running AzureAiServicesTextToSpeechTool: {e}"
)

@ -0,0 +1,29 @@
import os
import tempfile
from urllib.parse import urlparse
import requests
def detect_file_src_type(file_path: str) -> str:
"""Detect if the file is local or remote."""
if os.path.isfile(file_path):
return "local"
parsed_url = urlparse(file_path)
if parsed_url.scheme and parsed_url.netloc:
return "remote"
return "invalid"
def download_audio_from_url(audio_url: str) -> str:
"""Download audio from url to local."""
ext = audio_url.split(".")[-1]
response = requests.get(audio_url, stream=True)
response.raise_for_status()
with tempfile.NamedTemporaryFile(mode="wb", suffix=f".{ext}", delete=False) as f:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)
return f.name

@ -3,6 +3,7 @@ from langchain_community.agent_toolkits import __all__
EXPECTED_ALL = [
"AINetworkToolkit",
"AmadeusToolkit",
"AzureAiServicesToolkit",
"AzureCognitiveServicesToolkit",
"ConneryToolkit",
"FileManagementToolkit",

@ -9,6 +9,11 @@ EXPECTED_ALL = [
"AIPluginTool",
"APIOperation",
"ArxivQueryRun",
"AzureAiServicesDocumentIntelligenceTool",
"AzureAiServicesImageAnalysisTool",
"AzureAiServicesSpeechToTextTool",
"AzureAiServicesTextToSpeechTool",
"AzureAiServicesTextAnalyticsForHealthTool",
"AzureCogsFormRecognizerTool",
"AzureCogsImageAnalysisTool",
"AzureCogsSpeech2TextTool",

@ -10,6 +10,11 @@ _EXPECTED = [
"AIPluginTool",
"APIOperation",
"ArxivQueryRun",
"AzureAiServicesDocumentIntelligenceTool",
"AzureAiServicesImageAnalysisTool",
"AzureAiServicesSpeechToTextTool",
"AzureAiServicesTextToSpeechTool",
"AzureAiServicesTextAnalyticsForHealthTool",
"AzureCogsFormRecognizerTool",
"AzureCogsImageAnalysisTool",
"AzureCogsSpeech2TextTool",

Loading…
Cancel
Save