{ "cells": [ { "cell_type": "markdown", "source": [ "# PowerBI Dataset Agent\n", "\n", "This notebook showcases an agent designed to interact with a Power BI Dataset. The agent is designed to answer more general questions about a dataset, as well as recover from errors.\n", "\n", "Note that, as this agent is in active development, all answers might not be correct. It runs against the [executequery endpoint](https://learn.microsoft.com/en-us/rest/api/power-bi/datasets/execute-queries), which does not allow deletes.\n", "\n", "### Some notes\n", "- It relies on authentication with the azure.identity package, which can be installed with `pip install azure-identity`. Alternatively you can create the powerbi dataset with a token as a string without supplying the credentials.\n", "- You can also supply a username to impersonate for use with datasets that have RLS enabled. \n", "- The toolkit uses a LLM to create the query from the question, the agent uses the LLM for the overall execution.\n", "- Testing was done mostly with a `text-davinci-003` model, codex models did not seem to perform ver well." ], "metadata": {}, "attachments": {} }, { "cell_type": "markdown", "source": [ "## Initialization" ], "metadata": { "tags": [] } }, { "cell_type": "code", "execution_count": null, "source": [ "from langchain.agents.agent_toolkits import create_pbi_agent\n", "from langchain.agents.agent_toolkits import PowerBIToolkit\n", "from langchain.utilities.powerbi import PowerBIDataset\n", "from langchain.chat_models import ChatOpenAI\n", "from langchain.agents import AgentExecutor\n", "from azure.identity import DefaultAzureCredential" ], "outputs": [], "metadata": { "tags": [] } }, { "cell_type": "code", "execution_count": null, "source": [ "fast_llm = ChatOpenAI(temperature=0.5, max_tokens=1000, model_name=\"gpt-3.5-turbo\", verbose=True)\n", "smart_llm = ChatOpenAI(temperature=0, max_tokens=100, model_name=\"gpt-4\", verbose=True)\n", "\n", "toolkit = PowerBIToolkit(\n", " powerbi=PowerBIDataset(dataset_id=\"\", table_names=['table1', 'table2'], credential=DefaultAzureCredential()), \n", " llm=smart_llm\n", ")\n", "\n", "agent_executor = create_pbi_agent(\n", " llm=fast_llm,\n", " toolkit=toolkit,\n", " verbose=True,\n", ")" ], "outputs": [], "metadata": { "tags": [] } }, { "cell_type": "markdown", "source": [ "## Example: describing a table" ], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "agent_executor.run(\"Describe table1\")" ], "outputs": [], "metadata": { "tags": [] } }, { "cell_type": "markdown", "source": [ "## Example: simple query on a table\n", "In this example, the agent actually figures out the correct query to get a row count of the table." ], "metadata": {}, "attachments": {} }, { "cell_type": "code", "execution_count": null, "source": [ "agent_executor.run(\"How many records are in table1?\")" ], "outputs": [], "metadata": { "tags": [] } }, { "cell_type": "markdown", "source": [ "## Example: running queries" ], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "agent_executor.run(\"How many records are there by dimension1 in table2?\")" ], "outputs": [], "metadata": { "tags": [] } }, { "cell_type": "code", "execution_count": null, "source": [ "agent_executor.run(\"What unique values are there for dimensions2 in table2\")" ], "outputs": [], "metadata": { "tags": [] } }, { "cell_type": "markdown", "source": [ "## Example: add your own few-shot prompts" ], "metadata": {}, "attachments": {} }, { "cell_type": "code", "execution_count": null, "source": [ "#fictional example\n", "few_shots = \"\"\"\n", "Question: How many rows are in the table revenue?\n", "DAX: EVALUATE ROW(\"Number of rows\", COUNTROWS(revenue_details))\n", "----\n", "Question: How many rows are in the table revenue where year is not empty?\n", "DAX: EVALUATE ROW(\"Number of rows\", COUNTROWS(FILTER(revenue_details, revenue_details[year] <> \"\")))\n", "----\n", "Question: What was the average of value in revenue in dollars?\n", "DAX: EVALUATE ROW(\"Average\", AVERAGE(revenue_details[dollar_value]))\n", "----\n", "\"\"\"\n", "toolkit = PowerBIToolkit(\n", " powerbi=PowerBIDataset(dataset_id=\"\", table_names=['table1', 'table2'], credential=DefaultAzureCredential()), \n", " llm=smart_llm,\n", " examples=few_shots,\n", ")\n", "agent_executor = create_pbi_agent(\n", " llm=fast_llm,\n", " toolkit=toolkit,\n", " verbose=True,\n", ")" ], "outputs": [], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "agent_executor.run(\"What was the maximum of value in revenue in dollars in 2022?\")" ], "outputs": [], "metadata": {} } ], "metadata": { "kernelspec": { "name": "python3", "display_name": "Python 3.9.16 64-bit" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.16" }, "interpreter": { "hash": "397704579725e15f5c7cb49fe5f0341eb7531c82d19f2c29d197e8b64ab5776b" } }, "nbformat": 4, "nbformat_minor": 5 }