PowerBI major refinement in working of tool and tweaks in the rest (#5090)

# PowerBI major refinement in working of tool and tweaks in the rest I've gained some experience with more complex sets and the earlier implementation had too many tries by the agent to create DAX, so refactored the code to run the LLM to create dax based on a question and then immediately run the same against the dataset, with retries and a prompt that includes the error for the retry. This works much better! Also did some other refactoring of the inner workings, making things clearer, more concise and faster.
12 months ago · 1cb04f2b26
parent e57ebf3922
commit 1cb04f2b26
7 changed files with 191 additions and 187 deletions
--- a/langchain/agents/agent_toolkits/powerbi/prompt.py
+++ b/langchain/agents/agent_toolkits/powerbi/prompt.py
@ -2,28 +2,24 @@
 """Prompts for PowerBI agent."""


-POWERBI_PREFIX = """You are an agent designed to interact with a Power BI Dataset.
+POWERBI_PREFIX = """You are an agent designed to help users interact with a PowerBI Dataset.

-Assistant has access to tools that can give context, write queries and execute those queries against PowerBI, Microsofts business intelligence tool. The questions from the users should be interpreted as related to the dataset that is available and not general questions about the world. If the question does not seem related to the dataset, just return "I don't know" as the answer. The query language that PowerBI uses is called DAX and it is quite particular and complex, so make sure to use the right tools to get the answers the user is looking for.
+Agent has access to a tool that can write a query based on the question and then run those against PowerBI, Microsofts business intelligence tool. The questions from the users should be interpreted as related to the dataset that is available and not general questions about the world. If the question does not seem related to the dataset, just return "This does not appear to be part of this dataset." as the answer.

-Given an input question, create a syntactically correct DAX query to run, then look at the results and return the answer. Sometimes the result indicate something is wrong with the query, or there were errors in the json serialization. Unless the user specifies a specific number of examples they wish to obtain, always limit your query to at most {top_k} results. You can order the results by a relevant column to return the most interesting examples in the database.
-
-Assistant never just starts querying, assistant should first find out which tables there are, then how each table is defined and then ask the question to query tool to create a query and then ask the query tool to execute it, finally create a complete sentence that answers the question, if multiple rows need are asked find a way to write that in a easily readible format for a human. Assistant has tools that can get more context of the tables which helps it write correct queries.
+Given an input question, ask to run the questions against the dataset, then look at the results and return the answer, the answer should be a complete sentence that answers the question, if multiple rows are asked find a way to write that in a easily readible format for a human, also make sure to represent numbers in readable ways, like 1M instead of 1000000. Unless the user specifies a specific number of examples they wish to obtain, always limit your query to at most {top_k} results.
 """

 POWERBI_SUFFIX = """Begin!

 Question: {input}
-Thought: I should first ask which tables I have, then how each table is defined and then ask the question to query tool to create a query for me and then I should ask the query tool to execute it, finally create a nice sentence that answers the question.
+Thought: I can first ask which tables I have, then how each table is defined and then ask the query tool the question I need, and finally create a nice sentence that answers the question.
 {agent_scratchpad}"""

 POWERBI_CHAT_PREFIX = """Assistant is a large language model built to help users interact with a PowerBI Dataset.

-Assistant has access to tools that can give context, write queries and execute those queries against PowerBI, Microsofts business intelligence tool. The questions from the users should be interpreted as related to the dataset that is available and not general questions about the world. If the question does not seem related to the dataset, just return "I don't know" as the answer. The query language that PowerBI uses is called DAX and it is quite particular and complex, so make sure to use the right tools to get the answers the user is looking for.
-
-Given an input question, create a syntactically correct DAX query to run, then look at the results and return the answer. Sometimes the result indicate something is wrong with the query, or there were errors in the json serialization. Unless the user specifies a specific number of examples they wish to obtain, always limit your query to at most {top_k} results. You can order the results by a relevant column to return the most interesting examples in the database.
+Assistant has access to a tool that can write a query based on the question and then run those against PowerBI, Microsofts business intelligence tool. The questions from the users should be interpreted as related to the dataset that is available and not general questions about the world. If the question does not seem related to the dataset, just return "This does not appear to be part of this dataset." as the answer.

-Assistant never just starts querying, assistant should first find out which tables there are, then how each table is defined and then ask the question to query tool to create a query and then ask the query tool to execute it, finally create a complete sentence that answers the question, if multiple rows need are asked find a way to write that in a easily readible format for a human. Assistant has tools that can get more context of the tables which helps it write correct queries.
+Given an input question, ask to run the questions against the dataset, then look at the results and return the answer, the answer should be a complete sentence that answers the question, if multiple rows are asked find a way to write that in a easily readible format for a human, also make sure to represent numbers in readable ways, like 1M instead of 1000000. Unless the user specifies a specific number of examples they wish to obtain, always limit your query to at most {top_k} results.
 """

 POWERBI_CHAT_SUFFIX = """TOOLS
--- a/langchain/agents/agent_toolkits/powerbi/toolkit.py
+++ b/langchain/agents/agent_toolkits/powerbi/toolkit.py
@ -12,7 +12,6 @@ from langchain.tools import BaseTool
 from langchain.tools.powerbi.prompt import QUESTION_TO_QUERY
 from langchain.tools.powerbi.tool import (
    InfoPowerBITool,
-    InputToQueryTool,
    ListPowerBITool,
    QueryPowerBITool,
 )
@ -25,6 +24,7 @@ class PowerBIToolkit(BaseToolkit):
    powerbi: PowerBIDataset = Field(exclude=True)
    llm: BaseLanguageModel = Field(exclude=True)
    examples: Optional[str] = None
+    max_iterations: int = 5
    callback_manager: Optional[BaseCallbackManager] = None

    class Config:
@ -52,12 +52,12 @@ class PowerBIToolkit(BaseToolkit):
                ),
            )
        return [
-            QueryPowerBITool(powerbi=self.powerbi),
-            InfoPowerBITool(powerbi=self.powerbi),
-            ListPowerBITool(powerbi=self.powerbi),
-            InputToQueryTool(
+            QueryPowerBITool(
                llm_chain=chain,
                powerbi=self.powerbi,
                examples=self.examples,
+                max_iterations=self.max_iterations,
            ),
+            InfoPowerBITool(powerbi=self.powerbi),
+            ListPowerBITool(powerbi=self.powerbi),
        ]
--- a/langchain/tools/init.py
+++ b/langchain/tools/init.py
@ -36,6 +36,11 @@ from langchain.tools.playwright import (
    NavigateTool,
 )
 from langchain.tools.plugin import AIPluginTool
+from langchain.tools.powerbi.tool import (
+    InfoPowerBITool,
+    ListPowerBITool,
+    QueryPowerBITool,
+)
 from langchain.tools.scenexplain.tool import SceneXplainTool
 from langchain.tools.shell.tool import ShellTool
 from langchain.tools.steamship_image_generation import SteamshipImageGenerationTool
@ -79,13 +84,16 @@ __all__ = [
    "GoogleSerperRun",
    "HumanInputRun",
    "IFTTTWebhook",
+    "InfoPowerBITool",
    "ListDirectoryTool",
+    "ListPowerBITool",
    "MetaphorSearchResults",
    "MoveFileTool",
    "NavigateBackTool",
    "NavigateTool",
    "OpenAPISpec",
    "OpenWeatherMapQueryRun",
+    "QueryPowerBITool",
    "ReadFileTool",
    "SceneXplainTool",
    "ShellTool",
--- a/langchain/tools/powerbi/prompt.py
+++ b/langchain/tools/powerbi/prompt.py
@ -1,6 +1,6 @@
 # flake8: noqa
 QUESTION_TO_QUERY = """
-Answer the question below with a DAX query that can be sent to Power BI. DAX queries have a simple syntax comprised of just one required keyword, EVALUATE, and several optional keywords: ORDER BY, START AT, DEFINE, MEASURE, VAR, TABLE, and COLUMN. Each keyword defines a statement used for the duration of the query. Any time < or > are used in the text below it means that those values need to be replaced by table, columns or other things. 
+Answer the question below with a DAX query that can be sent to Power BI. DAX queries have a simple syntax comprised of just one required keyword, EVALUATE, and several optional keywords: ORDER BY, START AT, DEFINE, MEASURE, VAR, TABLE, and COLUMN. Each keyword defines a statement used for the duration of the query. Any time < or > are used in the text below it means that those values need to be replaced by table, columns or other things. If the question is not something you can answer with a DAX query, reply with "I cannot answer this" and the question will be escalated to a human.

 Some DAX functions return a table instead of a scalar, and must be wrapped in a function that evaluates the table and returns a scalar; unless the table is a single column, single row table, then it is treated as a scalar value. Most DAX functions require one or more arguments, which can include tables, columns, expressions, and values. However, some functions, such as PI, do not require any arguments, but always require parentheses to indicate the null argument. For example, you must always type PI(), not PI. You can also nest functions within other functions. 

@ -31,7 +31,7 @@ DATEDIFF(date1, date2, <interval>) - Returns the difference between two date val
 DATEVALUE(<date_text>) - Returns a date value that represents the specified date.
 YEAR(<date>), QUARTER(<date>), MONTH(<date>), DAY(<date>), HOUR(<date>), MINUTE(<date>), SECOND(<date>) - Returns the part of the date for the specified date.

-Finally, make sure to escape double quotes with a single backslash, and make sure that only table names have single quotes around them, while names of measures or the values of columns that you want to compare against are in escaped double quotes. Newlines are not necessary and can be skipped. The queries are serialized as json and so will have to fit be compliant with json syntax.
+Finally, make sure to escape double quotes with a single backslash, and make sure that only table names have single quotes around them, while names of measures or the values of columns that you want to compare against are in escaped double quotes. Newlines are not necessary and can be skipped. The queries are serialized as json and so will have to fit be compliant with json syntax. Sometimes you will get a question, a DAX query and a error, in that case you need to rewrite the DAX query to get the correct answer.

 The following tables exist: {tables}

@ -57,9 +57,9 @@ DAX: EVALUATE ROW(\"Average\", AVERAGE(<table>[<column>]))
 ----
 """

-BAD_REQUEST_RESPONSE = (
-    "Bad request. Please ask the question_to_query_powerbi tool to provide the query."
+RETRY_RESPONSE = (
+    "{tool_input} DAX: {query} Error: {error}. Please supply a new DAX query."
 )
-BAD_REQUEST_RESPONSE_ESCALATED = "You already tried this, please try a different query."
+BAD_REQUEST_RESPONSE = "Error on this question, the error was {error}, you can try to rephrase the question."
 SCHEMA_ERROR_RESPONSE = "Bad request, are you sure the table name is correct?"
 UNAUTHORIZED_RESPONSE = "Unauthorized. Try changing your authentication, do not retry."
--- a/langchain/tools/powerbi/tool.py
+++ b/langchain/tools/powerbi/tool.py
@ -1,5 +1,5 @@
 """Tools for interacting with a Power BI dataset."""
-from typing import Any, Dict, Optional
+from typing import Any, Dict, Optional, Tuple

 from pydantic import Field, validator

@ -11,9 +11,9 @@ from langchain.chains.llm import LLMChain
 from langchain.tools.base import BaseTool
 from langchain.tools.powerbi.prompt import (
    BAD_REQUEST_RESPONSE,
-    BAD_REQUEST_RESPONSE_ESCALATED,
    DEFAULT_FEWSHOT_EXAMPLES,
    QUESTION_TO_QUERY,
+    RETRY_RESPONSE,
 )
 from langchain.utilities.powerbi import PowerBIDataset, json_to_md

@ -23,21 +23,39 @@ class QueryPowerBITool(BaseTool):

    name = "query_powerbi"
    description = """
-    Input to this tool is a detailed and correct DAX query, output is a result from the dataset.
-    If the query is not correct, an error message will be returned.
-    If an error is returned with Bad request in it, rewrite the query and try again.
-    If an error is returned with Unauthorized in it, do not try again, but tell the user to change their authentication.
+    Input to this tool is a detailed question about the dataset, output is a result from the dataset. It will try to answer the question using the dataset, and if it cannot, it will ask for clarification.

-    Example Input: "EVALUATE ROW("count", COUNTROWS(table1))"
+    Example Input: "How many rows are in table1?"
    """  # noqa: E501
+    llm_chain: LLMChain
    powerbi: PowerBIDataset = Field(exclude=True)
+    template: Optional[str] = QUESTION_TO_QUERY
+    examples: Optional[str] = DEFAULT_FEWSHOT_EXAMPLES
    session_cache: Dict[str, Any] = Field(default_factory=dict, exclude=True)
+    max_iterations: int = 5

    class Config:
        """Configuration for this pydantic object."""

        arbitrary_types_allowed = True

+    @validator("llm_chain")
+    def validate_llm_chain_input_variables(  # pylint: disable=E0213
+        cls, llm_chain: LLMChain
+    ) -> LLMChain:
+        """Make sure the LLM chain has the correct input variables."""
+        if llm_chain.prompt.input_variables != [
+            "tool_input",
+            "tables",
+            "schemas",
+            "examples",
+        ]:
+            raise ValueError(
+                "LLM chain for QueryPowerBITool must have input variables ['tool_input', 'tables', 'schemas', 'examples'], found %s",  # noqa: C0301 E501 # pylint: disable=C0301
+                llm_chain.prompt.input_variables,
+            )
+        return llm_chain
+
    def _check_cache(self, tool_input: str) -> Optional[str]:
        """Check if the input is present in the cache.

@ -45,88 +63,106 @@ class QueryPowerBITool(BaseTool):
        if not present return None."""
        if tool_input not in self.session_cache:
            return None
-        if self.session_cache[tool_input] == BAD_REQUEST_RESPONSE:
-            self.session_cache[tool_input] = BAD_REQUEST_RESPONSE_ESCALATED
        return self.session_cache[tool_input]

    def _run(
        self,
        tool_input: str,
        run_manager: Optional[CallbackManagerForToolRun] = None,
+        **kwargs: Any,
    ) -> str:
        """Execute the query, return the results or an error message."""
        if cache := self._check_cache(tool_input):
            return cache
+
        try:
-            self.session_cache[tool_input] = self.powerbi.run(command=tool_input)
-        except Exception as exc:  # pylint: disable=broad-except
-            if "bad request" in str(exc).lower():
-                self.session_cache[tool_input] = BAD_REQUEST_RESPONSE
-            elif "unauthorized" in str(exc).lower():
-                self.session_cache[
-                    tool_input
-                ] = "Unauthorized. Try changing your authentication, do not retry."
-            else:
-                self.session_cache[tool_input] = str(exc)
-            return self.session_cache[tool_input]
-        if "results" in self.session_cache[tool_input]:
-            self.session_cache[tool_input] = json_to_md(
-                self.session_cache[tool_input]["results"][0]["tables"][0]["rows"]
+            query = self.llm_chain.predict(
+                tool_input=tool_input,
+                tables=self.powerbi.get_table_names(),
+                schemas=self.powerbi.get_schemas(),
+                examples=self.examples,
            )
+        except Exception as exc:  # pylint: disable=broad-except
+            self.session_cache[tool_input] = f"Error on call to LLM: {exc}"
            return self.session_cache[tool_input]
-        if (
-            "error" in self.session_cache[tool_input]
-            and "pbi.error" in self.session_cache[tool_input]["error"]
-            and "details" in self.session_cache[tool_input]["error"]["pbi.error"]
-        ):
-            self.session_cache[
-                tool_input
-            ] = f'{BAD_REQUEST_RESPONSE} Error was {self.session_cache[tool_input]["error"]["pbi.error"]["details"][0]["detail"]}'  # noqa: E501
+        if query == "I cannot answer this":
+            self.session_cache[tool_input] = query
            return self.session_cache[tool_input]
-        self.session_cache[
-            tool_input
-        ] = f'{BAD_REQUEST_RESPONSE} Error was {self.session_cache[tool_input]["error"]}'  # noqa: E501
+        pbi_result = self.powerbi.run(command=query)
+        result, error = self._parse_output(pbi_result)
+
+        iterations = kwargs.get("iterations", 0)
+        if error and iterations < self.max_iterations:
+            return self._run(
+                tool_input=RETRY_RESPONSE.format(
+                    tool_input=tool_input, query=query, error=error
+                ),
+                run_manager=run_manager,
+                iterations=iterations + 1,
+            )
+
+        self.session_cache[tool_input] = (
+            result if result else BAD_REQUEST_RESPONSE.format(error=error)
+        )
        return self.session_cache[tool_input]

    async def _arun(
        self,
        tool_input: str,
        run_manager: Optional[AsyncCallbackManagerForToolRun] = None,
+        **kwargs: Any,
    ) -> str:
        """Execute the query, return the results or an error message."""
        if cache := self._check_cache(tool_input):
            return cache
        try:
-            self.session_cache[tool_input] = await self.powerbi.arun(command=tool_input)
-        except Exception as exc:  # pylint: disable=broad-except
-            if "bad request" in str(exc).lower():
-                self.session_cache[tool_input] = BAD_REQUEST_RESPONSE
-            elif "unauthorized" in str(exc).lower():
-                self.session_cache[
-                    tool_input
-                ] = "Unauthorized. Try changing your authentication, do not retry."
-            else:
-                self.session_cache[tool_input] = str(exc)
-            return self.session_cache[tool_input]
-        if "results" in self.session_cache[tool_input]:
-            self.session_cache[tool_input] = json_to_md(
-                self.session_cache[tool_input]["results"][0]["tables"][0]["rows"]
+            query = await self.llm_chain.apredict(
+                tool_input=tool_input,
+                tables=self.powerbi.get_table_names(),
+                schemas=self.powerbi.get_schemas(),
+                examples=self.examples,
            )
+        except Exception as exc:  # pylint: disable=broad-except
+            self.session_cache[tool_input] = f"Error on call to LLM: {exc}"
            return self.session_cache[tool_input]
-        if (
-            "error" in self.session_cache[tool_input]
-            and "pbi.error" in self.session_cache[tool_input]["error"]
-            and "details" in self.session_cache[tool_input]["error"]["pbi.error"]
-        ):
-            self.session_cache[
-                tool_input
-            ] = f'{BAD_REQUEST_RESPONSE} Error was {self.session_cache[tool_input]["error"]["pbi.error"]["details"][0]["detail"]}'  # noqa: E501
+
+        if query == "I cannot answer this":
+            self.session_cache[tool_input] = query
            return self.session_cache[tool_input]
-        self.session_cache[
-            tool_input
-        ] = f'{BAD_REQUEST_RESPONSE} Error was {self.session_cache[tool_input]["error"]}'  # noqa: E501
+        pbi_result = await self.powerbi.arun(command=query)
+        result, error = self._parse_output(pbi_result)
+
+        iterations = kwargs.get("iterations", 0)
+        if error and iterations < self.max_iterations:
+            return await self._arun(
+                tool_input=RETRY_RESPONSE.format(
+                    tool_input=tool_input, query=query, error=error
+                ),
+                run_manager=run_manager,
+                iterations=iterations + 1,
+            )
+
+        self.session_cache[tool_input] = (
+            result if result else BAD_REQUEST_RESPONSE.format(error=error)
+        )
        return self.session_cache[tool_input]

+    def _parse_output(
+        self, pbi_result: Dict[str, Any]
+    ) -> Tuple[Optional[str], Optional[str]]:
+        """Parse the output of the query to a markdown table."""
+        if "results" in pbi_result:
+            return json_to_md(pbi_result["results"][0]["tables"][0]["rows"]), None
+
+        if "error" in pbi_result:
+            if (
+                "pbi.error" in pbi_result["error"]
+                and "details" in pbi_result["error"]["pbi.error"]
+            ):
+                return None, pbi_result["error"]["pbi.error"]["details"][0]["detail"]
+            return None, pbi_result["error"]
+        return None, "Unknown error"
+

 class InfoPowerBITool(BaseTool):
    """Tool for getting metadata about a PowerBI Dataset."""
@ -188,64 +224,3 @@ class ListPowerBITool(BaseTool):
    ) -> str:
        """Get the names of the tables."""
        return ", ".join(self.powerbi.get_table_names())
-
-
-class InputToQueryTool(BaseTool):
-    """Use an LLM to parse the question to a DAX query."""
-
-    name = "question_to_query_powerbi"
-    description = """
-    Use this tool to create the DAX query from a question, the input is a fully formed question related to the powerbi dataset. Always use this tool before executing a query with query_powerbi!
-
-    Example Input: "How many records are in table1?"
-    """  # noqa: E501
-    llm_chain: LLMChain
-    powerbi: PowerBIDataset = Field(exclude=True)
-    template: Optional[str] = QUESTION_TO_QUERY
-    examples: Optional[str] = DEFAULT_FEWSHOT_EXAMPLES
-
-    class Config:
-        """Configuration for this pydantic object."""
-
-        arbitrary_types_allowed = True
-
-    @validator("llm_chain")
-    def validate_llm_chain_input_variables(  # pylint: disable=E0213
-        cls, llm_chain: LLMChain
-    ) -> LLMChain:
-        """Make sure the LLM chain has the correct input variables."""
-        if llm_chain.prompt.input_variables != [
-            "tool_input",
-            "tables",
-            "schemas",
-            "examples",
-        ]:
-            raise ValueError(
-                "LLM chain for InputToQueryTool must have input variables ['tool_input', 'tables', 'schemas', 'examples']"  # noqa: C0301 E501 # pylint: disable=C0301
-            )
-        return llm_chain
-
-    def _run(
-        self,
-        tool_input: str,
-        run_manager: Optional[CallbackManagerForToolRun] = None,
-    ) -> str:
-        """Use the LLM to check the query."""
-        return self.llm_chain.predict(
-            tool_input=tool_input,
-            tables=self.powerbi.get_table_names(),
-            schemas=self.powerbi.get_schemas(),
-            examples=self.examples,
-        )
-
-    async def _arun(
-        self,
-        tool_input: str,
-        run_manager: Optional[AsyncCallbackManagerForToolRun] = None,
-    ) -> str:
-        return await self.llm_chain.apredict(
-            tool_input=tool_input,
-            tables=self.powerbi.get_table_names(),
-            schemas=self.powerbi.get_schemas(),
-            examples=self.examples,
-        )
--- a/langchain/utilities/powerbi.py
+++ b/langchain/utilities/powerbi.py
@ -1,15 +1,15 @@
 """Wrapper around a Power BI endpoint."""
 from __future__ import annotations

+import asyncio
 import logging
 import os
-from copy import deepcopy
 from typing import TYPE_CHECKING, Any, Dict, Iterable, List, Optional, Union

 import aiohttp
 import requests
 from aiohttp import ServerTimeoutError
-from pydantic import BaseModel, Field, root_validator
+from pydantic import BaseModel, Field, root_validator, validator
 from requests.exceptions import Timeout

 _LOGGER = logging.getLogger(__name__)
@ -36,14 +36,19 @@ class PowerBIDataset(BaseModel):
    token: Optional[str] = None
    impersonated_user_name: Optional[str] = None
    sample_rows_in_table_info: int = Field(default=1, gt=0, le=10)
+    schemas: Dict[str, str] = Field(default_factory=dict)
    aiosession: Optional[aiohttp.ClientSession] = None
-    schemas: Dict[str, str] = Field(default_factory=dict, init=False)

    class Config:
        """Configuration for this pydantic object."""

        arbitrary_types_allowed = True

+    @validator("table_names", allow_reuse=True)
+    def fix_table_names(cls, table_names: List[str]) -> List[str]:
+        """Fix the table names."""
+        return [fix_table_name(table) for table in table_names]
+
    @root_validator(pre=True, allow_reuse=True)
    def token_or_credential_present(cls, values: Dict[str, Any]) -> Dict[str, Any]:
        """Validate that at least one of token and credentials is present."""
@ -102,30 +107,37 @@ class PowerBIDataset(BaseModel):

    def _get_tables_to_query(
        self, table_names: Optional[Union[List[str], str]] = None
-    ) -> List[str]:
-        """Get the tables names that need to be queried."""
+    ) -> Optional[List[str]]:
+        """Get the tables names that need to be queried, after checking they exist."""
        if table_names is not None:
            if (
                isinstance(table_names, list)
                and len(table_names) > 0
                and table_names[0] != ""
            ):
-                return table_names
+                fixed_tables = [fix_table_name(table) for table in table_names]
+                non_existing_tables = [
+                    table for table in fixed_tables if table not in self.table_names
+                ]
+                if non_existing_tables:
+                    _LOGGER.warning(
+                        "Table(s) %s not found in dataset.",
+                        ", ".join(non_existing_tables),
+                    )
+                tables = [
+                    table for table in fixed_tables if table not in non_existing_tables
+                ]
+                return tables if tables else None
            if isinstance(table_names, str) and table_names != "":
-                return [table_names]
+                if table_names not in self.table_names:
+                    _LOGGER.warning("Table %s not found in dataset.", table_names)
+                    return None
+                return [fix_table_name(table_names)]
        return self.table_names

    def _get_tables_todo(self, tables_todo: List[str]) -> List[str]:
        """Get the tables that still need to be queried."""
-        todo = deepcopy(tables_todo)
-        for table in todo:
-            if table not in self.table_names:
-                _LOGGER.warning("Table %s not found in dataset.", table)
-                todo.remove(table)
-                continue
-            if table in self.schemas:
-                todo.remove(table)
-        return todo
+        return [table for table in tables_todo if table not in self.schemas]

    def _get_schema_for_tables(self, table_names: List[str]) -> str:
        """Create a string of the table schemas for the supplied tables."""
@ -139,23 +151,11 @@ class PowerBIDataset(BaseModel):
    ) -> str:
        """Get information about specified tables."""
        tables_requested = self._get_tables_to_query(table_names)
+        if tables_requested is None:
+            return "No (valid) tables requested."
        tables_todo = self._get_tables_todo(tables_requested)
        for table in tables_todo:
-            if " " in table and not table.startswith("'") and not table.endswith("'"):
-                table = f"'{table}'"
-            try:
-                result = self.run(
-                    f"EVALUATE TOPN({self.sample_rows_in_table_info}, {table})"
-                )
-            except Timeout:
-                _LOGGER.warning("Timeout while getting table info for %s", table)
-                self.schemas[table] = "unknown"
-                continue
-            except Exception as exc:  # pylint: disable=broad-exception-caught
-                _LOGGER.warning("Error while getting table info for %s: %s", table, exc)
-                self.schemas[table] = "unknown"
-                continue
-            self.schemas[table] = json_to_md(result["results"][0]["tables"][0]["rows"])
+            self._get_schema(table)
        return self._get_schema_for_tables(tables_requested)

    async def aget_table_info(
@ -163,25 +163,40 @@ class PowerBIDataset(BaseModel):
    ) -> str:
        """Get information about specified tables."""
        tables_requested = self._get_tables_to_query(table_names)
+        if tables_requested is None:
+            return "No (valid) tables requested."
        tables_todo = self._get_tables_todo(tables_requested)
-        for table in tables_todo:
-            if " " in table and not table.startswith("'") and not table.endswith("'"):
-                table = f"'{table}'"
-            try:
-                result = await self.arun(
-                    f"EVALUATE TOPN({self.sample_rows_in_table_info}, {table})"
-                )
-            except ServerTimeoutError:
-                _LOGGER.warning("Timeout while getting table info for %s", table)
-                self.schemas[table] = "unknown"
-                continue
-            except Exception as exc:  # pylint: disable=broad-exception-caught
-                _LOGGER.warning("Error while getting table info for %s: %s", table, exc)
-                self.schemas[table] = "unknown"
-                continue
-            self.schemas[table] = json_to_md(result["results"][0]["tables"][0]["rows"])
+        await asyncio.gather(*[self._aget_schema(table) for table in tables_todo])
        return self._get_schema_for_tables(tables_requested)

+    def _get_schema(self, table: str) -> None:
+        """Get the schema for a table."""
+        try:
+            result = self.run(
+                f"EVALUATE TOPN({self.sample_rows_in_table_info}, {table})"
+            )
+            self.schemas[table] = json_to_md(result["results"][0]["tables"][0]["rows"])
+        except Timeout:
+            _LOGGER.warning("Timeout while getting table info for %s", table)
+            self.schemas[table] = "unknown"
+        except Exception as exc:  # pylint: disable=broad-exception-caught
+            _LOGGER.warning("Error while getting table info for %s: %s", table, exc)
+            self.schemas[table] = "unknown"
+
+    async def _aget_schema(self, table: str) -> None:
+        """Get the schema for a table."""
+        try:
+            result = await self.arun(
+                f"EVALUATE TOPN({self.sample_rows_in_table_info}, {table})"
+            )
+            self.schemas[table] = json_to_md(result["results"][0]["tables"][0]["rows"])
+        except ServerTimeoutError:
+            _LOGGER.warning("Timeout while getting table info for %s", table)
+            self.schemas[table] = "unknown"
+        except Exception as exc:  # pylint: disable=broad-exception-caught
+            _LOGGER.warning("Error while getting table info for %s: %s", table, exc)
+            self.schemas[table] = "unknown"
+
    def _create_json_content(self, command: str) -> dict[str, Any]:
        """Create the json content for the request."""
        return {
@ -242,3 +257,10 @@ def json_to_md(
            output_md += f"| {value} "
        output_md += "|\n"
    return output_md
+
+
+def fix_table_name(table: str) -> str:
+    """Add single quotes around table names that contain spaces."""
+    if " " in table and not table.startswith("'") and not table.endswith("'"):
+        return f"'{table}'"
+    return table
--- a/tests/unit_tests/tools/test_public_api.py
+++ b/tests/unit_tests/tools/test_public_api.py
@ -31,13 +31,16 @@ _EXPECTED = [
    "GoogleSerperRun",
    "HumanInputRun",
    "IFTTTWebhook",
+    "InfoPowerBITool",
    "ListDirectoryTool",
+    "ListPowerBITool",
    "MetaphorSearchResults",
    "MoveFileTool",
    "NavigateBackTool",
    "NavigateTool",
    "OpenAPISpec",
    "OpenWeatherMapQueryRun",
+    "QueryPowerBITool",
    "ReadFileTool",
    "SceneXplainTool",
    "ShellTool",