community: Add stock market tools from financialdatasets.ai (#25025)

**Description:**
In this PR, I am adding three stock market tools from
financialdatasets.ai (my API!):
- get balance sheets
- get cash flow statements
- get income statements

Twitter handle: [@virattt](https://twitter.com/virattt)

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
pull/25123/head
Virat Singh 1 month ago committed by GitHub
parent 267855b3c1
commit 264ab96980
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

@ -0,0 +1,313 @@
{
"cells": [
{
"cell_type": "raw",
"id": "afaf8039",
"metadata": {},
"source": [
"---\n",
"sidebar_label: financial datasets\n",
"---"
]
},
{
"cell_type": "markdown",
"source": [
"# FinancialDatasetsToolkit\n",
"\n",
"The [financial datasets](https://financialdatasets.ai/) stock market API provides REST endpoints that let you get financial data for 16,000+ tickers spanning 30+ years.\n",
"\n",
"## Setup\n",
"\n",
"To use this toolkit, you need two API keys:\n",
"\n",
"`FINANCIAL_DATASETS_API_KEY`: Get it from [financialdatasets.ai](https://financialdatasets.ai/).\n",
"`OPENAI_API_KEY`: Get it from [OpenAI](https://platform.openai.com/)."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"import getpass\n",
"import os\n",
"\n",
"os.environ[\"FINANCIAL_DATASETS_API_KEY\"] = getpass.getpass()"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"os.environ[\"OPENAI_API_KEY\"] = getpass.getpass()"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [
"### Installation\n",
"\n",
"This toolkit lives in the `langchain-community` package."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": null,
"id": "652d6238-1f87-422a-b135-f5abbb8652fc",
"metadata": {},
"outputs": [],
"source": [
"%pip install -qU langchain-community"
]
},
{
"cell_type": "markdown",
"id": "a38cde65-254d-4219-a441-068766c0d4b5",
"metadata": {},
"source": [
"## Instantiation\n",
"\n",
"Now we can instantiate our toolkit:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cb09c344-1836-4e0c-acf8-11d13ac1dbae",
"metadata": {},
"outputs": [],
"source": [
"from langchain_community.agent_toolkits.financial_datasets.toolkit import (\n",
" FinancialDatasetsToolkit,\n",
")\n",
"from langchain_community.utilities.financial_datasets import FinancialDatasetsAPIWrapper\n",
"\n",
"api_wrapper = FinancialDatasetsAPIWrapper(\n",
" financial_datasets_api_key=os.environ[\"FINANCIAL_DATASETS_API_KEY\"]\n",
")\n",
"toolkit = FinancialDatasetsToolkit(api_wrapper=api_wrapper)"
]
},
{
"cell_type": "markdown",
"id": "5c5f2839-4020-424e-9fc9-07777eede442",
"metadata": {},
"source": [
"## Tools\n",
"\n",
"View available tools:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "51a60dbe-9f2e-4e04-bb62-23968f17164a",
"metadata": {},
"outputs": [],
"source": [
"tools = toolkit.get_tools()"
]
},
{
"cell_type": "markdown",
"source": [
"## Use within an agent\n",
"\n",
"Let's equip our agent with the FinancialDatasetsToolkit and ask financial questions."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"system_prompt = \"\"\"\n",
"You are an advanced financial analysis AI assistant equipped with specialized tools\n",
"to access and analyze financial data. Your primary function is to help users with\n",
"financial analysis by retrieving and interpreting income statements, balance sheets,\n",
"and cash flow statements for publicly traded companies.\n",
"\n",
"You have access to the following tools from the FinancialDatasetsToolkit:\n",
"\n",
"1. Balance Sheets: Retrieves balance sheet data for a given ticker symbol.\n",
"2. Income Statements: Fetches income statement data for a specified company.\n",
"3. Cash Flow Statements: Accesses cash flow statement information for a particular ticker.\n",
"\n",
"Your capabilities include:\n",
"\n",
"1. Retrieving financial statements for any publicly traded company using its ticker symbol.\n",
"2. Analyzing financial ratios and metrics based on the data from these statements.\n",
"3. Comparing financial performance across different time periods (e.g., year-over-year or quarter-over-quarter).\n",
"4. Identifying trends in a company's financial health and performance.\n",
"5. Providing insights on a company's liquidity, solvency, profitability, and efficiency.\n",
"6. Explaining complex financial concepts in simple terms.\n",
"\n",
"When responding to queries:\n",
"\n",
"1. Always specify which financial statement(s) you're using for your analysis.\n",
"2. Provide context for the numbers you're referencing (e.g., fiscal year, quarter).\n",
"3. Explain your reasoning and calculations clearly.\n",
"4. If you need more information to provide a complete answer, ask for clarification.\n",
"5. When appropriate, suggest additional analyses that might be helpful.\n",
"\n",
"Remember, your goal is to provide accurate, insightful financial analysis to\n",
"help users make informed decisions. Always maintain a professional and objective tone in your responses.\n",
"\"\"\""
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [
"Instantiate the LLM."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": null,
"id": "310bf18e-6c9a-4072-b86e-47bc1fcca29d",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.tools import tool\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"model = ChatOpenAI(model=\"gpt-4o\")"
]
},
{
"cell_type": "markdown",
"source": [
"Define a user query."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": null,
"id": "23e11cc9-abd6-4855-a7eb-799f45ca01ae",
"metadata": {},
"outputs": [],
"source": [
"query = \"What was AAPL's revenue in 2023? What about it's total debt in Q1 2024?\""
]
},
{
"cell_type": "markdown",
"source": [
"Create the agent."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"from langchain.agents import AgentExecutor, create_tool_calling_agent\n",
"from langchain_core.prompts import ChatPromptTemplate\n",
"\n",
"prompt = ChatPromptTemplate.from_messages(\n",
" [\n",
" (\"system\", system_prompt),\n",
" (\"human\", \"{input}\"),\n",
" # Placeholders fill up a **list** of messages\n",
" (\"placeholder\", \"{agent_scratchpad}\"),\n",
" ]\n",
")\n",
"\n",
"\n",
"agent = create_tool_calling_agent(model, tools, prompt)\n",
"agent_executor = AgentExecutor(agent=agent, tools=tools)"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [
"Query the agent."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"agent_executor.invoke({\"input\": query})"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [
"## API reference\n",
"\n",
"For detailed documentation of all `FinancialDatasetsToolkit` features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/agent_toolkits/langchain_community.agent_toolkits.financial_datasets.toolkit.FinancialDatasetsToolkit.html)."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [],
"metadata": {
"collapsed": false
}
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

@ -0,0 +1,45 @@
from __future__ import annotations
from typing import List
from langchain_core.pydantic_v1 import Field
from langchain_core.tools import BaseToolkit
from langchain_community.tools import BaseTool
from langchain_community.tools.financial_datasets.balance_sheets import BalanceSheets
from langchain_community.tools.financial_datasets.cash_flow_statements import (
CashFlowStatements,
)
from langchain_community.tools.financial_datasets.income_statements import (
IncomeStatements,
)
from langchain_community.utilities.financial_datasets import FinancialDatasetsAPIWrapper
class FinancialDatasetsToolkit(BaseToolkit):
"""Toolkit for interacting with financialdatasets.ai.
Parameters:
api_wrapper: The FinancialDatasets API Wrapper.
"""
api_wrapper: FinancialDatasetsAPIWrapper = Field(
default_factory=FinancialDatasetsAPIWrapper
)
def __init__(self, api_wrapper: FinancialDatasetsAPIWrapper):
super().__init__()
self.api_wrapper = api_wrapper
class Config:
"""Pydantic config."""
arbitrary_types_allowed = True
def get_tools(self) -> List[BaseTool]:
"""Get the tools in the toolkit."""
return [
BalanceSheets(api_wrapper=self.api_wrapper),
CashFlowStatements(api_wrapper=self.api_wrapper),
IncomeStatements(api_wrapper=self.api_wrapper),
]

@ -120,6 +120,15 @@ if TYPE_CHECKING:
ReadFileTool,
WriteFileTool,
)
from langchain_community.tools.financial_datasets.balance_sheets import (
BalanceSheets,
)
from langchain_community.tools.financial_datasets.cash_flow_statements import (
CashFlowStatements,
)
from langchain_community.tools.financial_datasets.income_statements import (
IncomeStatements,
)
from langchain_community.tools.gmail import (
GmailCreateDraft,
GmailGetMessage,
@ -348,6 +357,7 @@ __all__ = [
"AzureCogsSpeech2TextTool",
"AzureCogsText2SpeechTool",
"AzureCogsTextAnalyticsHealthTool",
"BalanceSheets",
"BaseGraphQLTool",
"BaseRequestsTool",
"BaseSQLDatabaseTool",
@ -357,6 +367,7 @@ __all__ = [
"BingSearchResults",
"BingSearchRun",
"BraveSearch",
"CashFlowStatements",
"ClickTool",
"CogniswitchKnowledgeRequest",
"CogniswitchKnowledgeSourceFile",
@ -396,6 +407,7 @@ __all__ = [
"GoogleSerperRun",
"HumanInputRun",
"IFTTTWebhook",
"IncomeStatements",
"InfoPowerBITool",
"InfoSQLDatabaseTool",
"InfoSparkSQLTool",
@ -498,6 +510,7 @@ _module_lookup = {
"AzureCogsSpeech2TextTool": "langchain_community.tools.azure_cognitive_services",
"AzureCogsText2SpeechTool": "langchain_community.tools.azure_cognitive_services",
"AzureCogsTextAnalyticsHealthTool": "langchain_community.tools.azure_cognitive_services", # noqa: E501
"BalanceSheets": "langchain_community.tools.financial_datasets.balance_sheets",
"BaseGraphQLTool": "langchain_community.tools.graphql.tool",
"BaseRequestsTool": "langchain_community.tools.requests.tool",
"BaseSQLDatabaseTool": "langchain_community.tools.sql_database.tool",
@ -507,6 +520,7 @@ _module_lookup = {
"BingSearchResults": "langchain_community.tools.bing_search.tool",
"BingSearchRun": "langchain_community.tools.bing_search.tool",
"BraveSearch": "langchain_community.tools.brave_search.tool",
"CashFlowStatements": "langchain_community.tools.financial_datasets.cash_flow_statements", # noqa: E501
"ClickTool": "langchain_community.tools.playwright",
"CogniswitchKnowledgeRequest": "langchain_community.tools.cogniswitch.tool",
"CogniswitchKnowledgeSourceFile": "langchain_community.tools.cogniswitch.tool",
@ -547,6 +561,7 @@ _module_lookup = {
"GoogleSerperRun": "langchain_community.tools.google_serper.tool",
"HumanInputRun": "langchain_community.tools.human.tool",
"IFTTTWebhook": "langchain_community.tools.ifttt",
"IncomeStatements": "langchain_community.tools.financial_datasets.income_statements", # noqa: E501
"InfoPowerBITool": "langchain_community.tools.powerbi.tool",
"InfoSQLDatabaseTool": "langchain_community.tools.sql_database.tool",
"InfoSparkSQLTool": "langchain_community.tools.spark_sql.tool",

@ -0,0 +1,17 @@
"""financial datasets tools."""
from langchain_community.tools.financial_datasets.balance_sheets import (
BalanceSheets,
)
from langchain_community.tools.financial_datasets.cash_flow_statements import (
CashFlowStatements,
)
from langchain_community.tools.financial_datasets.income_statements import (
IncomeStatements,
)
__all__ = [
"BalanceSheets",
"CashFlowStatements",
"IncomeStatements",
]

@ -0,0 +1,62 @@
from typing import Optional, Type
from langchain_core.callbacks import CallbackManagerForToolRun
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.tools import BaseTool
from langchain_community.utilities.financial_datasets import FinancialDatasetsAPIWrapper
class BalanceSheetsSchema(BaseModel):
"""Input for BalanceSheets."""
ticker: str = Field(
description="The ticker symbol to fetch balance sheets for.",
)
period: str = Field(
description="The period of the balance sheets. "
"Possible values are: "
"annual, quarterly, ttm. "
"Default is 'annual'.",
)
limit: int = Field(
description="The number of balance sheets to return. " "Default is 10.",
)
class BalanceSheets(BaseTool):
"""
Tool that gets balance sheets for a given ticker over a given period.
"""
mode: str = "get_balance_sheets"
name: str = "balance_sheets"
description: str = (
"A wrapper around financial datasets's Balance Sheets API. "
"This tool is useful for fetching balance sheets for a given ticker."
"The tool fetches balance sheets for a given ticker over a given period."
"The period can be annual, quarterly, or trailing twelve months (ttm)."
"The number of balance sheets to return can also be "
"specified using the limit parameter."
)
args_schema: Type[BalanceSheetsSchema] = BalanceSheetsSchema
api_wrapper: FinancialDatasetsAPIWrapper = Field(..., exclude=True)
def __init__(self, api_wrapper: FinancialDatasetsAPIWrapper):
super().__init__(api_wrapper=api_wrapper)
def _run(
self,
ticker: str,
period: str,
limit: Optional[int],
run_manager: Optional[CallbackManagerForToolRun] = None,
) -> str:
"""Use the Balance Sheets API tool."""
return self.api_wrapper.run(
mode=self.mode,
ticker=ticker,
period=period,
limit=limit,
)

@ -0,0 +1,62 @@
from typing import Optional, Type
from langchain_core.callbacks import CallbackManagerForToolRun
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.tools import BaseTool
from langchain_community.utilities.financial_datasets import FinancialDatasetsAPIWrapper
class CashFlowStatementsSchema(BaseModel):
"""Input for CashFlowStatements."""
ticker: str = Field(
description="The ticker symbol to fetch cash flow statements for.",
)
period: str = Field(
description="The period of the cash flow statement. "
"Possible values are: "
"annual, quarterly, ttm. "
"Default is 'annual'.",
)
limit: int = Field(
description="The number of cash flow statements to return. " "Default is 10.",
)
class CashFlowStatements(BaseTool):
"""
Tool that gets cash flow statements for a given ticker over a given period.
"""
mode: str = "get_cash_flow_statements"
name: str = "cash_flow_statements"
description: str = (
"A wrapper around financial datasets's Cash Flow Statements API. "
"This tool is useful for fetching cash flow statements for a given ticker."
"The tool fetches cash flow statements for a given ticker over a given period."
"The period can be annual, quarterly, or trailing twelve months (ttm)."
"The number of cash flow statements to return can also be "
"specified using the limit parameter."
)
args_schema: Type[CashFlowStatementsSchema] = CashFlowStatementsSchema
api_wrapper: FinancialDatasetsAPIWrapper = Field(..., exclude=True)
def __init__(self, api_wrapper: FinancialDatasetsAPIWrapper):
super().__init__(api_wrapper=api_wrapper)
def _run(
self,
ticker: str,
period: str,
limit: Optional[int],
run_manager: Optional[CallbackManagerForToolRun] = None,
) -> str:
"""Use the Cash Flow Statements API tool."""
return self.api_wrapper.run(
mode=self.mode,
ticker=ticker,
period=period,
limit=limit,
)

@ -0,0 +1,62 @@
from typing import Optional, Type
from langchain_core.callbacks import CallbackManagerForToolRun
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.tools import BaseTool
from langchain_community.utilities.financial_datasets import FinancialDatasetsAPIWrapper
class IncomeStatementsSchema(BaseModel):
"""Input for IncomeStatements."""
ticker: str = Field(
description="The ticker symbol to fetch income statements for.",
)
period: str = Field(
description="The period of the income statement. "
"Possible values are: "
"annual, quarterly, ttm. "
"Default is 'annual'.",
)
limit: int = Field(
description="The number of income statements to return. " "Default is 10.",
)
class IncomeStatements(BaseTool):
"""
Tool that gets income statements for a given ticker over a given period.
"""
mode: str = "get_income_statements"
name: str = "income_statements"
description: str = (
"A wrapper around financial datasets's Income Statements API. "
"This tool is useful for fetching income statements for a given ticker."
"The tool fetches income statements for a given ticker over a given period."
"The period can be annual, quarterly, or trailing twelve months (ttm)."
"The number of income statements to return can also be "
"specified using the limit parameter."
)
args_schema: Type[IncomeStatementsSchema] = IncomeStatementsSchema
api_wrapper: FinancialDatasetsAPIWrapper = Field(..., exclude=True)
def __init__(self, api_wrapper: FinancialDatasetsAPIWrapper):
super().__init__(api_wrapper=api_wrapper)
def _run(
self,
ticker: str,
period: str,
limit: Optional[int],
run_manager: Optional[CallbackManagerForToolRun] = None,
) -> str:
"""Use the Income Statements API tool."""
return self.api_wrapper.run(
mode=self.mode,
ticker=ticker,
period=period,
limit=limit,
)

@ -0,0 +1,135 @@
"""
Util that calls several of financial datasets stock market REST APIs.
Docs: https://docs.financialdatasets.ai/
"""
import json
from typing import Any, List, Optional
import requests
from langchain_core.pydantic_v1 import BaseModel
from langchain_core.utils import get_from_dict_or_env
FINANCIAL_DATASETS_BASE_URL = "https://api.financialdatasets.ai/"
class FinancialDatasetsAPIWrapper(BaseModel):
"""Wrapper for financial datasets API."""
financial_datasets_api_key: Optional[str] = None
def __init__(self, **data: Any):
super().__init__(**data)
self.financial_datasets_api_key = get_from_dict_or_env(
data, "financial_datasets_api_key", "FINANCIAL_DATASETS_API_KEY"
)
def get_income_statements(
self,
ticker: str,
period: str,
limit: Optional[int],
) -> Optional[dict]:
"""
Get the income statements for a stock `ticker` over a `period` of time.
:param ticker: the stock ticker
:param period: the period of time to get the balance sheets for.
Possible values are: annual, quarterly, ttm.
:param limit: the number of results to return, default is 10
:return: a list of income statements
"""
url = (
f"{FINANCIAL_DATASETS_BASE_URL}financials/income-statements/"
f"?ticker={ticker}"
f"&period={period}"
f"&limit={limit if limit else 10}"
)
# Add the api key to the headers
headers = {"X-API-KEY": self.financial_datasets_api_key}
# Execute the request
response = requests.get(url, headers=headers)
data = response.json()
return data.get("income_statements", None)
def get_balance_sheets(
self,
ticker: str,
period: str,
limit: Optional[int],
) -> List[dict]:
"""
Get the balance sheets for a stock `ticker` over a `period` of time.
:param ticker: the stock ticker
:param period: the period of time to get the balance sheets for.
Possible values are: annual, quarterly, ttm.
:param limit: the number of results to return, default is 10
:return: a list of balance sheets
"""
url = (
f"{FINANCIAL_DATASETS_BASE_URL}financials/balance-sheets/"
f"?ticker={ticker}"
f"&period={period}"
f"&limit={limit if limit else 10}"
)
# Add the api key to the headers
headers = {"X-API-KEY": self.financial_datasets_api_key}
# Execute the request
response = requests.get(url, headers=headers)
data = response.json()
return data.get("balance_sheets", None)
def get_cash_flow_statements(
self,
ticker: str,
period: str,
limit: Optional[int],
) -> List[dict]:
"""
Get the cash flow statements for a stock `ticker` over a `period` of time.
:param ticker: the stock ticker
:param period: the period of time to get the balance sheets for.
Possible values are: annual, quarterly, ttm.
:param limit: the number of results to return, default is 10
:return: a list of cash flow statements
"""
url = (
f"{FINANCIAL_DATASETS_BASE_URL}financials/cash-flow-statements/"
f"?ticker={ticker}"
f"&period={period}"
f"&limit={limit if limit else 10}"
)
# Add the api key to the headers
headers = {"X-API-KEY": self.financial_datasets_api_key}
# Execute the request
response = requests.get(url, headers=headers)
data = response.json()
return data.get("cash_flow_statements", None)
def run(self, mode: str, ticker: str, **kwargs: Any) -> str:
if mode == "get_income_statements":
period = kwargs.get("period", "annual")
limit = kwargs.get("limit", 10)
return json.dumps(self.get_income_statements(ticker, period, limit))
elif mode == "get_balance_sheets":
period = kwargs.get("period", "annual")
limit = kwargs.get("limit", 10)
return json.dumps(self.get_balance_sheets(ticker, period, limit))
elif mode == "get_cash_flow_statements":
period = kwargs.get("period", "annual")
limit = kwargs.get("limit", 10)
return json.dumps(self.get_cash_flow_statements(ticker, period, limit))
else:
raise ValueError(f"Invalid mode {mode} for financial datasets API.")

@ -20,6 +20,7 @@ EXPECTED_ALL = [
"AzureCogsSpeech2TextTool",
"AzureCogsText2SpeechTool",
"AzureCogsTextAnalyticsHealthTool",
"BalanceSheets",
"BaseGraphQLTool",
"BaseRequestsTool",
"BaseSQLDatabaseTool",
@ -29,6 +30,7 @@ EXPECTED_ALL = [
"BingSearchResults",
"BingSearchRun",
"BraveSearch",
"CashFlowStatements",
"ClickTool",
"CogniswitchKnowledgeSourceFile",
"CogniswitchKnowledgeSourceURL",
@ -68,6 +70,7 @@ EXPECTED_ALL = [
"GoogleSerperRun",
"HumanInputRun",
"IFTTTWebhook",
"IncomeStatements",
"InfoPowerBITool",
"InfoSQLDatabaseTool",
"InfoSparkSQLTool",

Loading…
Cancel
Save