docs: add response metadata page to v0.2 docs (#21489)

Adding this page from v0.1 docs: https://python.langchain.com/v0.1/docs/modules/model_io/chat/response_metadata/
3 weeks ago · 81ae184cc9
parent 13b01104c9
commit 81ae184cc9
5 changed files with 358 additions and 3 deletions
--- a/docs/docs/how_to/chat_token_usage_tracking.ipynb
+++ b/docs/docs/how_to/chat_token_usage_tracking.ipynb
@ -25,7 +25,7 @@
   "source": [
    "## Using AIMessage.response_metadata\n",
    "\n",
-    "A number of model providers return token usage information as part of the chat generation response. When available, this is included in the [`AIMessage.response_metadata`](https://api.python.langchain.com/en/latest/messages/langchain_core.messages.ai.AIMessage.html#langchain_core.messages.ai.AIMessage.response_metadata) field. Here's an example with OpenAI:"
+    "A number of model providers return token usage information as part of the chat generation response. When available, this is included in the [`AIMessage.response_metadata`](/docs/how_to/response_metadata) field. Here's an example with OpenAI:"
   ]
  },
  {
--- a/docs/docs/how_to/index.mdx
+++ b/docs/docs/how_to/index.mdx
@ -68,6 +68,7 @@ Chat Models are newer forms of language models that take messages in and output
 - [How to create a custom chat model class](/docs/how_to/custom_chat_model)
 - [How to stream a response back](/docs/how_to/chat_streaming)
 - [How to track token usage](/docs/how_to/chat_token_usage_tracking)
+- [How to track response metadata across providers](/docs/how_to/response_metadata)

 ### LLMs

--- a/docs/docs/how_to/response_metadata.ipynb
+++ b/docs/docs/how_to/response_metadata.ipynb
@ -0,0 +1,354 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "6bd1219b-f31c-41b0-95e6-3204ad894ac7",
+   "metadata": {},
+   "source": [
+    "# Response metadata\n",
+    "\n",
+    "Many model providers include some metadata in their chat generation responses. This metadata can be accessed via the `AIMessage.response_metadata: Dict` attribute. Depending on the model provider and model configuration, this can contain information like [token counts](/docs/how_to/chat_token_usage_tracking), [logprobs](/docs/how_to/logprobs), and more.\n",
+    "\n",
+    "Here's what the response metadata looks like for a few different providers:\n",
+    "\n",
+    "## OpenAI"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "161f5898-9976-4a75-943d-03eda1a40a60",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'token_usage': {'completion_tokens': 164,\n",
+       "  'prompt_tokens': 17,\n",
+       "  'total_tokens': 181},\n",
+       " 'model_name': 'gpt-4-turbo',\n",
+       " 'system_fingerprint': 'fp_76f018034d',\n",
+       " 'finish_reason': 'stop',\n",
+       " 'logprobs': None}"
+      ]
+     },
+     "execution_count": 2,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from langchain_openai import ChatOpenAI\n",
+    "\n",
+    "llm = ChatOpenAI(model=\"gpt-4-turbo\")\n",
+    "msg = llm.invoke([(\"human\", \"What's the oldest known example of cuneiform\")])\n",
+    "msg.response_metadata"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "98eab683-df03-44a1-a034-ebbe7c6851b6",
+   "metadata": {},
+   "source": [
+    "## Anthropic"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "61c43496-83b5-4d71-bd60-3e6d46c62a5e",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'id': 'msg_01CzQyD7BX8nkhDNfT1QqvEp',\n",
+       " 'model': 'claude-3-sonnet-20240229',\n",
+       " 'stop_reason': 'end_turn',\n",
+       " 'stop_sequence': None,\n",
+       " 'usage': {'input_tokens': 17, 'output_tokens': 296}}"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from langchain_anthropic import ChatAnthropic\n",
+    "\n",
+    "llm = ChatAnthropic(model=\"claude-3-sonnet-20240229\")\n",
+    "msg = llm.invoke([(\"human\", \"What's the oldest known example of cuneiform\")])\n",
+    "msg.response_metadata"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c1f24f69-18f6-43c1-8b26-3f88ec515259",
+   "metadata": {},
+   "source": [
+    "## Google VertexAI"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "39549336-25f5-4839-9846-f687cd77e59b",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'is_blocked': False,\n",
+       " 'safety_ratings': [{'category': 'HARM_CATEGORY_HATE_SPEECH',\n",
+       "   'probability_label': 'NEGLIGIBLE',\n",
+       "   'blocked': False},\n",
+       "  {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT',\n",
+       "   'probability_label': 'NEGLIGIBLE',\n",
+       "   'blocked': False},\n",
+       "  {'category': 'HARM_CATEGORY_HARASSMENT',\n",
+       "   'probability_label': 'NEGLIGIBLE',\n",
+       "   'blocked': False},\n",
+       "  {'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT',\n",
+       "   'probability_label': 'NEGLIGIBLE',\n",
+       "   'blocked': False}],\n",
+       " 'citation_metadata': None,\n",
+       " 'usage_metadata': {'prompt_token_count': 10,\n",
+       "  'candidates_token_count': 30,\n",
+       "  'total_token_count': 40}}"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from langchain_google_vertexai import ChatVertexAI\n",
+    "\n",
+    "llm = ChatVertexAI(model=\"gemini-pro\")\n",
+    "msg = llm.invoke([(\"human\", \"What's the oldest known example of cuneiform\")])\n",
+    "msg.response_metadata"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "bc4ef8bb-eee3-4266-b530-0af9b3b79fe9",
+   "metadata": {},
+   "source": [
+    "## Bedrock (Anthropic)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "id": "1e4ac668-4c6a-48ad-9a6f-7b291477b45d",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'model_id': 'anthropic.claude-v2',\n",
+       " 'usage': {'prompt_tokens': 19, 'completion_tokens': 371, 'total_tokens': 390}}"
+      ]
+     },
+     "execution_count": 16,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from langchain_aws import ChatBedrock\n",
+    "\n",
+    "llm = ChatBedrock(model_id=\"anthropic.claude-v2\")\n",
+    "msg = llm.invoke([(\"human\", \"What's the oldest known example of cuneiform\")])\n",
+    "msg.response_metadata"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ee040d15-5575-4309-a9e9-aed5a09c78e3",
+   "metadata": {},
+   "source": [
+    "## MistralAI"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "deb41321-52d0-4795-a40c-4a811a13d7b0",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'token_usage': {'prompt_tokens': 19,\n",
+       "  'total_tokens': 141,\n",
+       "  'completion_tokens': 122},\n",
+       " 'model': 'mistral-small',\n",
+       " 'finish_reason': 'stop'}"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from langchain_mistralai import ChatMistralAI\n",
+    "\n",
+    "llm = ChatMistralAI()\n",
+    "msg = llm.invoke([(\"human\", \"What's the oldest known example of cuneiform\")])\n",
+    "msg.response_metadata"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "297c7be4-9505-48ac-96c0-4dc2047cfe7f",
+   "metadata": {},
+   "source": [
+    "## Groq"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "744e14ec-ff50-4642-9893-ff7bdf8927ff",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'token_usage': {'completion_time': 0.243,\n",
+       "  'completion_tokens': 132,\n",
+       "  'prompt_time': 0.022,\n",
+       "  'prompt_tokens': 22,\n",
+       "  'queue_time': None,\n",
+       "  'total_time': 0.265,\n",
+       "  'total_tokens': 154},\n",
+       " 'model_name': 'mixtral-8x7b-32768',\n",
+       " 'system_fingerprint': 'fp_7b44c65f25',\n",
+       " 'finish_reason': 'stop',\n",
+       " 'logprobs': None}"
+      ]
+     },
+     "execution_count": 1,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from langchain_groq import ChatGroq\n",
+    "\n",
+    "llm = ChatGroq()\n",
+    "msg = llm.invoke([(\"human\", \"What's the oldest known example of cuneiform\")])\n",
+    "msg.response_metadata"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7cdeec00-8a8f-422a-8819-47c646578b65",
+   "metadata": {},
+   "source": [
+    "## TogetherAI"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "a984118e-a731-4864-bcea-7dc6c6b3d139",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'token_usage': {'completion_tokens': 208,\n",
+       "  'prompt_tokens': 20,\n",
+       "  'total_tokens': 228},\n",
+       " 'model_name': 'mistralai/Mixtral-8x7B-Instruct-v0.1',\n",
+       " 'system_fingerprint': None,\n",
+       " 'finish_reason': 'eos',\n",
+       " 'logprobs': None}"
+      ]
+     },
+     "execution_count": 2,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "import os\n",
+    "\n",
+    "from langchain_openai import ChatOpenAI\n",
+    "\n",
+    "llm = ChatOpenAI(\n",
+    "    base_url=\"https://api.together.xyz/v1\",\n",
+    "    api_key=os.environ[\"TOGETHER_API_KEY\"],\n",
+    "    model=\"mistralai/Mixtral-8x7B-Instruct-v0.1\",\n",
+    ")\n",
+    "msg = llm.invoke([(\"human\", \"What's the oldest known example of cuneiform\")])\n",
+    "msg.response_metadata"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3d5e0614-8dc2-4948-a0b5-dc76c7837a5a",
+   "metadata": {},
+   "source": [
+    "## FireworksAI"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 31,
+   "id": "6ae32a93-26db-41bb-95c2-38ddd5085fbe",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'token_usage': {'prompt_tokens': 19,\n",
+       "  'total_tokens': 219,\n",
+       "  'completion_tokens': 200},\n",
+       " 'model_name': 'accounts/fireworks/models/mixtral-8x7b-instruct',\n",
+       " 'system_fingerprint': '',\n",
+       " 'finish_reason': 'length',\n",
+       " 'logprobs': None}"
+      ]
+     },
+     "execution_count": 31,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from langchain_fireworks import ChatFireworks\n",
+    "\n",
+    "llm = ChatFireworks(model=\"accounts/fireworks/models/mixtral-8x7b-instruct\")\n",
+    "msg = llm.invoke([(\"human\", \"What's the oldest known example of cuneiform\")])\n",
+    "msg.response_metadata"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "poetry-venv-2",
+   "language": "python",
+   "name": "poetry-venv-2"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/docs/integrations/chat/huggingface.ipynb
+++ b/docs/docs/integrations/chat/huggingface.ipynb
@ -10,7 +10,7 @@
    "\n",
    "In particular, we will:\n",
    "1. Utilize the [HuggingFaceTextGenInference](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/llms/huggingface_text_gen_inference.py), [HuggingFaceEndpoint](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/llms/huggingface_endpoint.py), or [HuggingFaceHub](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/llms/huggingface_hub.py) integrations to instantiate an `LLM`.\n",
-    "2. Utilize the `ChatHuggingFace` class to enable any of these LLMs to interface with LangChain's [Chat Messages](docs/concepts#chat-models) abstraction.\n",
+    "2. Utilize the `ChatHuggingFace` class to enable any of these LLMs to interface with LangChain's [Chat Messages](/docs/concepts#chat-models) abstraction.\n",
    "3. Demonstrate how to use an open-source LLM to power an `ChatAgent` pipeline\n",
    "\n",
    "\n",
--- a/docs/docs/tutorials/graph.ipynb
+++ b/docs/docs/tutorials/graph.ipynb
@ -300,7 +300,7 @@
    "For more complex query-generation, we may want to create few-shot prompts or add query-checking steps. For advanced techniques like this and more check out:\n",
    "\n",
    "* [Prompting strategies](/docs/how_to/graph_prompting): Advanced prompt engineering techniques.\n",
-    "* [Mapping values](docs/how_to/graph_mapping): Techniques for mapping values from questions to database.\n",
+    "* [Mapping values](/docs/how_to/graph_mapping): Techniques for mapping values from questions to database.\n",
    "* [Semantic layer](/docs/how_to/graph_semantic): Techniques for implementing semantic layers.\n",
    "* [Constructing graphs](/docs/how_to/graph_constructing): Techniques for constructing knowledge graphs."
   ]