langchain/cookbook/anthropic_structured_output...

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "6db54519-b98e-47c2-8dc2-600f6140a3aa",
   "metadata": {},
   "source": [
    "## Tool Use with Anthropic API for structured outputs\n",
    "\n",
    "Anthropic API recently added tool use.\n",
    "\n",
    "This is very useful for structured output."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8990ec23-8ae1-4580-b220-4b00c05637d2",
   "metadata": {},
   "outputs": [],
   "source": [
    "! pip install -U langchain-anthropic"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6b966914-502b-499c-a4cf-e390106dd506",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Optional\n",
    "import os\n",
    "# os.environ['LANGCHAIN_TRACING_V2'] = 'true' # enables tracing\n",
    "# os.environ['LANGCHAIN_API_KEY'] = <your-api-key>"
   ]
  },
  {
   "attachments": {
    "83c97bfe-b9b2-48ef-95cf-06faeebaa048.png": {
     "image/png": "iVBORw0KGgoAAAANSUhEUgAACAIAAAIqCAYAAAC0FXoTAAAMP2lDQ1BJQ0MgUHJvZmlsZQAASImVVwdYU8kWnluSkEBCCSAgJfQmCEgJICWEFkB6EWyEJEAoMQaCiB1dVHDtYgEbuiqi2AGxI3YWwd4XRRSUdbFgV96kgK77yvfO9829//3nzH/OnDu3DADqp7hicQ6qAUCuKF8SGxLAGJucwiB1AwTggAYIgMDl5YlZ0dERANrg+e/27ib0hnbNQab1z/7/app8QR4PACQa4jR+Hi8X4kMA4JU8sSQfAKKMN5+aL5Zh2IC2BCYI8UIZzlDgShlOU+B9cp/4WDbEzQCoqHG5kgwAaG2QZxTwMqAGrQ9iJxFfKAJAnQGxb27uZD7EqRDbQB8xxDJ9ZtoPOhl/00wb0uRyM4awYi5yUwkU5olzuNP+z3L8b8vNkQ7GsIJNLVMSGiubM6zb7ezJ4TKsBnGvKC0yCmItiD8I+XJ/iFFKpjQ0QeGPGvLy2LBmQBdiJz43MBxiQ4iDRTmREUo+LV0YzIEYrhC0UJjPiYdYD+KFgrygOKXPZsnkWGUstC5dwmYp+QtciTyuLNZDaXYCS6n/OlPAUepjtKLM+CSIKRBbFAgTIyGmQeyYlx0XrvQZXZTJjhz0kUhjZflbQBwrEIUEKPSxgnRJcKzSvzQ3b3C+2OZMISdSiQ/kZ8aHKuqDNfO48vzhXLA2gYiVMKgjyBsbMTgXviAwSDF3rFsgSohT6nwQ5wfEKsbiFHFOtNIfNxPkhMh4M4hd8wrilGPxxHy4IBX6eLo4PzpekSdelMUNi1bkgy8DEYANAgEDSGFLA5NBFhC29tb3witFTzDgAgnIAALgoGQGRyTJe0TwGAeKwJ8QCUDe0LgAea8AFED+6xCrODqAdHlvgXxENngKcS4IBznwWiofJRqKlgieQEb4j+hc2Hgw3xzYZP3/nh9kvzMsyEQoGelgRIb6oCcxiBhIDCUGE21xA9wX98Yj4NEfNheciXsOzuO7P+EpoZ3wmHCD0EG4M0lYLPkpyzGgA+oHK2uR9mMtcCuo6YYH4D5QHSrjurgBcMBdYRwW7gcju0GWrcxbVhXGT9p/m8EPd0PpR3Yio+RhZH+yzc8jaXY0tyEVWa1/rI8i17SherOHen6Oz/6h+nx4Dv/ZE1uIHcTOY6exi9gxrB4wsJNYA9aCHZfhodX1RL66BqPFyvPJhjrCf8QbvLOySuY51Tj1OH1R9OULCmXvaMCeLJ4mEWZk5jNY8IsgYHBEPMcRDBcnF1cAZN8XxevrTYz8u4Hotnzn5v0BgM/JgYGBo9+5sJMA7PeAj/+R75wNE346VAG4cIQnlRQoOFx2IMC3hDp80vSBMTAHNnA+LsAdeAN/EATCQBSIB8lgIsw+E65zCZgKZoC5oASUgWVgNVgPNoGtYCfYAw6AenAMnAbnwGXQBm6Ae3D1dIEXoA+8A58RBCEhVISO6CMmiCVij7ggTMQXCUIikFgkGUlFMhARIkVmIPOQMmQFsh7ZglQj+5EjyGnkItKO3EEeIT3Ia+QTiqFqqDZqhFqhI1EmykLD0Xh0ApqBTkGL0PnoEnQtWoXuRuvQ0+hl9Abagb5A+zGAqWK6mCnmgDExNhaFpWDpmASbhZVi5VgVVos1wvt8DevAerGPOBGn4wzcAa7gUDwB5+FT8Fn4Ynw9vhOvw5vxa/gjvA//RqASDAn2BC8ChzCWkEGYSighlBO2Ew4TzsJnqYvwjkgk6hKtiR7wWUwmZhGnExcTNxD3Ek8R24mdxH4SiaRPsif5kKJIXFI+qYS0jrSbdJJ0ldRF+qCiqmKi4qISrJKiIlIpVilX2aVyQuWqyjOVz2QNsiXZixxF5pOnkZeSt5EbyVfIXeTPFE2KNcWHEk/JosylrKXUUs5S7lPeqKqqmql6qsaoClXnqK5V3ad6QfWR6kc1LTU7NbbaeDWp2hK1HWqn1O6ovaFSqVZUf2oKNZ+6hFpNPUN9SP1Ao9McaRwanzabVkGro12lvVQnq1uqs9Qnqhepl6sfVL+i3qtB1rDSYGtwNWZpVGgc0bil0a9J13TWjNLM1VysuUvzoma3FknLSitIi681X2ur1hmtTjpGN6ez6Tz6PPo2+ll6lzZR21qbo52lXaa9R7tVu09HS8dVJ1GnUKdC57hOhy6ma6XL0c3RXap7QPem7qdhRsNYwwTDFg2rHXZ12Hu94Xr+egK9Ur29ejf0Pukz9IP0s/WX69frPzDADewMYgymGmw0OGvQO1x7uPdw3vDS4QeG3zVEDe0MYw2nG241bDHsNzI2CjESG60zOmPUa6xr7G+cZbzK+IRxjwndxNdEaLLK5KTJc4YOg8XIYaxlNDP6TA1NQ02lpltMW00/m1mbJZgVm+01e2BOMWeap5uvMm8y77MwsRhjMcOixuKuJdmSaZlpucbyvOV7K2urJKsFVvVW3dZ61hzrIusa6/s2VBs/myk2VTbXbYm2TNts2w22bXaonZtdpl2F3RV71N7dXmi/wb59BGGE5wjRiKoRtxzUHFgOBQ41Do8cdR0jHIsd6x1fjrQYmTJy+cjzI785uTnlOG1zuues5RzmXOzc6Pzaxc6F51Lhcn0UdVTwqNmjGka9crV3FbhudL3tRncb47bArcntq7uHu8S91r3Hw8Ij1aPS4xZTmxnNXMy84EnwDPCc7XnM86OXu1e+1wGvv7wdvLO9d3l3j7YeLRi9bXSnj5kP12eLT4cvwzfVd7Nvh5+pH9evyu+xv7k/33+7/zOWLSuLtZv1MsApQBJwOOA924s9k30qEAsMCSwNbA3SCkoIWh/0MNgsOCO4JrgvxC1kesipUEJoeOjy0FscIw6PU83pC/MImxnWHK4WHhe+PvxxhF2EJKJxDDombMzKMfcjLSNFkfVRIIoTtTLqQbR19JToozHEmOiYipinsc6xM2LPx9HjJsXtinsXHxC/NP5egk2CNKEpUT1xfGJ14vukwKQVSR1jR46dOfZyskGyMLkhhZSSmLI9pX9c0LjV47rGu40vGX9zgvWEwgkXJxpMzJl4fJL6JO6kg6mE1KTUXalfuFHcKm5/GietMq2Px+at4b3g+/NX8XsEPoIVgmfpPukr0rszfDJWZvRk+mWWZ/YK2cL1wldZoVmbst5nR2XvyB7IScrZm6uSm5p7RKQlyhY1TzaeXDi5XWwvLhF3TPGasnpKnyRcsj0PyZuQ15CvDX/kW6Q20l+kjwp8CyoKPkxNnHqwULNQVNgyzW7aomnPioKLfpuOT+dNb5phOmPujEczWTO3zEJmpc1qmm0+e/7srjkhc3bOpczNnvt7sVPxiuK385LmNc43mj9nfucvIb/UlNBKJCW3Fngv2LQQXyhc2Lpo1KJ1i76V8ksvlTmVlZd9WcxbfOlX51/X/jqwJH1J61L3pRuXEZeJlt1c7rd85wrNFUUrOleOWVm3irGqdNXb1ZNWXyx3Ld+0hrJGuqZjbcTahnUW65at+7I+c/2NioCKvZWGlYsq32/gb7i60X9j7SajTWWbPm0Wbr69JWRLXZVVVflW4taCrU+3JW47/xvzt+rtBtvLtn/dIdrRsTN2Z3O1R3X1LsNdS2vQGmlNz+7xu9v2BO5pqHWo3bJXd2/ZPrBPuu/5/tT9Nw+EH2g6yDxYe8jyUOVh+uHSOqRuWl1ffWZ9R0NyQ/uRsCNNjd6Nh486Ht1xzPRYxXGd40tPUE7MPzFwsuhk/ynxqd7TGac7myY13Tsz9sz15pjm1rPhZy+cCz535jzr/MkLPheOXfS6eOQS81L9ZffLdS1uLYd/d/v9cKt7a90VjysNbZ5tje2j209c9bt6+lrgtXPXOdcv34i80X4z4ebtW+Nvddzm3+6+k3Pn1d2Cu5/vzblPuF/6QONB+UPDh1V/2P6xt8O94/ijwEctj+Me3+vkdb54kvfkS9f8p9Sn5c9MnlV3u3Qf6wnuaXs+7nnXC/GLz70lf2r+WfnS5uWhv/z/aukb29f1SvJq4PXiN/pvdrx1fdvUH93/8F3uu8/vSz/of9j5kfnx/KekT88+T/1C+rL2q+3Xxm/h
    }
   },
   "cell_type": "markdown",
   "id": "81897f2b-5936-4fa0-9445-3edb3af22da7",
   "metadata": {},
   "source": [
    "`How can we use tools to produce structured output?`\n",
    "\n",
    "Function call / tool use just generates a payload.\n",
    "\n",
    "Payload often a JSON string, which can be pass to an API or, in this case, a parser to produce structured output.\n",
    "\n",
    "LangChain has `llm.with_structured_output(schema)` to make it very easy to produce structured output that matches `schema`.\n",
    "\n",
    "![Screenshot 2024-04-03 at 10.16.57 PM.png](attachment:83c97bfe-b9b2-48ef-95cf-06faeebaa048.png)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9caa2aaf-1918-4a8a-982d-f8052b92ed44",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "from langchain_anthropic import ChatAnthropic\n",
    "from langchain_core.prompts import ChatPromptTemplate\n",
    "from langchain_core.pydantic_v1 import BaseModel, Field\n",
    "\n",
    "\n",
    "# Data model\n",
    "class code(BaseModel):\n",
    "    \"\"\"Code output\"\"\"\n",
    "\n",
    "    prefix: str = Field(description=\"Description of the problem and approach\")\n",
    "    imports: str = Field(description=\"Code block import statements\")\n",
    "    code: str = Field(description=\"Code block not including import statements\")\n",
    "\n",
    "\n",
    "# LLM\n",
    "llm = ChatAnthropic(\n",
    "    model=\"claude-3-opus-20240229\",\n",
    "    default_headers={\"anthropic-beta\": \"tools-2024-04-04\"},\n",
    ")\n",
    "\n",
    "# Structured output, including raw will capture raw output and parser errors\n",
    "structured_llm = llm.with_structured_output(code, include_raw=True)\n",
    "code_output = structured_llm.invoke(\n",
    "    \"Write a python program that prints the string 'hello world' and tell me how it works in a sentence\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "9025bfdc-6060-4042-9a61-4e361dda7087",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'text': \"<thinking>\\nThe tool 'code' is relevant for writing a Python program to print a string.\\n\\nTo use the 'code' tool, I need values for these required parameters:\\nprefix: A description of the problem and approach. I can provide this based on the request.\\nimports: The import statements needed for the code. For this simple program, no imports are needed, so I can leave this blank.\\ncode: The actual Python code, not including imports. I can write a simple print statement to output the string.\\n\\nI have all the required parameters, so I can proceed with calling the 'code' tool.\\n</thinking>\",\n",
       " 'type': 'text'}"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Initial reasoning stage\n",
    "code_output[\"raw\"].content[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "2393d9b6-67a2-41ea-ac01-dc038b4800f5",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'text': None,\n",
       " 'type': 'tool_use',\n",
       " 'id': 'toolu_01UwZVQub6vL36wiBww6CU7a',\n",
       " 'name': 'code',\n",
       " 'input': {'prefix': \"To print the string 'hello world' in Python:\",\n",
       "  'imports': '',\n",
       "  'code': \"print('hello world')\"}}"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Tool call\n",
    "code_output[\"raw\"].content[1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "f4f390ac-fbda-4173-892a-ffd12844228c",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'prefix': \"To print the string 'hello world' in Python:\",\n",
       " 'imports': '',\n",
       " 'code': \"print('hello world')\"}"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# JSON str\n",
    "code_output[\"raw\"].content[1][\"input\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "ba77d0f8-f79b-4656-9023-085ffdaf35f5",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Error\n",
    "error = code_output[\"parsing_error\"]\n",
    "error"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "cd854451-68d7-43df-bcae-4f3c3565536a",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Result\n",
    "parsed_result = code_output[\"parsed\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "47b3405f-0aea-460e-8603-f6092019fcd4",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"To print the string 'hello world' in Python:\""
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "parsed_result.prefix"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "85b16b62-1b72-4b6e-81fa-b1d707b728fa",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "''"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "parsed_result.imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "23857441-3e67-460c-b6be-b57cf0dd17ad",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"print('hello world')\""
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "parsed_result.code"
   ]
  },
  {
   "attachments": {
    "bb6c7126-7667-433f-ba50-56107b0341bd.png": {
     "image/png": "iVBORw0KGgoAAAANSUhEUgAABEMAAAIyCAYAAAAg8gS8AAAMP2lDQ1BJQ0MgUHJvZmlsZQAASImVVwdYU8kWnluSkEBCCSAgJfQmCEgJICWEFkB6EWyEJEAoMQaCiB1dVHDtYgEbuiqi2AGxI3YWwd4XRRSUdbFgV96kgK77yvfO9829//3nzH/OnDu3DADqp7hicQ6qAUCuKF8SGxLAGJucwiB1AwTggAYIgMDl5YlZ0dERANrg+e/27ib0hnbNQab1z/7/app8QR4PACQa4jR+Hi8X4kMA4JU8sSQfAKKMN5+aL5Zh2IC2BCYI8UIZzlDgShlOU+B9cp/4WDbEzQCoqHG5kgwAaG2QZxTwMqAGrQ9iJxFfKAJAnQGxb27uZD7EqRDbQB8xxDJ9ZtoPOhl/00wb0uRyM4awYi5yUwkU5olzuNP+z3L8b8vNkQ7GsIJNLVMSGiubM6zb7ezJ4TKsBnGvKC0yCmItiD8I+XJ/iFFKpjQ0QeGPGvLy2LBmQBdiJz43MBxiQ4iDRTmREUo+LV0YzIEYrhC0UJjPiYdYD+KFgrygOKXPZsnkWGUstC5dwmYp+QtciTyuLNZDaXYCS6n/OlPAUepjtKLM+CSIKRBbFAgTIyGmQeyYlx0XrvQZXZTJjhz0kUhjZflbQBwrEIUEKPSxgnRJcKzSvzQ3b3C+2OZMISdSiQ/kZ8aHKuqDNfO48vzhXLA2gYiVMKgjyBsbMTgXviAwSDF3rFsgSohT6nwQ5wfEKsbiFHFOtNIfNxPkhMh4M4hd8wrilGPxxHy4IBX6eLo4PzpekSdelMUNi1bkgy8DEYANAgEDSGFLA5NBFhC29tb3witFTzDgAgnIAALgoGQGRyTJe0TwGAeKwJ8QCUDe0LgAea8AFED+6xCrODqAdHlvgXxENngKcS4IBznwWiofJRqKlgieQEb4j+hc2Hgw3xzYZP3/nh9kvzMsyEQoGelgRIb6oCcxiBhIDCUGE21xA9wX98Yj4NEfNheciXsOzuO7P+EpoZ3wmHCD0EG4M0lYLPkpyzGgA+oHK2uR9mMtcCuo6YYH4D5QHSrjurgBcMBdYRwW7gcju0GWrcxbVhXGT9p/m8EPd0PpR3Yio+RhZH+yzc8jaXY0tyEVWa1/rI8i17SherOHen6Oz/6h+nx4Dv/ZE1uIHcTOY6exi9gxrB4wsJNYA9aCHZfhodX1RL66BqPFyvPJhjrCf8QbvLOySuY51Tj1OH1R9OULCmXvaMCeLJ4mEWZk5jNY8IsgYHBEPMcRDBcnF1cAZN8XxevrTYz8u4Hotnzn5v0BgM/JgYGBo9+5sJMA7PeAj/+R75wNE346VAG4cIQnlRQoOFx2IMC3hDp80vSBMTAHNnA+LsAdeAN/EATCQBSIB8lgIsw+E65zCZgKZoC5oASUgWVgNVgPNoGtYCfYAw6AenAMnAbnwGXQBm6Ae3D1dIEXoA+8A58RBCEhVISO6CMmiCVij7ggTMQXCUIikFgkGUlFMhARIkVmIPOQMmQFsh7ZglQj+5EjyGnkItKO3EEeIT3Ia+QTiqFqqDZqhFqhI1EmykLD0Xh0ApqBTkGL0PnoEnQtWoXuRuvQ0+hl9Abagb5A+zGAqWK6mCnmgDExNhaFpWDpmASbhZVi5VgVVos1wvt8DevAerGPOBGn4wzcAa7gUDwB5+FT8Fn4Ynw9vhOvw5vxa/gjvA//RqASDAn2BC8ChzCWkEGYSighlBO2Ew4TzsJnqYvwjkgk6hKtiR7wWUwmZhGnExcTNxD3Ek8R24mdxH4SiaRPsif5kKJIXFI+qYS0jrSbdJJ0ldRF+qCiqmKi4qISrJKiIlIpVilX2aVyQuWqyjOVz2QNsiXZixxF5pOnkZeSt5EbyVfIXeTPFE2KNcWHEk/JosylrKXUUs5S7lPeqKqqmql6qsaoClXnqK5V3ad6QfWR6kc1LTU7NbbaeDWp2hK1HWqn1O6ovaFSqVZUf2oKNZ+6hFpNPUN9SP1Ao9McaRwanzabVkGro12lvVQnq1uqs9Qnqhepl6sfVL+i3qtB1rDSYGtwNWZpVGgc0bil0a9J13TWjNLM1VysuUvzoma3FknLSitIi681X2ur1hmtTjpGN6ez6Tz6PPo2+ll6lzZR21qbo52lXaa9R7tVu09HS8dVJ1GnUKdC57hOhy6ma6XL0c3RXap7QPem7qdhRsNYwwTDFg2rHXZ12Hu94Xr+egK9Ur29ejf0Pukz9IP0s/WX69frPzDADewMYgymGmw0OGvQO1x7uPdw3vDS4QeG3zVEDe0MYw2nG241bDHsNzI2CjESG60zOmPUa6xr7G+cZbzK+IRxjwndxNdEaLLK5KTJc4YOg8XIYaxlNDP6TA1NQ02lpltMW00/m1mbJZgVm+01e2BOMWeap5uvMm8y77MwsRhjMcOixuKuJdmSaZlpucbyvOV7K2urJKsFVvVW3dZ61hzrIusa6/s2VBs/myk2VTbXbYm2TNts2w22bXaonZtdpl2F3RV71N7dXmi/wb59BGGE5wjRiKoRtxzUHFgOBQ41Do8cdR0jHIsd6x1fjrQYmTJy+cjzI785uTnlOG1zuues5RzmXOzc6Pzaxc6F51Lhcn0UdVTwqNmjGka9crV3FbhudL3tRncb47bArcntq7uHu8S91r3Hw8Ij1aPS4xZTmxnNXMy84EnwDPCc7XnM86OXu1e+1wGvv7wdvLO9d3l3j7YeLRi9bXSnj5kP12eLT4cvwzfVd7Nvh5+pH9evyu+xv7k/33+7/zOWLSuLtZv1MsApQBJwOOA924s9k30qEAsMCSwNbA3SCkoIWh/0MNgsOCO4JrgvxC1kesipUEJoeOjy0FscIw6PU83pC/MImxnWHK4WHhe+PvxxhF2EJKJxDDombMzKMfcjLSNFkfVRIIoTtTLqQbR19JToozHEmOiYipinsc6xM2LPx9HjJsXtinsXHxC/NP5egk2CNKEpUT1xfGJ14vukwKQVSR1jR46dOfZyskGyMLkhhZSSmLI9pX9c0LjV47rGu40vGX9zgvWEwgkXJxpMzJl4fJL6JO6kg6mE1KTUXalfuFHcKm5/GietMq2Px+at4b3g+/NX8XsEPoIVgmfpPukr0rszfDJWZvRk+mWWZ/YK2cL1wldZoVmbst5nR2XvyB7IScrZm6uSm5p7RKQlyhY1TzaeXDi5XWwvLhF3TPGasnpKnyRcsj0PyZuQ15CvDX/kW6Q20l+kjwp8CyoKPkxNnHqwULNQVNgyzW7aomnPioKLfpuOT+dNb5phOmPujEczWTO3zEJmpc1qmm0+e/7srjkhc3bOpczNnvt7sVPxiuK385LmNc43mj9nfucvIb/UlNBKJCW3Fngv2LQQXyhc2Lpo1KJ1i76V8ksvlTmVlZd9WcxbfOlX51/X/jqwJH1J61L3pRuXEZeJlt1c7rd85wrNFUUrOleOWVm3irGqdNXb1ZNWXyx3Ld+0hrJGuqZjbcTahnUW65at+7I+c/2NioCKvZWGlYsq32/gb7i60X9j7SajTWWbPm0Wbr69JWRLXZVVVflW4taCrU+3JW47/xvzt+rtBtvLtn/dIdrRsTN2Z3O1R3X1LsNdS2vQGmlNz+7xu9v2BO5pqHWo3bJXd2/ZPrBPuu/5/tT9Nw+EH2g6yDxYe8jyUOVh+uHSOqRuWl1ffWZ9R0NyQ/uRsCNNjd6Nh486Ht1xzPRYxXGd40tPUE7MPzFwsuhk/ynxqd7TGac7myY13Tsz9sz15pjm1rPhZy+cCz535jzr/MkLPheOXfS6eOQS81L9ZffLdS1uLYd/d/v9cKt7a90VjysNbZ5tje2j209c9bt6+lrgtXPXOdcv34i80X4z4ebtW+Nvddzm3+6+k3Pn1d2Cu5/vzblPuF/6QONB+UPDh1V/2P6xt8O94/ijwEctj+Me3+vkdb54kvfkS9f8p9Sn5c9MnlV3u3Qf6wnuaXs+7nnXC/GLz70lf2r+WfnS5uWhv/z/aukb29f1SvJq4PXiN/pvdrx1fdvUH93/8F3uu8/vSz/of9j5kfnx/KekT88+T/1C+rL2q+3Xxm/h
    }
   },
   "cell_type": "markdown",
   "id": "74b6c1f0-db28-4b43-ac31-92636dea7b56",
   "metadata": {},
   "source": [
    "## More challenging example\n",
    "\n",
    "Motivating example for tool use / structured outputs.\n",
    "\n",
    "![code-gen.png](attachment:bb6c7126-7667-433f-ba50-56107b0341bd.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8f387528-6535-4bc0-a2a6-8480ccf35394",
   "metadata": {},
   "source": [
    "Here are some docs that we want to answer code questions about."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "97dd1b8c-724a-436a-88b1-b38204fc81f5",
   "metadata": {},
   "outputs": [],
   "source": [
    "from bs4 import BeautifulSoup as Soup\n",
    "from langchain_community.document_loaders.recursive_url_loader import RecursiveUrlLoader\n",
    "\n",
    "# LCEL docs\n",
    "url = \"https://python.langchain.com/docs/expression_language/\"\n",
    "loader = RecursiveUrlLoader(\n",
    "    url=url, max_depth=20, extractor=lambda x: Soup(x, \"html.parser\").text\n",
    ")\n",
    "docs = loader.load()\n",
    "\n",
    "# Sort the list based on the URLs and get the text\n",
    "d_sorted = sorted(docs, key=lambda x: x.metadata[\"source\"])\n",
    "d_reversed = list(reversed(d_sorted))\n",
    "concatenated_content = \"\\n\\n\\n --- \\n\\n\\n\".join(\n",
    "    [doc.page_content for doc in d_reversed]\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5205cd42-8673-4699-9bb4-2cf90bfe098c",
   "metadata": {},
   "source": [
    "Problem:\n",
    "\n",
    "`What if we want to enforce tool use?`\n",
    "\n",
    "We can use fallbacks.\n",
    "\n",
    "Let's select a code gen prompt that -- from some of my testing -- does not correctly invoke the tool.\n",
    "\n",
    "We can see if we can correct from this."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "94e77be5-dddb-4386-b523-6f1136150bbd",
   "metadata": {},
   "outputs": [],
   "source": [
    "# This code gen prompt invokes tool use\n",
    "code_gen_prompt_working = ChatPromptTemplate.from_messages(\n",
    "    [\n",
    "        (\n",
    "            \"system\",\n",
    "            \"\"\"<instructions> You are a coding assistant with expertise in LCEL, LangChain expression language. \\n \n",
    "    Here is the LCEL documentation:  \\n ------- \\n  {context} \\n ------- \\n Answer the user  question based on the \\n \n",
    "    above provided documentation. Ensure any code you provide can be executed with all required imports and variables \\n\n",
    "    defined. Structure your answer: 1) a prefix describing the code solution, 2) the imports, 3) the functioning code block. \\n\n",
    "    Invoke the code tool to structure the output correctly. </instructions> \\n Here is the user question:\"\"\",\n",
    "        ),\n",
    "        (\"placeholder\", \"{messages}\"),\n",
    "    ]\n",
    ")\n",
    "\n",
    "# This code gen prompt does not invoke tool use\n",
    "code_gen_prompt_bad = ChatPromptTemplate.from_messages(\n",
    "    [\n",
    "        (\n",
    "            \"system\",\n",
    "            \"\"\"You are a coding assistant with expertise in LCEL, LangChain expression language. \\n \n",
    "    Here is a full set of LCEL documentation:  \\n ------- \\n  {context} \\n ------- \\n Answer the user \n",
    "    question based on the above provided documentation. Ensure any code you provide can be executed \\n \n",
    "    with all required imports and variables defined. Structure your answer with a description of the code solution. \\n\n",
    "    Then list the imports. And finally list the functioning code block. Here is the user question:\"\"\",\n",
    "        ),\n",
    "        (\"placeholder\", \"{messages}\"),\n",
    "    ]\n",
    ")\n",
    "\n",
    "\n",
    "# Data model\n",
    "class code(BaseModel):\n",
    "    \"\"\"Code output\"\"\"\n",
    "\n",
    "    prefix: str = Field(description=\"Description of the problem and approach\")\n",
    "    imports: str = Field(description=\"Code block import statements\")\n",
    "    code: str = Field(description=\"Code block not including import statements\")\n",
    "    description = \"Schema for code solutions to questions about LCEL.\"\n",
    "\n",
    "\n",
    "# LLM\n",
    "llm = ChatAnthropic(\n",
    "    model=\"claude-3-opus-20240229\",\n",
    "    default_headers={\"anthropic-beta\": \"tools-2024-04-04\"},\n",
    ")\n",
    "\n",
    "# Structured output\n",
    "# Include raw will capture raw output and parser errors\n",
    "structured_llm = llm.with_structured_output(code, include_raw=True)\n",
    "\n",
    "\n",
    "# Check for errors\n",
    "def check_claude_output(tool_output):\n",
    "    \"\"\"Check for parse error or failure to call the tool\"\"\"\n",
    "\n",
    "    # Error with parsing\n",
    "    if tool_output[\"parsing_error\"]:\n",
    "        # Report back output and parsing errors\n",
    "        print(\"Parsing error!\")\n",
    "        raw_output = str(code_output[\"raw\"].content)\n",
    "        error = tool_output[\"parsing_error\"]\n",
    "        raise ValueError(\n",
    "            f\"Error parsing your output! Be sure to invoke the tool. Output: {raw_output}. \\n Parse error: {error}\"\n",
    "        )\n",
    "\n",
    "    # Tool was not invoked\n",
    "    elif not tool_output[\"parsed\"]:\n",
    "        print(\"Failed to invoke tool!\")\n",
    "        raise ValueError(\n",
    "            \"You did not use the provided tool! Be sure to invoke the tool to structure the output.\"\n",
    "        )\n",
    "    return tool_output\n",
    "\n",
    "\n",
    "# Chain with output check\n",
    "code_chain = code_gen_prompt_bad | structured_llm | check_claude_output"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1b915baf-8b1d-43e8-b962-3e73b135dade",
   "metadata": {},
   "source": [
    "Let's add a check and re-try."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "efae1ff7-4413-4c47-a403-1630dd453219",
   "metadata": {},
   "outputs": [],
   "source": [
    "def insert_errors(inputs):\n",
    "    \"\"\"Insert errors in the messages\"\"\"\n",
    "\n",
    "    # Get errors\n",
    "    error = inputs[\"error\"]\n",
    "    messages = inputs[\"messages\"]\n",
    "    messages += [\n",
    "        (\n",
    "            \"user\",\n",
    "            f\"Retry. You are required to fix the parsing errors: {error} \\n\\n You must invoke the provided tool.\",\n",
    "        )\n",
    "    ]\n",
    "    return {\n",
    "        \"messages\": messages,\n",
    "        \"context\": inputs[\"context\"],\n",
    "    }\n",
    "\n",
    "\n",
    "# This will be run as a fallback chain\n",
    "fallback_chain = insert_errors | code_chain\n",
    "N = 3  # Max re-tries\n",
    "code_chain_re_try = code_chain.with_fallbacks(\n",
    "    fallbacks=[fallback_chain] * N, exception_key=\"error\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "c7712c49-ee8c-4a61-927e-3c0beb83782b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Failed to invoke tool!\n"
     ]
    }
   ],
   "source": [
    "# Test\n",
    "messages = [(\"user\", \"How do I build a RAG chain in LCEL?\")]\n",
    "code_output_lcel = code_chain_re_try.invoke(\n",
    "    {\"context\": concatenated_content, \"messages\": messages}\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "c8027a6f-6992-4bb4-9d6e-9d0778b04e28",
   "metadata": {},
   "outputs": [],
   "source": [
    "parsed_result_lcel = code_output_lcel[\"parsed\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "209186ac-3121-43a9-8358-86ace7e07f61",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"To build a RAG chain using LCEL, we'll use a vector store to retrieve relevant documents, a prompt template that incorporates the retrieved context, a chat model (like OpenAI) to generate a response based on the prompt, and an output parser to clean up the model output.\""
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "parsed_result_lcel.prefix"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "b8d6d189-e5df-49b6-ada8-83f6c0b26886",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'from langchain_community.vectorstores import DocArrayInMemorySearch\\nfrom langchain_core.output_parsers import StrOutputParser\\nfrom langchain_core.prompts import ChatPromptTemplate\\nfrom langchain_core.runnables import RunnablePassthrough\\nfrom langchain_openai import ChatOpenAI, OpenAIEmbeddings'"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "parsed_result_lcel.imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "e3822253-d28b-4f7e-9364-79974d04eff1",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'vectorstore = DocArrayInMemorySearch.from_texts(\\n    [\"harrison worked at kensho\", \"bears like to eat honey\"], \\n    embedding=OpenAIEmbeddings(),\\n)\\n\\nretriever = vectorstore.as_retriever()\\n\\ntemplate = \"\"\"Answer the question based only on the following context:\\n{context}\\nQuestion: {question}\"\"\"\\nprompt = ChatPromptTemplate.from_template(template)\\n\\noutput_parser = StrOutputParser()\\n\\nrag_chain = (\\n    {\"context\": retriever, \"question\": RunnablePassthrough()} \\n    | prompt \\n    | ChatOpenAI()\\n    | output_parser\\n)\\n\\nprint(rag_chain.invoke(\"where did harrison work?\"))'"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "parsed_result_lcel.code"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "80d63a3d-bad8-4385-bd85-40ca95c260c6",
   "metadata": {},
   "source": [
    "Example trace catching an error and correcting:\n",
    "\n",
    "https://smith.langchain.com/public/f06e62cb-2fac-46ae-80cd-0470b3155eae/r"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5f70e45c-eb68-4679-979c-0c04502affd1",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}