langchain/docs/modules/agents/streaming_stdout_final_only.ipynb

{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "23234b50-e6c6-4c87-9f97-259c15f36894",
   "metadata": {
    "tags": []
   },
   "source": [
    "# Only streaming final agent output"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "29dd6333-307c-43df-b848-65001c01733b",
   "metadata": {},
   "source": [
    "If you only want the final output of an agent to be streamed, you can use the callback ``FinalStreamingStdOutCallbackHandler``.\n",
    "For this, the underlying LLM has to support streaming as well."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "e4592215-6604-47e2-89ff-5db3af6d1e40",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from langchain.agents import load_tools\n",
    "from langchain.agents import initialize_agent\n",
    "from langchain.agents import AgentType\n",
    "from langchain.callbacks.streaming_stdout_final_only import FinalStreamingStdOutCallbackHandler\n",
    "from langchain.llms import OpenAI"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "19a813f7",
   "metadata": {},
   "source": [
    "Let's create the underlying LLM with ``streaming = True`` and pass a new instance of ``FinalStreamingStdOutCallbackHandler``."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "7fe81ef4",
   "metadata": {},
   "outputs": [],
   "source": [
    "llm = OpenAI(streaming=True, callbacks=[FinalStreamingStdOutCallbackHandler()], temperature=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "ff45b85d",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " Konrad Adenauer became Chancellor of Germany in 1949, 74 years ago in 2023."
     ]
    },
    {
     "data": {
      "text/plain": [
       "'Konrad Adenauer became Chancellor of Germany in 1949, 74 years ago in 2023.'"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tools = load_tools([\"wikipedia\", \"llm-math\"], llm=llm)\n",
    "agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=False)\n",
    "agent.run(\"It's 2023 now. How many years ago did Konrad Adenauer become Chancellor of Germany.\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "53a743b8",
   "metadata": {},
   "source": [
    "### Handling custom answer prefixes"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "23602c62",
   "metadata": {},
   "source": [
    "By default, we assume that the token sequence ``\"Final\", \"Answer\", \":\"`` indicates that the agent has reached an answers. We can, however, also pass a custom sequence to use as answer prefix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "5662a638",
   "metadata": {},
   "outputs": [],
   "source": [
    "llm = OpenAI(\n",
    "    streaming=True,\n",
    "    callbacks=[FinalStreamingStdOutCallbackHandler(answer_prefix_tokens=[\"The\", \"answer\", \":\"])],\n",
    "    temperature=0\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "b1a96cc0",
   "metadata": {},
   "source": [
    "For convenience, the callback automatically strips whitespaces and new line characters when comparing to `answer_prefix_tokens`. I.e., if `answer_prefix_tokens = [\"The\", \" answer\", \":\"]` then both `[\"\\nThe\", \" answer\", \":\"]` and `[\"The\", \" answer\", \":\"]` would be recognized a the answer prefix."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "9278b522",
   "metadata": {},
   "source": [
    "If you don't know the tokenized version of your answer prefix, you can determine it with the following code:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2f8f0640",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.callbacks.base import BaseCallbackHandler\n",
    "\n",
    "class MyCallbackHandler(BaseCallbackHandler):\n",
    "    def on_llm_new_token(self, token, **kwargs) -> None:\n",
    "        # print every token on a new line\n",
    "        print(f\"#{token}#\")\n",
    "\n",
    "llm = OpenAI(streaming=True, callbacks=[MyCallbackHandler()])\n",
    "tools = load_tools([\"wikipedia\", \"llm-math\"], llm=llm)\n",
    "agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=False)\n",
    "agent.run(\"It's 2023 now. How many years ago did Konrad Adenauer become Chancellor of Germany.\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "61190e58",
   "metadata": {},
   "source": [
    "### Also streaming the answer prefixes"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "1255776f",
   "metadata": {},
   "source": [
    "When the parameter `stream_prefix = True` is set, the answer prefix itself will also be streamed. This can be useful when the answer prefix itself is part of the answer. For example, when your answer is a JSON like\n",
    "\n",
    "`\n",
    "{\n",
    "    \"action\": \"Final answer\",\n",
    "    \"action_input\": \"Konrad Adenauer became Chancellor 74 years ago.\"\n",
    "}\n",
    "`\n",
    "\n",
    "and you don't only want the action_input to be streamed, but the entire JSON."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}