mirror of
https://github.com/hwchase17/langchain
synced 2024-10-29 17:07:25 +00:00
44ad9628c9
# Make FinalStreamingStdOutCallbackHandler more robust by ignoring new lines & white spaces `FinalStreamingStdOutCallbackHandler` doesn't work out of the box with `ChatOpenAI`, as it tokenized slightly differently than `OpenAI`. The response of `OpenAI` contains the tokens `["\nFinal", " Answer", ":"]` while `ChatOpenAI` contains `["Final", " Answer", ":"]`. This PR make `FinalStreamingStdOutCallbackHandler` more robust by ignoring new lines & white spaces when determining if the answer prefix has been reached. Fixes #5433 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: Tracing / Callbacks - @agola11 Twitter: [@UmerHAdil](https://twitter.com/@UmerHAdil) | Discord: RicChilligerDude#7589
209 lines
5.7 KiB
Plaintext
209 lines
5.7 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"id": "23234b50-e6c6-4c87-9f97-259c15f36894",
|
|
"metadata": {
|
|
"tags": []
|
|
},
|
|
"source": [
|
|
"# Only streaming final agent output"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"id": "29dd6333-307c-43df-b848-65001c01733b",
|
|
"metadata": {},
|
|
"source": [
|
|
"If you only want the final output of an agent to be streamed, you can use the callback ``FinalStreamingStdOutCallbackHandler``.\n",
|
|
"For this, the underlying LLM has to support streaming as well."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"id": "e4592215-6604-47e2-89ff-5db3af6d1e40",
|
|
"metadata": {
|
|
"tags": []
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"from langchain.agents import load_tools\n",
|
|
"from langchain.agents import initialize_agent\n",
|
|
"from langchain.agents import AgentType\n",
|
|
"from langchain.callbacks.streaming_stdout_final_only import FinalStreamingStdOutCallbackHandler\n",
|
|
"from langchain.llms import OpenAI"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"id": "19a813f7",
|
|
"metadata": {},
|
|
"source": [
|
|
"Let's create the underlying LLM with ``streaming = True`` and pass a new instance of ``FinalStreamingStdOutCallbackHandler``."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"id": "7fe81ef4",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"llm = OpenAI(streaming=True, callbacks=[FinalStreamingStdOutCallbackHandler()], temperature=0)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"id": "ff45b85d",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
" Konrad Adenauer became Chancellor of Germany in 1949, 74 years ago in 2023."
|
|
]
|
|
},
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"'Konrad Adenauer became Chancellor of Germany in 1949, 74 years ago in 2023.'"
|
|
]
|
|
},
|
|
"execution_count": 4,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"tools = load_tools([\"wikipedia\", \"llm-math\"], llm=llm)\n",
|
|
"agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=False)\n",
|
|
"agent.run(\"It's 2023 now. How many years ago did Konrad Adenauer become Chancellor of Germany.\")"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"id": "53a743b8",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Handling custom answer prefixes"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"id": "23602c62",
|
|
"metadata": {},
|
|
"source": [
|
|
"By default, we assume that the token sequence ``\"Final\", \"Answer\", \":\"`` indicates that the agent has reached an answers. We can, however, also pass a custom sequence to use as answer prefix."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"id": "5662a638",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"llm = OpenAI(\n",
|
|
" streaming=True,\n",
|
|
" callbacks=[FinalStreamingStdOutCallbackHandler(answer_prefix_tokens=[\"The\", \"answer\", \":\"])],\n",
|
|
" temperature=0\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"id": "b1a96cc0",
|
|
"metadata": {},
|
|
"source": [
|
|
"For convenience, the callback automatically strips whitespaces and new line characters when comparing to `answer_prefix_tokens`. I.e., if `answer_prefix_tokens = [\"The\", \" answer\", \":\"]` then both `[\"\\nThe\", \" answer\", \":\"]` and `[\"The\", \" answer\", \":\"]` would be recognized a the answer prefix."
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"id": "9278b522",
|
|
"metadata": {},
|
|
"source": [
|
|
"If you don't know the tokenized version of your answer prefix, you can determine it with the following code:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "2f8f0640",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from langchain.callbacks.base import BaseCallbackHandler\n",
|
|
"\n",
|
|
"class MyCallbackHandler(BaseCallbackHandler):\n",
|
|
" def on_llm_new_token(self, token, **kwargs) -> None:\n",
|
|
" # print every token on a new line\n",
|
|
" print(f\"#{token}#\")\n",
|
|
"\n",
|
|
"llm = OpenAI(streaming=True, callbacks=[MyCallbackHandler()])\n",
|
|
"tools = load_tools([\"wikipedia\", \"llm-math\"], llm=llm)\n",
|
|
"agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=False)\n",
|
|
"agent.run(\"It's 2023 now. How many years ago did Konrad Adenauer become Chancellor of Germany.\")"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"id": "61190e58",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Also streaming the answer prefixes"
|
|
]
|
|
},
|
|
{
|
|
"attachments": {},
|
|
"cell_type": "markdown",
|
|
"id": "1255776f",
|
|
"metadata": {},
|
|
"source": [
|
|
"When the parameter `stream_prefix = True` is set, the answer prefix itself will also be streamed. This can be useful when the answer prefix itself is part of the answer. For example, when your answer is a JSON like\n",
|
|
"\n",
|
|
"`\n",
|
|
"{\n",
|
|
" \"action\": \"Final answer\",\n",
|
|
" \"action_input\": \"Konrad Adenauer became Chancellor 74 years ago.\"\n",
|
|
"}\n",
|
|
"`\n",
|
|
"\n",
|
|
"and you don't only want the action_input to be streamed, but the entire JSON."
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.11.3"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|