Add client methods to read / list runs and sessions.
Update walkthrough to:
- Let the user create a dataset from the runs without going to the UI
- Use the new CLI command to start the server
Improve the error message when `docker` isn't found
"LangChain makes it easy to get started with Agents and other LLM applications. However, it can be tricky to get right, especially when you need to deliver a full product. To speed up your application development process, and to help monitor your applications in production, LangChain offers additional tracing tooling.\n",
"\n",
"When might you want to use tracing? Some situations we've found it useful include:\n",
"- Quickly debugging a new chain, agent, or set of tools\n",
"- Evaluating a given chain across different LLMs or Chat Models to compare results or improve prompts\n",
"- Running a given chain multiple time on a dataset to ensure it consistently meets a quality bar.\n",
"\n",
"\n",
"Developing applications with language models can be uniquely challenging. To manage this complexity and ensure reliable performance, LangChain provides tracing and evaluation functionality. This notebook demonstrates how to run Chains, which are language model functions, as well as Chat models, and LLMs on previously captured datasets or traces. Some common use cases for this approach include:\n",
"In this notebook, we'll show how to enable tracing in your LangChain applications and walk you a couple common ways to evaluate your agents.\n",
"We'll focus on using Datasets to benchmark Chain behavior.\n",
"\n",
"- Running an evaluation chain to grade previous runs.\n",
"- Comparing different chains, LLMs, and agents on traced datasets.\n",
"- Executing a stochastic chain multiple times over a dataset to generate metrics before deployment.\n",
"Bear in mind that this notebook is designed under the assumption that you're running LangChain+ server locally in the background, and it's set up to work specifically with the V2 endpoints. This is done using the folowing command in your terminal:\n",
"\n",
"Please note that this notebook assumes you have LangChain+ tracing running in the background. It is also configured to work only with the V2 endpoints. To set it up, follow the [tracing directions here](..\\/..\\/tracing\\/local_installation.md).\n",
"\n",
"We'll start by creating a client to connect to LangChain+."
"```\n",
"pip install --upgrade langchain\n",
"langchain server start\n",
"```\n",
"\n",
"Now, let's get started by creating a client to connect to LangChain+."
"If you have been using LangChainPlus already, you may have datasets available. To view all saved datasets, run:\n",
"## Tracing Runs\n",
"\n",
"```\n",
"datasets = client.list_datasets()\n",
"print(datasets)\n",
"```\n",
"\n",
"Datasets can be created in a number of ways, most often by collecting `Run`'s captured through the LangChain tracing API and converting a set of runs to a dataset.\n",
"\n",
"The V2 tracing API is currently accessible using the `tracing_v2_enabled` context manager. Assuming the server was succesfully started above, running LangChain Agents, Chains, LLMs, and other primitives will then automatically capture traces. We'll start with a simple math example.\n",
"\n",
"**Note** You can also use the `LANGCHAIN_TRACING_V2` environment variable to enable tracing for all runs by default, regardless of whether or not those runs happen within the `tracing_v2_enabled` context manager (i.e. `os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"`)"
"The V2 tracing API can be activated by setting the `LANGCHAIN_TRACING_V2` environment variable to true. Assuming you've successfully initiated the server as described earlier, running LangChain Agents, Chains, LLMs, and other primitives will automatically start capturing traces. Let's begin our exploration with a straightforward math example.\n"
"/Users/wfh/code/lc/lckg/langchain/callbacks/manager.py:78: UserWarning: The experimental tracing v2 is in development. This is not yet stable and may change in the future.\n",
" warnings.warn(\n"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The current population of Canada as of 2023 is 39,566,248.\n",
"39,566,248\n",
"Anwar Hadid is Dua Lipa's boyfriend and his age raised to the 0.43 power is approximately 3.87.\n",
"LLMMathChain._evaluate(\"\n",
"(age)**0.43\n",
"\") raised error: 'age'. Please try again with a valid numerical expression\n",
"The distance between Paris and Boston is approximately 3448 miles.\n",
"unknown format from LLM: Sorry, I cannot answer this question as it requires information from the future.\n",
"The distance between Paris and Boston is 3448 miles.\n",
"unknown format from LLM: Assuming we don't have any information about the actual number of points scored in the 2023 super bowl, we cannot provide a mathematical expression to solve this problem.\n",
"LLMMathChain._evaluate(\"\n",
"(total number of points scored in the 2023 super bowl)**0.23\n",
"\") raised error: invalid syntax. Perhaps you forgot a comma? (<expr>, line 1). Please try again with a valid numerical expression\n",
"Could not parse LLM output: `The final answer is that there were no more points scored in the 2023 Super Bowl than in the 2022 Super Bowl.`\n",
"15 points were scored more in the 2023 Super Bowl than in the 2022 Super Bowl.\n",
"1.9347796717823205\n",
"77\n",
"0.2791714614499425\n"
"LLMMathChain._evaluate(\"\n",
"round(0.2791714614499425, 2)\n",
"\") raised error: 'VariableNode' object is not callable. Please try again with a valid numerical expression\n"
]
}
],
@ -159,7 +175,7 @@
" \"who is kendall jenner's boyfriend? what is his height (in inches) raised to .13 power?\",\n",
"Now that you've captured a session entitled 'search_and_math_chain', it's time to create a dataset:\n",
"\n",
" 1. Navigate to the UI by clicking on the link below.\n",
" 2. Select the 'search_and_math_chain' session from the list.\n",
" 3. Next to the fist example, click \"+ to Dataset\".\n",
" 4. Click \"Create Dataset\" and create a title **\"calculator-example-dataset\"**.\n",
" 5. Add the other examples to the dataset as well"
"Now that you've captured a session entitled 'Tracing Walkthrough', it's time to create a dataset. We will do so using the `create_dataset` method below."
" <td>How many people live in canada as of 2023?</td>\n",
" <td>approximately 38,625,801</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>who is dua lipa's boyfriend? what is his age r...</td>\n",
" <td>her boyfriend is Romain Gravas. his age raised...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>what is dua lipa's boyfriend age raised to the...</td>\n",
" <td>her boyfriend is Romain Gravas. his age raised...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>how far is it from paris to boston in miles</td>\n",
" <td>approximately 3,435 mi</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>what was the total number of points scored in ...</td>\n",
" <td>approximately 2.682651500990882</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" input \\\n",
"0 How many people live in canada as of 2023? \n",
"1 who is dua lipa's boyfriend? what is his age r... \n",
"2 what is dua lipa's boyfriend age raised to the... \n",
"3 how far is it from paris to boston in miles \n",
"4 what was the total number of points scored in ... \n",
"\n",
" output \n",
"0 approximately 38,625,801 \n",
"1 her boyfriend is Romain Gravas. his age raised... \n",
"2 her boyfriend is Romain Gravas. his age raised... \n",
"3 approximately 3,435 mi \n",
"4 approximately 2.682651500990882 "
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"outputs": [],
"source": [
"# import pandas as pd\n",
"# from langchain.evaluation.loading import load_dataset\n",
@ -361,7 +304,7 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 10,
"id": "52a7ea76-79ca-4765-abf7-231e884040d6",
"metadata": {
"tags": []
@ -397,7 +340,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 11,
"id": "c2b59104-b90e-466a-b7ea-c5bd0194263b",
"metadata": {
"tags": []
@ -425,12 +368,13 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 12,
"id": "112d7bdf-7e50-4c1a-9285-5bac8473f2ee",
"metadata": {
"tags": []
},
"outputs": [{
"outputs": [
{
"data": {
"text/plain": [
"\u001b[0;31mSignature:\u001b[0m\n",
@ -467,14 +411,15 @@
},
"metadata": {},
"output_type": "display_data"
}],
}
],
"source": [
"?client.arun_on_dataset"
]
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 13,
"id": "6e10f823",
"metadata": {
"tags": []
@ -493,20 +438,24 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 14,
"id": "a8088b7d-3ab6-4279-94c8-5116fe7cee33",
"metadata": {
"tags": []
},
"outputs": [{
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Processed examples: 1\r"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/wfh/code/lc/lckg/langchain/callbacks/manager.py:78: UserWarning: The experimental tracing v2 is in development. This is not yet stable and may change in the future.\n",
" warnings.warn(\n",
"Chain failed for example 5523e460-6bb4-4a64-be37-bec0a98699a4. Error: LLMMathChain._evaluate(\"\n",
"(total number of points scored in the 2023 super bowl)**0.23\n",
"\") raised error: invalid syntax. Perhaps you forgot a comma? (<expr>, line 1). Please try again with a valid numerical expression\n"
"Chain failed for example 8d4ff5b4-41fb-4986-80f1-025e6fec96b0. Error: unknown format from LLM: It is impossible to accurately predict the total number of points scored in a future event. Therefore, a mathematical expression cannot be provided.\n"
]
},
{
@ -520,23 +469,25 @@
"name": "stderr",
"output_type": "stream",
"text": [
"Chain failed for example f193a3f6-1147-4ce6-a83e-fab1157dc88d. Error: unknown format from LLM: Assuming we don't have any information about the actual number of points scored in the 2023 super bowl, we cannot provide a mathematical expression to solve this problem.\n"
"Chain failed for example 178081fb-a44a-46d5-a23b-74a830da65f3. Error: LLMMathChain._evaluate(\"\n",
"(age)**0.43\n",
"\") raised error: 'age'. Please try again with a valid numerical expression\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Processed examples: 6\r"
"Processed examples: 5\r"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Chain failed for example 6d7bbb45-1dc0-4adc-be21-4f76a208a8d2. Error: LLMMathChain._evaluate(\"\n",
"(age ** 0.43)\n",
"\") raised error: 'age'. Please try again with a valid numerical expression\n"
"Chain failed for example 7de97d34-50e2-4ec5-bc49-c8e6287ae73e. Error: LLMMathChain._evaluate(\"\n",
"(total number of points scored in the 2023 super bowl)**0.23\n",
"\") raised error: invalid syntax. Perhaps you forgot a comma? (<expr>, line 1). Please try again with a valid numerical expression\n"
"# You can navigate to the UI by clicking on the link below\n",
"client"
@ -614,12 +568,13 @@
},
{
"cell_type": "code",
"execution_count": 24,
"execution_count": 16,
"id": "64490d7c-9a18-49ed-a3ac-36049c522cb4",
"metadata": {
"tags": []
},
"outputs": [{
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
@ -629,7 +584,7 @@
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "0adb751cec11417b88072963325b481d",
"model_id": "44f3c72015944e2ea4c39516350ea15c",
"version_major": 2,
"version_minor": 0
},
@ -711,7 +666,7 @@
"4 [{'data': {'content': 'Here is the topic for a... "
]
},
"execution_count": 24,
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
@ -727,7 +682,7 @@
},
{
"cell_type": "code",
"execution_count": 25,
"execution_count": 17,
"id": "348acd86-a927-4d60-8d52-02e64585e4fc",
"metadata": {
"tags": []
@ -757,7 +712,7 @@
},
{
"cell_type": "code",
"execution_count": 26,
"execution_count": 18,
"id": "a69dd183-ad5e-473d-b631-db90706e837f",
"metadata": {
"tags": []
@ -771,19 +726,12 @@
},
{
"cell_type": "code",
"execution_count": 27,
"execution_count": 19,
"id": "063da2a9-3692-4b7b-8edb-e474824fe416",
"metadata": {
"tags": []
},
"outputs": [{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/wfh/code/lc/lckg/langchain/callbacks/manager.py:78: UserWarning: The experimental tracing v2 is in development. This is not yet stable and may change in the future.\n",
"4 Please rise if you are able and show that, Yes... "
"3 Please rise if you are able and show that, Yes... \n",
"4 Groups of citizens blocking tanks with their b... "
]
},
"execution_count": 11,
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
@ -991,7 +942,7 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 23,
"id": "c7dcc1b2-7aef-44c0-ba0f-c812279099a5",
"metadata": {
"tags": []
@ -1011,24 +962,17 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 24,
"id": "e946138e-bf7c-43d7-861d-9c5740c933fa",
"metadata": {
"tags": []
},
"outputs": [{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/wfh/code/lc/lckg/langchain/callbacks/manager.py:78: UserWarning: The experimental tracing v2 is in development. This is not yet stable and may change in the future.\n",