Improve walkthrough links for sphinx (#7672)

Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>
pull/7681/head
William FH 1 year ago committed by GitHub
parent 5db4dba526
commit 051fac1e66
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -0,0 +1,12 @@
# LangSmith
import DocCardList from "@theme/DocCardList";
LangSmith helps you trace and evaluate your language model applications and intelligent agents to help you
move from prototype to production.
Check out the [interactive walkthrough](walkthrough) below to get started.
For more information, please refer to the [LangSmith documentation](https://docs.smith.langchain.com/)
<DocCardList />

@ -9,35 +9,38 @@
"source": [
"# LangSmith Walkthrough\n",
"\n",
"LangChain makes it easy to prototype LLM applications and Agents. Even so, delivering a high-quality product to production can be deceptively difficult. You will likely have to heavily customize your prompts, chains, and other components to create a high-quality product.\n",
"LangChain makes it easy to prototype LLM applications and Agents. However, delivering LLM applications to production can be deceptively difficult. You will likely have to heavily customize and iterate on your prompts, chains, and other components to create a high-quality product.\n",
"\n",
"To aid the development process, we've designed tracing and callbacks at the core of LangChain. In this notebook, you will get started prototyping and testing an example LLM agent.\n",
"To aid in this process, we've launched LangSmith, a unified platform for debugging, testing, and monitoring your LLM applications.\n",
"\n",
"When might this come in handy? You may find it useful when you want to:\n",
"\n",
"- Quickly debug a new chain, agent, or set of tools\n",
"- Visualize how components (chains, llms, retrievers, etc.) relate and are used\n",
"- Evaluate different prompts and LLMs for a single component\n",
"- Run a given chain several times over a dataset to ensure it consistently meets a quality bar.\n",
"- Run a given chain several times over a dataset to ensure it consistently meets a quality bar\n",
"- Capture usage traces and using LLMs or analytics pipelines to generate insights"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "138fbb8f-960d-4d26-9dd5-6d6acab3ee55",
"metadata": {},
"source": [
"## Prerequisites\n",
"\n",
"**Run the [local tracing server](https://docs.smith.langchain.com/docs/additional-resources/local_installation) OR [create a hosted LangSmith account](https://smith.langchain.com/) and connect with an API key.**\n",
"**Run LangSmith locally with docker OR [create a LangSmith account](https://smith.langchain.com/) and connect with an API key.**\n",
"\n",
"To run the local server, execute the following comand in your terminal:\n",
"Note that the hosted version of LangSmith is in gated beta; we're in the process of rolling it out to more users.\n",
"\n",
"To run LangSmith locally, execute the following comand in your terminal:\n",
"```\n",
"pip install --upgrade langsmith\n",
"langsmith start\n",
"```\n",
"\n",
"Now, let's get started debugging!"
"Now, let's get started!"
]
},
{
@ -47,7 +50,7 @@
"tags": []
},
"source": [
"## Debug your Chain \n",
"## Log Traces to LangSmith\n",
"\n",
"First, configure your environment variables to tell LangChain to log traces. This is done by setting the `LANGCHAIN_TRACING_V2` environment variable to true.\n",
"You can tell LangChain which project to log to by setting the `LANGCHAIN_PROJECT` environment variable. This will automatically create a debug project for you.\n",
@ -61,7 +64,7 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 19,
"id": "904db9a5-f387-4a57-914c-c8af8d39e249",
"metadata": {
"tags": []
@ -74,8 +77,12 @@
"unique_id = uuid4().hex[0:8]\n",
"os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
"os.environ[\"LANGCHAIN_PROJECT\"] = f\"Tracing Walkthrough - {unique_id}\"\n",
"# os.environ[\"LANGCHAIN_ENDPOINT\"] = \"https://api.smith.langchain.com\" # Uncomment this line to use the hosted version\n",
"# os.environ[\"LANGCHAIN_API_KEY\"] = \"<YOUR-LANGSMITH-API-KEY>\" # Uncomment this line to use the hosted version.\n",
"os.environ[\n",
" \"LANGCHAIN_ENDPOINT\"\n",
"] = \"\" # Update to \"https://api.smith.langchain.com\" to use the hosted version.\n",
"os.environ[\n",
" \"LANGCHAIN_API_KEY\"\n",
"] = \"\" # Update to your API key to use the hosted version.\n",
"\n",
"# Used by the agent in this tutorial\n",
"# os.environ[\"OPENAI_API_KEY\"] = \"<YOUR-OPENAI-API-KEY>\"\n",
@ -94,39 +101,16 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 20,
"id": "510b5ca0",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"You can click the link below to view the UI\n"
]
},
{
"data": {
"text/html": [
"<a href=\"https://dev.smith.langchain.com/\", target=\"_blank\" rel=\"noopener\">LangSmith Client</a>"
],
"text/plain": [
"Client (API URL: https://dev.api.smith.langchain.com)"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"outputs": [],
"source": [
"from langsmith import Client\n",
"\n",
"client = Client()\n",
"print(\"You can click the link below to view the UI\")\n",
"client"
"client = Client()"
]
},
{
@ -139,7 +123,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 21,
"id": "7c801853-8e96-404d-984c-51ace59cbbef",
"metadata": {
"tags": []
@ -158,7 +142,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 23,
"id": "19537902-b95c-4390-80a4-f6c9a937081e",
"metadata": {
"tags": []
@ -197,7 +181,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 9,
"id": "0405ff30-21fe-413d-85cf-9fa3c649efec",
"metadata": {
"tags": []
@ -214,37 +198,14 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "9decb964-be07-4b6c-9802-9825c8be7b64",
"metadata": {},
"source": [
"Assuming you've successfully configured the server earlier, your agent traces should show up in your server's UI. You can check by clicking on the link below:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "b7bc3934-bb1a-452c-a723-f9cdb0b416f9",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"<a href=\"https://dev.smith.langchain.com/\", target=\"_blank\" rel=\"noopener\">LangSmith Client</a>"
],
"text/plain": [
"Client (API URL: https://dev.api.smith.langchain.com)"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"client"
"Assuming you've successfully configured the server earlier, your agent traces should show up in your web app.\n",
"\n",
"Navigate to the web app to see the results: [local app](http://localhost:80) or [hosted app](https://smith.langchain.com/)"
]
},
{
@ -252,7 +213,7 @@
"id": "6c43c311-4e09-4d57-9ef3-13afb96ff430",
"metadata": {},
"source": [
"## Test\n",
"## Evaluate a New Agent\n",
"\n",
"Once you've debugged a customized your LLM component, you will want to create tests and benchmark evaluations to measure its performance before putting it into a production environment.\n",
"\n",
@ -265,6 +226,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "beab1a29-b79d-4a99-b5b1-0870c2d772b1",
"metadata": {},
@ -273,12 +235,12 @@
"\n",
"Below, use the client to create a dataset from the Agent runs you just logged while debugging above. You will use these later to measure performance.\n",
"\n",
"For more information on datasets, including how to create them from CSVs or other files or how to create them in the web app, please refer to the [LangSmith documentation](https://docs.langchain.plus/docs)."
"For more information on datasets, including how to create them from CSVs or other files or how to create them in the web app, please refer to the [LangSmith documentation](https://docs.smith.langchain.com/)."
]
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 11,
"id": "17580c4b-bd04-4dde-9d21-9d4edd25b00d",
"metadata": {
"tags": []
@ -309,14 +271,14 @@
"source": [
"### 2. Define the Agent or LLM to Test\n",
"\n",
"You can evaluate any LLM or chain. Since chains can have memory, we will pass in a `chain_factory` (aka a `constructor` ) function to initialize for each call.\n",
"You can evaluate any LLM, chain, or agent. Since chains can have memory, we will pass in a `chain_factory` (aka a `constructor` ) function to initialize for each call.\n",
"\n",
"In this case, you will test an agent that uses OpenAI's function calling endpoints, but it can be any simple chain."
]
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 12,
"id": "f42d8ecc-d46a-448b-a89c-04b0f6907f75",
"metadata": {
"tags": []
@ -343,6 +305,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "9cb9ef53",
"metadata": {},
@ -358,12 +321,12 @@
"- Evaluate 'aspects' of the agent's response in a reference-free manner using custom criteria\n",
"\n",
"For a longer discussion of how to select an appropriate evaluator for your use case and how to create your own\n",
"custom evaluators, please refer to the [LangSmith documentation](https://docs.langchain.plus/docs/).\n"
"custom evaluators, please refer to the [LangSmith documentation](https://docs.smith.langchain.com/).\n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 13,
"id": "a25dc281",
"metadata": {
"tags": []
@ -376,19 +339,29 @@
"evaluation_config = RunEvalConfig(\n",
" # Evaluators can either be an evaluator type (e.g., \"qa\", \"criteria\", \"embedding_distance\", etc.) or a configuration for that evaluator\n",
" evaluators=[\n",
" EvaluatorType.QA, # \"Correctness\" against a reference answer\n",
" # Measures whether a QA response is \"Correct\", based on a reference answer\n",
" # You can also select via the raw string \"qa\"\n",
" EvaluatorType.QA,\n",
" # Measure the embedding distance between the output and the reference answer\n",
" # Equivalent to: EvalConfig.EmbeddingDistance(embeddings=OpenAIEmbeddings())\n",
" EvaluatorType.EMBEDDING_DISTANCE,\n",
" RunEvalConfig.Criteria(\"helpfulness\"),\n",
" # Grade whether the output satisfies the stated criteria. You can select a default one such as \"helpfulness\" or provide your own.\n",
" RunEvalConfig.LabeledCriteria(\"helpfulness\"),\n",
" # Both the Criteria and LabeledCriteria evaluators can be configured with a dictionary of custom criteria.\n",
" RunEvalConfig.Criteria(\n",
" {\n",
" \"fifth-grader-score\": \"Do you have to be smarter than a fifth grader to answer this question?\"\n",
" }\n",
" ),\n",
" ]\n",
" ],\n",
" # You can add custom StringEvaluator or RunEvaluator objects here as well, which will automatically be\n",
" # applied to each prediction. Check out the docs for examples.\n",
" custom_evaluators=[],\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "07885b10",
"metadata": {
@ -397,7 +370,7 @@
"source": [
"### 4. Run the Agent and Evaluators\n",
"\n",
"Use the `arun_on_dataset` (or synchronous `run_on_dataset`) function to evaluate your model. This will:\n",
"Use the [arun_on_dataset](https://api.python.langchain.com/en/latest/smith/langchain.smith.evaluation.runner_utils.arun_on_dataset.html#langchain.smith.evaluation.runner_utils.arun_on_dataset) (or synchronous [run_on_dataset](https://api.python.langchain.com/en/latest/smith/langchain.smith.evaluation.runner_utils.run_on_dataset.html#langchain.smith.evaluation.runner_utils.run_on_dataset)) function to evaluate your model. This will:\n",
"1. Fetch example rows from the specified dataset\n",
"2. Run your llm or chain on each example.\n",
"3. Apply evalutors to the resulting run traces and corresponding reference examples to generate automated feedback.\n",
@ -407,7 +380,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 14,
"id": "3733269b-8085-4644-9d5d-baedcff13a2f",
"metadata": {
"tags": []
@ -417,14 +390,14 @@
"name": "stdout",
"output_type": "stream",
"text": [
"Processed examples: 2\r"
"Processed examples: 1\r"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Chain failed for example 4de88b85-928e-4711-8f11-98886295c8b3. Error: LLMMathChain._evaluate(\"\n",
"Chain failed for example 890fac1b-9788-4545-a952-c8f569f21a13. Error: LLMMathChain._evaluate(\"\n",
"age_of_Dua_Lipa_boyfriend ** 0.43\n",
"\") raised error: 'age_of_Dua_Lipa_boyfriend'. Please try again with a valid numerical expression\n"
]
@ -433,14 +406,14 @@
"name": "stdout",
"output_type": "stream",
"text": [
"Processed examples: 3\r"
"Processed examples: 6\r"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Chain failed for example 7cacdf54-d1b8-4e6c-944e-c94578a2fe0d. Error: Too many arguments to single-input tool Calculator. Args: ['height ^ 0.13', {'height': 68}]\n"
"Chain failed for example 614a5986-f9de-495e-adcf-a2a4bcfe68b6. Error: Too many arguments to single-input tool Calculator. Args: ['height ^ 0.13', {'height': 68}]\n"
]
},
{
@ -470,74 +443,6 @@
"# These are logged as warnings here and captured as errors in the tracing UI."
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "a8088b7d-3ab6-4279-94c8-5116fe7cee33",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"\u001b[0;31mSignature:\u001b[0m\n",
"\u001b[0marun_on_dataset\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m \u001b[0mclient\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Client'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m \u001b[0mdataset_name\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'str'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m \u001b[0mllm_or_chain_factory\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'MODEL_OR_CHAIN_FACTORY'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m \u001b[0;34m*\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m \u001b[0mevaluation\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[RunEvalConfig]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m \u001b[0mconcurrency_level\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'int'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m5\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m \u001b[0mnum_repetitions\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'int'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m \u001b[0mproject_name\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[str]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m \u001b[0mverbose\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'bool'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mFalse\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m \u001b[0mtags\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[List[str]]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m \u001b[0minput_mapper\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[Callable[[Dict], Any]]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n",
"\u001b[0;34m\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m->\u001b[0m \u001b[0;34m'Dict[str, Any]'\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mDocstring:\u001b[0m\n",
"Asynchronously run the Chain or language model on a dataset\n",
"and store traces to the specified project name.\n",
"\n",
"Args:\n",
" client: LangSmith client to use to read the dataset, and to\n",
" log feedback and run traces.\n",
" dataset_name: Name of the dataset to run the chain on.\n",
" llm_or_chain_factory: Language model or Chain constructor to run\n",
" over the dataset. The Chain constructor is used to permit\n",
" independent calls on each example without carrying over state.\n",
" concurrency_level: The number of async tasks to run concurrently.\n",
" num_repetitions: Number of times to run the model on each example.\n",
" This is useful when testing success rates or generating confidence\n",
" intervals.\n",
" project_name: Name of the project to store the traces in.\n",
" Defaults to {dataset_name}-{chain class name}-{datetime}.\n",
" verbose: Whether to print progress.\n",
" tags: Tags to add to each run in the project.\n",
" run_evaluators: Evaluators to run on the results of the chain.\n",
" input_mapper: A function to map to the inputs dictionary from an Example\n",
" to the format expected by the model to be evaluated. This is useful if\n",
" your model needs to deserialize more complex schema or if your dataset\n",
" has inputs with keys that differ from what is expected by your chain\n",
" or agent.\n",
"\n",
"Returns:\n",
" A dictionary containing the run's project name and the\n",
" resulting model outputs.\n",
"\u001b[0;31mFile:\u001b[0m ~/code/lc/langchain/langchain/smith/evaluation/runner_utils.py\n",
"\u001b[0;31mType:\u001b[0m function"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# For more information on additional configuration for the evaluation function:\n",
"\n",
"?arun_on_dataset"
]
},
{
"cell_type": "markdown",
"id": "cdacd159-eb4d-49e9-bb2a-c55322c40ed4",
@ -557,7 +462,7 @@
"id": "591c819e-9932-45cf-adab-63727dd49559",
"metadata": {},
"source": [
"## Exporting Runs\n",
"## Exporting Datasets and Runs\n",
"\n",
"LangSmith lets you export data to common formats such as CSV or JSONL directly in the web app. You can also use the client to fetch runs for further analysis, to store in your own database, or to share with others. Let's fetch the run traces from the evaluation run."
]
@ -615,6 +520,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "2646f0fb-81d4-43ce-8a9b-54b8e19841e2",
"metadata": {
@ -627,8 +533,14 @@
"\n",
"This was a quick guide to get started, but there are many more ways to use LangSmith to speed up your developer flow and produce better results.\n",
"\n",
"For more information on how you can get the most out of LangSmith, check out [LangSmith documentation](https://docs.langchain.plus/docs/), and please reach out with questions, feature requests, or feedback at [support@langchain.dev](mailto:support@langchain.dev)."
"For more information on how you can get the most out of LangSmith, check out [LangSmith documentation](https://docs.smith.langchain.com/), and please reach out with questions, feature requests, or feedback at [support@langchain.dev](mailto:support@langchain.dev)."
]
},
{
"cell_type": "markdown",
"id": "57237f12",
"metadata": {},
"source": []
}
],
"metadata": {
@ -647,7 +559,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
"version": "3.10.9"
}
},
"nbformat": 4,
Loading…
Cancel
Save