docs[patch]: Adds evaluation sections (#23050)

Also want to add an index/rollup page to LangSmith docs to enable linking to a how-to category as a group (e.g. https://docs.smith.langchain.com/how_to_guides/evaluation/) CC @agola11 @hinthornw
4 months ago · 3b7b276f6f
parent 6605ae22f6
commit 3b7b276f6f
4 changed files with 36 additions and 7 deletions
--- a/docs/docs/concepts.mdx
+++ b/docs/docs/concepts.mdx
@ -1045,6 +1045,19 @@ Table columns:
 | Semantic Chunker (Experimental) | [SemanticChunker](/docs/how_to/semantic-chunker/)                                                                                                                             | Sentences                                                                                                       |               | First splits on sentences. Then combines ones next to each other if they are semantically similar enough. Taken from [Greg Kamradt](https://github.com/FullStackRetrieval-com/RetrievalTutorials/blob/main/tutorials/LevelsOfTextSplitting/5_Levels_Of_Text_Splitting.ipynb) |
 | Integration: AI21 Semantic | [AI21SemanticTextSplitter](/docs/integrations/document_transformers/ai21_semantic_text_splitter/)                                                                                                                    |    ✅           | Identifies distinct topics that form coherent pieces of text and splits along those.                                                                                                                                                                                         |

+### Evaluation
+<span data-heading-keywords="evaluation,evaluate"></span>

+Evaluation is the process of assessing the performance and effectiveness of your LLM-powered applications.
+It involves testing the model's responses against a set of predefined criteria or benchmarks to ensure it meets the desired quality standards and fulfills the intended purpose.
+This process is vital for building reliable applications.

+![](/img/langsmith_evaluate.png)

+[LangSmith](https://docs.smith.langchain.com/) helps with this process in a few ways:
+
+- It makes it easier to create and curate datasets via its tracing and annotation features
+- It provides an evaluation framework that helps you define metrics and run your app against your dataset
+- It allows you to track results over time and automatically run your evaluators on a schedule or as part of CI/Code
+
+To learn more, check out [this LangSmith guide](https://docs.smith.langchain.com/concepts/evaluation).
--- a/docs/docs/how_to/index.mdx
+++ b/docs/docs/how_to/index.mdx
@ -303,7 +303,15 @@ You can peruse [LangGraph how-to guides here](https://langchain-ai.github.io/lan
 ## [LangSmith](https://docs.smith.langchain.com/)

 LangSmith allows you to closely trace, monitor and evaluate your LLM application.
-It seamlessly integrates with LangChain, and you can use it to inspect and debug individual steps of your chains as you build.
+It seamlessly integrates with LangChain and LangGraph, and you can use it to inspect and debug individual steps of your chains and agents as you build.

 LangSmith documentation is hosted on a separate site.
 You can peruse [LangSmith how-to guides here](https://docs.smith.langchain.com/how_to_guides/).
+
+### Evaluation
+<span data-heading-keywords="evaluation,evaluate"></span>
+
+Evaluating performance is a vital part of building LLM-powered applications.
+LangSmith helps with every step of the process from creating a dataset to defining metrics to running evaluators.
+
+To learn more, check out the [LangSmith evaluation how-to guides](https://docs.smith.langchain.com/how_to_guides/evaluation).
--- a/docs/docs/tutorials/index.mdx
+++ b/docs/docs/tutorials/index.mdx
@ -6,13 +6,13 @@ sidebar_class_name: hidden

 New to LangChain or to LLM app development in general? Read this material to quickly get up and running.

-### Basics
+## Basics
 - [Build a Simple LLM Application with LCEL](/docs/tutorials/llm_chain)
 - [Build a Chatbot](/docs/tutorials/chatbot)
 - [Build vector stores and retrievers](/docs/tutorials/retrievers)
 - [Build an Agent](/docs/tutorials/agents)

-### Working with external knowledge
+## Working with external knowledge
 - [Build a Retrieval Augmented Generation (RAG) Application](/docs/tutorials/rag)
 - [Build a Conversational RAG Application](/docs/tutorials/qa_chat_history)
 - [Build a Question/Answering system over SQL data](/docs/tutorials/sql_qa)
@ -21,13 +21,13 @@ New to LangChain or to LLM app development in general? Read this material to qui
 - [Build a Question Answering application over a Graph Database](/docs/tutorials/graph)
 - [Build a PDF ingestion and Question/Answering system](/docs/tutorials/pdf_qa/)

-### Specialized tasks
+## Specialized tasks
 - [Build an Extraction Chain](/docs/tutorials/extraction)
 - [Generate synthetic data](/docs/tutorials/data_generation)
 - [Classify text into labels](/docs/tutorials/classification)
 - [Summarize text](/docs/tutorials/summarization)

-### LangGraph
+## LangGraph

 LangGraph is an extension of LangChain aimed at
 building robust and stateful multi-actor applications with LLMs by modeling steps as edges and nodes in a graph.
@ -35,7 +35,7 @@ building robust and stateful multi-actor applications with LLMs by modeling step
 LangGraph documentation is currently hosted on a separate site.
 You can peruse [LangGraph tutorials here](https://langchain-ai.github.io/langgraph/tutorials/).

-### LangSmith
+## LangSmith

 LangSmith allows you to closely trace, monitor and evaluate your LLM application.
 It seamlessly integrates with LangChain, and you can use it to inspect and debug individual steps of your chains as you build.
@ -43,4 +43,12 @@ It seamlessly integrates with LangChain, and you can use it to inspect and debug
 LangSmith documentation is hosted on a separate site.
 You can peruse [LangSmith tutorials here](https://docs.smith.langchain.com/tutorials/).

-For a longer list of tutorials, see our [cookbook section](https://github.com/langchain-ai/langchain/tree/master/cookbook).
+### Evaluation
+
+LangSmith helps you evaluate the performance of your LLM applications. The below tutorial is a great way to get started:
+
+- [Evaluate your LLM application](https://docs.smith.langchain.com/tutorials/Developers/evaluation)
+
+## More
+
+For more tutorials, see our [cookbook section](https://github.com/langchain-ai/langchain/tree/master/cookbook).
--- a/docs/static/img/langsmith_evaluate.png
+++ b/docs/static/img/langsmith_evaluate.png