extraction docs (#1898)

2 years ago · 1f93c5cf69
parent 15b5a08f4b
commit 1f93c5cf69
2 changed files with 21 additions and 0 deletions
--- a/docs/index.rst
+++ b/docs/index.rst
@ -120,6 +120,7 @@ The above modules can be used in a variety of ways. LangChain also provides guid
   ./use_cases/question_answering.md
   ./use_cases/summarization.md
   ./use_cases/tabular.rst
+   ./use_cases/extraction.md
   ./use_cases/evaluation.rst
   ./use_cases/model_laboratory.ipynb

--- a/docs/use_cases/extraction.md
+++ b/docs/use_cases/extraction.md
@ -0,0 +1,20 @@
+# Extraction
+
+Most APIs and databases still deal with structured information.
+Therefore, in order to better work with those, it can be useful to extract structured information from text.
+Examples of this include:
+
+- Extracting a structured row to insert into a database from a sentence
+- Extracting multiple rows to insert into a database from a long document
+- Extracting the correct API parameters from a user query
+
+This work is extremely related to [output parsing](../modules/prompts/examples/output_parsers.ipynb).
+Output parsers are responsible for instructing the LLM to respond in a specific format.
+In this case, the output parsers specify the format of the data you would like to extract from the document.
+Then, in addition to the output format instructions, the prompt should also contain the data you would like to extract information from.
+
+While normal output parsers are good enough for basic structuring of response data,
+when doing extraction you often want to extract more complicated or nested structures.
+For a deep dive on extraction, we recommend checking out [`kor`](https://eyurtsev.github.io/kor/),
+a library that uses the existing LangChain chain and OutputParser abstractions
+but deep dives on allowing extraction of more complicated schemas.