"The tagging chain uses the OpenAI `functions` parameter to specify a schema to tag a document with. This helps us make sure that the model outputs exactly tags that we want, with their appropriate types.\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/extras/use_cases/tagging.ipynb)\n",
"\n",
"The tagging chain is to be used when we want to tag a passage with a specific attribute (i.e. what is the sentiment of this message?)"
"## Use case\n",
"\n",
"Tagging means labeling a document with classes such as:\n",
"\n",
"- sentiment\n",
"- language\n",
"- style (formal, informal etc.)\n",
"- covered topics\n",
"- political tendency\n",
"\n",
"![Image description](/img/tagging.png)\n",
"\n",
"## Overview\n",
"\n",
"Tagging has a few components:\n",
"\n",
"* `function`: Like [extraction](/docs/use_cases/extraction), tagging uses [functions](https://openai.com/blog/function-calling-and-other-api-updates) to specify how the model should tag a document\n",
"* `schema`: defines how we want to tag the document\n",
"\n",
"## Quickstart\n",
"\n",
"Let's see a very straightforward example of how we can use OpenAI functions for tagging in LangChain."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "bafb496a",
"execution_count": null,
"id": "dc5cbb6f",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/harrisonchase/.pyenv/versions/3.9.1/envs/langchain/lib/python3.9/site-packages/deeplake/util/check_latest_version.py:32: UserWarning: A newer version of deeplake (3.6.4) is available. It's recommended that you update to the latest version using `pip install -U deeplake`.\n",
"As we can see in the examples, it correctly interprets what we want but the results vary so that we get, for example, sentiments in different languages ('positive', 'enojado' etc.).\n",
"}\n",
"\n",
"We will see how to control these results in the next section."
"The [LangSmith trace](https://smith.langchain.com/public/311e663a-bbe8-4053-843e-5735055c032d/r) lets us peek under the hood:\n",
"\n",
"* As with [extraction](/docs/use_cases/extraction), we call the `information_extraction` function [here](https://github.com/langchain-ai/langchain/blob/269f85b7b7ffd74b38cd422d4164fc033388c3d0/libs/langchain/langchain/chains/openai_functions/extraction.py#L20) on the input string.\n",
"* This OpenAI funtion extraction information based upon the provided schema.\n",
"\n",
"![Image description](/img/tagging_trace.png)"
]
},
{
"cell_type": "markdown",
"id": "e68ad17e",
"metadata": {},
"source": [
"## Specifying schema with Pydantic"
"## Pydantic"
]
},
{
@ -304,11 +301,11 @@
"id": "2f5970ec",
"metadata": {},
"source": [
"We can also use a Pydantic schema to specify the required properties and types. We can also send other arguments, such as 'enum' or 'description' as can be seen in the example below.\n",
"We can also use a Pydantic schema to specify the required properties and types. \n",
"\n",
"By using the `create_tagging_chain_pydantic` function, we can send a Pydantic schema as input and the output will be an instantiated object that respects our desired schema. \n",
"We can also send other arguments, such as `enum` or `description`, to each field.\n",
"\n",
"In this way, we can specify our schema in the same manner that we would a new class or function in Python - with purely Pythonic types."
"This lets us specify our schema in the same manner that we would a new class or function in Python with purely Pythonic types."
"* You can use the [metadata tagger](https://python.langchain.com/docs/integrations/document_transformers/openai_metadata_tagger) document transformer to extract metadata from a LangChain `Document`. \n",
"* This covers the same basic functionality as the tagging chain, only applied to a LangChain `Document`."