mv docs extras (#11399)

pull/11491/head
Bagatur 9 months ago committed by GitHub
parent 53887242a1
commit 88ab69c288
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -10,7 +10,6 @@ cd "${SCRIPT_DIR}"
mkdir -p _dist/docs_skeleton
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
poetry run nbdoc_build
poetry run python generate_api_reference_links.py

@ -6,7 +6,7 @@
"metadata": {},
"source": [
"# Custom Pairwise Evaluator\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/extras/guides/evaluation/comparison/custom.ipynb)\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs_skeleton/docs/guides/evaluation/comparison/custom.ipynb)\n",
"\n",
"You can make your own pairwise string evaluators by inheriting from `PairwiseStringEvaluator` class and overwriting the `_evaluate_string_pairs` method (and the `_aevaluate_string_pairs` method if you want to use the evaluator asynchronously).\n",
"\n",

@ -8,7 +8,7 @@
},
"source": [
"# Pairwise Embedding Distance \n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/extras/guides/evaluation/comparison/pairwise_embedding_distance.ipynb)\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs_skeleton/docs/guides/evaluation/comparison/pairwise_embedding_distance.ipynb)\n",
"\n",
"One way to measure the similarity (or dissimilarity) between two predictions on a shared or similar input is to embed the predictions and compute a vector distance between the two embeddings.<a name=\"cite_ref-1\"></a>[<sup>[1]</sup>](#cite_note-1)\n",
"\n",

@ -6,7 +6,7 @@
"metadata": {},
"source": [
"# Pairwise String Comparison\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/extras/guides/evaluation/comparison/pairwise_string.ipynb)\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs_skeleton/docs/guides/evaluation/comparison/pairwise_string.ipynb)\n",
"\n",
"Often you will want to compare predictions of an LLM, Chain, or Agent for a given input. The `StringComparison` evaluators facilitate this so you can answer questions like:\n",
"\n",

@ -5,7 +5,7 @@
"metadata": {},
"source": [
"# Comparing Chain Outputs\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/extras/guides/evaluation/examples/comparisons.ipynb)\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs_skeleton/docs/guides/evaluation/examples/comparisons.ipynb)\n",
"\n",
"Suppose you have two different prompts (or LLMs). How do you know which will generate \"better\" results?\n",
"\n",

@ -6,7 +6,7 @@
"metadata": {},
"source": [
"# Criteria Evaluation\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/extras/guides/evaluation/string/criteria_eval_chain.ipynb)\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs_skeleton/docs/guides/evaluation/string/criteria_eval_chain.ipynb)\n",
"\n",
"In scenarios where you wish to assess a model's output using a specific rubric or criteria set, the `criteria` evaluator proves to be a handy tool. It allows you to verify if an LLM or Chain's output complies with a defined set of criteria.\n",
"\n",

@ -6,7 +6,7 @@
"metadata": {},
"source": [
"# Custom String Evaluator\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/extras/guides/evaluation/string/custom.ipynb)\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs_skeleton/docs/guides/evaluation/string/custom.ipynb)\n",
"\n",
"You can make your own custom string evaluators by inheriting from the `StringEvaluator` class and implementing the `_evaluate_strings` (and `_aevaluate_strings` for async support) methods.\n",
"\n",

@ -7,7 +7,7 @@
},
"source": [
"# Embedding Distance\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/extras/guides/evaluation/string/embedding_distance.ipynb)\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs_skeleton/docs/guides/evaluation/string/embedding_distance.ipynb)\n",
"\n",
"To measure semantic similarity (or dissimilarity) between a prediction and a reference label string, you could use a vector vector distance metric the two embedded representations using the `embedding_distance` evaluator.<a name=\"cite_ref-1\"></a>[<sup>[1]</sup>](#cite_note-1)\n",
"\n",

@ -6,7 +6,7 @@
"metadata": {},
"source": [
"# Exact Match\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/extras/guides/evaluation/string/exact_match.ipynb)\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs_skeleton/docs/guides/evaluation/string/exact_match.ipynb)\n",
"\n",
"Probably the simplest ways to evaluate an LLM or runnable's string output against a reference label is by a simple string equivalence.\n",
"\n",

@ -6,7 +6,7 @@
"metadata": {},
"source": [
"# Regex Match\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/extras/guides/evaluation/string/regex_match.ipynb)\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs_skeleton/docs/guides/evaluation/string/regex_match.ipynb)\n",
"\n",
"To evaluate chain or runnable string predictions against a custom regex, you can use the `regex_match` evaluator."
]

@ -6,7 +6,7 @@
"metadata": {},
"source": [
"# String Distance\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/extras/guides/evaluation/string/string_distance.ipynb)\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs_skeleton/docs/guides/evaluation/string/string_distance.ipynb)\n",
"\n",
"One of the simplest ways to compare an LLM or chain's string output against a reference label is by using string distance measurements such as Levenshtein or postfix distance. This can be used alongside approximate/fuzzy matching criteria for very basic unit testing.\n",
"\n",

@ -6,7 +6,7 @@
"metadata": {},
"source": [
"# Custom Trajectory Evaluator\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/extras/guides/evaluation/trajectory/custom.ipynb)\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs_skeleton/docs/guides/evaluation/trajectory/custom.ipynb)\n",
"\n",
"You can make your own custom trajectory evaluators by inheriting from the [AgentTrajectoryEvaluator](https://api.python.langchain.com/en/latest/evaluation/langchain.evaluation.schema.AgentTrajectoryEvaluator.html#langchain.evaluation.schema.AgentTrajectoryEvaluator) class and overwriting the `_evaluate_agent_trajectory` (and `_aevaluate_agent_action`) method.\n",
"\n",

@ -8,7 +8,7 @@
},
"source": [
"# Agent Trajectory\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/extras/guides/evaluation/trajectory/trajectory_eval.ipynb)\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs_skeleton/docs/guides/evaluation/trajectory/trajectory_eval.ipynb)\n",
"\n",
"Agents can be difficult to holistically evaluate due to the breadth of actions and generation they can make. We recommend using multiple evaluation techniques appropriate to your use case. One way to evaluate an agent is to look at the whole trajectory of actions taken along with their responses.\n",
"\n",

@ -8,7 +8,7 @@
},
"source": [
"# LangSmith Walkthrough\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/extras/guides/langsmith/walkthrough.ipynb)\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs_skeleton/docs/guides/langsmith/walkthrough.ipynb)\n",
"\n",
"LangChain makes it easy to prototype LLM applications and Agents. However, delivering LLM applications to production can be deceptively difficult. You will likely have to heavily customize and iterate on your prompts, chains, and other components to create a high-quality product.\n",
"\n",

@ -6,7 +6,7 @@
"source": [
"# Data anonymization with Microsoft Presidio\n",
"\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/extras/guides/privacy/presidio_data_anonymization/index.ipynb)\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs_skeleton/docs/guides/privacy/presidio_data_anonymization/index.ipynb)\n",
"\n",
"## Use case\n",
"\n",

@ -6,7 +6,7 @@
"source": [
"# Mutli-language data anonymization with Microsoft Presidio\n",
"\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/extras/guides/privacy/presidio_data_anonymization/multi_language.ipynb)\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs_skeleton/docs/guides/privacy/presidio_data_anonymization/multi_language.ipynb)\n",
"\n",
"\n",
"## Use case\n",

@ -6,7 +6,7 @@
"source": [
"# Reversible data anonymization with Microsoft Presidio\n",
"\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/extras/guides/privacy/presidio_data_anonymization/reversible.ipynb)\n",
"[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs_skeleton/docs/guides/privacy/presidio_data_anonymization/reversible.ipynb)\n",
"\n",
"\n",
"## Use case\n",

@ -22,16 +22,6 @@
{
"cell_type": "code",
"execution_count": null,
"id": "b39ac41a",
"metadata": {},
"outputs": [],
"source": [
"%pip install -U langchain"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "3f8518ad-c762-413c-b8c9-f1c211fc311d",
"metadata": {
"tags": []
@ -53,7 +43,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": null,
"id": "74550d74-3c01-4ba7-ad32-ca66d955d001",
"metadata": {
"tags": []
@ -117,8 +107,7 @@
"\n",
"responses = [\n",
" \"Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like 323-22-9980. John Doe's phone number is (999)253-9876.\", \n",
" # replace with your own expletive\n",
" \"Final Answer: This is a really <expletive> way of constructing a birdhouse. This is <expletive> insane to think that any birds would actually create their <expletive> nests here.\"\n",
" \"Final Answer: This is a really shitty way of constructing a birdhouse. This is fucking insane to think that any birds would actually create their motherfucking nests here.\"\n",
"]\n",
"llm = FakeListLLM(responses=responses)\n",
"\n",
@ -134,9 +123,9 @@
")\n",
"\n",
"try:\n",
" response = chain.invoke({\"question\": \"A sample SSN number looks like this . Can you give me some more samples?\"})\n",
" response = chain.invoke({\"question\": \"A sample SSN number looks like this 123-456-7890. Can you give me some more samples?\"})\n",
"except ModerationPiiError as e:\n",
" print(str(e))\n",
" print(e.message)\n",
"else:\n",
" print(response['output'])\n"
]
@ -166,36 +155,36 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": null,
"id": "d6e8900a-44ef-4967-bde8-b88af282139d",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain_experimental.comprehend_moderation import (BaseModerationConfig, \n",
" ModerationIntentConfig, \n",
" ModerationPiiConfig, \n",
" ModerationToxicityConfig\n",
")\n",
"\n",
"pii_config = ModerationPiiConfig(\n",
" labels=[\"SSN\"],\n",
" redact=True,\n",
" mask_character=\"X\"\n",
")\n",
"\n",
"toxicity_config = ModerationToxicityConfig(\n",
" threshold=0.5\n",
")\n",
"\n",
"intent_config = ModerationIntentConfig(\n",
" threshold=0.5\n",
")\n",
"\n",
"moderation_config = BaseModerationConfig(\n",
" filters=[pii_config, toxicity_config, intent_config]\n",
")"
"from langchain_experimental.comprehend_moderation import BaseModerationActions, BaseModerationFilters\n",
"\n",
"moderation_config = { \n",
" \"filters\":[ \n",
" BaseModerationFilters.PII, \n",
" BaseModerationFilters.TOXICITY,\n",
" BaseModerationFilters.INTENT\n",
" ],\n",
" \"pii\":{ \n",
" \"action\": BaseModerationActions.ALLOW, \n",
" \"threshold\":0.5, \n",
" \"labels\":[\"SSN\"],\n",
" \"mask_character\": \"X\"\n",
" },\n",
" \"toxicity\":{ \n",
" \"action\": BaseModerationActions.STOP, \n",
" \"threshold\":0.5\n",
" },\n",
" \"intent\":{ \n",
" \"action\": BaseModerationActions.STOP, \n",
" \"threshold\":0.5\n",
" }\n",
"}"
]
},
{
@ -203,20 +192,16 @@
"id": "3634376b-5938-43df-9ed6-70ca7e99290f",
"metadata": {},
"source": [
"At the core of the the configuration there are three configuration models to be used\n",
"At the core of the configuration you have three filters specified in the `filters` key:\n",
"\n",
"1. `BaseModerationFilters.PII`\n",
"2. `BaseModerationFilters.TOXICITY`\n",
"3. `BaseModerationFilters.INTENT`\n",
"\n",
"- `ModerationPiiConfig` used for configuring the behavior of the PII validations. Following are the parameters it can be initialized with\n",
" - `labels` the PII entity labels. Defaults to an empty list which means that the PII validation will consider all PII entities.\n",
" - `threshold` the confidence threshold for the detected entities, defaults to 0.5 or 50%\n",
" - `redact` a boolean flag to enforce whether redaction should be performed on the text, defaults to `False`. When `False`, the PII validation will error out when it detects any PII entity, when set to `True` it simply redacts the PII values in the text.\n",
" - `mask_character` the character used for masking, defaults to asterisk (*)\n",
"- `ModerationToxicityConfig` used for configuring the behavior of the toxicity validations. Following are the parameters it can be initialized with\n",
" - `labels` the Toxic entity labels. Defaults to an empty list which means that the toxicity validation will consider all toxic entities. all\n",
" - `threshold` the confidence threshold for the detected entities, defaults to 0.5 or 50% \n",
"- `ModerationIntentConfig` used for configuring the behavior of the intent validation\n",
" - `threshold` the confidence threshold for the the intent classification, defaults to 0.5 or 50% \n",
"And an `action` key that defines two possible actions for each moderation function:\n",
"\n",
"Finally, you use the `BaseModerationConfig` to define the order in which each of these checks are to be performed. The `BaseModerationConfig` takes an optional `filters` parameter which can be a list of one or more than one of the above validation checks, as seen in the previous code block. The `BaseModerationConfig` can also be initialized with any `filters` in which case it will use all the checks with default configuration (more on this explained later).\n",
"1. `BaseModerationActions.ALLOW` - `allows` the prompt to pass through but masks detected PII in case of PII check. The default behavior is to run and redact all PII entities. If there is an entity specified in the `labels` field, then only those entities will go through the PII check and masked.\n",
"2. `BaseModerationActions.STOP` - `stops` the prompt from passing through to the next step in case any PII, Toxicity, or incorrect Intent is detected. The action of `BaseModerationActions.STOP` will raise a Python `Exception` essentially stopping the chain in progress.\n",
"\n",
"Using the configuration in the previous cell will perform PII checks and will allow the prompt to pass through however it will mask any SSN numbers present in either the prompt or the LLM output.\n"
]
@ -254,8 +239,7 @@
"\n",
"responses = [\n",
" \"Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like 323-22-9980. John Doe's phone number is (999)253-9876.\", \n",
" # replace with your own expletive\n",
" \"Final Answer: This is a really <expletive> way of constructing a birdhouse. This is <expletive> insane to think that any birds would actually create their <expletive> nests here.\"\n",
" \"Final Answer: This is a really shitty way of constructing a birdhouse. This is fucking insane to think that any birds would actually create their motherfucking nests here.\"\n",
"]\n",
"llm = FakeListLLM(responses=responses)\n",
"\n",
@ -380,19 +364,22 @@
},
"outputs": [],
"source": [
"pii_config = ModerationPiiConfig(\n",
" labels=[\"SSN\"],\n",
" redact=True,\n",
" mask_character=\"X\"\n",
")\n",
"\n",
"toxicity_config = ModerationToxicityConfig(\n",
" threshold=0.5\n",
")\n",
"\n",
"moderation_config = BaseModerationConfig(\n",
" filters=[pii_config, toxicity_config]\n",
")\n",
"moderation_config = { \n",
" \"filters\": [ \n",
" BaseModerationFilters.PII, \n",
" BaseModerationFilters.TOXICITY\n",
" ],\n",
" \"pii\":{ \n",
" \"action\": BaseModerationActions.STOP, \n",
" \"threshold\":0.5, \n",
" \"labels\":[\"SSN\"], \n",
" \"mask_character\": \"X\" \n",
" },\n",
" \"toxicity\":{ \n",
" \"action\": BaseModerationActions.STOP, \n",
" \"threshold\":0.5 \n",
" }\n",
"}\n",
"\n",
"comp_moderation_with_config = AmazonComprehendModerationChain(\n",
" moderation_config=moderation_config, # specify the configuration\n",
@ -423,8 +410,7 @@
"\n",
"responses = [\n",
" \"Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like 323-22-9980. John Doe's phone number is (999)253-9876.\", \n",
" # replace with your own expletive\n",
" \"Final Answer: This is a really <expletive> way of constructing a birdhouse. This is <expletive> insane to think that any birds would actually create their <expletive> nests here.\"\n",
" \"Final Answer: This is a really shitty way of constructing a birdhouse. This is fucking insane to think that any birds would actually create their motherfucking nests here.\"\n",
"]\n",
"\n",
"llm = FakeListLLM(responses=responses)\n",
@ -458,7 +444,7 @@
"## `moderation_config` and moderation execution order\n",
"---\n",
"\n",
"If `AmazonComprehendModerationChain` is not initialized with any `moderation_config` then it is initialized with the default values of `BaseModerationConfig`. If no `filters` are used then the sequence of moderation check is as follows.\n",
"If `AmazonComprehendModerationChain` is not initialized with any `moderation_config` then the default action is `STOP` and default order of moderation check is as follows.\n",
"\n",
"```\n",
"AmazonComprehendModerationChain\n",
@ -478,25 +464,32 @@
" └── Return Prompt\n",
"```\n",
"\n",
"If any of the check raises a validation exception then the subsequent checks will not be performed. If a `callback` is provided in this case, then it will be called for each of the checks that have been performed. For example, in the case above, if the Chain fails due to presence of PII then the Toxicity and Intent checks will not be performed.\n",
"If any of the check raises exception then the subsequent checks will not be performed. If a `callback` is provided in this case, then it will be called for each of the checks that have been performed. For example, in the case above, if the Chain fails due to presence of PII then the Toxicity and Intent checks will not be performed.\n",
"\n",
"You can override the execution order by passing `moderation_config` and simply specifying the desired order in the `filters` parameter of the `BaseModerationConfig`. In case you specify the filters, then the order of the checks as specified in the `filters` parameter will be maintained. For example, in the configuration below, first Toxicity check will be performed, then PII, and finally Intent validation will be performed. In this case, `AmazonComprehendModerationChain` will perform the desired checks in the specified order with default values of each model `kwargs`.\n",
"You can override the execution order by passing `moderation_config` and simply specifying the desired order in the `filters` key of the configuration. In case you use `moderation_config` then the order of the checks as specified in the `filters` key will be maintained. For example, in the configuration below, first Toxicity check will be performed, then PII, and finally Intent validation will be performed. In this case, `AmazonComprehendModerationChain` will perform the desired checks in the specified order with default values of each model `kwargs`.\n",
"\n",
"```python\n",
"pii_check = ModerationPiiConfig()\n",
"toxicity_check = ModerationToxicityConfig()\n",
"intent_check = ModerationIntentConfig()\n",
"\n",
"moderation_config = BaseModerationConfig(filters=[toxicity_check, pii_check, intent_check])\n",
"moderation_config = { \n",
" \"filters\":[ BaseModerationFilters.TOXICITY, \n",
" BaseModerationFilters.PII, \n",
" BaseModerationFilters.INTENT]\n",
" }\n",
"```\n",
"\n",
"You can have also use more than one configuration for a specific moderation check, for example in the sample below, two consecutive PII checks are performed. First the configuration checks for any SSN, if found it would raise an error. If any SSN isn't found then it will next check if any NAME and CREDIT_DEBIT_NUMBER is present in the prompt and will mask it.\n",
"Model `kwargs` are specified by the `pii`, `toxicity`, and `intent` keys within the `moderation_config` dictionary. For example, in the `moderation_config` below, the default order of moderation is overriden and the `pii` & `toxicity` model `kwargs` have been overriden. For `intent` the chain's default `kwargs` will be used.\n",
"\n",
"```python\n",
"pii_check_1 = ModerationPiiConfig(labels=[\"SSN\"])\n",
"pii_check_2 = ModerationPiiConfig(labels=[\"NAME\", \"CREDIT_DEBIT_NUMBER\"], redact=True)\n",
"\n",
"moderation_config = BaseModerationConfig(filters=[pii_check_1, pii_check_2])\n",
" moderation_config = { \n",
" \"filters\":[ BaseModerationFilters.TOXICITY, \n",
" BaseModerationFilters.PII, \n",
" BaseModerationFilters.INTENT],\n",
" \"pii\":{ \"action\": BaseModerationActions.ALLOW, \n",
" \"threshold\":0.5, \n",
" \"labels\":[\"SSN\"], \n",
" \"mask_character\": \"X\" },\n",
" \"toxicity\":{ \"action\": BaseModerationActions.STOP, \n",
" \"threshold\":0.5 }\n",
" }\n",
"```\n",
"\n",
"1. For a list of PII labels see Amazon Comprehend Universal PII entity types - https://docs.aws.amazon.com/comprehend/latest/dg/how-pii.html#how-pii-types\n",
@ -519,9 +512,9 @@
"# Examples\n",
"---\n",
"\n",
"## With HuggingFace Hub Models\n",
"## With Hugging Face Hub Models\n",
"\n",
"Get your API Key from Huggingface hub - https://huggingface.co/docs/api-inference/quicktour#get-your-api-token"
"Get your API Key from Hugging Face hub - https://huggingface.co/docs/api-inference/quicktour#get-your-api-token"
]
},
{
@ -546,8 +539,7 @@
},
"outputs": [],
"source": [
"import os\n",
"os.environ[\"HUGGINGFACEHUB_API_TOKEN\"] = \"<YOUR HF TOKEN HERE>\""
"%env HUGGINGFACEHUB_API_TOKEN=\"<HUGGINGFACEHUB_API_TOKEN>\""
]
},
{
@ -560,7 +552,7 @@
"outputs": [],
"source": [
"# See https://huggingface.co/models?pipeline_tag=text-generation&sort=downloads for some other options\n",
"repo_id = \"google/flan-t5-xxl\" "
"repo_id = \"google/flan-t5-xxl\" \n"
]
},
{
@ -575,9 +567,12 @@
"from langchain.llms import HuggingFaceHub\n",
"from langchain.prompts import PromptTemplate\nfrom langchain.chains import LLMChain\n",
"\n",
"template = \"\"\"Question: {question}\"\"\"\n",
"template = \"\"\"Question: {question}\n",
"\n",
"Answer:\"\"\"\n",
"\n",
"prompt = PromptTemplate(template=template, input_variables=[\"question\"])\n",
"\n",
"llm = HuggingFaceHub(\n",
" repo_id=repo_id, model_kwargs={\"temperature\": 0.5, \"max_length\": 256}\n",
")\n",
@ -601,32 +596,22 @@
},
"outputs": [],
"source": [
"pii_config = ModerationPiiConfig(\n",
" labels=[\"SSN\", \"CREDIT_DEBIT_NUMBER\"],\n",
" redact=True,\n",
" mask_character=\"X\"\n",
")\n",
"\n",
"toxicity_config = ModerationToxicityConfig(\n",
" threshold=0.5\n",
")\n",
"\n",
"intent_config = ModerationIntentConfig(\n",
" threshold=0.8\n",
")\n",
"\n",
"moderation_config = BaseModerationConfig(\n",
" filters=[pii_config, toxicity_config, intent_config]\n",
")\n",
"# with callback\n",
"moderation_config = { \n",
" \"filters\":[ BaseModerationFilters.PII, BaseModerationFilters.TOXICITY, BaseModerationFilters.INTENT ],\n",
" \"pii\":{\"action\": BaseModerationActions.ALLOW, \"threshold\":0.5, \"labels\":[\"SSN\",\"CREDIT_DEBIT_NUMBER\"], \"mask_character\": \"X\"},\n",
" \"toxicity\":{\"action\": BaseModerationActions.STOP, \"threshold\":0.5},\n",
" \"intent\":{\"action\": BaseModerationActions.ALLOW, \"threshold\":0.5,},\n",
" }\n",
"\n",
"# without any callback\n",
"amazon_comp_moderation = AmazonComprehendModerationChain(moderation_config=moderation_config, \n",
" client=comprehend_client,\n",
" moderation_callback=my_callback,\n",
" verbose=True)\n",
"\n",
"# without callback\n",
"# with callback\n",
"amazon_comp_moderation_out = AmazonComprehendModerationChain(moderation_config=moderation_config, \n",
" client=comprehend_client,\n",
" moderation_callback=my_callback,\n",
" verbose=True)"
]
},
@ -657,10 +642,7 @@
")\n",
"\n",
"try:\n",
" response = chain.invoke({\"question\": \"\"\"What is John Doe's address, phone number and SSN from the following text?\n",
"\n",
"John Doe, a resident of 1234 Elm Street in Springfield, recently celebrated his birthday on January 1st. Turning 43 this year, John reflected on the years gone by. He often shares memories of his younger days with his close friends through calls on his phone, (555) 123-4567. Meanwhile, during a casual evening, he received an email at johndoe@example.com reminding him of an old acquaintance's reunion. As he navigated through some old documents, he stumbled upon a paper that listed his SSN as 123-45-6789, reminding him to store it in a safer place.\n",
"\"\"\"})\n",
" response = chain.invoke({\"question\": \"My AnyCompany Financial Services, LLC credit card account 1111-0000-1111-0008 has 24$ due by July 31st. Can you give me some more credit car number samples?\"})\n",
"except Exception as e:\n",
" print(str(e))\n",
"else:\n",
@ -753,26 +735,15 @@
},
"outputs": [],
"source": [
"pii_config = ModerationPiiConfig(\n",
" labels=[\"SSN\"],\n",
" redact=True,\n",
" mask_character=\"X\"\n",
")\n",
"\n",
"toxicity_config = ModerationToxicityConfig(\n",
" threshold=0.5\n",
")\n",
"\n",
"intent_config = ModerationIntentConfig(\n",
" threshold=0.8\n",
")\n",
"\n",
"moderation_config = BaseModerationConfig(\n",
" filters=[pii_config, toxicity_config, intent_config]\n",
")\n",
"moderation_config = { \n",
" \"filters\":[ BaseModerationFilters.PII, BaseModerationFilters.TOXICITY ],\n",
" \"pii\":{\"action\": BaseModerationActions.ALLOW, \"threshold\":0.5, \"labels\":[\"SSN\"], \"mask_character\": \"X\"},\n",
" \"toxicity\":{\"action\": BaseModerationActions.STOP, \"threshold\":0.5},\n",
" \"intent\":{\"action\": BaseModerationActions.ALLOW, \"threshold\":0.5,},\n",
" }\n",
"\n",
"amazon_comp_moderation = AmazonComprehendModerationChain(moderation_config=moderation_config, \n",
" client=comprehend_client,\n",
" client=comprehend_client ,\n",
" verbose=True)"
]
},
@ -803,10 +774,7 @@
")\n",
"\n",
"try:\n",
" response = chain.invoke({\"question\": \"\"\"What is John Doe's address, phone number and SSN from the following text?\n",
"\n",
"John Doe, a resident of 1234 Elm Street in Springfield, recently celebrated his birthday on January 1st. Turning 43 this year, John reflected on the years gone by. He often shares memories of his younger days with his close friends through calls on his phone, (555) 123-4567. Meanwhile, during a casual evening, he received an email at johndoe@example.com reminding him of an old acquaintance's reunion. As he navigated through some old documents, he stumbled upon a paper that listed his SSN as 123-45-6789, reminding him to store it in a safer place.\n",
"\"\"\"})\n",
" response = chain.invoke({\"question\": \"My AnyCompany Financial Services, LLC credit card account 1111-0000-1111-0008 has 24$ due by July 31st. Can you give me some more samples?\"})\n",
"except Exception as e:\n",
" print(str(e))\n",
"else:\n",

Some files were not shown because too many files have changed in this diff Show More

Loading…
Cancel
Save