{ "cells": [ { "cell_type": "markdown", "id": "25a3f834-60b7-4c21-bfb4-ad16d30fd3f7", "metadata": {}, "source": [ "# Amazon Comprehend Moderation Chain\n", "---" ] }, { "cell_type": "code", "execution_count": null, "id": "2c4236d8-4054-473d-84a4-87a4db278a62", "metadata": {}, "outputs": [], "source": [ "%pip install boto3 nltk" ] }, { "cell_type": "code", "execution_count": null, "id": "3f8518ad-c762-413c-b8c9-f1c211fc311d", "metadata": { "tags": [] }, "outputs": [], "source": [ "import boto3\n", "\n", "comprehend_client = boto3.client('comprehend', region_name='us-east-1')" ] }, { "cell_type": "markdown", "id": "d1f0ba28", "metadata": {}, "source": [ "Import `AmazonComprehendModerationChain`" ] }, { "cell_type": "code", "execution_count": null, "id": "74550d74-3c01-4ba7-ad32-ca66d955d001", "metadata": { "tags": [] }, "outputs": [], "source": [ "from langchain_experimental.comprehend_moderation import AmazonComprehendModerationChain" ] }, { "cell_type": "markdown", "id": "f00c338b-de9f-40e5-9295-93c9e26058e3", "metadata": {}, "source": [ "Initialize an instance of the Amazon Comprehend Moderation Chain to be used with your LLM chain" ] }, { "cell_type": "code", "execution_count": null, "id": "cde58cc6-ff83-493a-9aed-93d755f984a7", "metadata": { "tags": [] }, "outputs": [], "source": [ "comprehend_moderation = AmazonComprehendModerationChain(\n", " client=comprehend_client, #optional\n", " verbose=True\n", ")" ] }, { "cell_type": "markdown", "id": "ad646d01-82d2-435a-939b-c450693857ab", "metadata": {}, "source": [ "Using it with your LLM chain. \n", "\n", "**Note**: The example below uses the _Fake LLM_ from LangChain, but same concept could be applied to other LLMs." ] }, { "cell_type": "code", "execution_count": null, "id": "0efa1946-d4a9-467a-920a-a8fb78720fc2", "metadata": { "tags": [] }, "outputs": [], "source": [ "from langchain import PromptTemplate, LLMChain\n", "from langchain.llms.fake import FakeListLLM\n", "from langchain_experimental.comprehend_moderation.base_moderation_exceptions import ModerationPiiError\n", "\n", "template = \"\"\"Question: {question}\n", "\n", "Answer:\"\"\"\n", "\n", "prompt = PromptTemplate(template=template, input_variables=[\"question\"])\n", "\n", "responses = [\n", " \"Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like 323-22-9980. John Doe's phone number is (999)253-9876.\", \n", " \"Final Answer: This is a really shitty way of constructing a birdhouse. This is fucking insane to think that any birds would actually create their motherfucking nests here.\"\n", "]\n", "llm = FakeListLLM(responses=responses)\n", "\n", "llm_chain = LLMChain(prompt=prompt, llm=llm)\n", "\n", "chain = (\n", " prompt \n", " | comprehend_moderation \n", " | {llm_chain.input_keys[0]: lambda x: x['output'] } \n", " | llm_chain \n", " | { \"input\": lambda x: x['text'] } \n", " | comprehend_moderation \n", ")\n", "\n", "try:\n", " response = chain.invoke({\"question\": \"A sample SSN number looks like this 123-456-7890. Can you give me some more samples?\"})\n", "except ModerationPiiError as e:\n", " print(e.message)\n", "else:\n", " print(response['output'])\n" ] }, { "cell_type": "markdown", "id": "6da25d96-0d96-4c01-94ae-a2ead17f10aa", "metadata": {}, "source": [ "## Using `moderation_config` to customize your moderation\n", "---" ] }, { "cell_type": "markdown", "id": "bfd550e7-5012-41fa-9546-8b78ddf1c673", "metadata": {}, "source": [ "Use Amazon Comprehend Moderation with a configuration to control what moderations you wish to perform and what actions should be taken for each of them. There are three different moderations that happen when no configuration is passed as demonstrated above. These moderations are:\n", "\n", "- PII (Personally Identifiable Information) checks \n", "- Toxicity content detection\n", "- Intention detection\n", "\n", "Here is an example of a moderation config." ] }, { "cell_type": "code", "execution_count": null, "id": "d6e8900a-44ef-4967-bde8-b88af282139d", "metadata": { "tags": [] }, "outputs": [], "source": [ "from langchain_experimental.comprehend_moderation import BaseModerationActions, BaseModerationFilters\n", "\n", "moderation_config = { \n", " \"filters\":[ \n", " BaseModerationFilters.PII, \n", " BaseModerationFilters.TOXICITY,\n", " BaseModerationFilters.INTENT\n", " ],\n", " \"pii\":{ \n", " \"action\": BaseModerationActions.ALLOW, \n", " \"threshold\":0.5, \n", " \"labels\":[\"SSN\"],\n", " \"mask_character\": \"X\"\n", " },\n", " \"toxicity\":{ \n", " \"action\": BaseModerationActions.STOP, \n", " \"threshold\":0.5\n", " },\n", " \"intent\":{ \n", " \"action\": BaseModerationActions.STOP, \n", " \"threshold\":0.5\n", " }\n", "}" ] }, { "cell_type": "markdown", "id": "3634376b-5938-43df-9ed6-70ca7e99290f", "metadata": {}, "source": [ "At the core of the configuration you have three filters specified in the `filters` key:\n", "\n", "1. `BaseModerationFilters.PII`\n", "2. `BaseModerationFilters.TOXICITY`\n", "3. `BaseModerationFilters.INTENT`\n", "\n", "And an `action` key that defines two possible actions for each moderation function:\n", "\n", "1. `BaseModerationActions.ALLOW` - `allows` the prompt to pass through but masks detected PII in case of PII check. The default behavior is to run and redact all PII entities. If there is an entity specified in the `labels` field, then only those entities will go through the PII check and masked.\n", "2. `BaseModerationActions.STOP` - `stops` the prompt from passing through to the next step in case any PII, Toxicity, or incorrect Intent is detected. The action of `BaseModerationActions.STOP` will raise a Python `Exception` essentially stopping the chain in progress.\n", "\n", "Using the configuration in the previous cell will perform PII checks and will allow the prompt to pass through however it will mask any SSN numbers present in either the prompt or the LLM output.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "3a4f7e65-f733-4863-ae6d-34c9faffd849", "metadata": { "tags": [] }, "outputs": [], "source": [ "comp_moderation_with_config = AmazonComprehendModerationChain(\n", " moderation_config=moderation_config, #specify the configuration\n", " client=comprehend_client, #optionally pass the Boto3 Client\n", " verbose=True\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "a25e6f93-765b-4f99-8c1c-929157dbd4aa", "metadata": { "tags": [] }, "outputs": [], "source": [ "template = \"\"\"Question: {question}\n", "\n", "Answer:\"\"\"\n", "\n", "prompt = PromptTemplate(template=template, input_variables=[\"question\"])\n", "\n", "responses = [\n", " \"Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like 323-22-9980. John Doe's phone number is (999)253-9876.\", \n", " \"Final Answer: This is a really shitty way of constructing a birdhouse. This is fucking insane to think that any birds would actually create their motherfucking nests here.\"\n", "]\n", "llm = FakeListLLM(responses=responses)\n", "\n", "llm_chain = LLMChain(prompt=prompt, llm=llm)\n", "\n", "chain = ( \n", " prompt \n", " | comp_moderation_with_config \n", " | {llm_chain.input_keys[0]: lambda x: x['output'] } \n", " | llm_chain \n", " | { \"input\": lambda x: x['text'] } \n", " | comp_moderation_with_config \n", ")\n", "\n", "try:\n", " response = chain.invoke({\"question\": \"A sample SSN number looks like this 123-456-7890. Can you give me some more samples?\"})\n", "except Exception as e:\n", " print(str(e))\n", "else:\n", " print(response['output'])" ] }, { "cell_type": "markdown", "id": "ba890681-feeb-43ca-a0d5-9c11d2d9de3e", "metadata": { "tags": [] }, "source": [ "## Unique ID, and Moderation Callbacks\n", "---\n", "\n", "When Amazon Comprehend moderation action is specified as `STOP`, the chain will raise one of the following exceptions-\n", " - `ModerationPiiError`, for PII checks\n", " - `ModerationToxicityError`, for Toxicity checks \n", " - `ModerationIntentionError` for Intent checks\n", "\n", "In addition to the moderation configuration, the `AmazonComprehendModerationChain` can also be initialized with the following parameters\n", "\n", "- `unique_id` [Optional] a string parameter. This parameter can be used to pass any string value or ID. For example, in a chat application you may want to keep track of abusive users, in this case you can pass the user's username/email id etc. This defaults to `None`.\n", "\n", "- `moderation_callback` [Optional] the `BaseModerationCallbackHandler` that will be called asynchronously (non-blocking to the chain). Callback functions are useful when you want to perform additional actions when the moderation functions are executed, for example logging into a database, or writing a log file. You can override three functions by subclassing `BaseModerationCallbackHandler` - `on_after_pii()`, `on_after_toxicity()`, and `on_after_intent()`. Note that all three functions must be `async` functions. These callback functions receive two arguments:\n", " - `moderation_beacon` a dictionary that will contain information about the moderation function, the full response from Amazon Comprehend model, a unique chain id, the moderation status, and the input string which was validated. The dictionary is of the following schema-\n", " \n", " ```\n", " { \n", " 'moderation_chain_id': 'xxx-xxx-xxx', # Unique chain ID\n", " 'moderation_type': 'Toxicity' | 'PII' | 'Intent', \n", " 'moderation_status': 'LABELS_FOUND' | 'LABELS_NOT_FOUND',\n", " 'moderation_input': 'A sample SSN number looks like this 123-456-7890. Can you give me some more samples?',\n", " 'moderation_output': {...} #Full Amazon Comprehend PII, Toxicity, or Intent Model Output\n", " }\n", " ```\n", " \n", " - `unique_id` if passed to the `AmazonComprehendModerationChain`" ] }, { "cell_type": "markdown", "id": "3c178835-0264-4ac6-aef4-091d2993d06c", "metadata": {}, "source": [ "
moderation_callback
is different from LangChain Chain Callbacks. You can still use LangChain Chain callbacks with AmazonComprehendModerationChain
via the callbacks parameter. Example: \n", "from langchain.callbacks.stdout import StdOutCallbackHandler\n", "comp_moderation_with_config = AmazonComprehendModerationChain(verbose=True, callbacks=[StdOutCallbackHandler()])\n", "\n", "