NOTE: moderation_callback
is different from LangChain Chain Callbacks. You can still use LangChain Chain callbacks with
AmazonComprehendModerationChain
via the callbacks parameter. Example:
\n",
- "
\n",
- "from langchain.callbacks.stdout import StdOutCallbackHandler\n",
- "comp_moderation_with_config = AmazonComprehendModerationChain(verbose=True, callbacks=[StdOutCallbackHandler()])\n",
- "
\n",
- "
"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "0ec38536-8cc9-408e-860b-e4a439283643",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "from langchain_experimental.comprehend_moderation import BaseModerationCallbackHandler"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "1be744c7-3f99-4165-bf7f-9c5c249bbb53",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# Define callback handlers by subclassing BaseModerationCallbackHandler\n",
- "\n",
- "class MyModCallback(BaseModerationCallbackHandler):\n",
- " \n",
- " async def on_after_pii(self, output_beacon, unique_id):\n",
- " import json\n",
- " moderation_type = output_beacon['moderation_type']\n",
- " chain_id = output_beacon['moderation_chain_id']\n",
- " with open(f'output-{moderation_type}-{chain_id}.json', 'w') as file:\n",
- " data = { 'beacon_data': output_beacon, 'unique_id': unique_id }\n",
- " json.dump(data, file)\n",
- " \n",
- " '''\n",
- " async def on_after_toxicity(self, output_beacon, unique_id):\n",
- " pass\n",
- " \n",
- " async def on_after_intent(self, output_beacon, unique_id):\n",
- " pass\n",
- " '''\n",
- " \n",
- "\n",
- "my_callback = MyModCallback()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "362a3fe0-f09f-411e-9df1-d79b3e87510c",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "moderation_config = { \n",
- " \"filters\": [ \n",
- " BaseModerationFilters.PII, \n",
- " BaseModerationFilters.TOXICITY\n",
- " ],\n",
- " \"pii\":{ \n",
- " \"action\": BaseModerationActions.STOP, \n",
- " \"threshold\":0.5, \n",
- " \"labels\":[\"SSN\"], \n",
- " \"mask_character\": \"X\" \n",
- " },\n",
- " \"toxicity\":{ \n",
- " \"action\": BaseModerationActions.STOP, \n",
- " \"threshold\":0.5 \n",
- " }\n",
- "}\n",
- "\n",
- "comp_moderation_with_config = AmazonComprehendModerationChain(\n",
- " moderation_config=moderation_config, # specify the configuration\n",
- " client=comprehend_client, # optionally pass the Boto3 Client\n",
- " unique_id='john.doe@email.com', # A unique ID\n",
- " moderation_callback=my_callback, # BaseModerationCallbackHandler\n",
- " verbose=True\n",
- ")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "2af07937-67ea-4738-8343-c73d4d28c2cc",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "from langchain.prompts import PromptTemplate\nfrom langchain.chains import LLMChain\n",
- "from langchain.llms.fake import FakeListLLM\n",
- "\n",
- "template = \"\"\"Question: {question}\n",
- "\n",
- "Answer:\"\"\"\n",
- "\n",
- "prompt = PromptTemplate(template=template, input_variables=[\"question\"])\n",
- "\n",
- "responses = [\n",
- " \"Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like 323-22-9980. John Doe's phone number is (999)253-9876.\", \n",
- " \"Final Answer: This is a really shitty way of constructing a birdhouse. This is fucking insane to think that any birds would actually create their motherfucking nests here.\"\n",
- "]\n",
- "\n",
- "llm = FakeListLLM(responses=responses)\n",
- "\n",
- "llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
- "\n",
- "chain = (\n",
- " prompt \n",
- " | comp_moderation_with_config \n",
- " | {llm_chain.input_keys[0]: lambda x: x['output'] } \n",
- " | llm_chain \n",
- " | { \"input\": lambda x: x['text'] } \n",
- " | comp_moderation_with_config \n",
- ") \n",
- "\n",
- "try:\n",
- " response = chain.invoke({\"question\": \"A sample SSN number looks like this 123-456-7890. Can you give me some more samples?\"})\n",
- "except Exception as e:\n",
- " print(str(e))\n",
- "else:\n",
- " print(response['output'])"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "706454b2-2efa-4d41-abc8-ccf2b4e87822",
- "metadata": {
- "tags": []
- },
- "source": [
- "## `moderation_config` and moderation execution order\n",
- "---\n",
- "\n",
- "If `AmazonComprehendModerationChain` is not initialized with any `moderation_config` then the default action is `STOP` and default order of moderation check is as follows.\n",
- "\n",
- "```\n",
- "AmazonComprehendModerationChain\n",
- "│\n",
- "└──Check PII with Stop Action\n",
- " ├── Callback (if available)\n",
- " ├── Label Found ⟶ [Error Stop]\n",
- " └── No Label Found \n",
- " └──Check Toxicity with Stop Action\n",
- " ├── Callback (if available)\n",
- " ├── Label Found ⟶ [Error Stop]\n",
- " └── No Label Found\n",
- " └──Check Intent with Stop Action\n",
- " ├── Callback (if available)\n",
- " ├── Label Found ⟶ [Error Stop]\n",
- " └── No Label Found\n",
- " └── Return Prompt\n",
- "```\n",
- "\n",
- "If any of the check raises exception then the subsequent checks will not be performed. If a `callback` is provided in this case, then it will be called for each of the checks that have been performed. For example, in the case above, if the Chain fails due to presence of PII then the Toxicity and Intent checks will not be performed.\n",
- "\n",
- "You can override the execution order by passing `moderation_config` and simply specifying the desired order in the `filters` key of the configuration. In case you use `moderation_config` then the order of the checks as specified in the `filters` key will be maintained. For example, in the configuration below, first Toxicity check will be performed, then PII, and finally Intent validation will be performed. In this case, `AmazonComprehendModerationChain` will perform the desired checks in the specified order with default values of each model `kwargs`.\n",
- "\n",
- "```python\n",
- "moderation_config = { \n",
- " \"filters\":[ BaseModerationFilters.TOXICITY, \n",
- " BaseModerationFilters.PII, \n",
- " BaseModerationFilters.INTENT]\n",
- " }\n",
- "```\n",
- "\n",
- "Model `kwargs` are specified by the `pii`, `toxicity`, and `intent` keys within the `moderation_config` dictionary. For example, in the `moderation_config` below, the default order of moderation is overriden and the `pii` & `toxicity` model `kwargs` have been overriden. For `intent` the chain's default `kwargs` will be used.\n",
- "\n",
- "```python\n",
- " moderation_config = { \n",
- " \"filters\":[ BaseModerationFilters.TOXICITY, \n",
- " BaseModerationFilters.PII, \n",
- " BaseModerationFilters.INTENT],\n",
- " \"pii\":{ \"action\": BaseModerationActions.ALLOW, \n",
- " \"threshold\":0.5, \n",
- " \"labels\":[\"SSN\"], \n",
- " \"mask_character\": \"X\" },\n",
- " \"toxicity\":{ \"action\": BaseModerationActions.STOP, \n",
- " \"threshold\":0.5 }\n",
- " }\n",
- "```\n",
- "\n",
- "1. For a list of PII labels see Amazon Comprehend Universal PII entity types - https://docs.aws.amazon.com/comprehend/latest/dg/how-pii.html#how-pii-types\n",
- "2. Following are the list of available Toxicity labels-\n",
- " - `HATE_SPEECH`: Speech that criticizes, insults, denounces or dehumanizes a person or a group on the basis of an identity, be it race, ethnicity, gender identity, religion, sexual orientation, ability, national origin, or another identity-group.\n",
- " - `GRAPHIC`: Speech that uses visually descriptive, detailed and unpleasantly vivid imagery is considered as graphic. Such language is often made verbose so as to amplify an insult, discomfort or harm to the recipient.\n",
- " - `HARASSMENT_OR_ABUSE`: Speech that imposes disruptive power dynamics between the speaker and hearer, regardless of intent, seeks to affect the psychological well-being of the recipient, or objectifies a person should be classified as Harassment.\n",
- " - `SEXUAL`: Speech that indicates sexual interest, activity or arousal by using direct or indirect references to body parts or physical traits or sex is considered as toxic with toxicityType \"sexual\". \n",
- " - `VIOLENCE_OR_THREAT`: Speech that includes threats which seek to inflict pain, injury or hostility towards a person or group.\n",
- " - `INSULT`: Speech that includes demeaning, humiliating, mocking, insulting, or belittling language.\n",
- " - `PROFANITY`: Speech that contains words, phrases or acronyms that are impolite, vulgar, or offensive is considered as profane.\n",
- "3. For a list of Intent labels refer to documentation [link here]"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "78905aec-55ae-4fc3-a23b-8a69bd1e33f2",
- "metadata": {},
- "source": [
- "# Examples\n",
- "---\n",
- "\n",
- "## With Hugging Face Hub Models\n",
- "\n",
- "Get your API Key from Hugging Face hub - https://huggingface.co/docs/api-inference/quicktour#get-your-api-token"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "359b9627-769b-46ce-8be2-c8a5cf7728ba",
- "metadata": {
- "scrolled": true,
- "tags": []
- },
- "outputs": [],
- "source": [
- "%pip install huggingface_hub"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "41b7ea98-ad16-4454-8f12-c03c17113a86",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "%env HUGGINGFACEHUB_API_TOKEN=\"