langchain/libs/experimental/langchain_experimental/comprehend_moderation/base_moderation_config.py
Nikhil Jha dff24285ea
Comprehend Moderation 0.2 (#11730)
This PR replaces the previous `Intent` check with the new `Prompt
Safety` check. The logic and steps to enable chain moderation via the
Amazon Comprehend service, allowing you to detect and redact PII, Toxic,
and Prompt Safety information in the LLM prompt or answer remains
unchanged.
This implementation updates the code and configuration types with
respect to `Prompt Safety`.


### Usage sample

```python
from langchain_experimental.comprehend_moderation import (BaseModerationConfig, 
                                 ModerationPromptSafetyConfig, 
                                 ModerationPiiConfig, 
                                 ModerationToxicityConfig
)

pii_config = ModerationPiiConfig(
    labels=["SSN"],
    redact=True,
    mask_character="X"
)

toxicity_config = ModerationToxicityConfig(
    threshold=0.5
)

prompt_safety_config = ModerationPromptSafetyConfig(
    threshold=0.5
)

moderation_config = BaseModerationConfig(
    filters=[pii_config, toxicity_config, prompt_safety_config]
)

comp_moderation_with_config = AmazonComprehendModerationChain(
    moderation_config=moderation_config, #specify the configuration
    client=comprehend_client,            #optionally pass the Boto3 Client
    verbose=True
)

template = """Question: {question}

Answer:"""

prompt = PromptTemplate(template=template, input_variables=["question"])

responses = [
    "Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like 323-22-9980. John Doe's phone number is (999)253-9876.", 
    "Final Answer: This is a really shitty way of constructing a birdhouse. This is fucking insane to think that any birds would actually create their motherfucking nests here."
]
llm = FakeListLLM(responses=responses)

llm_chain = LLMChain(prompt=prompt, llm=llm)

chain = ( 
    prompt 
    | comp_moderation_with_config 
    | {llm_chain.input_keys[0]: lambda x: x['output'] }  
    | llm_chain 
    | { "input": lambda x: x['text'] } 
    | comp_moderation_with_config 
)

try:
    response = chain.invoke({"question": "A sample SSN number looks like this 123-456-7890. Can you give me some more samples?"})
except Exception as e:
    print(str(e))
else:
    print(response['output'])

```

### Output

```python
> Entering new AmazonComprehendModerationChain chain...
Running AmazonComprehendModerationChain...
Running pii Validation...
Running toxicity Validation...
Running prompt safety Validation...

> Finished chain.


> Entering new AmazonComprehendModerationChain chain...
Running AmazonComprehendModerationChain...
Running pii Validation...
Running toxicity Validation...
Running prompt safety Validation...

> Finished chain.
Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like XXXXXXXXXXXX John Doe's phone number is (999)253-9876.
```

---------

Co-authored-by: Jha <nikjha@amazon.com>
Co-authored-by: Anjan Biswas <anjanavb@amazon.com>
Co-authored-by: Anjan Biswas <84933469+anjanvb@users.noreply.github.com>
2023-10-26 09:42:18 -07:00

54 lines
1.4 KiB
Python

from typing import List, Union
from pydantic import BaseModel
class ModerationPiiConfig(BaseModel):
threshold: float = 0.5
"""Threshold for PII confidence score, defaults to 0.5 i.e. 50%"""
labels: List[str] = []
"""
List of PII Universal Labels.
Defaults to `list[]`
"""
redact: bool = False
"""Whether to perform redaction of detected PII entities"""
mask_character: str = "*"
"""Redaction mask character in case redact=True, defaults to asterisk (*)"""
class ModerationToxicityConfig(BaseModel):
threshold: float = 0.5
"""Threshold for Toxic label confidence score, defaults to 0.5 i.e. 50%"""
labels: List[str] = []
"""List of toxic labels, defaults to `list[]`"""
class ModerationPromptSafetyConfig(BaseModel):
threshold: float = 0.5
"""
Threshold for Prompt Safety classification
confidence score, defaults to 0.5 i.e. 50%
"""
class BaseModerationConfig(BaseModel):
filters: List[
Union[
ModerationPiiConfig, ModerationToxicityConfig, ModerationPromptSafetyConfig
]
] = [
ModerationPiiConfig(),
ModerationToxicityConfig(),
ModerationPromptSafetyConfig(),
]
"""
Filters applied to the moderation chain, defaults to
`[ModerationPiiConfig(), ModerationToxicityConfig(),
ModerationPromptSafetyConfig()]`
"""