mirror of
https://github.com/hwchase17/langchain
synced 2024-10-29 17:07:25 +00:00
2aae1102b0
### Description Add instance anonymization - if `John Doe` will appear twice in the text, it will be treated as the same entity. The difference between `PresidioAnonymizer` and `PresidioReversibleAnonymizer` is that only the second one has a built-in memory, so it will remember anonymization mapping for multiple texts: ``` >>> anonymizer = PresidioAnonymizer() >>> anonymizer.anonymize("My name is John Doe. Hi John Doe!") 'My name is Noah Rhodes. Hi Noah Rhodes!' >>> anonymizer.anonymize("My name is John Doe. Hi John Doe!") 'My name is Brett Russell. Hi Brett Russell!' ``` ``` >>> anonymizer = PresidioReversibleAnonymizer() >>> anonymizer.anonymize("My name is John Doe. Hi John Doe!") 'My name is Noah Rhodes. Hi Noah Rhodes!' >>> anonymizer.anonymize("My name is John Doe. Hi John Doe!") 'My name is Noah Rhodes. Hi Noah Rhodes!' ``` ### Twitter handle @deepsense_ai / @MaksOpp ### Tag maintainer @baskaryan @hwchase17 @hinthornw --------- Co-authored-by: Bagatur <baskaryan@gmail.com> |
||
---|---|---|
.. | ||
autonomous_agents | ||
comprehend_moderation | ||
cpal | ||
data_anonymizer | ||
fallacy_removal | ||
generative_agents | ||
graph_transformers | ||
llm_bash | ||
llm_symbolic_math | ||
llms | ||
pal_chain | ||
plan_and_execute | ||
prompt_injection_identifier | ||
prompts | ||
pydantic_v1 | ||
retrievers | ||
smart_llm | ||
sql | ||
synthetic_data | ||
tabular_synthetic_data | ||
tot | ||
__init__.py | ||
py.typed |