langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-29 17:07:25 +00:00

History

maks-operlejn-ds 274c3dc3a8 Multilingual anonymization (#10327 ) ### Description Add multiple language support to Anonymizer PII detection in Microsoft Presidio relies on several components - in addition to the usual pattern matching (e.g. using regex), the analyser uses a model for Named Entity Recognition (NER) to extract entities such as: - `PERSON` - `LOCATION` - `DATE_TIME` - `NRP` - `ORGANIZATION` [[Source]](https://github.com/microsoft/presidio/blob/main/presidio-analyzer/presidio_analyzer/predefined_recognizers/spacy_recognizer.py) To handle NER in specific languages, we utilize unique models from the `spaCy` library, recognized for its extensive selection covering multiple languages and sizes. However, it's not restrictive, allowing for integration of alternative frameworks such as [Stanza](https://microsoft.github.io/presidio/analyzer/nlp_engines/spacy_stanza/) or [transformers](https://microsoft.github.io/presidio/analyzer/nlp_engines/transformers/) when necessary. ### Future works - automatic language detection - instead of passing the language as a parameter in `anonymizer.anonymize`, we could detect the language/s beforehand and then use the corresponding NER model. We have discussed this internally and @mateusz-wosinski-ds will look into a standalone language detection tool/chain for LangChain 😄 ### Twitter handle @deepsense_ai / @MaksOpp ### Tag maintainer @baskaryan @hwchase17 @hinthornw		2023-09-07 14:42:24 -07:00
..
autonomous_agents	Harrison/string inplace (#10153 )	2023-09-03 14:25:29 -07:00
comprehend_moderation	Harrison/string inplace (#10153 )	2023-09-03 14:25:29 -07:00
cpal	Add security notices on PAL and CPAL experimental chains. (#9938 )	2023-08-29 13:51:56 -04:00
data_anonymizer	Multilingual anonymization (#10327 )	2023-09-07 14:42:24 -07:00
fallacy_removal	adding new chain for logical fallacy removal from model output in chain (#9887 )	2023-09-03 15:44:27 -07:00
generative_agents	Use a submodule for pydantic v1 compat (#9371 )	2023-08-17 16:35:49 +01:00
graph_transformers	Diffbot Graph Transformer / Neo4j Graph document ingestion (#9979 )	2023-09-06 13:32:59 -07:00
llms	Use a submodule for pydantic v1 compat (#9371 )	2023-08-17 16:35:49 +01:00
pal_chain	Add security notices on PAL and CPAL experimental chains. (#9938 )	2023-08-29 13:51:56 -04:00
plan_and_execute	Use a submodule for pydantic v1 compat (#9371 )	2023-08-17 16:35:49 +01:00
prompts	Harrison/official pre release (#8106 )	2023-07-21 18:44:32 -07:00
pydantic_v1	`poetry lock` the experimental package. (#9478 )	2023-08-22 14:09:35 -04:00
retrievers	Resolve: VectorSearch enabled SQLChain? (#10177 )	2023-09-06 17:08:12 -07:00
smart_llm	Use a submodule for pydantic v1 compat (#9371 )	2023-08-17 16:35:49 +01:00
sql	Resolve: VectorSearch enabled SQLChain? (#10177 )	2023-09-06 17:08:12 -07:00
tot	Use a submodule for pydantic v1 compat (#9371 )	2023-08-17 16:35:49 +01:00
__init__.py	Use a submodule for pydantic v1 compat (#9371 )	2023-08-17 16:35:49 +01:00
py.typed	Add `py.typed` file to `langchain-experimental`. (#9557 )	2023-08-21 15:37:16 -04:00