langchain/docs/extras/integrations/providers/doctran.mdx
Leonid Ganeline cb84f612c9
docs: document_transformers consistency (#10467)
- Updated `document_transformers` examples: titles, descriptions, links
- Added `integrations/providers` for missed document_transformers
2023-09-30 16:36:23 -07:00

38 lines
1.1 KiB
Plaintext

# Doctran
>[Doctran](https://github.com/psychic-api/doctran) is a python package. It uses LLMs and open source
> NLP libraries to transform raw text into clean, structured, information-dense documents
> that are optimized for vector space retrieval. You can think of `Doctran` as a black box where
> messy strings go in and nice, clean, labelled strings come out.
## Installation and Setup
```bash
pip install doctran
```
## Document Transformers
### Document Interrogator
See a [usage example for DoctranQATransformer](/docs/integrations/document_transformers/doctran_interrogate_document).
```python
from langchain.document_loaders import DoctranQATransformer
```
### Property Extractor
See a [usage example for DoctranPropertyExtractor](/docs/integrations/document_transformers/doctran_extract_properties).
```python
from langchain.document_loaders import DoctranPropertyExtractor
```
### Document Translator
See a [usage example for DoctranTextTranslator](/docs/integrations/document_transformers/doctran_translate_document).
```python
from langchain.document_loaders import DoctranTextTranslator
```