mirror of
https://github.com/hwchase17/langchain
synced 2024-10-29 17:07:25 +00:00
38 lines
1.1 KiB
Plaintext
38 lines
1.1 KiB
Plaintext
|
# Doctran
|
||
|
|
||
|
>[Doctran](https://github.com/psychic-api/doctran) is a python package. It uses LLMs and open source
|
||
|
> NLP libraries to transform raw text into clean, structured, information-dense documents
|
||
|
> that are optimized for vector space retrieval. You can think of `Doctran` as a black box where
|
||
|
> messy strings go in and nice, clean, labelled strings come out.
|
||
|
|
||
|
|
||
|
## Installation and Setup
|
||
|
|
||
|
```bash
|
||
|
pip install doctran
|
||
|
```
|
||
|
|
||
|
## Document Transformers
|
||
|
|
||
|
### Document Interrogator
|
||
|
|
||
|
See a [usage example for DoctranQATransformer](/docs/integrations/document_transformers/doctran_interrogate_document).
|
||
|
|
||
|
```python
|
||
|
from langchain.document_loaders import DoctranQATransformer
|
||
|
```
|
||
|
### Property Extractor
|
||
|
|
||
|
See a [usage example for DoctranPropertyExtractor](/docs/integrations/document_transformers/doctran_extract_properties).
|
||
|
|
||
|
```python
|
||
|
from langchain.document_loaders import DoctranPropertyExtractor
|
||
|
```
|
||
|
### Document Translator
|
||
|
|
||
|
See a [usage example for DoctranTextTranslator](/docs/integrations/document_transformers/doctran_translate_document).
|
||
|
|
||
|
```python
|
||
|
from langchain.document_loaders import DoctranTextTranslator
|
||
|
```
|