langchain/tests/integration_tests
Jason Fan 8effd90be0
Add new types of document transformers (#7379)
- Description: Add two new document transformers that translates
documents into different languages and converts documents into q&a
format to improve vector search results. Uses OpenAI function calling
via the [doctran](https://github.com/psychic-api/doctran/tree/main)
library.
  - Issue: N/A
  - Dependencies: `doctran = "^0.0.5"`
  - Tag maintainer: @rlancemartin @eyurtsev @hwchase17 
  - Twitter handle: @psychicapi or @jfan001

Notes
- Adheres to the `DocumentTransformer` abstraction set by @dev2049 in
#3182
- refactored `EmbeddingsRedundantFilter` to put it in a file under a new
`document_transformers` module
- Added basic docs for `DocumentInterrogator`, `DocumentTransformer` as
well as the existing `EmbeddingsRedundantFilter`

---------

Co-authored-by: Lance Martin <lance@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-07-12 23:53:30 -04:00
..
agent
cache
callbacks support adding custom metadata to runs (#7120) 2023-07-05 11:11:38 -07:00
chains codespell: workflow, config + some (quite a few) typos fixed (#6785) 2023-07-12 16:20:08 -04:00
chat_models Added support for chat_history (#7555) 2023-07-11 15:27:26 -04:00
client Load Run Evaluator (#7101) 2023-07-07 19:57:59 -07:00
document_loaders codespell: workflow, config + some (quite a few) typos fixed (#6785) 2023-07-12 16:20:08 -04:00
embeddings Harrison/octo ml (#6897) 2023-06-28 23:04:11 -07:00
evaluation/embedding_distance Add String Distance and Embedding Evaluators (#7123) 2023-07-07 21:44:31 -07:00
examples feat: Add UnstructuredTSVLoader (#7367) 2023-07-10 03:07:10 -04:00
graphs Add HugeGraphQAChain to support gremlin generating chain (#7132) 2023-07-04 10:21:21 -06:00
llms feat: ctransformers support async chain (#6859) 2023-07-10 04:23:41 -04:00
memory Harrison/split schema dir (#7025) 2023-07-01 13:39:19 -04:00
prompts
retrievers Add new types of document transformers (#7379) 2023-07-12 23:53:30 -04:00
utilities Harrison/dataforseo (#7214) 2023-07-05 16:02:02 -04:00
vectorstores codespell: workflow, config + some (quite a few) typos fixed (#6785) 2023-07-12 16:20:08 -04:00
__init__.py
.env.example
conftest.py
test_document_transformers.py Add new types of document transformers (#7379) 2023-07-12 23:53:30 -04:00
test_kuzu.py
test_nebulagraph.py
test_nlp_text_splitters.py Add spacy sentencizer (#7442) 2023-07-10 02:52:05 -04:00
test_pdf_pagesplitter.py
test_schema.py Base language model docstrings (#7104) 2023-07-07 16:09:10 -04:00
test_text_splitter.py