langchain/tests/integration_tests
Jens Madsen 8d9e9e013c
refactor: extract token text splitter function (#5179)
# Token text splitter for sentence transformers

The current TokenTextSplitter only works with OpenAi models via the
`tiktoken` package. This is not clear from the name `TokenTextSplitter`.
In this (first PR) a token based text splitter for sentence transformer
models is added. In the future I think we should work towards injecting
a tokenizer into the TokenTextSplitter to make ti more flexible.
Could perhaps be reviewed by @dev2049

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-04 14:41:44 -07:00
..
agent Add Multi-CSV/DF support in CSV and DataFrame Toolkits (#5009) 2023-05-25 14:23:11 -07:00
cache feat: add Momento as a standard cache and chat message history provider (#5221) 2023-05-25 19:13:21 -07:00
callbacks Update Tracer Auth / Reduce Num Calls (#5517) 2023-06-02 12:13:56 -07:00
chains Harrison/neo4j (#5078) 2023-05-22 07:31:48 -07:00
chat_models Harrison/vertex (#5049) 2023-05-24 15:51:12 -07:00
client Add Feedback Methods + Evaluation examples (#5166) 2023-05-31 11:14:27 -07:00
document_loaders feat: add UnstructuredExcelLoader for .xlsx and .xls files (#5617) 2023-06-03 12:44:12 -07:00
embeddings encoding_kwargs for InstructEmbeddings (#5450) 2023-05-30 11:57:04 -07:00
examples feat: add UnstructuredExcelLoader for .xlsx and .xls files (#5617) 2023-06-03 12:44:12 -07:00
llms Harrison/prediction guard update (#5404) 2023-05-29 07:14:59 -07:00
memory feat: add Momento as a standard cache and chat message history provider (#5221) 2023-05-25 19:13:21 -07:00
prompts
retrievers Harrison/pubmed integration (#5664) 2023-06-03 16:25:28 -07:00
utilities Harrison/pubmed integration (#5664) 2023-06-03 16:25:28 -07:00
vectorstores removing client+namespace in favor of collection (#5610) 2023-06-03 16:27:31 -07:00
__init__.py
.env.example adding MongoDBAtlasVectorSearch (#5338) 2023-05-30 07:59:01 -07:00
conftest.py
test_document_transformers.py
test_nlp_text_splitters.py
test_pdf_pagesplitter.py
test_schema.py Add 'get_token_ids' method (#4784) 2023-05-22 13:17:26 +00:00
test_text_splitter.py refactor: extract token text splitter function (#5179) 2023-06-04 14:41:44 -07:00