langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-29 17:07:25 +00:00

History

Jens Madsen 8d9e9e013c refactor: extract token text splitter function (#5179 ) # Token text splitter for sentence transformers The current TokenTextSplitter only works with OpenAi models via the `tiktoken` package. This is not clear from the name `TokenTextSplitter`. In this (first PR) a token based text splitter for sentence transformer models is added. In the future I think we should work towards injecting a tokenizer into the TokenTextSplitter to make ti more flexible. Could perhaps be reviewed by @dev2049 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>		2023-06-04 14:41:44 -07:00
..
agent	Add Multi-CSV/DF support in CSV and DataFrame Toolkits (#5009 )	2023-05-25 14:23:11 -07:00
cache	feat: add Momento as a standard cache and chat message history provider (#5221 )	2023-05-25 19:13:21 -07:00
callbacks	Update Tracer Auth / Reduce Num Calls (#5517 )	2023-06-02 12:13:56 -07:00
chains	Harrison/neo4j (#5078 )	2023-05-22 07:31:48 -07:00
chat_models	Harrison/vertex (#5049 )	2023-05-24 15:51:12 -07:00
client	Add Feedback Methods + Evaluation examples (#5166 )	2023-05-31 11:14:27 -07:00
document_loaders	feat: add `UnstructuredExcelLoader` for `.xlsx` and `.xls` files (#5617 )	2023-06-03 12:44:12 -07:00
embeddings	`encoding_kwargs` for InstructEmbeddings (#5450 )	2023-05-30 11:57:04 -07:00
examples	feat: add `UnstructuredExcelLoader` for `.xlsx` and `.xls` files (#5617 )	2023-06-03 12:44:12 -07:00
llms	Harrison/prediction guard update (#5404 )	2023-05-29 07:14:59 -07:00
memory	feat: add Momento as a standard cache and chat message history provider (#5221 )	2023-05-25 19:13:21 -07:00
prompts
retrievers	Harrison/pubmed integration (#5664 )	2023-06-03 16:25:28 -07:00
utilities	Harrison/pubmed integration (#5664 )	2023-06-03 16:25:28 -07:00
vectorstores	removing client+namespace in favor of collection (#5610 )	2023-06-03 16:27:31 -07:00
__init__.py
.env.example	adding MongoDBAtlasVectorSearch (#5338 )	2023-05-30 07:59:01 -07:00
conftest.py
test_document_transformers.py
test_nlp_text_splitters.py
test_pdf_pagesplitter.py
test_schema.py	Add 'get_token_ids' method (#4784 )	2023-05-22 13:17:26 +00:00
test_text_splitter.py	refactor: extract token text splitter function (#5179 )	2023-06-04 14:41:44 -07:00