mirror of https://github.com/hwchase17/langchain
Improve effeciency of TextSplitter.split_documents, iterate once (#5111)
# Improve TextSplitter.split_documents, collect page_content and metadata in one iteration ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @eyurtsev In the case where documents is a generator that can only be iterated once making this change is a huge help. Otherwise a silent issue happens where metadata is empty for all documents when documents is a generator. So we expand the argument from `List[Document]` to `Union[Iterable[Document], Sequence[Document]]` --------- Co-authored-by: Steven Tartakovsky <tartakovsky.developer@gmail.com>pull/5115/head
parent
b950022894
commit
d56313acba
Loading…
Reference in New Issue