langchain/libs/text-splitters/tests
Matthew DeGenaro 66828f4ecc
text-splitters[patch]: Modified SpacyTextSplitter to fully keep whitespace when strip_whitespace is false (#23272)
Previously, regardless of whether or not strip_whitespace was set to
true or false, the strip text method in the SpacyTextSplitter class used
`sent.text` to get the sentence. I modified this to include a ternary
such that if strip_whitespace is false, it uses `sent.text_with_ws`
I also modified the project.toml to include the spacy pipeline package
and to lock the numpy version, as higher versions break spacy.

- **Issue:** N/a
- **Dependencies:** None
2024-09-02 21:15:56 +00:00
..
integration_tests text-splitters[patch]: Modified SpacyTextSplitter to fully keep whitespace when strip_whitespace is false (#23272) 2024-09-02 21:15:56 +00:00
test_data text-splitters[patch]: fix HTMLSectionSplitter parsing of xslt paths (#22176) 2024-06-03 20:26:59 +00:00
unit_tests text-splitters[patch]: fix typing for keep_separator (#25706) 2024-08-23 17:22:02 +00:00
__init__.py