Roma
2b4e9a3efa
Add unit test for _merge_splits function ( #3513 )
...
This commit adds a new unit test for the _merge_splits function in the
text splitter. The new test verifies that the function merges text into
chunks of the correct size and overlap, using a specified separator. The
test passes on the current implementation of the function.
2023-04-25 10:02:59 -07:00
Harrison Chase
f95d551f7a
Harrison/shallow metadata ( #1599 )
...
Co-authored-by: Jesse Zhang <jessetanzhang@gmail.com>
2023-03-11 09:18:25 -08:00
Harrison Chase
064741db58
Harrison/fix text splitter ( #1511 )
...
Co-authored-by: ajaysolanky <ajsolanky@gmail.com>
Co-authored-by: Ajay Solanky <ajaysolanky@saw-l14668307kd.myfiosgateway.com>
2023-03-07 15:42:28 -08:00
Harrison Chase
1511606799
Harrison/fix splitting ( #563 )
...
fix issue where text splitting could possibly create empty docs
2023-01-08 19:19:32 -08:00
Harrison Chase
1192cc0767
smart text splitter ( #530 )
...
smart text splitter that iteratively tries different separators until it
works!
2023-01-08 15:11:10 -08:00
Harrison Chase
c104d507bf
Harrison/improve data augmented generation docs ( #390 )
...
Co-authored-by: cameronccohen <cameron.c.cohen@gmail.com>
Co-authored-by: Cameron Cohen <cameron.cohen@quantco.com>
2022-12-20 22:24:08 -05:00
Harrison Chase
e7b625fe03
fix text splitter ( #375 )
2022-12-18 20:21:43 -05:00
Harrison Chase
160af4ba6b
Harrison/map reduce ( #36 )
2022-10-31 20:17:22 -07:00