docs: `chunkviz` reference (#14802)

Added a reference to the `Chunkviz` utility.
pull/14914/head
Leonid Ganeline 10 months ago committed by GitHub
parent 50381abc42
commit 922693caba
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -39,7 +39,6 @@ In addition to controlling which characters you can split on, you can also contr
- `chunk_overlap`: the maximum overlap between chunks. It can be nice to have some overlap to maintain some continuity between chunks (e.g. do a sliding window).
- `add_start_index`: whether to include the starting position of each chunk within the original document in the metadata.
```python
# This is a long document we can split up.
with open('../../state_of_the_union.txt') as f:
@ -79,6 +78,13 @@ print(texts[1])
</CodeOutputBlock>
### Evaluate text splitters
You can evaluate text splitters with the [Chunkviz utility](https://www.chunkviz.com/) created by `Greg Kamradt`.
`Chunkviz` is a great tool for visualizing how your text splitter is working. It will show you how your text is
being split up and help in tuning up the splitting parameters.
## Other transformations:
### Filter redundant docs, translate docs, extract metadata, and more

Loading…
Cancel
Save