langchain/libs/text-splitters
bilk0h 3d54784e6d
text-splitters: Fix/recursive json splitter data persistence issue (#21529)
Thank you for contributing to LangChain!

**Description:** Noticed an issue with when I was calling
`RecursiveJsonSplitter().split_json()` multiple times that I was getting
weird results. I found an issue where `chunks` list in the `_json_split`
method. If chunks is not provided when _json_split (which is the case
when split_json calls _json_split) then the same list is used for
subsequent calls to `_json_split`.


You can see this in the test case i also added to this commit.

Output should be: 
```
[{'a': 1, 'b': 2}]
[{'c': 3, 'd': 4}]
```

Instead you get:
```
[{'a': 1, 'b': 2}]
[{'a': 1, 'b': 2, 'c': 3, 'd': 4}]
```

---------

Co-authored-by: Nuno Campos <nuno@langchain.dev>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
2024-06-18 20:21:55 -07:00
..
langchain_text_splitters text-splitters: Fix/recursive json splitter data persistence issue (#21529) 2024-06-18 20:21:55 -07:00
scripts text-splitters[minor], langchain[minor], community[patch], templates, docs: langchain-text-splitters 0.0.1 (#18346) 2024-02-29 18:33:21 -08:00
tests text-splitters: Fix/recursive json splitter data persistence issue (#21529) 2024-06-18 20:21:55 -07:00
extended_testing_deps.txt multiple: get rid of pyproject extras (#22581) 2024-06-06 15:45:22 -07:00
Makefile text-splitters[minor], langchain[minor], community[patch], templates, docs: langchain-text-splitters 0.0.1 (#18346) 2024-02-29 18:33:21 -08:00
poetry.lock core[patch]: Pin pydantic in py3.12.4 (#23130) 2024-06-18 12:00:02 -07:00
pyproject.toml multiple: get rid of pyproject extras (#22581) 2024-06-06 15:45:22 -07:00
README.md docs: text splitters readme (#18359) 2024-03-01 03:00:42 +00:00

🦜✂️ LangChain Text Splitters

Downloads License: MIT

Quick Install

pip install langchain-text-splitters

What is it?

LangChain Text Splitters contains utilities for splitting into chunks a wide variety of text documents.

For full documentation see the API reference and the Text Splitters module in the main docs.

📕 Releases & Versioning

langchain-text-splitters is currently on version 0.0.x.

Minor version increases will occur for:

  • Breaking changes for any public interfaces NOT marked beta

Patch version increases will occur for:

  • Bug fixes
  • New features
  • Any changes to private interfaces
  • Any changes to beta features

💁 Contributing

As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.

For detailed information on how to contribute, see the Contributing Guide.