You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
langchain/libs/text-splitters
Max Mulatz 058a64c563
Community[minor]: Add language parser for Elixir (#22742)
Hi 👋 

First off, thanks a ton for your work on this 💚 Really appreciate what
you're providing here for the community.

## Description

This PR adds a basic language parser for the
[Elixir](https://elixir-lang.org/) programming language. The parser code
is based upon the approach outlined in
https://github.com/langchain-ai/langchain/pull/13318: it's using
`tree-sitter` under the hood and aligns with all the other `tree-sitter`
based parses added that PR.

The `CHUNK_QUERY` I'm using here is probably not the most sophisticated
one, but it worked for my application. It's a starting point to provide
"core" parsing support for Elixir in LangChain. It enables people to use
the language parser out in real world applications which may then lead
to further tweaking of the queries. I consider this PR just the ground
work.

- **Dependencies:** requires `tree-sitter` and `tree-sitter-languages`
from the extended dependencies
- **Twitter handle:**`@bitcrowd`

## Checklist

- [x] **PR title**: "package: description"
- [x] **Add tests and docs**
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified.

<!-- If no one reviews your PR within a few days, please @-mention one
of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. -->
3 months ago
..
langchain_text_splitters Community[minor]: Add language parser for Elixir (#22742) 3 months ago
scripts text-splitters[minor], langchain[minor], community[patch], templates, docs: langchain-text-splitters 0.0.1 (#18346) 6 months ago
tests text-splitters[patch]: fix HTMLSectionSplitter parsing of xslt paths (#22176) 3 months ago
Makefile text-splitters[minor], langchain[minor], community[patch], templates, docs: langchain-text-splitters 0.0.1 (#18346) 6 months ago
README.md docs: text splitters readme (#18359) 6 months ago
extended_testing_deps.txt multiple: get rid of pyproject extras (#22581) 3 months ago
poetry.lock multiple: get rid of pyproject extras (#22581) 3 months ago
pyproject.toml multiple: get rid of pyproject extras (#22581) 3 months ago

README.md

🦜✂️ LangChain Text Splitters

Downloads License: MIT

Quick Install

pip install langchain-text-splitters

What is it?

LangChain Text Splitters contains utilities for splitting into chunks a wide variety of text documents.

For full documentation see the API reference and the Text Splitters module in the main docs.

📕 Releases & Versioning

langchain-text-splitters is currently on version 0.0.x.

Minor version increases will occur for:

  • Breaking changes for any public interfaces NOT marked beta

Patch version increases will occur for:

  • Bug fixes
  • New features
  • Any changes to private interfaces
  • Any changes to beta features

💁 Contributing

As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.

For detailed information on how to contribute, see the Contributing Guide.