langchain/tests/unit_tests/document_loaders
Eugene Yurtsev e46202829f
feat #4479: TextLoader auto detect encoding and improved exceptions (#4927)
# TextLoader auto detect encoding and enhanced exception handling

- Add an option to enable encoding detection on `TextLoader`. 
- The detection is done using `chardet`
- The loading is done by trying all detected encodings by order of
confidence or raise an exception otherwise.

### New Dependencies:
- `chardet`

Fixes #4479 

## Before submitting

<!-- If you're adding a new integration, include an integration test and
an example notebook showing its use! -->

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

- @eyurtsev

---------

Co-authored-by: blob42 <spike@w530>
2023-05-18 09:55:14 -04:00
..
blob_loaders fix(document_loaders/telegram): fix pandas calls + add tests (#4806) 2023-05-16 14:35:25 -07:00
loaders fix(document_loaders/telegram): fix pandas calls + add tests (#4806) 2023-05-16 14:35:25 -07:00
parsers Add html parsers (#4874) 2023-05-17 22:39:11 -04:00
test_docs fix(document_loaders/telegram): fix pandas calls + add tests (#4806) 2023-05-16 14:35:25 -07:00
__init__.py fix(document_loaders/telegram): fix pandas calls + add tests (#4806) 2023-05-16 14:35:25 -07:00
test_base.py fix(document_loaders/telegram): fix pandas calls + add tests (#4806) 2023-05-16 14:35:25 -07:00
test_bshtml.py Add html parsers (#4874) 2023-05-17 22:39:11 -04:00
test_confluence.py Add Confluence Loader unit tests (#3333) 2023-05-16 15:17:07 -07:00
test_csv_loader.py fix(document_loaders/telegram): fix pandas calls + add tests (#4806) 2023-05-16 14:35:25 -07:00
test_detect_encoding.py feat #4479: TextLoader auto detect encoding and improved exceptions (#4927) 2023-05-18 09:55:14 -04:00
test_generic_loader.py Add a generic document loader (#4875) 2023-05-17 22:38:55 -04:00
test_json_loader.py fix(document_loaders/telegram): fix pandas calls + add tests (#4806) 2023-05-16 14:35:25 -07:00
test_telegram.py fix(document_loaders/telegram): fix pandas calls + add tests (#4806) 2023-05-16 14:35:25 -07:00
test_web_base.py fix(document_loaders/telegram): fix pandas calls + add tests (#4806) 2023-05-16 14:35:25 -07:00
test_youtube.py fix(document_loaders/telegram): fix pandas calls + add tests (#4806) 2023-05-16 14:35:25 -07:00