You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
langchain/tests/integration_tests/document_loaders
Yaohui Wang 9d1bd18596
feat (documents): add LarkSuite document loader (#6420)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

### Summary

This PR adds a LarkSuite (FeiShu) document loader. 
> [LarkSuite](https://www.larksuite.com/) is an enterprise collaboration
platform developed by ByteDance.

### Tests

- an integration test case is added
- an example notebook showing usage is added. [Notebook
preview](https://github.com/yaohui-wyh/langchain/blob/master/docs/extras/modules/data_connection/document_loaders/integrations/larksuite.ipynb)

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

### Who can review?

- PTAL @eyurtsev @hwchase17

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->

---------

Co-authored-by: Yaohui Wang <wangyaohui.01@bytedance.com>
1 year ago
..
parsers feat (documents): add a source code loader based on AST manipulation (#6486) 1 year ago
__init__.py Add new iFixit document loader (#1333) 2 years ago
test_arxiv.py `Arxiv` document loader (#3627) 1 year ago
test_bigquery.py Harrison/big query (#2100) 1 year ago
test_bilibili.py Remove unnecessary spaces from document object’s page_content of BiliBiliLoader (#4619) 1 year ago
test_blockchain.py Enhancement: option to Get All Tokens with a single Blockchain Document Loader call (#3797) 1 year ago
test_confluence.py Several confluence loader improvements (#3300) 1 year ago
test_csv_loader.py feat: Add `UnstructuredCSVLoader` for CSV files (#5844) 1 year ago
test_dataframe.py rm pandas dependency (#2102) 1 year ago
test_duckdb.py Harrison/duckdb (#2064) 2 years ago
test_email.py Harrison/msg files (#2375) 1 year ago
test_embaas.py Add embaas document extraction api endpoints (#6048) 1 year ago
test_excel.py feat: add `UnstructuredExcelLoader` for `.xlsx` and `.xls` files (#5617) 1 year ago
test_facebook_chat.py Refactor TelegramChatLoader and FacebookChatLoader classes and add tests (#3863) 1 year ago
test_fauna.py Harrison/fauna loader (#5864) 1 year ago
test_figma.py Harrison/figma doc loader (#1908) 2 years ago
test_gitbook.py Harrison/gitbook (#2044) 1 year ago
test_github.py DocumentLoader for GitHub (#5408) 1 year ago
test_ifixit.py Add new iFixit document loader (#1333) 2 years ago
test_joplin.py Add Joplin document loader (#5153) 1 year ago
test_json_loader.py JSON loader (#4067) 1 year ago
test_language.py feat (documents): add a source code loader based on AST manipulation (#6486) 1 year ago
test_larksuite.py feat (documents): add LarkSuite document loader (#6420) 1 year ago
test_mastodon.py Add Mastodon toots loader (#5036) 1 year ago
test_max_compute.py add maxcompute (#5533) 1 year ago
test_modern_treasury.py Dev2049/add modern treasury (#3924) 1 year ago
test_odt.py feat: add loader for open office odt files (#4405) 1 year ago
test_org_mode.py feat: Add `UnstructuredOrgModeLoader` (#6842) 1 year ago
test_pdf.py Harrison/unstructured page number (#6464) 1 year ago
test_pyspark_dataframe_loader.py Harrison/spark reader (#5405) 1 year ago
test_python.py Add PythonLoader which auto-detects encoding of Python files (#3311) 1 year ago
test_rst.py feat: Add `UnstructuredRSTLoader` (#6594) 1 year ago
test_sitemap.py Harrison/sitemap local (#4704) 1 year ago
test_slack.py Add Slack Directory Loader (#2841) 1 year ago
test_spreedly.py Harrison/spreedly (#3937) 1 year ago
test_stripe.py Dev2049/add modern treasury (#3924) 1 year ago
test_unstructured.py feat: batch multiple files in a single Unstructured API request (#4525) 1 year ago
test_url.py add continue to fix 'continue_on_failure' parameter for URL doc loader (#2735) 1 year ago
test_url_playwright.py Harrison/playwright selector (#3185) 1 year ago
test_whatsapp_chat.py Fix WhatsAppChatLoader : Enable parsing additional formats (#6663) 1 year ago
test_xml.py feat: Add `UnstructuredXMLLoader` for `.xml` files (#5955) 1 year ago