langchain/tests/integration_tests/document_loaders
Yaohui Wang 9d1bd18596
feat (documents): add LarkSuite document loader (#6420)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

### Summary

This PR adds a LarkSuite (FeiShu) document loader. 
> [LarkSuite](https://www.larksuite.com/) is an enterprise collaboration
platform developed by ByteDance.

### Tests

- an integration test case is added
- an example notebook showing usage is added. [Notebook
preview](https://github.com/yaohui-wyh/langchain/blob/master/docs/extras/modules/data_connection/document_loaders/integrations/larksuite.ipynb)

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

### Who can review?

- PTAL @eyurtsev @hwchase17

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->

---------

Co-authored-by: Yaohui Wang <wangyaohui.01@bytedance.com>
2023-06-27 23:08:05 -07:00
..
parsers feat (documents): add a source code loader based on AST manipulation (#6486) 2023-06-27 15:58:47 -07:00
__init__.py Add new iFixit document loader (#1333) 2023-02-27 20:40:20 -08:00
test_arxiv.py Arxiv document loader (#3627) 2023-04-26 21:04:56 -07:00
test_bigquery.py Harrison/big query (#2100) 2023-03-28 08:17:22 -07:00
test_bilibili.py Remove unnecessary spaces from document object’s page_content of BiliBiliLoader (#4619) 2023-05-16 13:13:57 -04:00
test_blockchain.py Enhancement: option to Get All Tokens with a single Blockchain Document Loader call (#3797) 2023-05-03 15:46:44 -07:00
test_confluence.py Several confluence loader improvements (#3300) 2023-04-23 15:06:10 -07:00
test_csv_loader.py feat: Add UnstructuredCSVLoader for CSV files (#5844) 2023-06-07 19:18:01 -07:00
test_dataframe.py rm pandas dependency (#2102) 2023-03-28 08:38:19 -07:00
test_duckdb.py Harrison/duckdb (#2064) 2023-03-27 19:51:34 -07:00
test_email.py Harrison/msg files (#2375) 2023-04-04 06:48:34 -07:00
test_embaas.py Add embaas document extraction api endpoints (#6048) 2023-06-12 19:13:52 -07:00
test_excel.py feat: add UnstructuredExcelLoader for .xlsx and .xls files (#5617) 2023-06-03 12:44:12 -07:00
test_facebook_chat.py Refactor TelegramChatLoader and FacebookChatLoader classes and add tests (#3863) 2023-05-03 15:59:19 -07:00
test_fauna.py Harrison/fauna loader (#5864) 2023-06-07 21:32:23 -07:00
test_figma.py Harrison/figma doc loader (#1908) 2023-03-22 19:57:46 -07:00
test_gitbook.py Harrison/gitbook (#2044) 2023-03-28 15:28:33 -07:00
test_github.py DocumentLoader for GitHub (#5408) 2023-05-29 20:11:21 -07:00
test_ifixit.py Add new iFixit document loader (#1333) 2023-02-27 20:40:20 -08:00
test_joplin.py Add Joplin document loader (#5153) 2023-05-24 12:31:55 -07:00
test_json_loader.py JSON loader (#4067) 2023-05-05 14:48:13 -07:00
test_language.py feat (documents): add a source code loader based on AST manipulation (#6486) 2023-06-27 15:58:47 -07:00
test_larksuite.py feat (documents): add LarkSuite document loader (#6420) 2023-06-27 23:08:05 -07:00
test_mastodon.py Add Mastodon toots loader (#5036) 2023-05-22 16:43:07 -07:00
test_max_compute.py add maxcompute (#5533) 2023-06-01 00:54:42 -07:00
test_modern_treasury.py Dev2049/add modern treasury (#3924) 2023-05-01 20:28:02 -07:00
test_odt.py feat: add loader for open office odt files (#4405) 2023-05-10 01:37:17 -07:00
test_org_mode.py feat: Add UnstructuredOrgModeLoader (#6842) 2023-06-27 16:34:17 -07:00
test_pdf.py Harrison/unstructured page number (#6464) 2023-06-19 22:31:43 -07:00
test_pyspark_dataframe_loader.py Harrison/spark reader (#5405) 2023-05-29 20:23:17 -07:00
test_python.py Add PythonLoader which auto-detects encoding of Python files (#3311) 2023-04-21 10:47:57 -07:00
test_rst.py feat: Add UnstructuredRSTLoader (#6594) 2023-06-25 12:41:57 -07:00
test_sitemap.py Harrison/sitemap local (#4704) 2023-05-14 22:04:38 -07:00
test_slack.py Add Slack Directory Loader (#2841) 2023-04-13 21:31:59 -07:00
test_spreedly.py Harrison/spreedly (#3937) 2023-05-01 20:56:56 -07:00
test_stripe.py Dev2049/add modern treasury (#3924) 2023-05-01 20:28:02 -07:00
test_unstructured.py feat: batch multiple files in a single Unstructured API request (#4525) 2023-05-21 20:48:20 -07:00
test_url_playwright.py Harrison/playwright selector (#3185) 2023-04-19 16:54:15 -07:00
test_url.py add continue to fix 'continue_on_failure' parameter for URL doc loader (#2735) 2023-04-11 21:12:39 -07:00
test_whatsapp_chat.py Fix WhatsAppChatLoader : Enable parsing additional formats (#6663) 2023-06-25 12:08:43 -07:00
test_xml.py feat: Add UnstructuredXMLLoader for .xml files (#5955) 2023-06-10 16:24:42 -07:00