langchain/tests/integration_tests/document_loaders
Julius Lipp 5b6bbf4ab2
Add embaas document extraction api endpoints (#6048)
# Introduces embaas document extraction api endpoints

In this PR, we add support for embaas document extraction endpoints to
Text Embedding Models (with LLMs, in different PRs coming). We currently
offer the MTEB leaderboard top performers, will continue to add top
embedding models and soon add support for customers to deploy thier own
models. Additional Documentation + Infomation can be found
[here](https://embaas.io).

While developing this integration, I closely followed the patterns
established by other langchain integrations. Nonetheless, if there are
any aspects that require adjustments or if there's a better way to
present a new integration, let me know! :)

Additionally, I fixed some docs in the embeddings integration.

Related PR: #5976 

#### Who can review?
  DataLoaders
  - @eyurtsev
2023-06-12 19:13:52 -07:00
..
parsers Add html parsers (#4874) 2023-05-17 22:39:11 -04:00
__init__.py Add new iFixit document loader (#1333) 2023-02-27 20:40:20 -08:00
test_arxiv.py Arxiv document loader (#3627) 2023-04-26 21:04:56 -07:00
test_bigquery.py Harrison/big query (#2100) 2023-03-28 08:17:22 -07:00
test_bilibili.py Remove unnecessary spaces from document object’s page_content of BiliBiliLoader (#4619) 2023-05-16 13:13:57 -04:00
test_blockchain.py Enhancement: option to Get All Tokens with a single Blockchain Document Loader call (#3797) 2023-05-03 15:46:44 -07:00
test_confluence.py Several confluence loader improvements (#3300) 2023-04-23 15:06:10 -07:00
test_csv_loader.py feat: Add UnstructuredCSVLoader for CSV files (#5844) 2023-06-07 19:18:01 -07:00
test_dataframe.py rm pandas dependency (#2102) 2023-03-28 08:38:19 -07:00
test_duckdb.py Harrison/duckdb (#2064) 2023-03-27 19:51:34 -07:00
test_email.py Harrison/msg files (#2375) 2023-04-04 06:48:34 -07:00
test_embaas.py Add embaas document extraction api endpoints (#6048) 2023-06-12 19:13:52 -07:00
test_excel.py feat: add UnstructuredExcelLoader for .xlsx and .xls files (#5617) 2023-06-03 12:44:12 -07:00
test_facebook_chat.py Refactor TelegramChatLoader and FacebookChatLoader classes and add tests (#3863) 2023-05-03 15:59:19 -07:00
test_fauna.py Harrison/fauna loader (#5864) 2023-06-07 21:32:23 -07:00
test_figma.py Harrison/figma doc loader (#1908) 2023-03-22 19:57:46 -07:00
test_gitbook.py Harrison/gitbook (#2044) 2023-03-28 15:28:33 -07:00
test_github.py DocumentLoader for GitHub (#5408) 2023-05-29 20:11:21 -07:00
test_ifixit.py Add new iFixit document loader (#1333) 2023-02-27 20:40:20 -08:00
test_joplin.py Add Joplin document loader (#5153) 2023-05-24 12:31:55 -07:00
test_json_loader.py JSON loader (#4067) 2023-05-05 14:48:13 -07:00
test_mastodon.py Add Mastodon toots loader (#5036) 2023-05-22 16:43:07 -07:00
test_max_compute.py add maxcompute (#5533) 2023-06-01 00:54:42 -07:00
test_modern_treasury.py Dev2049/add modern treasury (#3924) 2023-05-01 20:28:02 -07:00
test_odt.py feat: add loader for open office odt files (#4405) 2023-05-10 01:37:17 -07:00
test_pdf.py Dev2049/pypdfium2 (#4209) 2023-05-05 17:55:31 -07:00
test_pyspark_dataframe_loader.py Harrison/spark reader (#5405) 2023-05-29 20:23:17 -07:00
test_python.py Add PythonLoader which auto-detects encoding of Python files (#3311) 2023-04-21 10:47:57 -07:00
test_sitemap.py Harrison/sitemap local (#4704) 2023-05-14 22:04:38 -07:00
test_slack.py Add Slack Directory Loader (#2841) 2023-04-13 21:31:59 -07:00
test_spreedly.py Harrison/spreedly (#3937) 2023-05-01 20:56:56 -07:00
test_stripe.py Dev2049/add modern treasury (#3924) 2023-05-01 20:28:02 -07:00
test_unstructured.py feat: batch multiple files in a single Unstructured API request (#4525) 2023-05-21 20:48:20 -07:00
test_url_playwright.py Harrison/playwright selector (#3185) 2023-04-19 16:54:15 -07:00
test_url.py add continue to fix 'continue_on_failure' parameter for URL doc loader (#2735) 2023-04-11 21:12:39 -07:00
test_whatsapp_chat.py Update WhatsAppChatLoader to include the character ~ in the sender name (#4420) 2023-05-09 15:00:04 -07:00
test_xml.py feat: Add UnstructuredXMLLoader for .xml files (#5955) 2023-06-10 16:24:42 -07:00