You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
langchain/tests/integration_tests/document_loaders
Julius Lipp 5b6bbf4ab2
Add embaas document extraction api endpoints (#6048)
# Introduces embaas document extraction api endpoints

In this PR, we add support for embaas document extraction endpoints to
Text Embedding Models (with LLMs, in different PRs coming). We currently
offer the MTEB leaderboard top performers, will continue to add top
embedding models and soon add support for customers to deploy thier own
models. Additional Documentation + Infomation can be found
[here](https://embaas.io).

While developing this integration, I closely followed the patterns
established by other langchain integrations. Nonetheless, if there are
any aspects that require adjustments or if there's a better way to
present a new integration, let me know! :)

Additionally, I fixed some docs in the embeddings integration.

Related PR: #5976 

#### Who can review?
  DataLoaders
  - @eyurtsev
1 year ago
..
parsers Add html parsers (#4874) 1 year ago
__init__.py Add new iFixit document loader (#1333) 2 years ago
test_arxiv.py `Arxiv` document loader (#3627) 1 year ago
test_bigquery.py Harrison/big query (#2100) 2 years ago
test_bilibili.py Remove unnecessary spaces from document object’s page_content of BiliBiliLoader (#4619) 1 year ago
test_blockchain.py Enhancement: option to Get All Tokens with a single Blockchain Document Loader call (#3797) 1 year ago
test_confluence.py Several confluence loader improvements (#3300) 1 year ago
test_csv_loader.py feat: Add `UnstructuredCSVLoader` for CSV files (#5844) 1 year ago
test_dataframe.py rm pandas dependency (#2102) 2 years ago
test_duckdb.py Harrison/duckdb (#2064) 2 years ago
test_email.py Harrison/msg files (#2375) 1 year ago
test_embaas.py Add embaas document extraction api endpoints (#6048) 1 year ago
test_excel.py feat: add `UnstructuredExcelLoader` for `.xlsx` and `.xls` files (#5617) 1 year ago
test_facebook_chat.py Refactor TelegramChatLoader and FacebookChatLoader classes and add tests (#3863) 1 year ago
test_fauna.py Harrison/fauna loader (#5864) 1 year ago
test_figma.py Harrison/figma doc loader (#1908) 2 years ago
test_gitbook.py Harrison/gitbook (#2044) 2 years ago
test_github.py DocumentLoader for GitHub (#5408) 1 year ago
test_ifixit.py Add new iFixit document loader (#1333) 2 years ago
test_joplin.py Add Joplin document loader (#5153) 1 year ago
test_json_loader.py JSON loader (#4067) 1 year ago
test_mastodon.py Add Mastodon toots loader (#5036) 1 year ago
test_max_compute.py add maxcompute (#5533) 1 year ago
test_modern_treasury.py Dev2049/add modern treasury (#3924) 1 year ago
test_odt.py feat: add loader for open office odt files (#4405) 1 year ago
test_pdf.py Dev2049/pypdfium2 (#4209) 1 year ago
test_pyspark_dataframe_loader.py Harrison/spark reader (#5405) 1 year ago
test_python.py Add PythonLoader which auto-detects encoding of Python files (#3311) 1 year ago
test_sitemap.py Harrison/sitemap local (#4704) 1 year ago
test_slack.py Add Slack Directory Loader (#2841) 1 year ago
test_spreedly.py Harrison/spreedly (#3937) 1 year ago
test_stripe.py Dev2049/add modern treasury (#3924) 1 year ago
test_unstructured.py feat: batch multiple files in a single Unstructured API request (#4525) 1 year ago
test_url.py add continue to fix 'continue_on_failure' parameter for URL doc loader (#2735) 1 year ago
test_url_playwright.py Harrison/playwright selector (#3185) 1 year ago
test_whatsapp_chat.py Update WhatsAppChatLoader to include the character ~ in the sender name (#4420) 1 year ago
test_xml.py feat: Add `UnstructuredXMLLoader` for `.xml` files (#5955) 1 year ago