You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
langchain/tests/integration_tests/document_loaders
Aarav Borthakur 210296a71f
Integrate Rockset as a document loader (#7681)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

Integrate [Rockset](https://rockset.com/docs/) as a document loader.

Issue: None
Dependencies: Nothing new (rockset's dependency was already added
[here](https://github.com/hwchase17/langchain/pull/6216))
Tag maintainer: @rlancemartin

I have added a test for the integration and an example notebook showing
its use. I ran `make lint` and everything looks good.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
..
parsers feat (documents): add a source code loader based on AST manipulation (#6486) 1 year ago
__init__.py Add new iFixit document loader (#1333) 2 years ago
test_arxiv.py `Arxiv` document loader (#3627) 1 year ago
test_bigquery.py Harrison/big query (#2100) 1 year ago
test_bilibili.py Remove unnecessary spaces from document object’s page_content of BiliBiliLoader (#4619) 1 year ago
test_blockchain.py codespell: workflow, config + some (quite a few) typos fixed (#6785) 1 year ago
test_confluence.py Several confluence loader improvements (#3300) 1 year ago
test_csv_loader.py feat: Add `UnstructuredCSVLoader` for CSV files (#5844) 1 year ago
test_dataframe.py rm pandas dependency (#2102) 1 year ago
test_duckdb.py Harrison/duckdb (#2064) 2 years ago
test_email.py feat: enable `UnstructuredEmailLoader` to process attachments (#6977) 1 year ago
test_embaas.py Add embaas document extraction api endpoints (#6048) 1 year ago
test_excel.py feat: add `UnstructuredExcelLoader` for `.xlsx` and `.xls` files (#5617) 1 year ago
test_facebook_chat.py Refactor TelegramChatLoader and FacebookChatLoader classes and add tests (#3863) 1 year ago
test_fauna.py Harrison/fauna loader (#5864) 1 year ago
test_figma.py Harrison/figma doc loader (#1908) 2 years ago
test_gitbook.py Harrison/gitbook (#2044) 1 year ago
test_github.py DocumentLoader for GitHub (#5408) 1 year ago
test_ifixit.py Add new iFixit document loader (#1333) 2 years ago
test_joplin.py Add Joplin document loader (#5153) 1 year ago
test_json_loader.py JSON loader (#4067) 1 year ago
test_language.py feat (documents): add a source code loader based on AST manipulation (#6486) 1 year ago
test_larksuite.py feat (documents): add LarkSuite document loader (#6420) 1 year ago
test_mastodon.py Add Mastodon toots loader (#5036) 1 year ago
test_max_compute.py add maxcompute (#5533) 1 year ago
test_modern_treasury.py Dev2049/add modern treasury (#3924) 1 year ago
test_odt.py feat: add loader for open office odt files (#4405) 1 year ago
test_org_mode.py feat: Add `UnstructuredOrgModeLoader` (#6842) 1 year ago
test_pdf.py Harrison/unstructured page number (#6464) 1 year ago
test_pyspark_dataframe_loader.py Harrison/spark reader (#5405) 1 year ago
test_python.py Add PythonLoader which auto-detects encoding of Python files (#3311) 1 year ago
test_rocksetdb.py Integrate Rockset as a document loader (#7681) 1 year ago
test_rst.py feat: Add `UnstructuredRSTLoader` (#6594) 1 year ago
test_sitemap.py Harrison/sitemap local (#4704) 1 year ago
test_slack.py Add Slack Directory Loader (#2841) 1 year ago
test_spreedly.py Harrison/spreedly (#3937) 1 year ago
test_stripe.py Dev2049/add modern treasury (#3924) 1 year ago
test_tsv.py feat: Add `UnstructuredTSVLoader` (#7367) 1 year ago
test_unstructured.py feat: batch multiple files in a single Unstructured API request (#4525) 1 year ago
test_url.py add continue to fix 'continue_on_failure' parameter for URL doc loader (#2735) 1 year ago
test_url_playwright.py Added matching async load func to PlaywrightURLLoader (#5938) 1 year ago
test_whatsapp_chat.py Fix WhatsAppChatLoader : Enable parsing additional formats (#6663) 1 year ago
test_xml.py feat: Add `UnstructuredXMLLoader` for `.xml` files (#5955) 1 year ago
test_xorbits.py Add Xorbits Dataframe as a Document Loader (#7319) 1 year ago