langchain/libs/community/langchain_community/document_loaders
Christophe Bornet af8c5c185b
langchain[minor],community[minor]: Add async methods in BaseLoader (#16634)
Adds:
* methods `aload()` and `alazy_load()` to interface `BaseLoader`
* implementation for class `MergedDataLoader `
* support for class `BaseLoader` in async function `aindex()` with unit
tests

Note: this is compatible with existing `aload()` methods that some
loaders already had.

**Twitter handle:** @cbornet_

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-01-31 11:08:11 -08:00
..
blob_loaders
parsers community[patch]: avoid KeyError when language not in LANGUAGE_SEGMENTERS (#15212) 2024-01-23 21:09:43 -08:00
__init__.py [Fix] Fix Cassandra Document loader default page content mapper (#16273) 2024-01-27 11:23:02 -08:00
acreom.py
airbyte_json.py
airbyte.py
airtable.py
apify_dataset.py
arcgis_loader.py
arxiv.py Update arxiv.py with get_summaries_as_docs inside of Arxivloader (#14953) 2023-12-22 13:14:22 -08:00
assemblyai.py community[minor]: add the ability to load existing transcripts from AssemblyAI by their id. (#16051) 2024-01-30 13:47:45 -08:00
astradb.py Use Postponed Evaluation of Annotations in Astra and Cassandra doc loaders (#16694) 2024-01-28 16:39:27 -08:00
async_html.py
azlyrics.py
azure_ai_data.py
azure_blob_storage_container.py
azure_blob_storage_file.py
baiducloud_bos_directory.py
baiducloud_bos_file.py
base_o365.py
base.py langchain[minor],community[minor]: Add async methods in BaseLoader (#16634) 2024-01-31 11:08:11 -08:00
bibtex.py
bigquery.py
bilibili.py
blackboard.py
blockchain.py
brave_search.py
browserless.py
cassandra.py Use Postponed Evaluation of Annotations in Astra and Cassandra doc loaders (#16694) 2024-01-28 16:39:27 -08:00
chatgpt.py
chm.py Feat: add CHM file loader (#15519) 2024-01-07 09:28:52 -08:00
chromium.py
college_confidential.py
concurrent.py
confluence.py
conllu.py
couchbase.py
csv_loader.py
cube_semantic.py
datadog_logs.py
dataframe.py
diffbot.py
directory.py Report which file was errored on in DirectoryLoader (#16790) 2024-01-30 09:14:58 -08:00
discord.py
doc_intelligence.py community: fix the "page" mode in the AzureAIDocumentIntelligenceParser (bug) (#15958) 2024-01-12 11:01:28 -08:00
docugami.py
docusaurus.py
dropbox.py
duckdb_loader.py
email.py
epub.py
etherscan.py
evernote.py
excel.py Docs: fix excel document loader typo (#15470) 2024-01-07 09:33:35 -08:00
facebook_chat.py
fauna.py
figma.py
gcs_directory.py
gcs_file.py fix: correct spelling mistakes of "seperate, intialise, pre-defined" (#14647) 2023-12-22 11:49:35 -08:00
generic.py
geodataframe.py
git.py
gitbook.py
github.py
google_speech_to_text.py
googledrive.py
gutenberg.py
helpers.py
hn.py
html_bs.py fix: correct spelling mistakes of "seperate, intialise, pre-defined" (#14647) 2023-12-22 11:49:35 -08:00
html.py
hugging_face_dataset.py
ifixit.py
image_captions.py
image.py
imsdb.py
iugu.py
joplin.py
json_loader.py
lakefs.py
larksuite.py
markdown.py corrected outdated link (#15053) 2023-12-22 12:39:38 -08:00
mastodon.py
max_compute.py
mediawikidump.py community:Lazy load wikipedia dump file (#15111) 2024-01-01 14:02:56 -08:00
merge.py langchain[minor],community[minor]: Add async methods in BaseLoader (#16634) 2024-01-31 11:08:11 -08:00
mhtml.py fix: correct spelling mistakes of "seperate, intialise, pre-defined" (#14647) 2023-12-22 11:49:35 -08:00
modern_treasury.py
mongodb.py
news.py
notebook.py
notion.py
notiondb.py
nuclia.py
obs_directory.py
obs_file.py
obsidian.py
odt.py
onedrive_file.py
onedrive.py
onenote.py
open_city_data.py
org_mode.py
pdf.py community: Include PDF ID in MathPix metadata (#15629) 2024-01-07 08:31:53 -08:00
polars_dataframe.py
powerpoint.py
psychic.py
pubmed.py
pyspark_dataframe.py
python.py
quip.py
readthedocs.py
recursive_url_loader.py
reddit.py
roam.py
rocksetdb.py
rspace.py fix: correct spelling mistakes of "seperate, intialise, pre-defined" (#14647) 2023-12-22 11:49:35 -08:00
rss.py
rst.py
rtf.py
s3_directory.py
s3_file.py
sharepoint.py
sitemap.py
slack_directory.py
snowflake_loader.py
spreedly.py
srt.py
stripe.py
surrealdb.py community[patch]: SurrealDB fix for asyncio (#16092) 2024-01-23 19:46:19 -08:00
telegram.py
tencent_cos_directory.py
tencent_cos_file.py fix: correct spelling mistakes of "seperate, intialise, pre-defined" (#14647) 2023-12-22 11:49:35 -08:00
tensorflow_datasets.py
text.py
tomarkdown.py
toml.py
trello.py
tsv.py
twitter.py
unstructured.py community[patch]: Load list of files using UnstructuredFileLoader (#16216) 2024-01-23 19:37:37 -08:00
url_playwright.py
url_selenium.py
url.py
vsdx.py community[minor]: New documents loader for visio files (with extension .vsdx) (#16171) 2024-01-22 22:07:03 -08:00
weather.py
web_base.py community[patch]: Add Cookie Support to Fetch Method (#16673) 2024-01-27 16:03:53 -08:00
whatsapp_chat.py
wikipedia.py
word_document.py
xml.py
xorbits.py
youtube.py community[patch]: youtube loader transcript format (#16625) 2024-01-26 15:26:09 -08:00