You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
langchain/langchain/document_loaders
Kenzie Mihardja b8d78424ab
Change Data Loader Namespace (#6568)
Description:
Update the artifact name of the xml file and the namespaces. Co-authored
with @tjaffri
Co-authored-by: Kenzie Mihardja <kenzie@docugami.com>
11 months ago
..
blob_loaders YoutubeAudioLoader and updates to OpenAIWhisperParser (#5772) 1 year ago
parsers YoutubeAudioLoader and updates to OpenAIWhisperParser (#5772) 1 year ago
__init__.py Fix class promotion (#6187) 12 months ago
acreom.py Feature/add acreom loader (#5780) 12 months ago
airbyte_json.py Dev2049/add modern treasury (#3924) 1 year ago
airtable.py Create Airtable loader (#5958) 12 months ago
apify_dataset.py changed ValueError to ImportError (#5103) 1 year ago
arxiv.py `Arxiv` document loader (#3627) 1 year ago
azlyrics.py clean up loaders (#1178) 1 year ago
azure_blob_storage_container.py Minor: Remove duplicated word in error message (#2706) 1 year ago
azure_blob_storage_file.py Minor: Remove duplicated word in error message (#2706) 1 year ago
base.py docs `retriever` improvements (#4430) 1 year ago
bibtex.py Bibtex integration for document loader and retriever (#5137) 1 year ago
bigquery.py Ability to specify credentials wihen using Google BigQuery as a data loader (#5466) 1 year ago
bilibili.py Fix bilibili (#4860) 1 year ago
blackboard.py hotfix (#1742) 1 year ago
blockchain.py Enhancement: option to Get All Tokens with a single Blockchain Document Loader call (#3797) 1 year ago
chatgpt.py Add ChatGPT Data Loader (#3336) 1 year ago
college_confidential.py clean up loaders (#1178) 1 year ago
confluence.py feat: add content_format param to ConfluenceLoader.load() (#5922) 12 months ago
conllu.py add CoNLL-U document loader (#1297) 1 year ago
csv_loader.py feat: Add `UnstructuredCSVLoader` for CSV files (#5844) 12 months ago
dataframe.py Fix typo in dataframe.py (#4786) 1 year ago
diffbot.py consistently use getLogger(__name__), no root logger (#2989) 1 year ago
directory.py Add path validation to DirectoryLoader (#5327) 1 year ago
discord.py Harrison/discord loader (#3200) 1 year ago
docugami.py Change Data Loader Namespace (#6568) 11 months ago
duckdb_loader.py changed ValueError to ImportError (#5103) 1 year ago
email.py fix: pass unstructured kwargs down in all unstructured loaders (#2506) 1 year ago
embaas.py Add embaas document extraction api endpoints (#6048) 12 months ago
epub.py fix: pass unstructured kwargs down in all unstructured loaders (#2506) 1 year ago
evernote.py feature/4493 Improve Evernote Document Loader (#4577) 1 year ago
excel.py feat: add `UnstructuredExcelLoader` for `.xlsx` and `.xls` files (#5617) 1 year ago
facebook_chat.py Refactor TelegramChatLoader and FacebookChatLoader classes and add tests (#3863) 1 year ago
fauna.py Harrison/fauna loader (#5864) 12 months ago
figma.py Dev2049/add modern treasury (#3924) 1 year ago
gcs_directory.py Support GCS Objects with `/` in GCS Loaders (#3356) 1 year ago
gcs_file.py Support GCS Objects with `/` in GCS Loaders (#3356) 1 year ago
generic.py Add a generic document loader (#4875) 1 year ago
git.py Add source field to metadata (#4462) 1 year ago
gitbook.py Gitbook enhancements (#2279) 1 year ago
github.py DocumentLoader for GitHub (#5408) 1 year ago
googledrive.py Iterate through filtered file types instead of all listed files (#6258) 12 months ago
gutenberg.py gutenberg books (#946) 1 year ago
helpers.py feat #4479: TextLoader auto detect encoding and improved exceptions (#4927) 1 year ago
hn.py Refactor some loops into list comprehensions (#1185) 1 year ago
html.py feat: allow the unstructured kwargs to be passed in to Unstructured document loaders (#1667) 1 year ago
html_bs.py Add get_text_separator parameter to BSHTMLLoader (#3551) 1 year ago
hugging_face_dataset.py Hugging Face Loader: Add lazy load (#4799) 1 year ago
ifixit.py Harrison/ifixit (#1680) 1 year ago
image.py feat: allow the unstructured kwargs to be passed in to Unstructured document loaders (#1667) 1 year ago
image_captions.py changed ValueError to ImportError (#5103) 1 year ago
imsdb.py clean up loaders (#1178) 1 year ago
iugu.py Add Iugu document loader (#5162) 1 year ago
joplin.py Fixed regression in JoplinLoader's get note url (#5265) 1 year ago
json_loader.py changed ValueError to ImportError (#5103) 1 year ago
markdown.py fix: pass unstructured kwargs down in all unstructured loaders (#2506) 1 year ago
mastodon.py Add Mastodon toots loader (#5036) 1 year ago
max_compute.py add maxcompute (#5533) 1 year ago
mediawikidump.py Harrison/media wiki xml (#4072) 1 year ago
modern_treasury.py Dev2049/add modern treasury (#3924) 1 year ago
notebook.py changed ValueError to ImportError (#5103) 1 year ago
notion.py Harrison/unstructured support (#903) 1 year ago
notiondb.py Harrison/param notion db (#4689) 1 year ago
obsidian.py Dev2049/obsidian patch (#4204) 1 year ago
odt.py feat: add loader for open office odt files (#4405) 1 year ago
onedrive.py changed ValueError to ImportError (#5103) 1 year ago
onedrive_file.py Harrison/one drive loader (#4081) 1 year ago
pdf.py Fixed PermissionError on windows (#6170) 12 months ago
powerpoint.py feat: batch multiple files in a single Unstructured API request (#4525) 1 year ago
psychic.py Harrison/psychic (#5063) 1 year ago
pyspark_dataframe.py fix (#5457) 1 year ago
python.py Add PythonLoader which auto-detects encoding of Python files (#3311) 1 year ago
readthedocs.py Allow readthedoc loader to pass custom html tag (#5175) 1 year ago
reddit.py Langchain with reddit (#3661) (#3768) 1 year ago
roam.py Harrison/add roam loader (#939) 1 year ago
rtf.py feat: add loader for rich text files (#3227) 1 year ago
s3_directory.py changed ValueError to ImportError (#5103) 1 year ago
s3_file.py changed ValueError to ImportError (#5103) 1 year ago
sitemap.py Strips whitespace and \n from loc before filtering urls from sitemap (#5728) 1 year ago
slack_directory.py Add Slack Directory Loader (#2841) 1 year ago
snowflake_loader.py Fix: SnowflakeLoader returning empty documents (#5967) 12 months ago
spreedly.py Harrison/spreedly (#3937) 1 year ago
srt.py changed ValueError to ImportError (#5103) 1 year ago
stripe.py Dev2049/add modern treasury (#3924) 1 year ago
telegram.py changed ValueError to ImportError (#5103) 1 year ago
text.py feat #4479: TextLoader auto detect encoding and improved exceptions (#4927) 1 year ago
tomarkdown.py 2markdown loader (#4796) 1 year ago
toml.py Add PDF parser implementations (#4356) 1 year ago
trello.py New Trello document loader (#4767) 1 year ago
twitter.py changed ValueError to ImportError (#5103) 1 year ago
unstructured.py Harrison/unstructured page number (#6464) 12 months ago
url.py enhancement: add elements mode to `UnstructuredURLLoader` (#3456) 1 year ago
url_playwright.py changed ValueError to ImportError (#5103) 1 year ago
url_selenium.py changed ValueError to ImportError (#5103) 1 year ago
weather.py Adding Weather Loader (#5056) 1 year ago
web_base.py Update web_base.py _fetch() method For SiteMapLoader (#6256) 12 months ago
whatsapp_chat.py Fix whatsappchatloader - enable parsing new datetime format on WhatsApp chat (#6555) 11 months ago
wikipedia.py added `Wikipedia` document loader (#4141) 1 year ago
word_document.py feat: batch multiple files in a single Unstructured API request (#4525) 1 year ago
xml.py feat: Add `UnstructuredXMLLoader` for `.xml` files (#5955) 12 months ago
youtube.py Harrison/youtube multi language (#5758) 1 year ago