You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
langchain/langchain/document_loaders
Tim Asp 030ce9f506
fix import error of bs4 (#1952)
Ran into a broken build if bs4 wasn't installed in the project.

Minor tweak to follow the other doc loaders optional package-loading
conventions.

Also updated html docs to include reference to this new html loader.

side note: Should there be 2 different html-to-text document loaders?
This new one only handles local files, while the existing unstructured
html loader handles HTML from local and remote. So it seems like the
improvement was adding the title to the metadata, which is useful but
could also be added to `html.py`
1 year ago
..
__init__.py Remove redundant .docx loader (closes #1716) + update how_to_guides.rst (#1891) 1 year ago
airbyte_json.py Harrison/airbyte (#989) 1 year ago
azlyrics.py clean up loaders (#1178) 1 year ago
base.py Harrison/unstructured support (#903) 1 year ago
blackboard.py hotfix (#1742) 1 year ago
college_confidential.py clean up loaders (#1178) 1 year ago
conllu.py add CoNLL-U document loader (#1297) 1 year ago
csv_loader.py Allow passing in encoding to csv_loader (#1836) 1 year ago
directory.py Add HTML document_loader that includes page title metadata (#1720) 1 year ago
email.py Harrison/unstructured structured (#1004) 1 year ago
evernote.py Update and rename everynote.py to evernote.py (#1060) 1 year ago
facebook_chat.py Harrison/fb loader (#1277) 1 year ago
figma.py Harrison/figma doc loader (#1908) 1 year ago
gcs_directory.py Harrison/add roam loader (#939) 1 year ago
gcs_file.py Harrison/add roam loader (#939) 1 year ago
gitbook.py Add optional `base_url` arg to `GitbookLoader` (#1552) 1 year ago
googledrive.py Add service account support to Google Drive (#1761) 1 year ago
gutenberg.py gutenberg books (#946) 1 year ago
hn.py Refactor some loops into list comprehensions (#1185) 1 year ago
html.py feat: allow the unstructured kwargs to be passed in to Unstructured document loaders (#1667) 1 year ago
html_bs.py fix import error of bs4 (#1952) 1 year ago
ifixit.py Harrison/ifixit (#1680) 1 year ago
image.py feat: allow the unstructured kwargs to be passed in to Unstructured document loaders (#1667) 1 year ago
imsdb.py clean up loaders (#1178) 1 year ago
markdown.py feat: document loader for markdown files (#1558) 1 year ago
notebook.py fix imports (#1288) 1 year ago
notion.py Harrison/unstructured support (#903) 1 year ago
obsidian.py add encoding parameter to ObsidianLoader (#1752) 1 year ago
pdf.py feat: allow the unstructured kwargs to be passed in to Unstructured document loaders (#1667) 1 year ago
powerpoint.py feat: allow the unstructured kwargs to be passed in to Unstructured document loaders (#1667) 1 year ago
readthedocs.py Harrison/rtd loader (#1513) 1 year ago
roam.py Harrison/add roam loader (#939) 1 year ago
s3_directory.py Harrison/add roam loader (#939) 1 year ago
s3_file.py Support S3 Object keys with `/` in `S3FileLoader` (#1517) 1 year ago
srt.py add srt loader (#1140) 1 year ago
telegram.py fix telegram imports (#1110) 1 year ago
text.py Harrison/0083 (#996) 1 year ago
unstructured.py feat: allow the unstructured kwargs to be passed in to Unstructured document loaders (#1667) 1 year ago
url.py Fix description of UnstructuredURLLoader & UnstructuredHTMLLoader (#1570) 1 year ago
web_base.py Harrison/headers (#1696) 1 year ago
word_document.py feat: allow the unstructured kwargs to be passed in to Unstructured document loaders (#1667) 1 year ago
youtube.py Harrison/subtitles (#1842) 1 year ago