forked from Archives/langchain
030ce9f506
Ran into a broken build if bs4 wasn't installed in the project. Minor tweak to follow the other doc loaders optional package-loading conventions. Also updated html docs to include reference to this new html loader. side note: Should there be 2 different html-to-text document loaders? This new one only handles local files, while the existing unstructured html loader handles HTML from local and remote. So it seems like the improvement was adding the title to the metadata, which is useful but could also be added to `html.py` |
||
---|---|---|
.. | ||
conllu.conllu | ||
facebook_chat.json | ||
fake-content.html | ||
fake-email.eml | ||
fake-power-point.pptx | ||
fake.docx | ||
layout-parser-paper.pdf | ||
mlb_teams_2012.csv | ||
notebook.ipynb | ||
telegram.json | ||
testing.enex |