langchain/docs/modules/indexes/document_loaders/examples
Mike McGarry ddd595fe81
feature/4493 Improve Evernote Document Loader (#4577)
# Improve Evernote Document Loader

When exporting from Evernote you may export more than one note.
Currently the Evernote loader concatenates the content of all notes in
the export into a single document and only attaches the name of the
export file as metadata on the document.

This change ensures that each note is loaded as an independent document
and all available metadata on the note e.g. author, title, created,
updated are added as metadata on each document.

It also uses an existing optional dependency of `html2text` instead of
`pypandoc` to remove the need to download the pandoc application via
`download_pandoc()` to be able to use the `pypandoc` python bindings.

Fixes #4493 

Co-authored-by: Mike McGarry <mike.mcgarry@finbourne.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-05-19 14:28:17 -07:00
..
example_data feature/4493 Improve Evernote Document Loader (#4577) 2023-05-19 14:28:17 -07:00
airbyte_json.ipynb Vwp/docs improved document loaders (#4006) 2023-05-02 15:24:53 -07:00
apify_dataset.ipynb Vwp/docs improved document loaders (#4006) 2023-05-02 15:24:53 -07:00
arxiv.ipynb Deleted importing Document from document_loaders.base because Documen… (#4068) 2023-05-03 17:54:30 -07:00
aws_s3_directory.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
aws_s3_file.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
azlyrics.ipynb Vwp/docs improved document loaders (#4006) 2023-05-02 15:24:53 -07:00
azure_blob_storage_container.ipynb Vwp/docs improved document loaders (#4006) 2023-05-02 15:24:53 -07:00
azure_blob_storage_file.ipynb Vwp/docs improved document loaders (#4006) 2023-05-02 15:24:53 -07:00
bilibili.ipynb Fix bilibili (#4860) 2023-05-18 09:56:51 -04:00
blackboard.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
blockchain.ipynb Vwp/docs improved document loaders (#4006) 2023-05-02 15:24:53 -07:00
chatgpt_loader.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
college_confidential.ipynb Vwp/docs improved document loaders (#4006) 2023-05-02 15:24:53 -07:00
confluence.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
conll-u.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
copypaste.ipynb Vwp/docs improved document loaders (#4006) 2023-05-02 15:24:53 -07:00
csv.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
diffbot.ipynb Vwp/docs improved document loaders (#4006) 2023-05-02 15:24:53 -07:00
discord_loader.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
docugami.ipynb Docugami docs: First cell should be a title cell (#4735) 2023-05-16 13:12:14 -04:00
duckdb.ipynb Vwp/docs improved document loaders (#4006) 2023-05-02 15:24:53 -07:00
email.ipynb Vwp/docs improved document loaders (#4006) 2023-05-02 15:24:53 -07:00
epub.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
evernote.ipynb feature/4493 Improve Evernote Document Loader (#4577) 2023-05-19 14:28:17 -07:00
facebook_chat.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
figma.ipynb Vwp/docs improved document loaders (#4006) 2023-05-02 15:24:53 -07:00
file_directory.ipynb feat #4479: TextLoader auto detect encoding and improved exceptions (#4927) 2023-05-18 09:55:14 -04:00
git.ipynb Vwp/docs improved document loaders (#4006) 2023-05-02 15:24:53 -07:00
gitbook.ipynb Vwp/docs improved document loaders (#4006) 2023-05-02 15:24:53 -07:00
google_bigquery.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
google_cloud_storage_directory.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
google_cloud_storage_file.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
google_drive.ipynb Load specific file types from Google Drive (issue #4878) (#4926) 2023-05-18 09:27:53 -04:00
gutenberg.ipynb Vwp/docs improved document loaders (#4006) 2023-05-02 15:24:53 -07:00
hacker_news.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
html.ipynb 2markdown loader (#4796) 2023-05-16 23:42:53 -07:00
hugging_face_dataset.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
ifixit.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
image_captions.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
image.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
imsdb.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
json_loader.ipynb JSON loader (#4067) 2023-05-05 14:48:13 -07:00
jupyter_notebook.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
markdown.ipynb Vwp/docs improved document loaders (#4006) 2023-05-02 15:24:53 -07:00
mediawikidump.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
microsoft_onedrive.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
microsoft_powerpoint.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
microsoft_word.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
modern_treasury.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
notion.ipynb Vwp/docs improved document loaders (#4006) 2023-05-02 15:24:53 -07:00
notiondb.ipynb Harrison/param notion db (#4689) 2023-05-14 18:26:25 -07:00
obsidian.ipynb Vwp/docs improved document loaders (#4006) 2023-05-02 15:24:53 -07:00
odt.ipynb feat: add loader for open office odt files (#4405) 2023-05-10 01:37:17 -07:00
pandas_dataframe.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
pdf.ipynb Feature: pdfplumber PDF loader with BaseBlobParser (#4552) 2023-05-15 09:47:02 -04:00
readthedocs_documentation.ipynb Vwp/docs improved document loaders (#4006) 2023-05-02 15:24:53 -07:00
reddit.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
roam.ipynb Vwp/docs improved document loaders (#4006) 2023-05-02 15:24:53 -07:00
sitemap.ipynb Harrison/sitemap local (#4704) 2023-05-14 22:04:38 -07:00
slack.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
spreedly.ipynb Vwp/docs improved document loaders (#4006) 2023-05-02 15:24:53 -07:00
stripe.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
subtitle.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
telegram.ipynb fix(document_loaders/telegram): fix pandas calls + add tests (#4806) 2023-05-16 14:35:25 -07:00
tomarkdown.ipynb 2markdown loader (#4796) 2023-05-16 23:42:53 -07:00
toml.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
twitter.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
unstructured_file.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
url.ipynb Harrison/playwright (#2871) 2023-04-13 22:15:03 -07:00
web_base.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
whatsapp_chat.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00
wikipedia.ipynb added Wikipedia document loader (#4141) 2023-05-06 09:32:45 -07:00
youtube_transcript.ipynb docs: document_loaders improvements (#4200) 2023-05-05 17:44:54 -07:00