You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
langchain/docs/modules/indexes/document_loaders/examples
Eugene Yurtsev 5cfa72a130
Bibtex integration for document loader and retriever (#5137)
# Bibtex integration

Wrap bibtexparser to retrieve a list of docs from a bibtex file.
* Get the metadata from the bibtex entries
* `page_content` get from the local pdf referenced in the `file` field
of the bibtex entry using `pymupdf`
* If no valid pdf file, `page_content` set to the `abstract` field of
the bibtex entry
* Support Zotero flavour using regex to get the file path
* Added usage example in
`docs/modules/indexes/document_loaders/examples/bibtex.ipynb`
---------

Co-authored-by: Sébastien M. Popoff <sebastien.popoff@espci.fr>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
..
example_data feature/4493 Improve Evernote Document Loader (#4577) 1 year ago
airbyte_json.ipynb Vwp/docs improved document loaders (#4006) 1 year ago
apify_dataset.ipynb Vwp/docs improved document loaders (#4006) 1 year ago
arxiv.ipynb Deleted importing Document from document_loaders.base because Documen… (#4068) 1 year ago
aws_s3_directory.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
aws_s3_file.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
azlyrics.ipynb Vwp/docs improved document loaders (#4006) 1 year ago
azure_blob_storage_container.ipynb Vwp/docs improved document loaders (#4006) 1 year ago
azure_blob_storage_file.ipynb Vwp/docs improved document loaders (#4006) 1 year ago
bibtex.ipynb Bibtex integration for document loader and retriever (#5137) 1 year ago
bilibili.ipynb Fix bilibili (#4860) 1 year ago
blackboard.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
blockchain.ipynb Vwp/docs improved document loaders (#4006) 1 year ago
chatgpt_loader.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
college_confidential.ipynb Vwp/docs improved document loaders (#4006) 1 year ago
confluence.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
conll-u.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
copypaste.ipynb Vwp/docs improved document loaders (#4006) 1 year ago
csv.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
diffbot.ipynb Vwp/docs improved document loaders (#4006) 1 year ago
discord_loader.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
docugami.ipynb Docugami docs: First cell should be a title cell (#4735) 1 year ago
duckdb.ipynb Vwp/docs improved document loaders (#4006) 1 year ago
email.ipynb Vwp/docs improved document loaders (#4006) 1 year ago
epub.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
evernote.ipynb feature/4493 Improve Evernote Document Loader (#4577) 1 year ago
facebook_chat.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
figma.ipynb Vwp/docs improved document loaders (#4006) 1 year ago
file_directory.ipynb feat #4479: TextLoader auto detect encoding and improved exceptions (#4927) 1 year ago
git.ipynb Vwp/docs improved document loaders (#4006) 1 year ago
gitbook.ipynb Vwp/docs improved document loaders (#4006) 1 year ago
google_bigquery.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
google_cloud_storage_directory.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
google_cloud_storage_file.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
google_drive.ipynb Load specific file types from Google Drive (issue #4878) (#4926) 1 year ago
gutenberg.ipynb Vwp/docs improved document loaders (#4006) 1 year ago
hacker_news.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
html.ipynb 2markdown loader (#4796) 1 year ago
hugging_face_dataset.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
ifixit.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
image.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
image_captions.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
imsdb.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
iugu.ipynb Add Iugu document loader (#5162) 1 year ago
joplin.ipynb Add Joplin document loader (#5153) 1 year ago
json.ipynb docs: added missed `document_loaders` examples (#5150) 1 year ago
jupyter_notebook.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
markdown.ipynb Vwp/docs improved document loaders (#4006) 1 year ago
mastodon.ipynb Add Mastodon toots loader (#5036) 1 year ago
mediawikidump.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
microsoft_onedrive.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
microsoft_powerpoint.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
microsoft_word.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
modern_treasury.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
notion.ipynb Vwp/docs improved document loaders (#4006) 1 year ago
notiondb.ipynb Harrison/param notion db (#4689) 1 year ago
obsidian.ipynb Vwp/docs improved document loaders (#4006) 1 year ago
odt.ipynb docs: added missed `document_loaders` examples (#5150) 1 year ago
pandas_dataframe.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
pdf.ipynb Feature: pdfplumber PDF loader with BaseBlobParser (#4552) 1 year ago
psychic.ipynb Harrison/psychic (#5063) 1 year ago
readthedocs_documentation.ipynb Vwp/docs improved document loaders (#4006) 1 year ago
reddit.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
roam.ipynb Vwp/docs improved document loaders (#4006) 1 year ago
sitemap.ipynb Harrison/sitemap local (#4704) 1 year ago
slack.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
spreedly.ipynb Vwp/docs improved document loaders (#4006) 1 year ago
stripe.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
subtitle.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
telegram.ipynb fix(document_loaders/telegram): fix pandas calls + add tests (#4806) 1 year ago
tomarkdown.ipynb docs: added missed `document_loaders` examples (#5150) 1 year ago
toml.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
twitter.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
unstructured_file.ipynb feat: batch multiple files in a single Unstructured API request (#4525) 1 year ago
url.ipynb Harrison/playwright (#2871) 1 year ago
weather.ipynb Adding Weather Loader (#5056) 1 year ago
web_base.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
whatsapp_chat.ipynb docs: `document_loaders` improvements (#4200) 1 year ago
wikipedia.ipynb added `Wikipedia` document loader (#4141) 1 year ago
youtube_transcript.ipynb docs: `document_loaders` improvements (#4200) 1 year ago