You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
langchain/langchain/document_loaders
Dhruvil Shah 2eec687474
update web_base.py to have verify option (#6107)
We propose an enhancement to the web-based loader initialize method by
introducing a "verify" option. This enhancement addresses the issue of
SSL verification errors encountered on certain web pages. By providing
users with the option to set the verify parameter to False, we offer
greater flexibility and control.
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

### Fixes #6079 

#### Who can review?
@eyurtsev @hwchase17

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
12 months ago
..
blob_loaders YoutubeAudioLoader and updates to OpenAIWhisperParser (#5772) 1 year ago
parsers YoutubeAudioLoader and updates to OpenAIWhisperParser (#5772) 1 year ago
__init__.py Feature/add acreom loader (#5780) 12 months ago
acreom.py Feature/add acreom loader (#5780) 12 months ago
airbyte_json.py Dev2049/add modern treasury (#3924) 1 year ago
airtable.py Create Airtable loader (#5958) 12 months ago
apify_dataset.py changed ValueError to ImportError (#5103) 1 year ago
arxiv.py `Arxiv` document loader (#3627) 1 year ago
azlyrics.py clean up loaders (#1178) 1 year ago
azure_blob_storage_container.py Minor: Remove duplicated word in error message (#2706) 1 year ago
azure_blob_storage_file.py Minor: Remove duplicated word in error message (#2706) 1 year ago
base.py docs `retriever` improvements (#4430) 1 year ago
bibtex.py Bibtex integration for document loader and retriever (#5137) 1 year ago
bigquery.py Ability to specify credentials wihen using Google BigQuery as a data loader (#5466) 1 year ago
bilibili.py Fix bilibili (#4860) 1 year ago
blackboard.py hotfix (#1742) 1 year ago
blockchain.py Enhancement: option to Get All Tokens with a single Blockchain Document Loader call (#3797) 1 year ago
chatgpt.py Add ChatGPT Data Loader (#3336) 1 year ago
college_confidential.py clean up loaders (#1178) 1 year ago
confluence.py feat: add content_format param to ConfluenceLoader.load() (#5922) 12 months ago
conllu.py add CoNLL-U document loader (#1297) 1 year ago
csv_loader.py feat: Add `UnstructuredCSVLoader` for CSV files (#5844) 12 months ago
dataframe.py Fix typo in dataframe.py (#4786) 1 year ago
diffbot.py consistently use getLogger(__name__), no root logger (#2989) 1 year ago
directory.py Add path validation to DirectoryLoader (#5327) 1 year ago
discord.py Harrison/discord loader (#3200) 1 year ago
docugami.py changed ValueError to ImportError (#5103) 1 year ago
duckdb_loader.py changed ValueError to ImportError (#5103) 1 year ago
email.py fix: pass unstructured kwargs down in all unstructured loaders (#2506) 1 year ago
embaas.py Add embaas document extraction api endpoints (#6048) 12 months ago
epub.py fix: pass unstructured kwargs down in all unstructured loaders (#2506) 1 year ago
evernote.py feature/4493 Improve Evernote Document Loader (#4577) 1 year ago
excel.py feat: add `UnstructuredExcelLoader` for `.xlsx` and `.xls` files (#5617) 1 year ago
facebook_chat.py Refactor TelegramChatLoader and FacebookChatLoader classes and add tests (#3863) 1 year ago
fauna.py Harrison/fauna loader (#5864) 12 months ago
figma.py Dev2049/add modern treasury (#3924) 1 year ago
gcs_directory.py Support GCS Objects with `/` in GCS Loaders (#3356) 1 year ago
gcs_file.py Support GCS Objects with `/` in GCS Loaders (#3356) 1 year ago
generic.py Add a generic document loader (#4875) 1 year ago
git.py Add source field to metadata (#4462) 1 year ago
gitbook.py Gitbook enhancements (#2279) 1 year ago
github.py DocumentLoader for GitHub (#5408) 1 year ago
googledrive.py Allow GoogleDrive to authenticate via application default credentials on Cloud Run/GCE etc without service key (#6035) 12 months ago
gutenberg.py gutenberg books (#946) 1 year ago
helpers.py feat #4479: TextLoader auto detect encoding and improved exceptions (#4927) 1 year ago
hn.py Refactor some loops into list comprehensions (#1185) 1 year ago
html.py feat: allow the unstructured kwargs to be passed in to Unstructured document loaders (#1667) 1 year ago
html_bs.py Add get_text_separator parameter to BSHTMLLoader (#3551) 1 year ago
hugging_face_dataset.py Hugging Face Loader: Add lazy load (#4799) 1 year ago
ifixit.py Harrison/ifixit (#1680) 1 year ago
image.py feat: allow the unstructured kwargs to be passed in to Unstructured document loaders (#1667) 1 year ago
image_captions.py changed ValueError to ImportError (#5103) 1 year ago
imsdb.py clean up loaders (#1178) 1 year ago
iugu.py Add Iugu document loader (#5162) 1 year ago
joplin.py Fixed regression in JoplinLoader's get note url (#5265) 1 year ago
json_loader.py changed ValueError to ImportError (#5103) 1 year ago
markdown.py fix: pass unstructured kwargs down in all unstructured loaders (#2506) 1 year ago
mastodon.py Add Mastodon toots loader (#5036) 1 year ago
max_compute.py add maxcompute (#5533) 1 year ago
mediawikidump.py Harrison/media wiki xml (#4072) 1 year ago
modern_treasury.py Dev2049/add modern treasury (#3924) 1 year ago
notebook.py changed ValueError to ImportError (#5103) 1 year ago
notion.py Harrison/unstructured support (#903) 1 year ago
notiondb.py Harrison/param notion db (#4689) 1 year ago
obsidian.py Dev2049/obsidian patch (#4204) 1 year ago
odt.py feat: add loader for open office odt files (#4405) 1 year ago
onedrive.py changed ValueError to ImportError (#5103) 1 year ago
onedrive_file.py Harrison/one drive loader (#4081) 1 year ago
pdf.py changed ValueError to ImportError (#5103) 1 year ago
powerpoint.py feat: batch multiple files in a single Unstructured API request (#4525) 1 year ago
psychic.py Harrison/psychic (#5063) 1 year ago
pyspark_dataframe.py fix (#5457) 1 year ago
python.py Add PythonLoader which auto-detects encoding of Python files (#3311) 1 year ago
readthedocs.py Allow readthedoc loader to pass custom html tag (#5175) 1 year ago
reddit.py Langchain with reddit (#3661) (#3768) 1 year ago
roam.py Harrison/add roam loader (#939) 1 year ago
rtf.py feat: add loader for rich text files (#3227) 1 year ago
s3_directory.py changed ValueError to ImportError (#5103) 1 year ago
s3_file.py changed ValueError to ImportError (#5103) 1 year ago
sitemap.py Strips whitespace and \n from loc before filtering urls from sitemap (#5728) 1 year ago
slack_directory.py Add Slack Directory Loader (#2841) 1 year ago
snowflake_loader.py Fix: SnowflakeLoader returning empty documents (#5967) 12 months ago
spreedly.py Harrison/spreedly (#3937) 1 year ago
srt.py changed ValueError to ImportError (#5103) 1 year ago
stripe.py Dev2049/add modern treasury (#3924) 1 year ago
telegram.py changed ValueError to ImportError (#5103) 1 year ago
text.py feat #4479: TextLoader auto detect encoding and improved exceptions (#4927) 1 year ago
tomarkdown.py 2markdown loader (#4796) 1 year ago
toml.py Add PDF parser implementations (#4356) 1 year ago
trello.py New Trello document loader (#4767) 1 year ago
twitter.py changed ValueError to ImportError (#5103) 1 year ago
unstructured.py feat: batch multiple files in a single Unstructured API request (#4525) 1 year ago
url.py enhancement: add elements mode to `UnstructuredURLLoader` (#3456) 1 year ago
url_playwright.py changed ValueError to ImportError (#5103) 1 year ago
url_selenium.py changed ValueError to ImportError (#5103) 1 year ago
weather.py Adding Weather Loader (#5056) 1 year ago
web_base.py update web_base.py to have verify option (#6107) 12 months ago
whatsapp_chat.py Update WhatsAppChatLoader to include the character ~ in the sender name (#4420) 1 year ago
wikipedia.py added `Wikipedia` document loader (#4141) 1 year ago
word_document.py feat: batch multiple files in a single Unstructured API request (#4525) 1 year ago
xml.py feat: Add `UnstructuredXMLLoader` for `.xml` files (#5955) 12 months ago
youtube.py Harrison/youtube multi language (#5758) 1 year ago