mirror of
https://github.com/hwchase17/langchain
synced 2024-11-08 07:10:35 +00:00
e4cfaa5680
### Summary This PR introduces a `SeleniumURLLoader` which, similar to `UnstructuredURLLoader`, loads data from URLs. However, it utilizes `selenium` to fetch page content, enabling it to work with JavaScript-rendered pages. The `unstructured` library is also employed for loading the HTML content. ### Testing ```bash pip install selenium pip install unstructured ``` ```python from langchain.document_loaders import SeleniumURLLoader urls = [ "https://www.youtube.com/watch?v=dQw4w9WgXcQ", "https://goo.gl/maps/NDSHwePEyaHMFGwh8" ] loader = SeleniumURLLoader(urls=urls) data = loader.load() ``` |
||
---|---|---|
.. | ||
_static | ||
ecosystem | ||
getting_started | ||
modules | ||
reference | ||
tracing | ||
use_cases | ||
conf.py | ||
deployments.md | ||
ecosystem.rst | ||
gallery.rst | ||
glossary.md | ||
index.rst | ||
make.bat | ||
Makefile | ||
model_laboratory.ipynb | ||
reference.rst | ||
requirements.txt | ||
tracing.md |