langchain/docs
Kevin Huang e4cfaa5680
Introduces SeleniumURLLoader for JavaScript-Dependent Web Page Data Retrieval (#2291)
### Summary
This PR introduces a `SeleniumURLLoader` which, similar to
`UnstructuredURLLoader`, loads data from URLs. However, it utilizes
`selenium` to fetch page content, enabling it to work with
JavaScript-rendered pages. The `unstructured` library is also employed
for loading the HTML content.

### Testing
```bash
pip install selenium
pip install unstructured
```

```python
from langchain.document_loaders import SeleniumURLLoader

urls = [
    "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
    "https://goo.gl/maps/NDSHwePEyaHMFGwh8"
]

loader = SeleniumURLLoader(urls=urls)
data = loader.load()
```
2023-04-02 14:05:00 -07:00
..
_static Harrison/apify (#2215) 2023-03-30 20:58:14 -07:00
ecosystem Harrison/apify (#2215) 2023-03-30 20:58:14 -07:00
getting_started Fixed 1 missing line in getting_started.md (#2107) 2023-03-28 15:03:28 -07:00
modules Introduces SeleniumURLLoader for JavaScript-Dependent Web Page Data Retrieval (#2291) 2023-04-02 14:05:00 -07:00
reference Harrison/deeplake (#1316) 2023-02-26 22:35:04 -08:00
tracing tracing improvements to docs (#1947) 2023-03-23 19:00:18 -07:00
use_cases Update apis.md (#2278) 2023-04-01 12:48:16 -07:00
conf.py Corrects copyright year (#1762) 2023-03-18 19:55:05 -07:00
deployments.md docs(deployment): add langchain-serve (#2006) 2023-03-27 23:32:04 -07:00
ecosystem.rst Docs refactor (#480) 2023-01-02 08:24:09 -08:00
gallery.rst docs: update gpt index references to LlamaIndex (#1856) 2023-03-21 22:01:05 -07:00
glossary.md big docs refactor (#1978) 2023-03-26 19:49:46 -07:00
index.rst small nit on index page (#2018) 2023-03-27 00:15:24 -04:00
make.bat initial commit 2022-10-24 14:51:15 -07:00
Makefile Feature: linkcheck-action (#534) (#542) 2023-01-04 21:39:50 -08:00
model_laboratory.ipynb big docs refactor (#1978) 2023-03-26 19:49:46 -07:00
reference.rst Feature: linkcheck-action (#534) (#542) 2023-01-04 21:39:50 -08:00
requirements.txt Harrison/docs reqs (#2199) 2023-03-30 08:20:30 -07:00
tracing.md Harrison/tracing docs (#806) 2023-01-29 20:49:35 -08:00