mirror of
https://github.com/hwchase17/langchain
synced 2024-10-29 17:07:25 +00:00
c2b25c17c5
We may want to process load all URLs under a root directory. For example, let's look at the [LangChain JS documentation](https://js.langchain.com/docs/). This has many interesting child pages that we may want to read in bulk. Of course, the `WebBaseLoader` can load a list of pages. But, the challenge is traversing the tree of child pages and actually assembling that list! We do this using the `RecusiveUrlLoader`. This also gives us the flexibility to exclude some children (e.g., the `api` directory with > 800 child pages). |
||
---|---|---|
.. | ||
document_loaders/integrations | ||
document_transformers/text_splitters | ||
retrievers | ||
text_embedding/integrations | ||
vectorstores/integrations |