langchain/docs/extras/modules/data_connection
Lance Martin c2b25c17c5
Recursive URL loader (#6455)
We may want to process load all URLs under a root directory.

For example, let's look at the [LangChain JS
documentation](https://js.langchain.com/docs/).

This has many interesting child pages that we may want to read in bulk.

Of course, the `WebBaseLoader` can load a list of pages. 

But, the challenge is traversing the tree of child pages and actually
assembling that list!
 
We do this using the `RecusiveUrlLoader`.

This also gives us the flexibility to exclude some children (e.g., the
`api` directory with > 800 child pages).
2023-06-23 13:09:00 -07:00
..
document_loaders/integrations Recursive URL loader (#6455) 2023-06-23 13:09:00 -07:00
document_transformers/text_splitters MD header text splitter returns Documents (#6571) 2023-06-22 09:25:38 -07:00
retrievers docs/fix links (#6498) 2023-06-20 14:06:50 -07:00
text_embedding/integrations Doc refactor (#6300) 2023-06-16 11:52:56 -07:00
vectorstores/integrations added redis method to delete entries by keys (#6222) 2023-06-22 13:26:47 -07:00