mirror of
https://github.com/hwchase17/langchain
synced 2024-11-08 07:10:35 +00:00
bd4865b6fe
Description: This PR improves the function of recursive_url_loader, such as limiting the depth of the access, and customizable extractors(from the raw webpage to the text of the Document object), so that users can use other tools to extract the webpage. This PR also includes the document and test for the new loader. Old PR closed due to project structure change. #7756 Because socket requests are not allowed, the old unit test was removed. Issue: N/A Dependencies: asyncio, aiohttp Tag maintainer: @rlancemartin Twitter handle: @ Zend_Nihility --------- Co-authored-by: Lance Martin <lance@langchain.dev> |
||
---|---|---|
.. | ||
callbacks | ||
chat | ||
document_loaders | ||
document_transformers | ||
llms | ||
memory | ||
providers | ||
retrievers | ||
text_embedding | ||
toolkits | ||
tools | ||
vectorstores |