mirror of
https://github.com/hwchase17/langchain
synced 2024-11-08 07:10:35 +00:00
Bagatur/apify (#8008)
<!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> --------- Co-authored-by: Jiří Moravčík <jiri.moravcik@gmail.com> Co-authored-by: Jan Čurn <jan.curn@gmail.com>
This commit is contained in:
parent
1d7414a371
commit
7c24a6b9d1
@ -7,7 +7,23 @@ from langchain.document_loaders.base import BaseLoader
|
|||||||
|
|
||||||
|
|
||||||
class ApifyDatasetLoader(BaseLoader, BaseModel):
|
class ApifyDatasetLoader(BaseLoader, BaseModel):
|
||||||
"""Loading Documents from Apify datasets."""
|
"""Loads datasets from Apify-a web scraping, crawling, and data extraction platform.
|
||||||
|
For details, see https://docs.apify.com/platform/integrations/langchain
|
||||||
|
|
||||||
|
Example:
|
||||||
|
.. code-block:: python
|
||||||
|
|
||||||
|
from langchain.document_loaders import ApifyDatasetLoader
|
||||||
|
from langchain.schema import Document
|
||||||
|
|
||||||
|
loader = ApifyDatasetLoader(
|
||||||
|
dataset_id="YOUR-DATASET-ID",
|
||||||
|
dataset_mapping_function=lambda dataset_item: Document(
|
||||||
|
page_content=dataset_item["text"], metadata={"source": dataset_item["url"]}
|
||||||
|
),
|
||||||
|
)
|
||||||
|
documents = loader.load()
|
||||||
|
""" # noqa: E501
|
||||||
|
|
||||||
apify_client: Any
|
apify_client: Any
|
||||||
"""An instance of the ApifyClient class from the apify-client Python package."""
|
"""An instance of the ApifyClient class from the apify-client Python package."""
|
||||||
|
Loading…
Reference in New Issue
Block a user