langchain/docs/integrations/apify.md

47 lines
1.5 KiB
Markdown
Raw Normal View History

# Apify
This page covers how to use [Apify](https://apify.com) within LangChain.
## Overview
Apify is a cloud platform for web scraping and data extraction,
which provides an [ecosystem](https://apify.com/store) of more than a thousand
ready-made apps called *Actors* for various scraping, crawling, and extraction use cases.
[![Apify Actors](../_static/ApifyActors.png)](https://apify.com/store)
This integration enables you run Actors on the Apify platform and load their results into LangChain to feed your vector
indexes with documents and data from the web, e.g. to generate answers from websites with documentation,
blogs, or knowledge bases.
## Installation and Setup
- Install the Apify API client for Python with `pip install apify-client`
- Get your [Apify API token](https://console.apify.com/account/integrations) and either set it as
an environment variable (`APIFY_API_TOKEN`) or pass it to the `ApifyWrapper` as `apify_api_token` in the constructor.
## Wrappers
### Utility
You can use the `ApifyWrapper` to run Actors on the Apify platform.
```python
from langchain.utilities import ApifyWrapper
```
For a more detailed walkthrough of this wrapper, see [this notebook](../modules/agents/tools/examples/apify.ipynb).
### Loader
You can also use our `ApifyDatasetLoader` to get data from Apify dataset.
```python
from langchain.document_loaders import ApifyDatasetLoader
```
For a more detailed walkthrough of this loader, see [this notebook](../modules/indexes/document_loaders/examples/apify_dataset.ipynb).