langchain/docs/integrations/apify.md
Leonid Ganeline e2d7677526
docs: compound ecosystem and integrations (#4870)
# Docs: compound ecosystem and integrations

**Problem statement:** We have a big overlap between the
References/Integrations and Ecosystem/LongChain Ecosystem pages. It
confuses users. It creates a situation when new integration is added
only on one of these pages, which creates even more confusion.
- removed References/Integrations page (but move all its information
into the individual integration pages - in the next PR).
- renamed Ecosystem/LongChain Ecosystem into Integrations/Integrations.
I like the Ecosystem term. It is more generic and semantically richer
than the Integration term. But it mentally overloads users. The
`integration` term is more concrete.
UPDATE: after discussion, the Ecosystem is the term.
Ecosystem/Integrations is the page (in place of Ecosystem/LongChain
Ecosystem).

As a result, a user gets a single place to start with the individual
integration.
2023-05-18 09:29:57 -07:00

1.5 KiB

Apify

This page covers how to use Apify within LangChain.

Overview

Apify is a cloud platform for web scraping and data extraction, which provides an ecosystem of more than a thousand ready-made apps called Actors for various scraping, crawling, and extraction use cases.

Apify Actors

This integration enables you run Actors on the Apify platform and load their results into LangChain to feed your vector indexes with documents and data from the web, e.g. to generate answers from websites with documentation, blogs, or knowledge bases.

Installation and Setup

  • Install the Apify API client for Python with pip install apify-client
  • Get your Apify API token and either set it as an environment variable (APIFY_API_TOKEN) or pass it to the ApifyWrapper as apify_api_token in the constructor.

Wrappers

Utility

You can use the ApifyWrapper to run Actors on the Apify platform.

from langchain.utilities import ApifyWrapper

For a more detailed walkthrough of this wrapper, see this notebook.

Loader

You can also use our ApifyDatasetLoader to get data from Apify dataset.

from langchain.document_loaders import ApifyDatasetLoader

For a more detailed walkthrough of this loader, see this notebook.