langchain/docs/integrations/arxiv.md
Leonid Ganeline b201cfaa0f
docs ecosystem/integrations update 4 (#5590)
# docs `ecosystem/integrations` update 4

Added missed integrations. Fixed inconsistencies. 

## Who can review?

@hwchase17 
@dev2049
2023-06-03 15:29:03 -07:00

880 B

Arxiv

arXiv is an open-access archive for 2 million scholarly articles in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics.

Installation and Setup

First, you need to install arxiv python package.

pip install arxiv

Second, you need to install PyMuPDF python package which transforms PDF files downloaded from the arxiv.org site into the text format.

pip install pymupdf

Document Loader

See a usage example.

from langchain.document_loaders import ArxivLoader

Retriever

See a usage example.

from langchain.retrievers import ArxivRetriever