langchain/docs/integrations/arxiv.md
Leonid Ganeline 1837caa70d
docs: ecosystem/integrations update 1 (#5219)
# docs: ecosystem/integrations update

It is the first in a series of `ecosystem/integrations` updates.

The ecosystem/integrations list is missing many integrations.
I'm adding the missing integrations in a consistent format: 
1. description of the integrated system
2. `Installation and Setup` section with 'pip install ...`, Key setup,
and other necessary settings
3. Sections like `LLM`, `Text Embedding Models`, `Chat Models`... with
links to correspondent examples and imports of the used classes.

This PR keeps new docs, that are presented in the
`docs/modules/models/text_embedding/examples` but missed in the
`ecosystem/integrations`. The next PRs will cover the next example
sections.

Also updated `integrations.rst`: added the `Dependencies` section with a
link to the packages used in LangChain.

## Who can review?

@hwchase17
@eyurtsev
@dev2049
2023-05-29 07:25:17 -07:00

727 B

Arxiv

arXiv is an open-access archive for 2 million scholarly articles in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics.

Installation and Setup

First, you need to install arxiv python package.

pip install arxiv

Second, you need to install PyMuPDF python package which transforms PDF files downloaded from the arxiv.org site into the text format.

pip install pymupdf

Document Loader

See a usage example.

from langchain.document_loaders import ArxivLoader