langchain/docs/integrations/arxiv.md
Leonid Ganeline 1837caa70d
docs: ecosystem/integrations update 1 (#5219)
# docs: ecosystem/integrations update

It is the first in a series of `ecosystem/integrations` updates.

The ecosystem/integrations list is missing many integrations.
I'm adding the missing integrations in a consistent format: 
1. description of the integrated system
2. `Installation and Setup` section with 'pip install ...`, Key setup,
and other necessary settings
3. Sections like `LLM`, `Text Embedding Models`, `Chat Models`... with
links to correspondent examples and imports of the used classes.

This PR keeps new docs, that are presented in the
`docs/modules/models/text_embedding/examples` but missed in the
`ecosystem/integrations`. The next PRs will cover the next example
sections.

Also updated `integrations.rst`: added the `Dependencies` section with a
link to the packages used in LangChain.

## Who can review?

@hwchase17
@eyurtsev
@dev2049
2023-05-29 07:25:17 -07:00

29 lines
727 B
Markdown

# Arxiv
>[arXiv](https://arxiv.org/) is an open-access archive for 2 million scholarly articles in the fields of physics,
> mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and
> systems science, and economics.
## Installation and Setup
First, you need to install `arxiv` python package.
```bash
pip install arxiv
```
Second, you need to install `PyMuPDF` python package which transforms PDF files downloaded from the `arxiv.org` site into the text format.
```bash
pip install pymupdf
```
## Document Loader
See a [usage example](../modules/indexes/document_loaders/examples/arxiv.ipynb).
```python
from langchain.document_loaders import ArxivLoader
```