You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
langchain/templates/rag-semi-structured
David Duong d39b4b61b6
Batch apply `poetry lock --no-update` for all templates (#12531)
Ran the following bash script for all templates

```bash
#!/bin/bash

set -e
current_dir="$(pwd)"
for directory in */; do
    if [ -d "$directory" ]; then
        (cd "$directory" && poetry lock --no-update)
    fi
done

cd "$current_dir"
```

Co-authored-by: Bagatur <baskaryan@gmail.com>
11 months ago
..
docs
rag_semi_structured
tests
LICENSE
README.md
poetry.lock Batch apply `poetry lock --no-update` for all templates (#12531) 11 months ago
pyproject.toml
rag_semi_structured.ipynb notebook fmt (#12498) 11 months ago

README.md

Semi structured RAG

This template performs RAG on semi-structured data (e.g., a PDF with text and tables).

See this blog post for useful background context.

Data loading

We use partition_pdf from Unstructured to extract both table and text elements.

This will require some system-level package installations, e.g., on Mac:

brew install tesseract poppler

Chroma

Chroma is an open-source vector database.

This template will create and add documents to the vector database in chain.py.

These documents can be loaded from many sources.

LLM

Be sure that OPENAI_API_KEY is set in order to the OpenAI models.

Adding the template

Create your LangServe app:

langchain serve new my-app
cd my-app

Add template:

langchain serve add rag-semi-structured

Start server:

langchain start

See Jupyter notebook rag_semi_structured for various way to connect to the template.