langchain/docs/extras/use_cases/question_answering/index.mdx

---
sidebar_position: 0
---

# QA over Documents

## Use case
Suppose you have some text documents (PDF, blog, Notion pages, etc.) and want to ask questions related to the contents of those documents. LLMs, given their proficiency in understanding text, are a great tool for this.

In this walkthrough we'll go over how to build a question-answering over documents application using LLMs. Two very related use cases which we cover elsewhere are:
- [QA over structured data](/docs/use_cases/tabular) (e.g., SQL)
- [QA over code](/docs/use_cases/code) (e.g., Python)

![intro.png](/img/qa_intro.png)

## Overview
The pipeline for converting raw unstructured data into a QA chain looks like this:
1. `Loading`: First we need to load our data. Unstructured data can be loaded from many sources. Use the [LangChain integration hub](https://integrations.langchain.com/) to browse the full set of loaders.
Each loader returns data as a LangChain [`Document`](https://docs.langchain.com/docs/components/schema/document).
2. `Splitting`: [Text splitters](/docs/modules/data_connection/document_transformers/) break `Documents` into splits of specified size
3. `Storage`: Storage (e.g., often a [vectorstore](/docs/modules/data_connection/vectorstores/)) will house [and often embed](https://www.pinecone.io/learn/vector-embeddings/) the splits
4. `Retrieval`: The app retrieves splits from storage (e.g., often [with similar embeddings](https://www.pinecone.io/learn/k-nearest-neighbor/) to the input question)
5. `Generation`: An [LLM](/docs/modules/model_io/models/llms/) produces an answer using a prompt that includes the question and the retrieved data
6. `Conversation` (Extension): Hold a multi-turn conversation by adding [Memory](/docs/modules/memory/) to your QA chain.

![flow.jpeg](/img/qa_flow.jpeg)

## Quickstart
To give you a sneak preview, the above pipeline can be all be wrapped in a single object: `VectorstoreIndexCreator`. Suppose we want a QA app over this [blog post](https://lilianweng.github.io/posts/2023-06-23-agent/). We can create this in a few lines of code:

First set environment variables and install packages:
```bash
pip install openai chromadb
export OPENAI_API_KEY="..."
```

Then run:
```python
from langchain.document_loaders import WebBaseLoader
from langchain.indexes import VectorstoreIndexCreator

loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
index = VectorstoreIndexCreator().from_loaders([loader])
```

And now ask your questions:
```python
index.query("What is Task Decomposition?")
```

    ' Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. It can be done using LLM with simple prompting, task-specific instructions, or human inputs. Tree of Thoughts (Yao et al. 2023) is an example of a task decomposition technique that explores multiple reasoning possibilities at each step and generates multiple thoughts per step, creating a tree structure.'

Ok, but what's going on under the hood, and how could we customize this for our specific use case? For that, let's take a look at how we can construct this pipeline piece by piece.

## Step 1. Load

Specify a `DocumentLoader` to load in your unstructured data as `Documents`. A `Document` is a piece of text (the `page_content`) and associated metadata.

```python
from langchain.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
data = loader.load()
```

### Go deeper
- Browse the > 120 data loader integrations [here](https://integrations.langchain.com/).
- See further documentation on loaders [here](/docs/modules/data_connection/document_loaders/).

## Step 2. Split

Split the `Document` into chunks for embedding and vector storage.

```python
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size = 500, chunk_overlap = 0)
all_splits = text_splitter.split_documents(data)
```

### Go deeper

- `DocumentSplitters` are just one type of the more generic `DocumentTransformers`, which can all be useful in this preprocessing step.
- See further documentation on transformers [here](/docs/modules/data_connection/document_transformers/).
- `Context-aware splitters` keep the location ("context") of each split in the original `Document`:
    - [Markdown files](/docs/use_cases/question_answering/document-context-aware-QA)
    - [Code (py or js)](/docs/modules/data_connection/document_loaders/integrations/source_code)
    - [Documents](/docs/modules/data_connection/document_loaders/integrations/grobid)

## Step 3. Store

To be able to look up our document splits, we first need to store them where we can later look them up.
The most common way to do this is to embed the contents of each document then store the embedding and document in a vector store, with the embedding being used to index the document.

```python
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma

vectorstore = Chroma.from_documents(documents=all_splits, embedding=OpenAIEmbeddings())
```

### Go deeper
- Browse the > 40 vectorstores integrations [here](https://integrations.langchain.com/).
- See further documentation on vectorstores [here](/docs/modules/data_connection/vectorstores/).
- Browse the > 30 text embedding integrations [here](https://integrations.langchain.com/).
- See further documentation on embedding models [here](/docs/modules/data_connection/text_embedding/).

 Here are Steps 1-3:

![lc.png](/img/qa_data_load.png)

## Step 4. Retrieve

Retrieve relevant splits for any question using [similarity search](https://www.pinecone.io/learn/what-is-similarity-search/).

```python
question = "What are the approaches to Task Decomposition?"
docs = vectorstore.similarity_search(question)
len(docs)
```

    4

### Go deeper

Vectorstores are commonly used for retrieval, but they are not the only option. For example, SVMs (see thread [here](https://twitter.com/karpathy/status/1647025230546886658?s=20)) can also be used.

LangChain [has many retrievers](/docs/modules/data_connection/retrievers/) including, but not limited to, vectorstores. All retrievers implement a common method `get_relevant_documents()` (and its asynchronous variant `aget_relevant_documents()`).

```python
from langchain.retrievers import SVMRetriever

svm_retriever = SVMRetriever.from_documents(all_splits,OpenAIEmbeddings())
docs_svm=svm_retriever.get_relevant_documents(question)
len(docs_svm)
```

    4

Some common ways to improve on vector similarity search include:
- `MultiQueryRetriever` [generates variants of the input question](/docs/modules/data_connection/retrievers/MultiQueryRetriever) to improve retrieval.
- `Max marginal relevance` selects for [relevance and diversity](https://www.cs.cmu.edu/~jgc/publication/The_Use_MMR_Diversity_Based_LTMIR_1998.pdf) among the retrieved documents.
- Documents can be filtered during retrieval using [`metadata` filters](/docs/use_cases/question_answering/how_to/document-context-aware-QA).


```python
import logging

from langchain.chat_models import ChatOpenAI
from langchain.retrievers.multi_query import MultiQueryRetriever

logging.basicConfig()
logging.getLogger('langchain.retrievers.multi_query').setLevel(logging.INFO)

retriever_from_llm = MultiQueryRetriever.from_llm(retriever=vectorstore.as_retriever(),
                                                  llm=ChatOpenAI(temperature=0))
unique_docs = retriever_from_llm.get_relevant_documents(query=question)
len(unique_docs)
```

    INFO:langchain.retrievers.multi_query:Generated queries: ['1. How can Task Decomposition be approached?', '2. What are the different methods for Task Decomposition?', '3. What are the various approaches to decomposing tasks?']
    5

## Step 5. Generate

Distill the retrieved documents into an answer using an LLM/Chat model (e.g., `gpt-3.5-turbo`) with `RetrievalQA` chain.

```python
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
qa_chain = RetrievalQA.from_chain_type(llm,retriever=vectorstore.as_retriever())
qa_chain({"query": question})
```

    {
        'query': 'What are the approaches to Task Decomposition?',
        'result': 'The approaches to task decomposition include:\n\n1. Simple prompting: This approach involves using simple prompts or questions to guide the agent in breaking down a task into smaller subgoals. For example, the agent can be prompted with "Steps for XYZ" and asked to list the subgoals for achieving XYZ.\n\n2. Task-specific instructions: In this approach, task-specific instructions are provided to the agent to guide the decomposition process. For example, if the task is to write a novel, the agent can be instructed to "Write a story outline" as a subgoal.\n\n3. Human inputs: This approach involves incorporating human inputs in the task decomposition process. Humans can provide guidance, feedback, and suggestions to help the agent break down complex tasks into manageable subgoals.\n\nThese approaches aim to enable efficient handling of complex tasks by breaking them down into smaller, more manageable parts.'
    }

Note, you can pass in an `LLM` or a `ChatModel` (like we did here) to the `RetrievalQA` chain.

### Go deeper

#### Choosing LLMs
- Browse the > 55 LLM and chat model integrations [here](https://integrations.langchain.com/).
- See further documentation on LLMs and chat models [here](/docs/modules/model_io/models/).
- Use local LLMS: The popularity of [PrivateGPT](https://github.com/imartinez/privateGPT) and [GPT4All](https://github.com/nomic-ai/gpt4all) underscore the importance of running LLMs locally.
Using `GPT4All` is as simple as [downloading the binary]((/docs/integrations/llms/gpt4all)) and then:

    from langchain.llms import GPT4All
    from langchain.chains import RetrievalQA

    llm = GPT4All(model="/Users/rlm/Desktop/Code/gpt4all/models/nous-hermes-13b.ggmlv3.q4_0.bin",max_tokens=2048)
    qa_chain = RetrievalQA.from_chain_type(llm, retriever=vectorstore.as_retriever())

#### Customizing the prompt

The prompt in `RetrievalQA` chain can be easily customized.

```python
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate

template = """Use the following pieces of context to answer the question at the end. 
If you don't know the answer, just say that you don't know, don't try to make up an answer. 
Use three sentences maximum and keep the answer as concise as possible. 
Always say "thanks for asking!" at the end of the answer. 
{context}
Question: {question}
Helpful Answer:"""
QA_CHAIN_PROMPT = PromptTemplate.from_template(template)

llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
qa_chain = RetrievalQA.from_chain_type(
    llm,
    retriever=vectorstore.as_retriever(),
    chain_type_kwargs={"prompt": QA_CHAIN_PROMPT}
)
result = qa_chain({"query": question})
result["result"]
```

    'The approaches to Task Decomposition are (1) using simple prompting by LLM, (2) using task-specific instructions, and (3) with human inputs. Thanks for asking!'


#### Return source documents

The full set of retrieved documents used for answer distillation can be returned using `return_source_documents=True`.

```python
from langchain.chains import RetrievalQA

qa_chain = RetrievalQA.from_chain_type(llm,retriever=vectorstore.as_retriever(),
                                       return_source_documents=True)
result = qa_chain({"query": question})
print(len(result['source_documents']))
result['source_documents'][0]
```

    4
    Document(page_content='Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'title': "LLM Powered Autonomous Agents | Lil'Log", 'description': 'Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.\nAgent System Overview In a LLM-powered autonomous agent system, LLM functions as the agent’s brain, complemented by several key components:', 'language': 'en'})


#### Return citations

Answer citations can be returned using `RetrievalQAWithSourcesChain`.


```python
from langchain.chains import RetrievalQAWithSourcesChain

qa_chain = RetrievalQAWithSourcesChain.from_chain_type(llm,retriever=vectorstore.as_retriever())

result = qa_chain({"question": question})
result
```

    {
        'question': 'What are the approaches to Task Decomposition?',
        'answer': 'The approaches to Task Decomposition include (1) using LLM with simple prompting, (2) using task-specific instructions, and (3) incorporating human inputs.\n',
        'sources': 'https://lilianweng.github.io/posts/2023-06-23-agent/'
    }

#### Customizing retrieved document processing

Retrieved documents can be fed to an LLM for answer distillation in a few different ways.

`stuff`, `refine`, `map-reduce`, and `map-rerank` chains for passing documents to an LLM prompt are well summarized [here](/docs/modules/chains/document/).
 
`stuff` is commonly used because it simply "stuffs" all retrieved documents into the prompt.

The [load_qa_chain](/docs/use_cases/question_answering/how_to/question_answering.html) is an easy way to pass documents to an LLM using these various approaches (e.g., see `chain_type`).


```python
from langchain.chains.question_answering import load_qa_chain

chain = load_qa_chain(llm, chain_type="stuff")
chain({"input_documents": unique_docs, "question": question},return_only_outputs=True)
```

    {'output_text': 'The approaches to task decomposition include (1) using simple prompting to break down tasks into subgoals, (2) providing task-specific instructions to guide the decomposition process, and (3) incorporating human inputs for task decomposition.'}

We can also pass the `chain_type` to `RetrievalQA`.


```python
qa_chain = RetrievalQA.from_chain_type(llm,retriever=vectorstore.as_retriever(),
                                       chain_type="stuff")
result = qa_chain({"query": question})
```

In summary, the user can choose the desired level of abstraction for QA:

![summary_chains.png](/img/summary_chains.png)

## Step 6. Converse (Extension)

To hold a conversation, a chain needs to be able to refer to past interactions. Chain `Memory` allows us to do this. To keep chat history, we can specify a Memory buffer to track the conversation inputs / outputs.

```python
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
```

The `ConversationalRetrievalChain` uses chat in the `Memory buffer`. 

```python
from langchain.chains import ConversationalRetrievalChain

retriever = vectorstore.as_retriever()
chat = ConversationalRetrievalChain.from_llm(llm, retriever=retriever, memory=memory)
```

```python
result = chat({"question": "What are some of the main ideas in self-reflection?"})
result['answer']
```

    "Some of the main ideas in self-reflection include:\n1. Iterative improvement: Self-reflection allows autonomous agents to improve by refining past action decisions and correcting mistakes.\n2. Trial and error: Self-reflection is crucial in real-world tasks where trial and error are inevitable.\n3. Two-shot examples: Self-reflection is created by showing pairs of failed trajectories and ideal reflections for guiding future changes in the plan.\n4. Working memory: Reflections are added to the agent's working memory, up to three, to be used as context for querying.\n5. Performance evaluation: Self-reflection involves continuously reviewing and analyzing actions, self-criticizing behavior, and reflecting on past decisions and strategies to refine approaches.\n6. Efficiency: Self-reflection encourages being smart and efficient, aiming to complete tasks in the least number of steps."

The Memory buffer has context to resolve `"it"` ("self-reflection") in the below question.

```python
result = chat({"question": "How does the Reflexion paper handle it?"})
result['answer']
```

    "The Reflexion paper handles self-reflection by showing two-shot examples to the Learning Language Model (LLM). Each example consists of a failed trajectory and an ideal reflection that guides future changes in the agent's plan. These reflections are then added to the agent's working memory, up to a maximum of three, to be used as context for querying the LLM. This allows the agent to iteratively improve its reasoning skills by refining past action decisions and correcting previous mistakes."

### Go deeper

The [documentation](/docs/use_cases/question_answering/how_to/chat_vector_db) on `ConversationalRetrievalChain` offers a few extensions, such as streaming and source documents.


## Further reading
- Check out the [How to](/docs/use_cases/question_answer/how_to/) section for all the variations of chains that can be used for QA over docs in different settings.
- Check out the [Integrations-specific](/docs/use_cases/question_answer/integrations/) section for chains that use specific integrations.
-												use top nav docs (#8090)


											
										
										
											2023-07-21 20:52:03 +00:00
+								---
 								sidebar_position: 0
 								---
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								# QA over Documents
-												big docs refactor (#1978)

Co-authored-by: Ankush Gola <ankush.gola@gmail.com>
											
										
										
											2023-03-27 02:49:46 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								## Use case
 								Suppose you have some text documents (PDF, blog, Notion pages, etc.) and want to ask questions related to the contents of those documents. LLMs, given their proficiency in understanding text, are a great tool for this.
-												bump version to 0.0.95 (#1324)


											
										
										
											2023-02-27 15:45:54 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								In this walkthrough we'll go over how to build a question-answering over documents application using LLMs. Two very related use cases which we cover elsewhere are:
 								- [QA over structured data](/docs/use_cases/tabular) (e.g., SQL)
 								- [QA over code](/docs/use_cases/code) (e.g., Python)
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
 								![intro.png](/img/qa_intro.png)
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								## Overview
 								The pipeline for converting raw unstructured data into a QA chain looks like this:
 . `Loading`: First we need to load our data. Unstructured data can be loaded from many sources. Use the [LangChain integration hub](https://integrations.langchain.com/) to browse the full set of loaders.
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								Each loader returns data as a LangChain [`Document`](https://docs.langchain.com/docs/components/schema/document).
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+. `Splitting`: [Text splitters](/docs/modules/data_connection/document_transformers/) break `Documents` into splits of specified size
 . `Storage`: Storage (e.g., often a [vectorstore](/docs/modules/data_connection/vectorstores/)) will house [and often embed](https://www.pinecone.io/learn/vector-embeddings/) the splits
 . `Retrieval`: The app retrieves splits from storage (e.g., often [with similar embeddings](https://www.pinecone.io/learn/k-nearest-neighbor/) to the input question)
 . `Generation`: An [LLM](/docs/modules/model_io/models/llms/) produces an answer using a prompt that includes the question and the retrieved data
 . `Conversation` (Extension): Hold a multi-turn conversation by adding [Memory](/docs/modules/memory/) to your QA chain.
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
 								![flow.jpeg](/img/qa_flow.jpeg)
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								## Quickstart
 								To give you a sneak preview, the above pipeline can be all be wrapped in a single object: `VectorstoreIndexCreator`. Suppose we want a QA app over this [blog post](https://lilianweng.github.io/posts/2023-06-23-agent/). We can create this in a few lines of code:
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								First set environment variables and install packages:
 								```bash
 								pip install openai chromadb
-												Add env setup (#7550)

Include setup
											
										
										
											2023-07-11 16:48:40 +00:00
+								export OPENAI_API_KEY="..."
 								```
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								Then run:
-												bump version to 0.0.95 (#1324)


											
										
										
											2023-02-27 15:45:54 +00:00
+								```python
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								from langchain.document_loaders import WebBaseLoader
-												bump version to 0.0.95 (#1324)


											
										
										
											2023-02-27 15:45:54 +00:00
+								from langchain.indexes import VectorstoreIndexCreator
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
-												bump version to 0.0.95 (#1324)


											
										
										
											2023-02-27 15:45:54 +00:00
+								index = VectorstoreIndexCreator().from_loaders([loader])
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								```
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								And now ask your questions:
 								```python
 								index.query("What is Task Decomposition?")
 								```
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
 								    ' Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. It can be done using LLM with simple prompting, task-specific instructions, or human inputs. Tree of Thoughts (Yao et al. 2023) is an example of a task decomposition technique that explores multiple reasoning possibilities at each step and generates multiple thoughts per step, creating a tree structure.'
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								Ok, but what's going on under the hood, and how could we customize this for our specific use case? For that, let's take a look at how we can construct this pipeline piece by piece.
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								## Step 1. Load
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								Specify a `DocumentLoader` to load in your unstructured data as `Documents`. A `Document` is a piece of text (the `page_content`) and associated metadata.
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
 								```python
 								from langchain.document_loaders import WebBaseLoader
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
 								data = loader.load()
 								```
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								### Go deeper
 								- Browse the > 120 data loader integrations [here](https://integrations.langchain.com/).
 								- See further documentation on loaders [here](/docs/modules/data_connection/document_loaders/).
 								## Step 2. Split
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								Split the `Document` into chunks for embedding and vector storage.
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
 								```python
 								from langchain.text_splitter import RecursiveCharacterTextSplitter
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								text_splitter = RecursiveCharacterTextSplitter(chunk_size = 500, chunk_overlap = 0)
 								all_splits = text_splitter.split_documents(data)
 								```
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								### Go deeper
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								- `DocumentSplitters` are just one type of the more generic `DocumentTransformers`, which can all be useful in this preprocessing step.
 								- See further documentation on transformers [here](/docs/modules/data_connection/document_transformers/).
 								- `Context-aware splitters` keep the location ("context") of each split in the original `Document`:
 								    - [Markdown files](/docs/use_cases/question_answering/document-context-aware-QA)
 								    - [Code (py or js)](/docs/modules/data_connection/document_loaders/integrations/source_code)
 								    - [Documents](/docs/modules/data_connection/document_loaders/integrations/grobid)
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								## Step 3. Store
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								To be able to look up our document splits, we first need to store them where we can later look them up.
 								The most common way to do this is to embed the contents of each document then store the embedding and document in a vector store, with the embedding being used to index the document.
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								```python
 								from langchain.embeddings import OpenAIEmbeddings
 								from langchain.vectorstores import Chroma
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								vectorstore = Chroma.from_documents(documents=all_splits, embedding=OpenAIEmbeddings())
 								```
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								### Go deeper
 								- Browse the > 40 vectorstores integrations [here](https://integrations.langchain.com/).
 								- See further documentation on vectorstores [here](/docs/modules/data_connection/vectorstores/).
 								- Browse the > 30 text embedding integrations [here](https://integrations.langchain.com/).
 								- See further documentation on embedding models [here](/docs/modules/data_connection/text_embedding/).
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								 Here are Steps 1-3:
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								![lc.png](/img/qa_data_load.png)
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								## Step 4. Retrieve
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								Retrieve relevant splits for any question using [similarity search](https://www.pinecone.io/learn/what-is-similarity-search/).
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
 								```python
 								question = "What are the approaches to Task Decomposition?"
 								docs = vectorstore.similarity_search(question)
 								len(docs)
-												bump version to 0.0.95 (#1324)


											
										
										
											2023-02-27 15:45:54 +00:00
+								```
-												Harrison/qa eg (#3052)

Co-authored-by: Sukhpal Saini <bdcorps@users.noreply.github.com>
											
										
										
											2023-04-18 03:56:42 +00:00
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								### Go deeper
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								Vectorstores are commonly used for retrieval, but they are not the only option. For example, SVMs (see thread [here](https://twitter.com/karpathy/status/1647025230546886658?s=20)) can also be used.
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								LangChain [has many retrievers](/docs/modules/data_connection/retrievers/) including, but not limited to, vectorstores. All retrievers implement a common method `get_relevant_documents()` (and its asynchronous variant `aget_relevant_documents()`).
-												Harrison/qa eg (#3052)

Co-authored-by: Sukhpal Saini <bdcorps@users.noreply.github.com>
											
										
										
											2023-04-18 03:56:42 +00:00
-												bump version to 0.0.95 (#1324)


											
										
										
											2023-02-27 15:45:54 +00:00
+								```python
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								from langchain.retrievers import SVMRetriever
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								svm_retriever = SVMRetriever.from_documents(all_splits,OpenAIEmbeddings())
 								docs_svm=svm_retriever.get_relevant_documents(question)
-												Fix for code snippet in documentation (#8290)

- Description: I fixed an issue in the code snippet related to the
variable name and the evaluation of its length. The original code used
the variable "docs," but the correct variable name is "docs_svm" after
using the SVMRetriever.
- maintainer: @baskaryan
- Twitter handle: @iamreechi_

Co-authored-by: iamreechi <richieakparuorji>
											
										
										
											2023-07-26 18:31:08 +00:00
+								len(docs_svm)
-												bump version to 0.0.95 (#1324)


											
										
										
											2023-02-27 15:45:54 +00:00
+								```
-												Harrison/qa eg (#3052)

Co-authored-by: Sukhpal Saini <bdcorps@users.noreply.github.com>
											
										
										
											2023-04-18 03:56:42 +00:00
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								Some common ways to improve on vector similarity search include:
-												Update links on QA Use Case docs (#8784)

- Description: 2 links were not working on Question Answering Use Cases
documentation page. Hence, changed them to nearest useful links,
  - Issue: NA,
  - Dependencies: NA,
  - Tag maintainer: @baskaryan,
  - Twitter handle: NA

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
											
										
										
											2023-08-06 00:30:56 +00:00
+								- `MultiQueryRetriever` [generates variants of the input question](/docs/modules/data_connection/retrievers/MultiQueryRetriever) to improve retrieval.
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								- `Max marginal relevance` selects for [relevance and diversity](https://www.cs.cmu.edu/~jgc/publication/The_Use_MMR_Diversity_Based_LTMIR_1998.pdf) among the retrieved documents.
-												Update links on QA Use Case docs (#8784)

- Description: 2 links were not working on Question Answering Use Cases
documentation page. Hence, changed them to nearest useful links,
  - Issue: NA,
  - Dependencies: NA,
  - Tag maintainer: @baskaryan,
  - Twitter handle: NA

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->
											
										
										
											2023-08-06 00:30:56 +00:00
+								- Documents can be filtered during retrieval using [`metadata` filters](/docs/use_cases/question_answering/how_to/document-context-aware-QA).
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												Harrison/qa eg (#3052)

Co-authored-by: Sukhpal Saini <bdcorps@users.noreply.github.com>
											
										
										
											2023-04-18 03:56:42 +00:00
-												bump version to 0.0.95 (#1324)


											
										
										
											2023-02-27 15:45:54 +00:00
+								```python
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								import logging
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								from langchain.chat_models import ChatOpenAI
 								from langchain.retrievers.multi_query import MultiQueryRetriever
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								logging.basicConfig()
 								logging.getLogger('langchain.retrievers.multi_query').setLevel(logging.INFO)
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								retriever_from_llm = MultiQueryRetriever.from_llm(retriever=vectorstore.as_retriever(),
 								                                                  llm=ChatOpenAI(temperature=0))
 								unique_docs = retriever_from_llm.get_relevant_documents(query=question)
 								len(unique_docs)
-												bump version to 0.0.95 (#1324)


											
										
										
											2023-02-27 15:45:54 +00:00
+								```
-												Harrison/qa eg (#3052)

Co-authored-by: Sukhpal Saini <bdcorps@users.noreply.github.com>
											
										
										
											2023-04-18 03:56:42 +00:00
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								    INFO:langchain.retrievers.multi_query:Generated queries: ['1. How can Task Decomposition be approached?', '2. What are the different methods for Task Decomposition?', '3. What are the various approaches to decomposing tasks?']
 
-												bump version to 0.0.95 (#1324)


											
										
										
											2023-02-27 15:45:54 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								## Step 5. Generate
-												bump version to 0.0.95 (#1324)


											
										
										
											2023-02-27 15:45:54 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								Distill the retrieved documents into an answer using an LLM/Chat model (e.g., `gpt-3.5-turbo`) with `RetrievalQA` chain.
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
 								```python
 								from langchain.chains import RetrievalQA
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								from langchain.chat_models import ChatOpenAI
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								qa_chain = RetrievalQA.from_chain_type(llm,retriever=vectorstore.as_retriever())
 								qa_chain({"query": question})
 								```
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								    {
 								        'query': 'What are the approaches to Task Decomposition?',
 								        'result': 'The approaches to task decomposition include:\n\n1. Simple prompting: This approach involves using simple prompts or questions to guide the agent in breaking down a task into smaller subgoals. For example, the agent can be prompted with "Steps for XYZ" and asked to list the subgoals for achieving XYZ.\n\n2. Task-specific instructions: In this approach, task-specific instructions are provided to the agent to guide the decomposition process. For example, if the task is to write a novel, the agent can be instructed to "Write a story outline" as a subgoal.\n\n3. Human inputs: This approach involves incorporating human inputs in the task decomposition process. Humans can provide guidance, feedback, and suggestions to help the agent break down complex tasks into manageable subgoals.\n\nThese approaches aim to enable efficient handling of complex tasks by breaking them down into smaller, more manageable parts.'
 								    }
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								Note, you can pass in an `LLM` or a `ChatModel` (like we did here) to the `RetrievalQA` chain.
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								### Go deeper
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								#### Choosing LLMs
 								- Browse the > 55 LLM and chat model integrations [here](https://integrations.langchain.com/).
 								- See further documentation on LLMs and chat models [here](/docs/modules/model_io/models/).
 								- Use local LLMS: The popularity of [PrivateGPT](https://github.com/imartinez/privateGPT) and [GPT4All](https://github.com/nomic-ai/gpt4all) underscore the importance of running LLMs locally.
 								Using `GPT4All` is as simple as [downloading the binary]((/docs/integrations/llms/gpt4all)) and then:
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								    from langchain.llms import GPT4All
 								    from langchain.chains import RetrievalQA
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								    llm = GPT4All(model="/Users/rlm/Desktop/Code/gpt4all/models/nous-hermes-13b.ggmlv3.q4_0.bin",max_tokens=2048)
 								    qa_chain = RetrievalQA.from_chain_type(llm, retriever=vectorstore.as_retriever())
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								#### Customizing the prompt
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
 								The prompt in `RetrievalQA` chain can be easily customized.
 								```python
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								from langchain.chains import RetrievalQA
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								from langchain.prompts import PromptTemplate
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								template = """Use the following pieces of context to answer the question at the end.
 								If you don't know the answer, just say that you don't know, don't try to make up an answer.
 								Use three sentences maximum and keep the answer as concise as possible.
 								Always say "thanks for asking!" at the end of the answer.
 								{context}
 								Question: {question}
 								Helpful Answer:"""
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								QA_CHAIN_PROMPT = PromptTemplate.from_template(template)
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
 								llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								qa_chain = RetrievalQA.from_chain_type(
 								    llm,
 								    retriever=vectorstore.as_retriever(),
 								    chain_type_kwargs={"prompt": QA_CHAIN_PROMPT}
 								)
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								result = qa_chain({"query": question})
 								result["result"]
 								```
 								    'The approaches to Task Decomposition are (1) using simple prompting by LLM, (2) using task-specific instructions, and (3) with human inputs. Thanks for asking!'
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								#### Return source documents
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
 								The full set of retrieved documents used for answer distillation can be returned using `return_source_documents=True`.
 								```python
 								from langchain.chains import RetrievalQA
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								qa_chain = RetrievalQA.from_chain_type(llm,retriever=vectorstore.as_retriever(),
 								                                       return_source_documents=True)
 								result = qa_chain({"query": question})
 								print(len(result['source_documents']))
 								result['source_documents'][0]
 								```
 
 								    Document(page_content='Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.', metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'title': "LLM Powered Autonomous Agents | Lil'Log", 'description': 'Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.\nAgent System Overview In a LLM-powered autonomous agent system, LLM functions as the agent’s brain, complemented by several key components:', 'language': 'en'})
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								#### Return citations
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
 								Answer citations can be returned using `RetrievalQAWithSourcesChain`.
 								```python
 								from langchain.chains import RetrievalQAWithSourcesChain
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								qa_chain = RetrievalQAWithSourcesChain.from_chain_type(llm,retriever=vectorstore.as_retriever())
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								result = qa_chain({"question": question})
 								result
 								```
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								    {
 								        'question': 'What are the approaches to Task Decomposition?',
 								        'answer': 'The approaches to Task Decomposition include (1) using LLM with simple prompting, (2) using task-specific instructions, and (3) incorporating human inputs.\n',
 								        'sources': 'https://lilianweng.github.io/posts/2023-06-23-agent/'
 								    }
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								#### Customizing retrieved document processing
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
 								Retrieved documents can be fed to an LLM for answer distillation in a few different ways.
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								`stuff`, `refine`, `map-reduce`, and `map-rerank` chains for passing documents to an LLM prompt are well summarized [here](/docs/modules/chains/document/).
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
 								`stuff` is commonly used because it simply "stuffs" all retrieved documents into the prompt.
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								The [load_qa_chain](/docs/use_cases/question_answering/how_to/question_answering.html) is an easy way to pass documents to an LLM using these various approaches (e.g., see `chain_type`).
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												improve documentation on how to pass in custom prompts (#561)


											
										
										
											2023-01-09 03:20:13 +00:00
 								```python
 								from langchain.chains.question_answering import load_qa_chain
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
-												improve documentation on how to pass in custom prompts (#561)


											
										
										
											2023-01-09 03:20:13 +00:00
+								chain = load_qa_chain(llm, chain_type="stuff")
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								chain({"input_documents": unique_docs, "question": question},return_only_outputs=True)
-												improve documentation on how to pass in custom prompts (#561)


											
										
										
											2023-01-09 03:20:13 +00:00
+								```
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								    {'output_text': 'The approaches to task decomposition include (1) using simple prompting to break down tasks into subgoals, (2) providing task-specific instructions to guide the decomposition process, and (3) incorporating human inputs for task decomposition.'}
 								We can also pass the `chain_type` to `RetrievalQA`.
-												Docs refactor (#480)

Big docs refactor! Motivation is to make it easier for people to find
resources they are looking for. To accomplish this, there are now three
main sections:

- Getting Started: steps for getting started, walking through most core
functionality
- Modules: these are different modules of functionality that langchain
provides. Each part here has a "getting started", "how to", "key
concepts" and "reference" section (except in a few select cases where it
didnt easily fit).
- Use Cases: this is to separate use cases (like summarization, question
answering, evaluation, etc) from the modules, and provide a different
entry point to the code base.

There is also a full reference section, as well as extra resources
(glossary, gallery, etc)

Co-authored-by: Shreya Rajpal <ShreyaR@users.noreply.github.com>
											
										
										
											2023-01-02 16:24:09 +00:00
-												improve documentation on how to pass in custom prompts (#561)


											
										
										
											2023-01-09 03:20:13 +00:00
 								```python
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								qa_chain = RetrievalQA.from_chain_type(llm,retriever=vectorstore.as_retriever(),
 								                                       chain_type="stuff")
 								result = qa_chain({"query": question})
-												improve documentation on how to pass in custom prompts (#561)


											
										
										
											2023-01-09 03:20:13 +00:00
+								```
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								In summary, the user can choose the desired level of abstraction for QA:
 								![summary_chains.png](/img/summary_chains.png)
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								## Step 6. Converse (Extension)
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								To hold a conversation, a chain needs to be able to refer to past interactions. Chain `Memory` allows us to do this. To keep chat history, we can specify a Memory buffer to track the conversation inputs / outputs.
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
 								```python
 								from langchain.memory import ConversationBufferMemory
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
+								memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
 								```
 								The `ConversationalRetrievalChain` uses chat in the `Memory buffer`.
 								```python
 								from langchain.chains import ConversationalRetrievalChain
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								retriever = vectorstore.as_retriever()
 								chat = ConversationalRetrievalChain.from_llm(llm, retriever=retriever, memory=memory)
 								```
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
 								```python
 								result = chat({"question": "What are some of the main ideas in self-reflection?"})
 								result['answer']
 								```
 								    "Some of the main ideas in self-reflection include:\n1. Iterative improvement: Self-reflection allows autonomous agents to improve by refining past action decisions and correcting mistakes.\n2. Trial and error: Self-reflection is crucial in real-world tasks where trial and error are inevitable.\n3. Two-shot examples: Self-reflection is created by showing pairs of failed trajectories and ideal reflections for guiding future changes in the plan.\n4. Working memory: Reflections are added to the agent's working memory, up to three, to be used as context for querying.\n5. Performance evaluation: Self-reflection involves continuously reviewing and analyzing actions, self-criticizing behavior, and reflecting on past decisions and strategies to refine approaches.\n6. Efficiency: Self-reflection encourages being smart and efficient, aiming to complete tasks in the least number of steps."
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								The Memory buffer has context to resolve `"it"` ("self-reflection") in the below question.
-												Update landing page for "question answering over documents" (#7152)

Improve documentation for a central use-case, qa / chat over documents.

This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).

Testing w/ local Docusaurus server:

```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
											
										
										
											2023-07-10 21:15:13 +00:00
 								```python
 								result = chat({"question": "How does the Reflexion paper handle it?"})
 								result['answer']
 								```
 								    "The Reflexion paper handles self-reflection by showing two-shot examples to the Learning Language Model (LLM). Each example consists of a failed trajectory and an ideal reflection that guides future changes in the agent's plan. These reflections are then added to the agent's working memory, up to a maximum of three, to be used as context for querying the LLM. This allows the agent to iteratively improve its reasoning skills by refining past action decisions and correcting previous mistakes."
-												Harrison/qa eg (#3052)

Co-authored-by: Sukhpal Saini <bdcorps@users.noreply.github.com>
											
										
										
											2023-04-18 03:56:42 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								### Go deeper
-												Harrison/qa eg (#3052)

Co-authored-by: Sukhpal Saini <bdcorps@users.noreply.github.com>
											
										
										
											2023-04-18 03:56:42 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								The [documentation](/docs/use_cases/question_answering/how_to/chat_vector_db) on `ConversationalRetrievalChain` offers a few extensions, such as streaming and source documents.
-												Harrison/qa eg (#3052)

Co-authored-by: Sukhpal Saini <bdcorps@users.noreply.github.com>
											
										
										
											2023-04-18 03:56:42 +00:00
-												mv popular and additional chains to use cases (#8242)


											
										
										
											2023-07-27 19:55:13 +00:00
+								## Further reading
 								- Check out the [How to](/docs/use_cases/question_answer/how_to/) section for all the variations of chains that can be used for QA over docs in different settings.
 								- Check out the [Integrations-specific](/docs/use_cases/question_answer/integrations/) section for chains that use specific integrations.