# Retrieve as you generate with FLARE

This notebook is an implementation of Forward-Looking Active REtrieval augmented generation (FLARE).

Please see the original repo [here](https://github.com/jzbjyb/FLARE/tree/main).

The basic idea is:

- Start answering a question
- If you start generating tokens the model is uncertain about, look up relevant documents
- Use those documents to continue generating
- Repeat until finished

There is a lot of cool detail in how the lookup of relevant documents is done.
Basically, the tokens that model is uncertain about are highlighted, and then an LLM is called to generate a question that would lead to that answer. For example, if the generated text is `Joe Biden went to Harvard`, and the tokens the model was uncertain about was `Harvard`, then a good generated question would be `where did Joe Biden go to college`. This generated question is then used in a retrieval step to fetch relevant documents.

In order to set up this chain, we will need three things:

- An LLM to generate the answer
- An LLM to generate hypothetical questions to use in retrieval
- A retriever to use to look up answers for

The LLM that we use to generate the answer needs to return logprobs so we can identify uncertain tokens. For that reason, we HIGHLY recommend that you use the OpenAI wrapper (NB: not the ChatOpenAI wrapper, as that does not return logprobs).

The LLM we use to generate hypothetical questions to use in retrieval can be anything. In this notebook we will use ChatOpenAI because it is fast and cheap.

The retriever can be anything. In this notebook we will use [SERPER](https://serper.dev/) search engine, because it is cheap.

Other important parameters to understand:

- `max_generation_len`: The maximum number of tokens to generate before stopping to check if any are uncertain
- `min_prob`: Any tokens generated with probability below this will be considered uncertain

## Imports

In [1]:
import os

os.environ["SERPER_API_KEY"] = ""
os.environ["OPENAI_API_KEY"] = ""

In [2]:
from typing import Any, List

from langchain.callbacks.manager import (
    AsyncCallbackManagerForRetrieverRun,
    CallbackManagerForRetrieverRun,
)
from langchain_community.utilities import GoogleSerperAPIWrapper
from langchain_core.documents import Document
from langchain_core.retrievers import BaseRetriever
from langchain_openai import ChatOpenAI, OpenAI

## Retriever

In [3]:
class SerperSearchRetriever(BaseRetriever):
    search: GoogleSerperAPIWrapper = None

    def _get_relevant_documents(
        self, query: str, *, run_manager: CallbackManagerForRetrieverRun, **kwargs: Any
    ) -> List[Document]:
        return [Document(page_content=self.search.run(query))]

    async def _aget_relevant_documents(
        self,
        query: str,
        *,
        run_manager: AsyncCallbackManagerForRetrieverRun,
        **kwargs: Any,
    ) -> List[Document]:
        raise NotImplementedError()


retriever = SerperSearchRetriever(search=GoogleSerperAPIWrapper())

## FLARE Chain

In [4]:
# We set this so we can see what exactly is going on
from langchain.globals import set_verbose

set_verbose(True)

In [5]:
from langchain.chains import FlareChain

flare = FlareChain.from_llm(
    ChatOpenAI(temperature=0),
    retriever=retriever,
    max_generation_len=164,
    min_prob=0.3,
)

In [6]:
query = "explain in great detail the difference between the langchain framework and baby agi"

In [7]:
flare.run(query)



[1m> Entering new FlareChain chain...[0m
[36;1m[1;3mCurrent Response: [0m
Prompt after formatting:
[32;1m[1;3mRespond to the user message using any relevant context. If context is provided, you should ground your answer in that context. Once you're done responding return FINISHED.

>>> CONTEXT: 
>>> USER INPUT: explain in great detail the difference between the langchain framework and baby agi
>>> RESPONSE: [0m


[1m> Entering new QuestionGeneratorChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven a user input and an existing partial response as context, ask a question to which the answer is the given term/entity/phrase:

>>> USER INPUT: explain in great detail the difference between the langchain framework and baby agi
>>> EXISTING PARTIAL RESPONSE:  
The Langchain Framework is a decentralized platform for natural language processing (NLP) applications. It uses a blockchain-based distributed ledger to store and process data, allowing for secure and transparent d


[1m> Finished chain.[0m
[33;1m[1;3mGenerated Questions: ['What is the Langchain Framework?', 'What technology does the Langchain Framework use to store and process data for secure and transparent data sharing?', 'What technology does the Langchain Framework use to store and process data?', 'What does the Langchain Framework use a blockchain-based distributed ledger for?', 'What does the Langchain Framework provide in addition to a decentralized platform for natural language processing applications?', 'What set of tools and services does the Langchain Framework provide?', 'What is the purpose of Baby AGI?', 'What type of applications is the Langchain Framework designed for?'][0m


[1m> Entering new _OpenAIResponseChain chain...[0m
Prompt after formatting:
[32;1m[1;3mRespond to the user message using any relevant context. If context is provided, you should ground your answer in that context. Once you're done responding return FINISHED.

>>> CONTEXT: LangChain: Software. LangCha


[1m> Finished chain.[0m

[1m> Finished chain.[0m


' LangChain is a framework for developing applications powered by language models. It provides a standard interface for chains, lots of integrations with other tools, and end-to-end chains for common applications. On the other hand, Baby AGI is an AI system that is exploring and demonstrating the potential of large language models, such as GPT, and how it can autonomously perform tasks. Baby AGI has the ability to complete tasks, generate new tasks based on previous results, and prioritize tasks in real-time. '

In [8]:
llm = OpenAI()
llm.invoke(query)

'\n\nThe Langchain framework and Baby AGI are both artificial intelligence (AI) frameworks that are used to create intelligent agents. The Langchain framework is a supervised learning system that is based on the concept of “language chains”. It uses a set of rules to map natural language inputs to specific outputs. It is a general-purpose AI framework and can be used to build applications such as natural language processing (NLP), chatbots, and more.\n\nBaby AGI, on the other hand, is an unsupervised learning system that uses neural networks and reinforcement learning to learn from its environment. It is used to create intelligent agents that can adapt to changing environments. It is a more advanced AI system and can be used to build more complex applications such as game playing, robotic vision, and more.\n\nThe main difference between the two is that the Langchain framework uses supervised learning while Baby AGI uses unsupervised learning. The Langchain framework is a general-purpos

In [9]:
flare.run("how are the origin stories of langchain and bitcoin similar or different?")



[1m> Entering new FlareChain chain...[0m
[36;1m[1;3mCurrent Response: [0m
Prompt after formatting:
[32;1m[1;3mRespond to the user message using any relevant context. If context is provided, you should ground your answer in that context. Once you're done responding return FINISHED.

>>> CONTEXT: 
>>> USER INPUT: how are the origin stories of langchain and bitcoin similar or different?
>>> RESPONSE: [0m


[1m> Entering new QuestionGeneratorChain chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven a user input and an existing partial response as context, ask a question to which the answer is the given term/entity/phrase:

>>> USER INPUT: how are the origin stories of langchain and bitcoin similar or different?
>>> EXISTING PARTIAL RESPONSE:  

Langchain and Bitcoin have very different origin stories. Bitcoin was created by the mysterious Satoshi Nakamoto in 2008 as a decentralized digital currency. Langchain, on the other hand, was created in 2020 by a team of developers a

' The origin stories of LangChain and Bitcoin are quite different. Bitcoin was created in 2009 by an unknown person using the alias Satoshi Nakamoto. LangChain was created in late October 2022 by Harrison Chase. Bitcoin is a decentralized cryptocurrency, while LangChain is a framework built around LLMs. '