This will go over how to get started building an agent.
We will create this agent from scratch, using LangChain Expression Language.
We will then define custom tools, and then run it in a custom loop (we will also show how to use the standard LangChain `AgentExecutor`).
### Set up the agent
We first need to create our agent.
This is the chain responsible for determining what action to take next.
In this example, we will use OpenAI Function Calling to create this agent.
This is generally the most reliable way create agents.
In this example we will show what it is like to construct this agent from scratch, using LangChain Expression Language.
For this guide, we will construct a custom agent that has access to a custom tool.
We are choosing this example because we think for most use cases you will NEED to customize either the agent or the tools.
The tool we will give the agent is a tool to calculate the length of a word.
This is useful because this is actually something LLMs can mess up due to tokenization.
We will first create it WITHOUT memory, but we will then show how to add memory in.
Memory is needed to enable conversation.
First, let's load the language model we're going to use to control the agent.
```python
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(temperature=0)
```
Next, let's define some tools to use.
Let's write a really simple Python function to calculate the length of a word that is passed in.
```python
from langchain.agents import tool
@tool
def get_word_length(word: str) -> int:
"""Returns the length of a word."""
return len(word)
tools = [get_word_length]
```
Now let us create the prompt.
Because OpenAI Function Calling is finetuned for tool usage, we hardly need any instructions on how to reason, or how to output format.
We will just have two input variables: `input` (for the user question) and `agent_scratchpad` (for any previous steps taken)
```python
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
prompt = ChatPromptTemplate.from_messages([
("system", "You are very powerful assistant, but bad at calculating lengths of words."),
("user", "{input}"),
MessagesPlaceholder(variable_name="agent_scratchpad"),
])
```
How does the agent know what tools it can use?
Those are passed in as a separate argument, so we can bind those as keyword arguments to the LLM.
```python
from langchain.tools.render import format_tool_to_openai_function
llm_with_tools = llm.bind(
functions=[format_tool_to_openai_function(t) for t in tools]
)
```
Putting those pieces together, we can now create the agent.
We will import two last utility functions: a component for formatting intermediate steps to messages, and a component for converting the output message into an agent action/agent finish.
```python
from langchain.agents.format_scratchpad import format_to_openai_functions
from langchain.agents.output_parsers import OpenAIFunctionsAgentOutputParser
agent = {
"input": lambda x: x["input"],
"agent_scratchpad": lambda x: format_to_openai_functions(x['intermediate_steps'])
} | prompt | llm_with_tools | OpenAIFunctionsAgentOutputParser()
```
Now that we have our agent, let's play around with it!
Let's pass in a simple question and empty intermediate steps and see what it returns:
```python
agent.invoke({
"input": "how many letters in the word educa?",
"intermediate_steps": []
})
```
We can see that it responds with an `AgentAction` to take (it's actually an `AgentActionMessageLog` - a subclass of `AgentAction` which also tracks the full message log).
So this is just the first step - now we need to write a runtime for this.
The simplest one is just one that continuously loops, calling the agent, then taking the action, and repeating until an `AgentFinish` is returned.
Let's code that up below:
```python
from langchain.schema.agent import AgentFinish
intermediate_steps = []
while True:
output = agent.invoke({
"input": "how many letters in the word educa?",
"intermediate_steps": intermediate_steps
})
if isinstance(output, AgentFinish):
final_result = output.return_values["output"]
break
else:
print(output.tool, output.tool_input)
tool = {
"get_word_length": get_word_length
}[output.tool]
observation = tool.run(output.tool_input)
intermediate_steps.append((output, observation))
print(final_result)
```
We can see this prints out the following:
```
get_word_length {'word': 'educa'}
There are 5 letters in the word "educa".
```
Woo! It's working.
To simplify this a bit, we can import and use the `AgentExecutor` class.
This bundles up all of the above and adds in error handling, early stopping, tracing, and other quality-of-life improvements that reduce safeguards you need to write.
```python
from langchain.agents import AgentExecutor
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
```
Now let's test it out!
```python
agent_executor.invoke({"input": "how many letters in the word educa?"})
```
```
> Entering new AgentExecutor chain...
Invoking: `get_word_length` with `{'word': 'educa'}`
5
There are 5 letters in the word "educa".
> Finished chain.
'There are 5 letters in the word "educa".'
```
This is great - we have an agent!
However, this agent is stateless - it doesn't remember anything about previous interactions.
This means you can't ask follow up questions easily.
Let's fix that by adding in memory.
In order to do this, we need to do two things:
1. Add a place for memory variables to go in the prompt
2. Keep track of the chat history
First, let's add a place for memory in the prompt.
We do this by adding a placeholder for messages with the key `"chat_history"`.
Notice that we put this ABOVE the new user input (to follow the conversation flow).
```python
from langchain.prompts import MessagesPlaceholder
MEMORY_KEY = "chat_history"
prompt = ChatPromptTemplate.from_messages([
("system", "You are very powerful assistant, but bad at calculating lengths of words."),
MessagesPlaceholder(variable_name=MEMORY_KEY),
("user", "{input}"),
MessagesPlaceholder(variable_name="agent_scratchpad"),
])
```
We can then set up a list to track the chat history
```
from langchain.schema.messages import HumanMessage, AIMessage
chat_history = []
```
We can then put it all together!
```python
agent = {
"input": lambda x: x["input"],
"agent_scratchpad": lambda x: format_to_openai_functions(x['intermediate_steps']),
"chat_history": lambda x: x["chat_history"]
} | prompt | llm_with_tools | OpenAIFunctionsAgentOutputParser()
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
```
When running, we now need to track the inputs and outputs as chat history
```
input1 = "how many letters in the word educa?"
result = agent_executor.invoke({"input": input1, "chat_history": chat_history})
chat_history.append(HumanMessage(content=input1))
chat_history.append(AIMessage(content=result['output']))
agent_executor.invoke({"input": "is that a real word?", "chat_history": chat_history})
```