This will go over how to get started building an agent. We will create this agent from scratch, using LangChain Expression Language. We will then define custom tools, and then run it in a custom loop (we will also show how to use the standard LangChain `AgentExecutor`). ### Set up the agent We first need to create our agent. This is the chain responsible for determining what action to take next. In this example, we will use OpenAI Function Calling to create this agent. This is generally the most reliable way create agents. In this example we will show what it is like to construct this agent from scratch, using LangChain Expression Language. For this guide, we will construct a custom agent that has access to a custom tool. We are choosing this example because we think for most use cases you will NEED to customize either the agent or the tools. The tool we will give the agent is a tool to calculate the length of a word. This is useful because this is actually something LLMs can mess up due to tokenization. We will first create it WITHOUT memory, but we will then show how to add memory in. Memory is needed to enable conversation. First, let's load the language model we're going to use to control the agent. ```python from langchain.chat_models import ChatOpenAI llm = ChatOpenAI(temperature=0) ``` Next, let's define some tools to use. Let's write a really simple Python function to calculate the length of a word that is passed in. ```python from langchain.agents import tool @tool def get_word_length(word: str) -> int: """Returns the length of a word.""" return len(word) tools = [get_word_length] ``` Now let us create the prompt. Because OpenAI Function Calling is finetuned for tool usage, we hardly need any instructions on how to reason, or how to output format. We will just have two input variables: `input` (for the user question) and `agent_scratchpad` (for any previous steps taken) ```python from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder prompt = ChatPromptTemplate.from_messages([ ("system", "You are very powerful assistant, but bad at calculating lengths of words."), ("user", "{input}"), MessagesPlaceholder(variable_name="agent_scratchpad"), ]) ``` How does the agent know what tools it can use? Those are passed in as a separate argument, so we can bind those as keyword arguments to the LLM. ```python from langchain.tools.render import format_tool_to_openai_function llm_with_tools = llm.bind( functions=[format_tool_to_openai_function(t) for t in tools] ) ``` Putting those pieces together, we can now create the agent. We will import two last utility functions: a component for formatting intermediate steps to messages, and a component for converting the output message into an agent action/agent finish. ```python from langchain.agents.format_scratchpad import format_to_openai_functions from langchain.agents.output_parsers import OpenAIFunctionsAgentOutputParser agent = { "input": lambda x: x["input"], "agent_scratchpad": lambda x: format_to_openai_functions(x['intermediate_steps']) } | prompt | llm_with_tools | OpenAIFunctionsAgentOutputParser() ``` Now that we have our agent, let's play around with it! Let's pass in a simple question and empty intermediate steps and see what it returns: ```python agent.invoke({ "input": "how many letters in the word educa?", "intermediate_steps": [] }) ``` We can see that it responds with an `AgentAction` to take (it's actually an `AgentActionMessageLog` - a subclass of `AgentAction` which also tracks the full message log). So this is just the first step - now we need to write a runtime for this. The simplest one is just one that continuously loops, calling the agent, then taking the action, and repeating until an `AgentFinish` is returned. Let's code that up below: ```python from langchain.schema.agent import AgentFinish intermediate_steps = [] while True: output = agent.invoke({ "input": "how many letters in the word educa?", "intermediate_steps": intermediate_steps }) if isinstance(output, AgentFinish): final_result = output.return_values["output"] break else: print(output.tool, output.tool_input) tool = { "get_word_length": get_word_length }[output.tool] observation = tool.run(output.tool_input) intermediate_steps.append((output, observation)) print(final_result) ``` We can see this prints out the following: ``` get_word_length {'word': 'educa'} There are 5 letters in the word "educa". ``` Woo! It's working. To simplify this a bit, we can import and use the `AgentExecutor` class. This bundles up all of the above and adds in error handling, early stopping, tracing, and other quality-of-life improvements that reduce safeguards you need to write. ```python from langchain.agents import AgentExecutor agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True) ``` Now let's test it out! ```python agent_executor.invoke({"input": "how many letters in the word educa?"}) ``` ``` > Entering new AgentExecutor chain... Invoking: `get_word_length` with `{'word': 'educa'}` 5 There are 5 letters in the word "educa". > Finished chain. 'There are 5 letters in the word "educa".' ``` This is great - we have an agent! However, this agent is stateless - it doesn't remember anything about previous interactions. This means you can't ask follow up questions easily. Let's fix that by adding in memory. In order to do this, we need to do two things: 1. Add a place for memory variables to go in the prompt 2. Keep track of the chat history First, let's add a place for memory in the prompt. We do this by adding a placeholder for messages with the key `"chat_history"`. Notice that we put this ABOVE the new user input (to follow the conversation flow). ```python from langchain.prompts import MessagesPlaceholder MEMORY_KEY = "chat_history" prompt = ChatPromptTemplate.from_messages([ ("system", "You are very powerful assistant, but bad at calculating lengths of words."), MessagesPlaceholder(variable_name=MEMORY_KEY), ("user", "{input}"), MessagesPlaceholder(variable_name="agent_scratchpad"), ]) ``` We can then set up a list to track the chat history ``` from langchain.schema.messages import HumanMessage, AIMessage chat_history = [] ``` We can then put it all together! ```python agent = { "input": lambda x: x["input"], "agent_scratchpad": lambda x: format_to_openai_functions(x['intermediate_steps']), "chat_history": lambda x: x["chat_history"] } | prompt | llm_with_tools | OpenAIFunctionsAgentOutputParser() agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True) ``` When running, we now need to track the inputs and outputs as chat history ``` input1 = "how many letters in the word educa?" result = agent_executor.invoke({"input": input1, "chat_history": chat_history}) chat_history.append(HumanMessage(content=input1)) chat_history.append(AIMessage(content=result['output'])) agent_executor.invoke({"input": "is that a real word?", "chat_history": chat_history}) ```