(rfc) chat models (#1424)

Co-authored-by: Ankush Gola <ankush.gola@gmail.com>
1 year ago · 0e21463f07
parent dec3750875
commit 0e21463f07
36 changed files with 2726 additions and 371 deletions
--- a/docs/index.rst
+++ b/docs/index.rst
@ -63,6 +63,8 @@ These modules are, in increasing order of complexity:

 - `Memory <./modules/memory.html>`_: Memory is the concept of persisting state between calls of a chain/agent. LangChain provides a standard interface for memory, a collection of memory implementations, and examples of chains/agents that use memory.

+- `Chat <./modules/chat.html>`_: Chat models are a variation on Language Models that expose a different API - rather than working with raw text, they work with messages. LangChain provides a standard interface for working with them and doing all the same things as above.
+

 .. toctree::
   :maxdepth: 1
@ -78,6 +80,7 @@ These modules are, in increasing order of complexity:
   ./modules/chains.md
   ./modules/agents.md
   ./modules/memory.md
+   ./modules/chat.md

 Use Cases
 ----------
--- a/docs/modules/agents/implementations/mrkl.ipynb
+++ b/docs/modules/agents/implementations/mrkl.ipynb
@ -205,7 +205,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.9"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/modules/agents/implementations/react.ipynb
+++ b/docs/modules/agents/implementations/react.ipynb
@ -81,7 +81,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3.9.0 64-bit ('llm-env')",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
@ -95,7 +95,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.9.0"
+   "version": "3.9.1"
  },
  "vscode": {
   "interpreter": {
--- a/docs/modules/chains/getting_started.ipynb
+++ b/docs/modules/chains/getting_started.ipynb
@ -32,7 +32,9 @@
  {
   "cell_type": "code",
   "execution_count": 1,
-   "metadata": {},
+   "metadata": {
+    "tags": []
+   },
   "outputs": [],
   "source": [
    "from langchain.prompts import PromptTemplate\n",
@ -55,7 +57,9 @@
  {
   "cell_type": "code",
   "execution_count": 2,
-   "metadata": {},
+   "metadata": {
+    "tags": []
+   },
   "outputs": [
    {
     "name": "stdout",
@ -63,7 +67,7 @@
     "text": [
      "\n",
      "\n",
-      "Vibrancy Socks.\n"
+      "Rainbow Socks Co.\n"
     ]
    }
   ],
@ -75,6 +79,48 @@
    "print(chain.run(\"colorful socks\"))"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You can use a chat model in an `LLMChain` as well:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\n",
+      "Rainbow Threads\n"
+     ]
+    }
+   ],
+   "source": [
+    "from langchain.chat_models import ChatOpenAI\n",
+    "from langchain.prompts.chat import (\n",
+    "    ChatPromptTemplate,\n",
+    "    HumanMessagePromptTemplate,\n",
+    ")\n",
+    "human_message_prompt = HumanMessagePromptTemplate(\n",
+    "        prompt=PromptTemplate(\n",
+    "            template=\"What is a good name for a company that makes {product}?\",\n",
+    "            input_variables=[\"product\"],\n",
+    "        )\n",
+    "    )\n",
+    "chat_prompt_template = ChatPromptTemplate.from_messages([human_message_prompt])\n",
+    "chat = ChatOpenAI(temperature=0.9)\n",
+    "chain = LLMChain(llm=chat, prompt=chat_prompt_template)\n",
+    "print(chain.run(\"colorful socks\"))"
+   ]
+  },
  {
   "cell_type": "markdown",
   "metadata": {},
@ -274,5 +320,5 @@
  }
 },
 "nbformat": 4,
- "nbformat_minor": 2
+ "nbformat_minor": 4
 }
--- a/docs/modules/chat.rst
+++ b/docs/modules/chat.rst
@ -0,0 +1,26 @@
+Chat
+==========================
+
+Chat models are a variation on language models.
+While chat models use language models under the hood, the interface they expose is a bit different.
+Rather than expose a "text in, text out" API, they expose an interface where "chat messages" are the inputs and outputs.
+
+Chat model APIs are fairly new, so we are still figuring out the correct abstractions.
+
+The following sections of documentation are provided:
+
+- `Getting Started <./chat/getting_started.html>`_: An overview of the basics of chat models.
+
+- `Key Concepts <./chat/key_concepts.html>`_: A conceptual guide going over the various concepts related to chat models.
+
+- `How-To Guides <./chat/how_to_guides.html>`_: A collection of how-to guides. These highlight how to accomplish various objectives with our chat model class, as well as how to integrate with various chat model providers.
+
+
+.. toctree::
+   :maxdepth: 1
+   :name: LLMs
+   :hidden:
+   
+   ./chat/getting_started.ipynb
+   ./chat/key_concepts.md
+   ./chat/how_to_guides.rst
--- a/docs/modules/chat/examples/agent.ipynb
+++ b/docs/modules/chat/examples/agent.ipynb
@ -0,0 +1,208 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "e58f4d5a",
+   "metadata": {},
+   "source": [
+    "# Agent\n",
+    "This notebook covers how to create a custom agent for a chat model. It will utilize chat specific prompts."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "5268c7fa",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.agents import ZeroShotAgent, Tool, AgentExecutor\n",
+    "from langchain.chains import LLMChain\n",
+    "from langchain.utilities import SerpAPIWrapper"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "fbaa4dbe",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "search = SerpAPIWrapper()\n",
+    "tools = [\n",
+    "    Tool(\n",
+    "        name = \"Search\",\n",
+    "        func=search.run,\n",
+    "        description=\"useful for when you need to answer questions about current events\"\n",
+    "    )\n",
+    "]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "f3ba6f08",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "prefix = \"\"\"Answer the following questions as best you can, but speaking as a pirate might speak. You have access to the following tools:\"\"\"\n",
+    "suffix = \"\"\"Begin! Remember to speak as a pirate when giving your final answer. Use lots of \"Args\"\"\"\n",
+    "\n",
+    "prompt = ZeroShotAgent.create_prompt(\n",
+    "    tools, \n",
+    "    prefix=prefix, \n",
+    "    suffix=suffix, \n",
+    "    input_variables=[]\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "3547a37d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.chat_models import ChatOpenAI\n",
+    "from langchain.prompts.chat import (\n",
+    "    ChatPromptTemplate,\n",
+    "    SystemMessagePromptTemplate,\n",
+    "    AIMessagePromptTemplate,\n",
+    "    HumanMessagePromptTemplate,\n",
+    ")\n",
+    "from langchain.schema import (\n",
+    "    AIMessage,\n",
+    "    HumanMessage,\n",
+    "    SystemMessage\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "a78f886f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "messages = [\n",
+    "    SystemMessagePromptTemplate(prompt=prompt),\n",
+    "    HumanMessagePromptTemplate.from_template(\"{input}\\n\\nThis was your previous work \"\n",
+    "                f\"(but I haven't seen any of it! I only see what \"\n",
+    "                \"you return as final answer):\\n{agent_scratchpad}\")\n",
+    "]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "dadadd70",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "prompt = ChatPromptTemplate.from_messages(messages)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "id": "b7180182",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "llm_chain = LLMChain(llm=ChatOpenAI(temperature=0), prompt=prompt)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "id": "ddddb07b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tool_names = [tool.name for tool in tools]\n",
+    "agent = ZeroShotAgent(llm_chain=llm_chain, allowed_tools=tool_names)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "id": "36aef054",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "agent_executor = AgentExecutor.from_agent_and_tools(agent=agent, tools=tools, verbose=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "id": "33a4d6cc",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\n",
+      "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
+      "\u001b[32;1m\u001b[1;3mArrr, ye be in luck, matey! I'll find ye the answer to yer question.\n",
+      "\n",
+      "Thought: I need to search for the current population of Canada.\n",
+      "Action: Search\n",
+      "Action Input: \"current population of Canada 2023\"\n",
+      "\u001b[0m\n",
+      "Observation: \u001b[36;1m\u001b[1;3mThe current population of Canada is 38,623,091 as of Saturday, March 4, 2023, based on Worldometer elaboration of the latest United Nations data.\u001b[0m\n",
+      "Thought:\u001b[32;1m\u001b[1;3mAhoy, me hearties! I've found the answer to yer question.\n",
+      "\n",
+      "Final Answer: As of March 4, 2023, the population of Canada be 38,623,091. Arrr!\u001b[0m\n",
+      "\n",
+      "\u001b[1m> Finished chain.\u001b[0m\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "'As of March 4, 2023, the population of Canada be 38,623,091. Arrr!'"
+      ]
+     },
+     "execution_count": 13,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "agent_executor.run(\"How many people live in canada as of 2023?\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6aefe978",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/modules/chat/examples/chat_vector_db.ipynb
+++ b/docs/modules/chat/examples/chat_vector_db.ipynb
@ -0,0 +1,376 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "134a0785",
+   "metadata": {},
+   "source": [
+    "# Chat Vector DB\n",
+    "\n",
+    "This notebook goes over how to set up a chat model to chat with a vector database.\n",
+    "\n",
+    "This notebook is very similar to the example of using an LLM in the ChatVectorDBChain. The only differences here are (1) using a ChatModel, and (2) passing in a ChatPromptTemplate (optimized for chat models)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "70c4e529",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "from langchain.embeddings.openai import OpenAIEmbeddings\n",
+    "from langchain.vectorstores import Chroma\n",
+    "from langchain.text_splitter import CharacterTextSplitter\n",
+    "from langchain.chains import ChatVectorDBChain"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "cdff94be",
+   "metadata": {},
+   "source": [
+    "Load in documents. You can replace this with a loader for whatever type of data you want"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "01c46e92",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "from langchain.document_loaders import TextLoader\n",
+    "loader = TextLoader('../../state_of_the_union.txt')\n",
+    "documents = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e9be4779",
+   "metadata": {},
+   "source": [
+    "If you had multiple loaders that you wanted to combine, you do something like:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "433363a5",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# loaders = [....]\n",
+    "# docs = []\n",
+    "# for loader in loaders:\n",
+    "#     docs.extend(loader.load())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "239475d2",
+   "metadata": {},
+   "source": [
+    "We now split the documents, create embeddings for them, and put them in a vectorstore. This allows us to do semantic search over them."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "a8930cf7",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Running Chroma using direct local API.\n",
+      "Using DuckDB in-memory for database. Data will be transient.\n"
+     ]
+    }
+   ],
+   "source": [
+    "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
+    "documents = text_splitter.split_documents(documents)\n",
+    "\n",
+    "embeddings = OpenAIEmbeddings()\n",
+    "vectorstore = Chroma.from_documents(documents, embeddings)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "18415aca",
+   "metadata": {},
+   "source": [
+    "We are now going to construct a prompt specifically designed for chat models."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "c8805230",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.chat_models import ChatOpenAI\n",
+    "from langchain.prompts.chat import (\n",
+    "    ChatPromptTemplate,\n",
+    "    SystemMessagePromptTemplate,\n",
+    "    AIMessagePromptTemplate,\n",
+    "    HumanMessagePromptTemplate,\n",
+    ")\n",
+    "from langchain.schema import (\n",
+    "    AIMessage,\n",
+    "    HumanMessage,\n",
+    "    SystemMessage\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "cc86c30e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "system_template=\"\"\"Use the following pieces of context to answer the users question. \n",
+    "If you don't know the answer, just say that you don't know, don't try to make up an answer.\n",
+    "----------------\n",
+    "{context}\"\"\"\n",
+    "messages = [\n",
+    "    SystemMessagePromptTemplate.from_template(system_template),\n",
+    "    HumanMessagePromptTemplate.from_template(\"{question}\")\n",
+    "]\n",
+    "prompt = ChatPromptTemplate.from_messages(messages)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3c96b118",
+   "metadata": {},
+   "source": [
+    "We now initialize the ChatVectorDBChain"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "7b4110f3",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "qa = ChatVectorDBChain.from_llm(ChatOpenAI(temperature=0), vectorstore,qa_prompt=prompt)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3872432d",
+   "metadata": {},
+   "source": [
+    "Here's an example of asking a question with no chat history"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "7fe3e730",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "chat_history = []\n",
+    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
+    "result = qa({\"question\": query, \"chat_history\": chat_history})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "bfff9cc8",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "\"The President nominated Circuit Court of Appeals Judge Ketanji Brown Jackson to serve on the United States Supreme Court. He described her as one of the nation's top legal minds, a former top litigator in private practice, a former federal public defender, and a consensus builder. He also mentioned that she has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans.\""
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "result[\"answer\"]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9e46edf7",
+   "metadata": {},
+   "source": [
+    "Here's an example of asking a question with some chat history"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "id": "00b4cf00",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "chat_history = [(query, result[\"answer\"])]\n",
+    "query = \"Did he mention who came before her\"\n",
+    "result = qa({\"question\": query, \"chat_history\": chat_history})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "id": "f01828d1",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'The context does not provide information about the predecessor of Ketanji Brown Jackson.'"
+      ]
+     },
+     "execution_count": 13,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "result['answer']"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2324cdc6-98bf-4708-b8cd-02a98b1e5b67",
+   "metadata": {},
+   "source": [
+    "## Chat Vector DB with streaming to `stdout`\n",
+    "\n",
+    "Output from the chain will be streamed to `stdout` token by token in this example."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "id": "2efacec3-2690-4b05-8de3-a32fd2ac3911",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "from langchain.chains.llm import LLMChain\n",
+    "from langchain.llms import OpenAI\n",
+    "from langchain.callbacks.base import CallbackManager\n",
+    "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
+    "from langchain.chains.chat_vector_db.prompts import CONDENSE_QUESTION_PROMPT\n",
+    "from langchain.chains.question_answering import load_qa_chain\n",
+    "\n",
+    "# Construct a ChatVectorDBChain with a streaming llm for combine docs\n",
+    "# and a separate, non-streaming llm for question generation\n",
+    "llm = OpenAI(temperature=0)\n",
+    "streaming_llm = ChatOpenAI(streaming=True, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), verbose=True, temperature=0)\n",
+    "\n",
+    "question_generator = LLMChain(llm=llm, prompt=CONDENSE_QUESTION_PROMPT)\n",
+    "doc_chain = load_qa_chain(streaming_llm, chain_type=\"stuff\", prompt=prompt)\n",
+    "\n",
+    "qa = ChatVectorDBChain(vectorstore=vectorstore, combine_docs_chain=doc_chain, question_generator=question_generator)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "id": "fd6d43f4-7428-44a4-81bc-26fe88a98762",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "The President nominated Circuit Court of Appeals Judge Ketanji Brown Jackson to serve on the United States Supreme Court. He described her as one of the nation's top legal minds, a former top litigator in private practice, a former federal public defender, and a consensus builder. He also mentioned that she has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans."
+     ]
+    }
+   ],
+   "source": [
+    "chat_history = []\n",
+    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
+    "result = qa({\"question\": query, \"chat_history\": chat_history})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "id": "5ab38978-f3e8-4fa7-808c-c79dec48379a",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "The context does not provide information on who Ketanji Brown Jackson succeeded on the United States Supreme Court."
+     ]
+    }
+   ],
+   "source": [
+    "chat_history = [(query, result[\"answer\"])]\n",
+    "query = \"Did he mention who she suceeded\"\n",
+    "result = qa({\"question\": query, \"chat_history\": chat_history})\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "8e8d0055",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/modules/chat/examples/few_shot_examples.ipynb
+++ b/docs/modules/chat/examples/few_shot_examples.ipynb
@ -0,0 +1,166 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "bb0735c0",
+   "metadata": {},
+   "source": [
+    "# Few Shot Examples\n",
+    "\n",
+    "This notebook covers how to use few shot examples in chat models.\n",
+    "\n",
+    "There does not appear to be solid consensus on how best to do few shot prompting. As a result, we are not solidifying any abstractions around this yet but rather using existing abstractions."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c6e9664c",
+   "metadata": {},
+   "source": [
+    "## Alternating Human/AI messages\n",
+    "The first way of doing few shot prompting relies on using alternating human/ai messages. See an example of this below."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "62156fe4",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.chat_models import ChatOpenAI\n",
+    "from langchain import PromptTemplate, LLMChain\n",
+    "from langchain.prompts.chat import (\n",
+    "    ChatPromptTemplate,\n",
+    "    SystemMessagePromptTemplate,\n",
+    "    AIMessagePromptTemplate,\n",
+    "    HumanMessagePromptTemplate,\n",
+    ")\n",
+    "from langchain.schema import (\n",
+    "    AIMessage,\n",
+    "    HumanMessage,\n",
+    "    SystemMessage\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "ed7ac3c6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "chat = ChatOpenAI(temperature=0)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "98791aa9",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "template=\"You are a helpful assistant that translates english to pirate.\"\n",
+    "system_message_prompt = SystemMessagePromptTemplate.from_template(template)\n",
+    "example_human = HumanMessagePromptTemplate.from_template(\"Hi\")\n",
+    "example_ai = AIMessagePromptTemplate.from_template(\"Argh me mateys\")\n",
+    "human_template=\"{text}\"\n",
+    "human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "4eebdcd7",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "\"I be lovin' programmin', me hearty!\""
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, example_human, example_ai, human_message_prompt])\n",
+    "chain = LLMChain(llm=chat, prompt=chat_prompt)\n",
+    "# get a chat completion from the formatted messages\n",
+    "chain.run(\"I love programming.\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5c4135d7",
+   "metadata": {},
+   "source": [
+    "## System Messages\n",
+    "\n",
+    "OpenAI provides an optional `name` parameter that they also recommend using in conjunction with system messages to do few shot prompting. Here is an example of how to do that below."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "1ba92d59",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "template=\"You are a helpful assistant that translates english to pirate.\"\n",
+    "system_message_prompt = SystemMessagePromptTemplate.from_template(template)\n",
+    "example_human = SystemMessagePromptTemplate.from_template(\"Hi\", additional_kwargs={\"name\": \"example_user\"})\n",
+    "example_ai = SystemMessagePromptTemplate.from_template(\"Argh me mateys\", additional_kwargs={\"name\": \"example_assistant\"})\n",
+    "human_template=\"{text}\"\n",
+    "human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "56e488a7",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "\"I be lovin' programmin', me hearty.\""
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, example_human, example_ai, human_message_prompt])\n",
+    "chain = LLMChain(llm=chat, prompt=chat_prompt)\n",
+    "# get a chat completion from the formatted messages\n",
+    "chain.run(\"I love programming.\")"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/modules/chat/examples/streaming.ipynb
+++ b/docs/modules/chat/examples/streaming.ipynb
@ -0,0 +1,119 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "fe4e96b5",
+   "metadata": {},
+   "source": [
+    "# Streaming\n",
+    "\n",
+    "This notebook goes over how to use streaming with a chat model."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "e0244f2a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.chat_models import ChatOpenAI\n",
+    "from langchain.schema import (\n",
+    "    HumanMessage,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "ad342bfa",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\n",
+      "Verse 1:\n",
+      "Bubbles rising to the top\n",
+      "A refreshing drink that never stops\n",
+      "Clear and crisp, it's pure delight\n",
+      "A taste that's sure to excite\n",
+      "\n",
+      "Chorus:\n",
+      "Sparkling water, oh so fine\n",
+      "A drink that's always on my mind\n",
+      "With every sip, I feel alive\n",
+      "Sparkling water, you're my vibe\n",
+      "\n",
+      "Verse 2:\n",
+      "No sugar, no calories, just pure bliss\n",
+      "A drink that's hard to resist\n",
+      "It's the perfect way to quench my thirst\n",
+      "A drink that always comes first\n",
+      "\n",
+      "Chorus:\n",
+      "Sparkling water, oh so fine\n",
+      "A drink that's always on my mind\n",
+      "With every sip, I feel alive\n",
+      "Sparkling water, you're my vibe\n",
+      "\n",
+      "Bridge:\n",
+      "From the mountains to the sea\n",
+      "Sparkling water, you're the key\n",
+      "To a healthy life, a happy soul\n",
+      "A drink that makes me feel whole\n",
+      "\n",
+      "Chorus:\n",
+      "Sparkling water, oh so fine\n",
+      "A drink that's always on my mind\n",
+      "With every sip, I feel alive\n",
+      "Sparkling water, you're my vibe\n",
+      "\n",
+      "Outro:\n",
+      "Sparkling water, you're the one\n",
+      "A drink that's always so much fun\n",
+      "I'll never let you go, my friend\n",
+      "Sparkling"
+     ]
+    }
+   ],
+   "source": [
+    "from langchain.callbacks.base import CallbackManager\n",
+    "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
+    "chat = ChatOpenAI(streaming=True, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), verbose=True, temperature=0)\n",
+    "resp = chat([HumanMessage(content=\"Write me a song about sparkling water.\")])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "67c44deb",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/modules/chat/examples/vector_db_qa.ipynb
+++ b/docs/modules/chat/examples/vector_db_qa.ipynb
@ -0,0 +1,169 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "07c1e3b9",
+   "metadata": {},
+   "source": [
+    "# Vector DB Question/Answering\n",
+    "\n",
+    "This example showcases using a chat model to do question answering over a vector database.\n",
+    "\n",
+    "This notebook is very similar to the example of using an LLM in the ChatVectorDBChain. The only differences here are (1) using a ChatModel, and (2) passing in a ChatPromptTemplate (optimized for chat models)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "82525493",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.embeddings.openai import OpenAIEmbeddings\n",
+    "from langchain.vectorstores import Chroma\n",
+    "from langchain.text_splitter import CharacterTextSplitter\n",
+    "from langchain.chains import VectorDBQA"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "5c7049db",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Running Chroma using direct local API.\n",
+      "Using DuckDB in-memory for database. Data will be transient.\n"
+     ]
+    }
+   ],
+   "source": [
+    "from langchain.document_loaders import TextLoader\n",
+    "loader = TextLoader('../../state_of_the_union.txt')\n",
+    "documents = loader.load()\n",
+    "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
+    "texts = text_splitter.split_documents(documents)\n",
+    "\n",
+    "embeddings = OpenAIEmbeddings()\n",
+    "docsearch = Chroma.from_documents(texts, embeddings)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "35f99145",
+   "metadata": {},
+   "source": [
+    "We can now set up the chat model and chat model specific prompt"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "32a49412",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.chat_models import ChatOpenAI\n",
+    "from langchain.prompts.chat import (\n",
+    "    ChatPromptTemplate,\n",
+    "    SystemMessagePromptTemplate,\n",
+    "    AIMessagePromptTemplate,\n",
+    "    HumanMessagePromptTemplate,\n",
+    ")\n",
+    "from langchain.schema import (\n",
+    "    AIMessage,\n",
+    "    HumanMessage,\n",
+    "    SystemMessage\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "f231fb9b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "system_template=\"\"\"Use the following pieces of context to answer the users question. \n",
+    "If you don't know the answer, just say that you don't know, don't try to make up an answer.\n",
+    "----------------\n",
+    "{context}\"\"\"\n",
+    "messages = [\n",
+    "    SystemMessagePromptTemplate.from_template(system_template),\n",
+    "    HumanMessagePromptTemplate.from_template(\"{question}\")\n",
+    "]\n",
+    "prompt = ChatPromptTemplate.from_messages(messages)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "3018f865",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "chain_type_kwargs = {\"prompt\": prompt}\n",
+    "qa = VectorDBQA.from_chain_type(llm=ChatOpenAI(), chain_type=\"stuff\", vectorstore=docsearch, chain_type_kwargs=chain_type_kwargs)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "032a47f8",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "\"The President nominated Ketanji Brown Jackson as a Judge for the United States Supreme Court. He described her as one of the nation's top legal minds and a former top litigator in private practice, a former federal public defender, and a consensus builder.\""
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
+    "qa.run(query)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "8b403637",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  },
+  "vscode": {
+   "interpreter": {
+    "hash": "b1677b440931f40d89ef8be7bf03acb108ce003de0ac9b18e8d43753ea2e7103"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/modules/chat/examples/vector_db_qa_with_sources.ipynb
+++ b/docs/modules/chat/examples/vector_db_qa_with_sources.ipynb
@ -0,0 +1,218 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "efc5be67",
+   "metadata": {},
+   "source": [
+    "# VectorDB Question Answering with Sources\n",
+    "\n",
+    "This notebook goes over how to do question-answering with sources with a chat model over a vector database. It does this by using the `VectorDBQAWithSourcesChain`, which does the lookup of the documents from a vector database. \n",
+    "\n",
+    "This notebook is very similar to the example of using an LLM in the ChatVectorDBChain. The only differences here are (1) using a ChatModel, and (2) passing in a ChatPromptTemplate (optimized for chat models)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "1c613960",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.embeddings.openai import OpenAIEmbeddings\n",
+    "from langchain.embeddings.cohere import CohereEmbeddings\n",
+    "from langchain.text_splitter import CharacterTextSplitter\n",
+    "from langchain.vectorstores.elastic_vector_search import ElasticVectorSearch\n",
+    "from langchain.vectorstores import Chroma"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "17d1306e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "with open('../../state_of_the_union.txt') as f:\n",
+    "    state_of_the_union = f.read()\n",
+    "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
+    "texts = text_splitter.split_text(state_of_the_union)\n",
+    "\n",
+    "embeddings = OpenAIEmbeddings()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "0e745d99",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Running Chroma using direct local API.\n",
+      "Using DuckDB in-memory for database. Data will be transient.\n"
+     ]
+    }
+   ],
+   "source": [
+    "docsearch = Chroma.from_texts(texts, embeddings, metadatas=[{\"source\": f\"{i}-pl\"} for i in range(len(texts))])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "8aa571ae",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.chains import VectorDBQAWithSourcesChain"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1f73b14a",
+   "metadata": {},
+   "source": [
+    "We can now set up the chat model and chat model specific prompt"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "9643c775",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.chat_models import ChatOpenAI\n",
+    "from langchain.prompts.chat import (\n",
+    "    ChatPromptTemplate,\n",
+    "    SystemMessagePromptTemplate,\n",
+    "    AIMessagePromptTemplate,\n",
+    "    HumanMessagePromptTemplate,\n",
+    ")\n",
+    "from langchain.schema import (\n",
+    "    AIMessage,\n",
+    "    HumanMessage,\n",
+    "    SystemMessage\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "id": "ed00e906",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "system_template=\"\"\"Use the following pieces of context to answer the users question. \n",
+    "If you don't know the answer, just say that you don't know, don't try to make up an answer.\n",
+    "ALWAYS return a \"SOURCES\" part in your answer.\n",
+    "The \"SOURCES\" part should be a reference to the source of the document from which you got your answer.\n",
+    "\n",
+    "Example of your response should be:\n",
+    "\n",
+    "```\n",
+    "The answer is foo\n",
+    "SOURCES: xyz\n",
+    "```\n",
+    "\n",
+    "Begin!\n",
+    "----------------\n",
+    "{summaries}\"\"\"\n",
+    "messages = [\n",
+    "    SystemMessagePromptTemplate.from_template(system_template),\n",
+    "    HumanMessagePromptTemplate.from_template(\"{question}\")\n",
+    "]\n",
+    "prompt = ChatPromptTemplate.from_messages(messages)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "id": "aa859d4c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "chain_type_kwargs = {\"prompt\": prompt}\n",
+    "chain = VectorDBQAWithSourcesChain.from_chain_type(\n",
+    "    ChatOpenAI(temperature=0), \n",
+    "    chain_type=\"stuff\", \n",
+    "    vectorstore=docsearch,\n",
+    "    chain_type_kwargs=chain_type_kwargs\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 19,
+   "id": "8ba36fa7",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'answer': 'The President honored Justice Stephen Breyer, an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court, for his dedicated service to the country. \\n',\n",
+       " 'sources': '30-pl'}"
+      ]
+     },
+     "execution_count": 19,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "chain({\"question\": \"What did the president say about Justice Breyer\"}, return_only_outputs=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "id": "c91fdc8a",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'answer': ' The president honored Justice Stephen Breyer for his service.\\n',\n",
+       " 'sources': '30-pl'}"
+      ]
+     },
+     "execution_count": 11,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "qa({\"question\": \"What did the president say about Justice Breyer\"}, return_only_outputs=True)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  },
+  "vscode": {
+   "interpreter": {
+    "hash": "b1677b440931f40d89ef8be7bf03acb108ce003de0ac9b18e8d43753ea2e7103"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/modules/chat/getting_started.ipynb
+++ b/docs/modules/chat/getting_started.ipynb
@ -0,0 +1,380 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "e49f1e0d",
+   "metadata": {},
+   "source": [
+    "# Getting Started\n",
+    "\n",
+    "This notebook covers how to get started with chat models. The interface is based around messages rather than raw text."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "522686de",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "from langchain.chat_models import ChatOpenAI\n",
+    "from langchain import PromptTemplate, LLMChain\n",
+    "from langchain.prompts.chat import (\n",
+    "    ChatPromptTemplate,\n",
+    "    SystemMessagePromptTemplate,\n",
+    "    AIMessagePromptTemplate,\n",
+    "    HumanMessagePromptTemplate,\n",
+    ")\n",
+    "from langchain.schema import (\n",
+    "    AIMessage,\n",
+    "    HumanMessage,\n",
+    "    SystemMessage\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "62e0dbc3",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "chat = ChatOpenAI(temperature=0)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "bbaec18e-3684-4eef-955f-c1cec8bf765d",
+   "metadata": {},
+   "source": [
+    "You can get chat completions by passing one or more messages to the chat model. The response will be a message. The types of messages currently supported in LangChain are `AIMessage`, `HumanMessage`, `SystemMessage`, and `ChatMessage` -- `ChatMessage` takes in an arbitrary role parameter. Most of the time, you'll just be dealing with `HumanMessage`, `AIMessage`, and `SystemMessage`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "76a6e7b0-e927-4bfb-a414-1332a4149106",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "AIMessage(content=\"J'aime programmer.\", additional_kwargs={})"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "chat([HumanMessage(content=\"Translate this sentence from English to French. I love programming.\")])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a62153d4-1211-411b-a493-3febfe446ae0",
+   "metadata": {},
+   "source": [
+    "OpenAI's chat model supports multiple messages as input. See [here](https://platform.openai.com/docs/guides/chat/chat-vs-completions) for more information. Here is an example of sending a system and user message to the chat model:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "ce16ad78-8e6f-48cd-954e-98be75eb5836",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "AIMessage(content=\"J'aime programmer.\", additional_kwargs={})"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "messages = [\n",
+    "    SystemMessage(content=\"You are a helpful assistant that translates English to French.\"),\n",
+    "    HumanMessage(content=\"Translate this sentence from English to French. I love programming.\")\n",
+    "]\n",
+    "chat(messages)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "36dc8d7e-bd25-47ac-8c1b-60e3422603d3",
+   "metadata": {},
+   "source": [
+    "You can go one step further and generate completions for multiple sets of messages using `generate`. This returns an `LLMResult` with an additional `message` parameter."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "2b21fc52-74b6-4950-ab78-45d12c68fb4d",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "LLMResult(generations=[[ChatGeneration(text=\"J'aime programmer.\", generation_info=None, message=AIMessage(content=\"J'aime programmer.\", additional_kwargs={}))], [ChatGeneration(text=\"J'aime l'intelligence artificielle.\", generation_info=None, message=AIMessage(content=\"J'aime l'intelligence artificielle.\", additional_kwargs={}))]], llm_output=None)"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "batch_messages = [\n",
+    "    [\n",
+    "        SystemMessage(content=\"You are a helpful assistant that translates English to French.\"),\n",
+    "        HumanMessage(content=\"Translate this sentence from English to French. I love programming.\")\n",
+    "    ],\n",
+    "    [\n",
+    "        SystemMessage(content=\"You are a helpful assistant that translates English to French.\"),\n",
+    "        HumanMessage(content=\"Translate this sentence from English to French. I love artificial intelligence.\")\n",
+    "    ],\n",
+    "]\n",
+    "chat.generate(batch_messages)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b10b00ef-f373-4bc3-8302-2dfc28033734",
+   "metadata": {},
+   "source": [
+    "## PromptTemplates"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "778f912a-66ea-4a5d-b3de-6c7db4baba26",
+   "metadata": {},
+   "source": [
+    "You can make use of templating by using a `MessagePromptTemplate`. You can build a `ChatPromptTemplate` from one or more `MessagePromptTemplates`. You can use `ChatPromptTemplate`'s `format_prompt` -- this returns a `PromptValue`, which you can convert to a string or Message object, depending on whether you want to use the formatted value as input to an llm or chat model.\n",
+    "\n",
+    "For convience, there is a `from_template` method exposed on the template. If you were to use this template, this is what it would look like:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "180c5cc8",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "template=\"You are a helpful assistant that translates {input_language} to {output_language}.\"\n",
+    "system_message_prompt = SystemMessagePromptTemplate.from_template(template)\n",
+    "human_template=\"{text}\"\n",
+    "human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "fbb043e6",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "AIMessage(content=\"J'adore la programmation.\", additional_kwargs={})"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])\n",
+    "\n",
+    "# get a chat completion from the formatted messages\n",
+    "chat(chat_prompt.format_prompt(input_language=\"English\", output_language=\"French\", text=\"I love programming.\").to_messages())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e28b98da",
+   "metadata": {},
+   "source": [
+    "If you wanted to construct the MessagePromptTemplate more directly, you could create a PromptTemplate outside and then pass it in, eg:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "d5b1ab1c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "prompt=PromptTemplate(\n",
+    "    template=\"You are a helpful assistant that translates {input_language} to {output_language}.\",\n",
+    "    input_variables=[\"input_language\", \"output_language\"],\n",
+    ")\n",
+    "system_message_prompt = SystemMessagePromptTemplate(prompt=prompt)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "92af0bba",
+   "metadata": {},
+   "source": [
+    "## LLMChain\n",
+    "You can use the existing LLMChain in a very similar way to before - provide a prompt and a model."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "id": "f2cbfe3d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "chain = LLMChain(llm=chat, prompt=chat_prompt)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "id": "268543b1",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "\"J'adore la programmation.\""
+      ]
+     },
+     "execution_count": 11,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "chain.run(input_language=\"English\", output_language=\"French\", text=\"I love programming.\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "eb779f3f",
+   "metadata": {},
+   "source": [
+    "## Streaming\n",
+    "\n",
+    "Streaming is supported for `ChatOpenAI` through callback handling."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "id": "509181be",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\n",
+      "Verse 1:\n",
+      "Bubbles rising to the top\n",
+      "A refreshing drink that never stops\n",
+      "Clear and crisp, it's pure delight\n",
+      "A taste that's sure to excite\n",
+      "\n",
+      "Chorus:\n",
+      "Sparkling water, oh so fine\n",
+      "A drink that's always on my mind\n",
+      "With every sip, I feel alive\n",
+      "Sparkling water, you're my vibe\n",
+      "\n",
+      "Verse 2:\n",
+      "No sugar, no calories, just pure bliss\n",
+      "A drink that's hard to resist\n",
+      "It's the perfect way to quench my thirst\n",
+      "A drink that always comes first\n",
+      "\n",
+      "Chorus:\n",
+      "Sparkling water, oh so fine\n",
+      "A drink that's always on my mind\n",
+      "With every sip, I feel alive\n",
+      "Sparkling water, you're my vibe\n",
+      "\n",
+      "Bridge:\n",
+      "From the mountains to the sea\n",
+      "Sparkling water, you're the key\n",
+      "To a healthy life, a happy soul\n",
+      "A drink that makes me feel whole\n",
+      "\n",
+      "Chorus:\n",
+      "Sparkling water, oh so fine\n",
+      "A drink that's always on my mind\n",
+      "With every sip, I feel alive\n",
+      "Sparkling water, you're my vibe\n",
+      "\n",
+      "Outro:\n",
+      "Sparkling water, you're the one\n",
+      "A drink that's always so much fun\n",
+      "I'll never let you go, my friend\n",
+      "Sparkling"
+     ]
+    }
+   ],
+   "source": [
+    "from langchain.callbacks.base import CallbackManager\n",
+    "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
+    "chat = ChatOpenAI(streaming=True, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), verbose=True, temperature=0)\n",
+    "resp = chat([HumanMessage(content=\"Write me a song about sparkling water.\")])\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c095285d",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/modules/chat/how_to_guides.rst
+++ b/docs/modules/chat/how_to_guides.rst
@ -0,0 +1,10 @@
+How-To Guides
+=============
+
+The examples here all address certain "how-to" guides for working with chat models.
+
+.. toctree::
+   :maxdepth: 1
+   :glob:
+
+   ./examples/*
--- a/docs/modules/chat/key_concepts.md
+++ b/docs/modules/chat/key_concepts.md
@ -0,0 +1,29 @@
+# Key Concepts
+
+## ChatMessage
+A chat message is what we refer to as the modular unit of information.
+At the moment, this consists of "content", which refers to the content of the chat message.
+At the moment, most chat models are trained to predict sequences of Human <> AI messages.
+This is because so far the primary interaction mode has been between a human user and a singular AI system.
+
+At the moment, there are four different classes of Chat Messages
+
+### HumanMessage
+A HumanMessage is a ChatMessage that is sent as if from a Human's point of view.
+
+### AIMessage
+An AIMessage is a ChatMessage that is sent from the point of view of the AI system to which the Human is corresponding. 
+
+### SystemMessage
+A SystemMessage is still a bit ambiguous, and so far seems to be a concept unique to OpenAI
+
+### ChatMessage
+A chat message is a generic chat message, with not only a "content" field but also a "role" field.
+With this field, arbitrary roles may be assigned to a message.
+
+## ChatGeneration
+The output of a single prediction of a chat message.
+Currently this is just a chat message itself (eg content and a role)
+
+## Chat Model
+A model which takes in a list of chat messages, and predicts a chat message in response.
--- a/docs/modules/indexes/chain_examples/summarize.ipynb
+++ b/docs/modules/indexes/chain_examples/summarize.ipynb
@ -21,7 +21,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": 1,
   "id": "e9db25f3",
   "metadata": {},
   "outputs": [],
@ -81,17 +81,17 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 14,
+   "execution_count": 7,
   "id": "5cfa89b2",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "\" In response to Russia's aggression in Ukraine, the United States and its allies have imposed economic sanctions and are taking other measures to hold Putin accountable. The US is also providing economic and military assistance to Ukraine, protecting NATO countries, and investing in American products to create jobs. President Biden and Vice President Harris have passed the American Rescue Plan and the Bipartisan Infrastructure Law to help working people and rebuild America.\""
+       "' In response to Russian aggression in Ukraine, the United States and its allies are taking action to hold Putin accountable, including economic sanctions, asset seizures, and military assistance. The US is also providing economic and humanitarian aid to Ukraine, and has passed the American Rescue Plan and the Bipartisan Infrastructure Law to help struggling families and create jobs. The US remains unified and determined to protect Ukraine and the free world.'"
      ]
     },
-     "execution_count": 14,
+     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -470,7 +470,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.9"
+   "version": "3.9.1"
  },
  "vscode": {
   "interpreter": {
--- a/docs/modules/llms/async_llm.ipynb
+++ b/docs/modules/llms/async_llm.ipynb
@ -9,7 +9,7 @@
    "\n",
    "LangChain provides async support for LLMs by leveraging the [asyncio](https://docs.python.org/3/library/asyncio.html) library.\n",
    "\n",
-    "Async support is particularly useful for calling multiple LLMs concurrently, as these calls are network-bound. Currently, only `OpenAI` `OpenAIChat`, and `PromptLayerOpenAI` are supported, but async support for other LLMs is on the roadmap.\n",
+    "Async support is particularly useful for calling multiple LLMs concurrently, as these calls are network-bound. Currently, only `OpenAI` and `PromptLayerOpenAI` are supported, but async support for other LLMs is on the roadmap.\n",
    "\n",
    "You can use the `agenerate` method to call an OpenAI LLM asynchronously."
   ]
@ -28,66 +28,65 @@
     "text": [
      "\n",
      "\n",
-      "As an AI language model, I don't have feelings like humans, but I'm functioning properly. How may I assist you?\n",
+      "I'm doing well, thank you. How about you?\n",
      "\n",
      "\n",
-      "I'm an AI language model, so I don't have emotions, but I'm functioning properly. How may I assist you today?\n",
+      "I'm doing well, thank you. How about you?\n",
      "\n",
      "\n",
-      "As an AI language model, I do not have emotions like humans, but I'm functioning normally. How can I assist you today?\n",
+      "I'm doing well, how about you?\n",
      "\n",
      "\n",
-      "I am an AI language model, so I do not have feelings, but I am here to assist you. How may I help you today?\n",
+      "I'm doing well, thank you. How about you?\n",
      "\n",
      "\n",
-      "As an AI language model, I do not have feelings or emotions but I'm always ready to assist you. How may I assist you today?\n",
+      "I'm doing well, thank you. How about you?\n",
      "\n",
      "\n",
-      "As an AI language model, I don't have feelings, but I'm functioning normally. How may I assist you today?\n",
+      "I'm doing well, thank you. How about yourself?\n",
      "\n",
      "\n",
-      "As an AI language model, I don't have feelings, but I'm functioning properly. Thank you. How may I assist you today?\n",
+      "I'm doing well, thank you! How about you?\n",
      "\n",
      "\n",
-      "As an AI language model, I don't have emotions, so I don't have a specific feeling or emotion. How can I assist you today?\n",
+      "I'm doing well, thank you. How about you?\n",
      "\n",
      "\n",
-      "As an AI language model, I do not have feelings or emotions. However, I am functioning as intended and ready to assist you with any queries you may have. How can I be of assistance today?\n",
+      "I'm doing well, thank you! How about you?\n",
      "\n",
      "\n",
-      "As an AI language model, I do not have feelings, but I am functioning well. Thank you for asking. How can I assist you today?\n",
-      "\u001b[1mConcurrent executed in 0.92 seconds.\u001b[0m\n",
+      "I'm doing well, thank you. How about you?\n",
+      "\u001b[1mConcurrent executed in 1.39 seconds.\u001b[0m\n",
      "\n",
      "\n",
-      "As an AI language model, I don't have feelings, but I'm functioning well. How can I assist you today?\n",
+      "I'm doing well, thank you. How about you?\n",
      "\n",
      "\n",
-      "As an AI language model, I don't have feelings, but I'm functioning well. Thank you for asking. How may I assist you today?\n",
+      "I'm doing well, thank you. How about you?\n",
      "\n",
+      "I'm doing well, thank you. How about you?\n",
      "\n",
-      "I'm an AI language model, so I don't have feelings, but I'm functioning well. How can I assist you today?\n",
      "\n",
+      "I'm doing well, thank you. How about you?\n",
      "\n",
-      "As an AI language model, I don't have feelings, but I'm functioning well. Thank you for asking. How may I assist you today?\n",
      "\n",
+      "I'm doing well, thank you. How about yourself?\n",
      "\n",
-      "As an AI language model, I don't have feelings, but I am functioning well. How can I assist you today?\n",
      "\n",
+      "I'm doing well, thanks for asking. How about you?\n",
      "\n",
-      "As an AI language model, I don't have feelings but I'm functioning well. How can I assist you today?\n",
      "\n",
+      "I'm doing well, thanks! How about you?\n",
      "\n",
-      "As an AI language model, I do not have personal emotions. However, I am functioning well and ready to assist you with any queries or tasks you have. How may I assist you today?\n",
      "\n",
+      "I'm doing well, thank you. How about you?\n",
      "\n",
-      "As an AI language model, I do not have feelings or emotions, but I'm functioning well. How can I assist you today?\n",
      "\n",
+      "I'm doing well, thank you. How about yourself?\n",
      "\n",
-      "I am an AI language model and do not have feelings. But I am functioning properly and ready to assist you with any task. How may I help you today?\n",
      "\n",
-      "\n",
-      "As an AI language model, I do not have emotions, but I am functioning well. How can I assist you today?\n",
-      "\u001b[1mSerial executed in 5.00 seconds.\u001b[0m\n"
+      "I'm doing well, thanks for asking. How about you?\n",
+      "\u001b[1mSerial executed in 5.77 seconds.\u001b[0m\n"
     ]
    }
   ],
@ -95,10 +94,10 @@
    "import time\n",
    "import asyncio\n",
    "\n",
-    "from langchain.llms import OpenAIChat\n",
+    "from langchain.llms import OpenAI\n",
    "\n",
    "def generate_serially():\n",
-    "    llm = OpenAIChat(temperature=0.9)\n",
+    "    llm = OpenAI(temperature=0.9)\n",
    "    for _ in range(10):\n",
    "        resp = llm.generate([\"Hello, how are you?\"])\n",
    "        print(resp.generations[0][0].text)\n",
@ -110,7 +109,7 @@
    "\n",
    "\n",
    "async def generate_concurrently():\n",
-    "    llm = OpenAIChat(temperature=0.9)\n",
+    "    llm = OpenAI(temperature=0.9)\n",
    "    tasks = [async_generate(llm) for _ in range(10)]\n",
    "    await asyncio.gather(*tasks)\n",
    "\n",
@ -152,7 +151,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.9"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/modules/llms/integrations/openaichat.ipynb
+++ b/docs/modules/llms/integrations/openaichat.ipynb
@ -1,245 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "e49f1e0d",
-   "metadata": {},
-   "source": [
-    "# OpenAIChat\n",
-    "\n",
-    "OpenAI also has a [chat model](https://platform.openai.com/docs/guides/chat) you can use. The interface is very similar to the normal OpenAI model."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "id": "522686de",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain.llms import OpenAIChat\n",
-    "from langchain import PromptTemplate, LLMChain"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "id": "62e0dbc3",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "llm = OpenAIChat(temperature=0)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "id": "fbb043e6",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "template = \"\"\"Question: {question}\n",
-    "\n",
-    "Answer: Let's think step by step.\"\"\"\n",
-    "\n",
-    "prompt = PromptTemplate(template=template, input_variables=[\"question\"])"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "id": "3f945b76",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "llm_chain = LLMChain(prompt=prompt, llm=llm)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "id": "25260808",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'\\n\\nJustin Bieber was born on March 1, 1994. \\n\\nThe Super Bowl is played in February of each year. \\n\\nTherefore, the Super Bowl that was played in the year Justin Bieber was born was Super Bowl XXVIII, which was played on January 30, 1994. \\n\\nThe Dallas Cowboys won Super Bowl XXVIII by defeating the Buffalo Bills with a score of 30-13.'"
-      ]
-     },
-     "execution_count": 5,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "question = \"What NFL team won the Super Bowl in the year Justin Beiber was born?\"\n",
-    "\n",
-    "llm_chain.run(question)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "75a05b79",
-   "metadata": {},
-   "source": [
-    "## Prefix Messages\n",
-    "\n",
-    "OpenAI Chat also supports the idea of [prefix messages](https://platform.openai.com/docs/guides/chat/chat-vs-completions), eg messages that would appear before the user input. These can be used as system messages to give more context/purpose the LLM."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "id": "c27a1501",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "prefix_messages = [{\"role\": \"system\", \"content\": \"You are a helpful assistant that is very good at problem solving who thinks step by step.\"}]"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "id": "e46a914e",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "llm = OpenAIChat(temperature=0, prefix_messages=prefix_messages)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "id": "d683d9f2",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "llm_chain = LLMChain(prompt=prompt, llm=llm)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
-   "id": "6f5b8e78",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'Step 1: Justin Bieber was born on March 1, 1994.\\nStep 2: The Super Bowl is played in February of each year.\\nStep 3: Therefore, the Super Bowl that was played in the year Justin Bieber was born was Super Bowl XXVIII, which was played on January 30, 1994.\\nStep 4: The team that won Super Bowl XXVIII was the Dallas Cowboys.'"
-      ]
-     },
-     "execution_count": 9,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "question = \"What NFL team won the Super Bowl in the year Justin Beiber was born?\"\n",
-    "\n",
-    "llm_chain.run(question)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "f6d5dda8",
-   "metadata": {},
-   "source": [
-    "## Async"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 11,
-   "id": "1973b9bb",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "result = await llm_chain.arun(question)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 12,
-   "id": "5815178f",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'Step 1: Justin Bieber was born on March 1, 1994.\\nStep 2: The Super Bowl is played in February of each year.\\nStep 3: Therefore, the Super Bowl that was played in the year Justin Bieber was born was Super Bowl XXVIII, which was played on January 30, 1994.\\nStep 4: The team that won Super Bowl XXVIII was the Dallas Cowboys.'"
-      ]
-     },
-     "execution_count": 12,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "result"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "eb779f3f",
-   "metadata": {},
-   "source": [
-    "## Streaming"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 14,
-   "id": "509181be",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "\n",
-      "\n",
-      "Justin Bieber was born on March 1, 1994. The NFL team that won the Super Bowl in the same year was the Dallas Cowboys. They defeated the Buffalo Bills 30-13 in Super Bowl XXVIII on January 30, 1994."
-     ]
-    }
-   ],
-   "source": [
-    "\n",
-    "from langchain.callbacks.base import CallbackManager\n",
-    "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
-    "llm = OpenAIChat(streaming=True, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), verbose=True, temperature=0)\n",
-    "resp = llm(question)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "c095285d",
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.9.1"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
--- a/docs/modules/llms/streaming_llm.ipynb
+++ b/docs/modules/llms/streaming_llm.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# Streaming with LLMs\n",
    "\n",
-    "LangChain provides streaming support for LLMs. Currently, we only support streaming for the `OpenAI` and `OpenAIChat` LLM implementation, but streaming support for other LLM implementations is on the roadmap. To utilize streaming, use a [`CallbackHandler`](https://github.com/hwchase17/langchain/blob/master/langchain/callbacks/base.py) that implements `on_llm_new_token`. In this example, we are using [`StreamingStdOutCallbackHandler`]()."
+    "LangChain provides streaming support for LLMs. Currently, we only support streaming for the `OpenAI` and `ChatOpenAI` LLM implementation, but streaming support for other LLM implementations is on the roadmap. To utilize streaming, use a [`CallbackHandler`](https://github.com/hwchase17/langchain/blob/master/langchain/callbacks/base.py) that implements `on_llm_new_token`. In this example, we are using [`StreamingStdOutCallbackHandler`]()."
   ]
  },
  {
@ -63,9 +63,11 @@
    }
   ],
   "source": [
-    "from langchain.llms import OpenAI, OpenAIChat\n",
+    "from langchain.llms import OpenAI\n",
+    "from langchain.chat_models import ChatOpenAI\n",
    "from langchain.callbacks.base import CallbackManager\n",
    "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
+    "from langchain.schema import HumanMessage\n",
    "\n",
    "\n",
    "llm = OpenAI(streaming=True, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), verbose=True, temperature=0)\n",
@ -84,7 +86,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 6,
   "id": "a35373f1-9ee6-4753-a343-5aee749b8527",
   "metadata": {
    "tags": []
@ -106,7 +108,7 @@
       "LLMResult(generations=[[Generation(text='\\n\\nQ: What did the fish say when it hit the wall?\\nA: Dam!', generation_info={'finish_reason': None, 'logprobs': None})]], llm_output={'token_usage': {}})"
      ]
     },
-     "execution_count": 3,
+     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -120,12 +122,12 @@
   "id": "a93a4d61-0476-49db-8321-7de92bd74059",
   "metadata": {},
   "source": [
-    "Here's an example with `OpenAIChat`:"
+    "Here's an example with `ChatOpenAI`:"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 3,
   "id": "22665f16-e05b-473c-a4bd-ad75744ea024",
   "metadata": {
    "tags": []
@ -177,13 +179,13 @@
      "Sparkling water, you're the one\n",
      "A drink that's always so much fun\n",
      "I'll never let you go, my friend\n",
-      "Sparkling water, until the end."
+      "Sparkling"
     ]
    }
   ],
   "source": [
-    "llm = OpenAIChat(streaming=True, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), verbose=True, temperature=0)\n",
-    "resp = llm(\"Write me a song about sparkling water.\")"
+    "chat = ChatOpenAI(streaming=True, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), verbose=True, temperature=0)\n",
+    "resp = chat([HumanMessage(content=\"Write me a song about sparkling water.\")])"
   ]
  },
  {
--- a/langchain/chains/constitutional_ai/base.py
+++ b/langchain/chains/constitutional_ai/base.py
@ -6,7 +6,7 @@ from langchain.chains.constitutional_ai.models import ConstitutionalPrinciple
 from langchain.chains.constitutional_ai.prompts import CRITIQUE_PROMPT, REVISION_PROMPT
 from langchain.chains.llm import LLMChain
 from langchain.llms.base import BaseLLM
-from langchain.prompts.prompt import BasePromptTemplate
+from langchain.prompts.base import BasePromptTemplate


 class ConstitutionalChain(Chain):
--- a/langchain/chains/llm.py
+++ b/langchain/chains/llm.py
@ -1,14 +1,15 @@
 """Chain that just formats a prompt and calls an LLM."""
+from __future__ import annotations
+
 from typing import Any, Dict, List, Optional, Sequence, Tuple, Union

 from pydantic import BaseModel, Extra

 from langchain.chains.base import Chain
 from langchain.input import get_colored_text
-from langchain.llms.base import BaseLLM
 from langchain.prompts.base import BasePromptTemplate
 from langchain.prompts.prompt import PromptTemplate
-from langchain.schema import LLMResult
+from langchain.schema import BaseLanguageModel, LLMResult, PromptValue


 class LLMChain(Chain, BaseModel):
@ -27,8 +28,7 @@ class LLMChain(Chain, BaseModel):

    prompt: BasePromptTemplate
    """Prompt object to use."""
-    llm: BaseLLM
-    """LLM wrapper to use."""
+    llm: BaseLanguageModel
    output_key: str = "text"  #: :meta private:

    class Config:
@ -53,21 +53,22 @@ class LLMChain(Chain, BaseModel):
        """
        return [self.output_key]

+    def _call(self, inputs: Dict[str, Any]) -> Dict[str, str]:
+        return self.apply([inputs])[0]
+
    def generate(self, input_list: List[Dict[str, Any]]) -> LLMResult:
        """Generate LLM result from inputs."""
        prompts, stop = self.prep_prompts(input_list)
-        response = self.llm.generate(prompts, stop=stop)
-        return response
+        return self.llm.generate_prompt(prompts, stop)

    async def agenerate(self, input_list: List[Dict[str, Any]]) -> LLMResult:
        """Generate LLM result from inputs."""
        prompts, stop = await self.aprep_prompts(input_list)
-        response = await self.llm.agenerate(prompts, stop=stop)
-        return response
+        return await self.llm.agenerate_prompt(prompts, stop)

    def prep_prompts(
        self, input_list: List[Dict[str, Any]]
-    ) -> Tuple[List[str], Optional[List[str]]]:
+    ) -> Tuple[List[PromptValue], Optional[List[str]]]:
        """Prepare prompts from inputs."""
        stop = None
        if "stop" in input_list[0]:
@ -75,8 +76,8 @@ class LLMChain(Chain, BaseModel):
        prompts = []
        for inputs in input_list:
            selected_inputs = {k: inputs[k] for k in self.prompt.input_variables}
-            prompt = self.prompt.format(**selected_inputs)
-            _colored_text = get_colored_text(prompt, "green")
+            prompt = self.prompt.format_prompt(**selected_inputs)
+            _colored_text = get_colored_text(prompt.to_string(), "green")
            _text = "Prompt after formatting:\n" + _colored_text
            self.callback_manager.on_text(_text, end="\n", verbose=self.verbose)
            if "stop" in inputs and inputs["stop"] != stop:
@ -88,7 +89,7 @@ class LLMChain(Chain, BaseModel):

    async def aprep_prompts(
        self, input_list: List[Dict[str, Any]]
-    ) -> Tuple[List[str], Optional[List[str]]]:
+    ) -> Tuple[List[PromptValue], Optional[List[str]]]:
        """Prepare prompts from inputs."""
        stop = None
        if "stop" in input_list[0]:
@ -96,8 +97,8 @@ class LLMChain(Chain, BaseModel):
        prompts = []
        for inputs in input_list:
            selected_inputs = {k: inputs[k] for k in self.prompt.input_variables}
-            prompt = self.prompt.format(**selected_inputs)
-            _colored_text = get_colored_text(prompt, "green")
+            prompt = self.prompt.format_prompt(**selected_inputs)
+            _colored_text = get_colored_text(prompt.to_string(), "green")
            _text = "Prompt after formatting:\n" + _colored_text
            if self.callback_manager.is_async:
                await self.callback_manager.on_text(
@ -130,13 +131,8 @@ class LLMChain(Chain, BaseModel):
            for generation in response.generations
        ]

-    def _call(self, inputs: Dict[str, Any]) -> Dict[str, str]:
-        known_values = self.prep_inputs(inputs.copy())
-        return self.apply([known_values])[0]
-
    async def _acall(self, inputs: Dict[str, Any]) -> Dict[str, str]:
-        known_values = self.prep_inputs(inputs.copy())
-        return (await self.aapply([known_values]))[0]
+        return (await self.aapply([inputs]))[0]

    def predict(self, **kwargs: Any) -> str:
        """Format prompt with kwargs and pass to LLM.
@ -207,7 +203,7 @@ class LLMChain(Chain, BaseModel):
        return "llm_chain"

    @classmethod
-    def from_string(cls, llm: BaseLLM, template: str) -> Chain:
+    def from_string(cls, llm: BaseLanguageModel, template: str) -> Chain:
        """Create LLMChain from LLM and template."""
        prompt_template = PromptTemplate.from_template(template)
        return cls(llm=llm, prompt=prompt_template)
--- a/langchain/chains/qa_with_sources/base.py
+++ b/langchain/chains/qa_with_sources/base.py
@ -3,7 +3,7 @@
 from __future__ import annotations

 from abc import ABC, abstractmethod
-from typing import Any, Dict, List
+from typing import Any, Dict, List, Optional

 from pydantic import BaseModel, Extra, root_validator

@ -62,10 +62,17 @@ class BaseQAWithSourcesChain(Chain, BaseModel, ABC):

    @classmethod
    def from_chain_type(
-        cls, llm: BaseLLM, chain_type: str = "stuff", **kwargs: Any
+        cls,
+        llm: BaseLLM,
+        chain_type: str = "stuff",
+        chain_type_kwargs: Optional[dict] = None,
+        **kwargs: Any,
    ) -> BaseQAWithSourcesChain:
        """Load chain from chain type."""
-        combine_document_chain = load_qa_with_sources_chain(llm, chain_type=chain_type)
+        _chain_kwargs = chain_type_kwargs or {}
+        combine_document_chain = load_qa_with_sources_chain(
+            llm, chain_type=chain_type, **_chain_kwargs
+        )
        return cls(combine_documents_chain=combine_document_chain, **kwargs)

    class Config:
--- a/langchain/chat_models/init.py
+++ b/langchain/chat_models/init.py
@ -0,0 +1,3 @@
+from langchain.chat_models.openai import ChatOpenAI
+
+__all__ = ["ChatOpenAI"]
--- a/langchain/chat_models/base.py
+++ b/langchain/chat_models/base.py
@ -0,0 +1,105 @@
+from abc import ABC, abstractmethod
+from typing import List, Optional
+
+from pydantic import BaseModel, Extra, Field, validator
+
+import langchain
+from langchain.callbacks import get_callback_manager
+from langchain.callbacks.base import BaseCallbackManager
+from langchain.schema import (
+    AIMessage,
+    BaseLanguageModel,
+    BaseMessage,
+    ChatGeneration,
+    ChatResult,
+    LLMResult,
+    PromptValue,
+)
+
+
+def _get_verbosity() -> bool:
+    return langchain.verbose
+
+
+class BaseChatModel(BaseLanguageModel, BaseModel, ABC):
+    verbose: bool = Field(default_factory=_get_verbosity)
+    """Whether to print out response text."""
+    callback_manager: BaseCallbackManager = Field(default_factory=get_callback_manager)
+
+    class Config:
+        """Configuration for this pydantic object."""
+
+        extra = Extra.forbid
+        arbitrary_types_allowed = True
+
+    @validator("callback_manager", pre=True, always=True)
+    def set_callback_manager(
+        cls, callback_manager: Optional[BaseCallbackManager]
+    ) -> BaseCallbackManager:
+        """If callback manager is None, set it.
+
+        This allows users to pass in None as callback manager, which is a nice UX.
+        """
+        return callback_manager or get_callback_manager()
+
+    def generate(
+        self, messages: List[List[BaseMessage]], stop: Optional[List[str]] = None
+    ) -> LLMResult:
+        """Top Level call"""
+        results = []
+        for m in messages:
+            results.append(self._generate(m, stop=stop))
+        return LLMResult(generations=[res.generations for res in results])
+
+    async def agenerate(
+        self, messages: List[List[BaseMessage]], stop: Optional[List[str]] = None
+    ) -> LLMResult:
+        results = []
+        for m in messages:
+            results.append(self._generate(m, stop=stop))
+        return LLMResult(generations=[res.generations for res in results])
+
+    def generate_prompt(
+        self, prompts: List[PromptValue], stop: Optional[List[str]] = None
+    ) -> LLMResult:
+        prompt_messages = [p.to_messages() for p in prompts]
+        return self.generate(prompt_messages, stop=stop)
+
+    async def agenerate_prompt(
+        self, prompts: List[PromptValue], stop: Optional[List[str]] = None
+    ) -> LLMResult:
+        prompt_messages = [p.to_messages() for p in prompts]
+        return await self.agenerate(prompt_messages, stop=stop)
+
+    @abstractmethod
+    def _generate(
+        self, messages: List[BaseMessage], stop: Optional[List[str]] = None
+    ) -> ChatResult:
+        """Top Level call"""
+
+    @abstractmethod
+    async def _agenerate(
+        self, messages: List[BaseMessage], stop: Optional[List[str]] = None
+    ) -> ChatResult:
+        """Top Level call"""
+
+    def __call__(
+        self, messages: List[BaseMessage], stop: Optional[List[str]] = None
+    ) -> BaseMessage:
+        return self._generate(messages, stop=stop).generations[0].message
+
+
+class SimpleChatModel(BaseChatModel):
+    def _generate(
+        self, messages: List[BaseMessage], stop: Optional[List[str]] = None
+    ) -> ChatResult:
+        output_str = self._call(messages, stop=stop)
+        message = AIMessage(text=output_str)
+        generation = ChatGeneration(message=message)
+        return ChatResult(generations=[generation])
+
+    @abstractmethod
+    def _call(
+        self, messages: List[BaseMessage], stop: Optional[List[str]] = None
+    ) -> str:
+        """Simpler interface."""
--- a/langchain/chat_models/openai.py
+++ b/langchain/chat_models/openai.py
@ -0,0 +1,319 @@
+"""OpenAI chat wrapper."""
+from __future__ import annotations
+
+import logging
+import sys
+from typing import Any, Callable, Dict, List, Mapping, Optional
+
+from pydantic import BaseModel, Extra, Field, root_validator
+from tenacity import (
+    before_sleep_log,
+    retry,
+    retry_if_exception_type,
+    stop_after_attempt,
+    wait_exponential,
+)
+
+from langchain.chat_models.base import BaseChatModel
+from langchain.schema import (
+    AIMessage,
+    BaseMessage,
+    ChatGeneration,
+    ChatMessage,
+    ChatResult,
+    HumanMessage,
+    SystemMessage,
+)
+from langchain.utils import get_from_dict_or_env
+
+logger = logging.getLogger(__file__)
+
+
+def _create_retry_decorator(llm: ChatOpenAI) -> Callable[[Any], Any]:
+    import openai
+
+    min_seconds = 4
+    max_seconds = 10
+    # Wait 2^x * 1 second between each retry starting with
+    # 4 seconds, then up to 10 seconds, then 10 seconds afterwards
+    return retry(
+        reraise=True,
+        stop=stop_after_attempt(llm.max_retries),
+        wait=wait_exponential(multiplier=1, min=min_seconds, max=max_seconds),
+        retry=(
+            retry_if_exception_type(openai.error.Timeout)
+            | retry_if_exception_type(openai.error.APIError)
+            | retry_if_exception_type(openai.error.APIConnectionError)
+            | retry_if_exception_type(openai.error.RateLimitError)
+            | retry_if_exception_type(openai.error.ServiceUnavailableError)
+        ),
+        before_sleep=before_sleep_log(logger, logging.WARNING),
+    )
+
+
+async def acompletion_with_retry(llm: ChatOpenAI, **kwargs: Any) -> Any:
+    """Use tenacity to retry the async completion call."""
+    retry_decorator = _create_retry_decorator(llm)
+
+    @retry_decorator
+    async def _completion_with_retry(**kwargs: Any) -> Any:
+        # Use OpenAI's async api https://github.com/openai/openai-python#async-api
+        return await llm.client.acreate(**kwargs)
+
+    return await _completion_with_retry(**kwargs)
+
+
+def _convert_dict_to_message(_dict: dict) -> BaseMessage:
+    role = _dict["role"]
+    if role == "user":
+        return HumanMessage(content=_dict["content"])
+    elif role == "assistant":
+        return AIMessage(content=_dict["content"])
+    elif role == "system":
+        return SystemMessage(content=_dict["content"])
+    else:
+        return ChatMessage(content=_dict["content"], role=role)
+
+
+def _convert_message_to_dict(message: BaseMessage) -> dict:
+    if isinstance(message, ChatMessage):
+        message_dict = {"role": message.role, "content": message.content}
+    elif isinstance(message, HumanMessage):
+        message_dict = {"role": "user", "content": message.content}
+    elif isinstance(message, AIMessage):
+        message_dict = {"role": "assistant", "content": message.content}
+    elif isinstance(message, SystemMessage):
+        message_dict = {"role": "system", "content": message.content}
+    else:
+        raise ValueError(f"Got unknown type {message}")
+    if "name" in message.additional_kwargs:
+        message_dict["name"] = message.additional_kwargs["name"]
+    return message_dict
+
+
+class ChatOpenAI(BaseChatModel, BaseModel):
+    """Wrapper around OpenAI Chat large language models.
+
+    To use, you should have the ``openai`` python package installed, and the
+    environment variable ``OPENAI_API_KEY`` set with your API key.
+
+    Any parameters that are valid to be passed to the openai.create call can be passed
+    in, even if not explicitly saved on this class.
+
+    Example:
+        .. code-block:: python
+
+            from langchain.chat_models import ChatOpenAI
+            openai = ChatOpenAI(model_name="gpt-3.5-turbo")
+    """
+
+    client: Any  #: :meta private:
+    model_name: str = "gpt-3.5-turbo"
+    """Model name to use."""
+    model_kwargs: Dict[str, Any] = Field(default_factory=dict)
+    """Holds any model parameters valid for `create` call not explicitly specified."""
+    openai_api_key: Optional[str] = None
+    max_retries: int = 6
+    """Maximum number of retries to make when generating."""
+    streaming: bool = False
+    """Whether to stream the results or not."""
+    n: int = 1
+    """Number of chat completions to generate for each prompt."""
+    max_tokens: int = 256
+    """Maximum number of tokens to generate."""
+
+    class Config:
+        """Configuration for this pydantic object."""
+
+        extra = Extra.ignore
+
+    @root_validator(pre=True)
+    def build_extra(cls, values: Dict[str, Any]) -> Dict[str, Any]:
+        """Build extra kwargs from additional params that were passed in."""
+        all_required_field_names = {field.alias for field in cls.__fields__.values()}
+
+        extra = values.get("model_kwargs", {})
+        for field_name in list(values):
+            if field_name not in all_required_field_names:
+                if field_name in extra:
+                    raise ValueError(f"Found {field_name} supplied twice.")
+                extra[field_name] = values.pop(field_name)
+        values["model_kwargs"] = extra
+        return values
+
+    @root_validator()
+    def validate_environment(cls, values: Dict) -> Dict:
+        """Validate that api key and python package exists in environment."""
+        openai_api_key = get_from_dict_or_env(
+            values, "openai_api_key", "OPENAI_API_KEY"
+        )
+        try:
+            import openai
+
+            openai.api_key = openai_api_key
+        except ImportError:
+            raise ValueError(
+                "Could not import openai python package. "
+                "Please it install it with `pip install openai`."
+            )
+        try:
+            values["client"] = openai.ChatCompletion
+        except AttributeError:
+            raise ValueError(
+                "`openai` has no `ChatCompletion` attribute, this is likely "
+                "due to an old version of the openai package. Try upgrading it "
+                "with `pip install --upgrade openai`."
+            )
+        if values["n"] < 1:
+            raise ValueError("n must be at least 1.")
+        if values["n"] > 1 and values["streaming"]:
+            raise ValueError("n must be 1 when streaming.")
+        return values
+
+    @property
+    def _default_params(self) -> Dict[str, Any]:
+        """Get the default parameters for calling OpenAI API."""
+        return {
+            "model": self.model_name,
+            "max_tokens": self.max_tokens,
+            "stream": self.streaming,
+            "n": self.n,
+            **self.model_kwargs,
+        }
+
+    def _create_retry_decorator(self) -> Callable[[Any], Any]:
+        import openai
+
+        min_seconds = 4
+        max_seconds = 10
+        # Wait 2^x * 1 second between each retry starting with
+        # 4 seconds, then up to 10 seconds, then 10 seconds afterwards
+        return retry(
+            reraise=True,
+            stop=stop_after_attempt(self.max_retries),
+            wait=wait_exponential(multiplier=1, min=min_seconds, max=max_seconds),
+            retry=(
+                retry_if_exception_type(openai.error.Timeout)
+                | retry_if_exception_type(openai.error.APIError)
+                | retry_if_exception_type(openai.error.APIConnectionError)
+                | retry_if_exception_type(openai.error.RateLimitError)
+                | retry_if_exception_type(openai.error.ServiceUnavailableError)
+            ),
+            before_sleep=before_sleep_log(logger, logging.WARNING),
+        )
+
+    def completion_with_retry(self, **kwargs: Any) -> Any:
+        """Use tenacity to retry the completion call."""
+        retry_decorator = self._create_retry_decorator()
+
+        @retry_decorator
+        def _completion_with_retry(**kwargs: Any) -> Any:
+            return self.client.create(**kwargs)
+
+        return _completion_with_retry(**kwargs)
+
+    def _generate(
+        self, messages: List[BaseMessage], stop: Optional[List[str]] = None
+    ) -> ChatResult:
+        params: Dict[str, Any] = {**{"model": self.model_name}, **self._default_params}
+        if stop is not None:
+            if "stop" in params:
+                raise ValueError("`stop` found in both the input and default params.")
+            params["stop"] = stop
+        message_dicts = [_convert_message_to_dict(m) for m in messages]
+        if self.streaming:
+            inner_completion = ""
+            role = "assistant"
+            params["stream"] = True
+            for stream_resp in self.completion_with_retry(
+                messages=message_dicts, **params
+            ):
+                role = stream_resp["choices"][0]["delta"].get("role", role)
+                token = stream_resp["choices"][0]["delta"].get("content", "")
+                inner_completion += token
+                self.callback_manager.on_llm_new_token(
+                    token,
+                    verbose=self.verbose,
+                )
+            message = _convert_dict_to_message(
+                {"content": inner_completion, "role": role}
+            )
+            return ChatResult(generations=[ChatGeneration(message=message)])
+        response = self.completion_with_retry(messages=message_dicts, **params)
+        generations = []
+        for res in response["choices"]:
+            message = _convert_dict_to_message(res["message"])
+            gen = ChatGeneration(message=message)
+            generations.append(gen)
+        return ChatResult(generations=generations)
+
+    async def _agenerate(
+        self, messages: List[BaseMessage], stop: Optional[List[str]] = None
+    ) -> ChatResult:
+        params: Dict[str, Any] = {**{"model": self.model_name}, **self._default_params}
+        if stop is not None:
+            if "stop" in params:
+                raise ValueError("`stop` found in both the input and default params.")
+            params["stop"] = stop
+        message_dicts = [_convert_message_to_dict(m) for m in messages]
+        if self.streaming:
+            inner_completion = ""
+            role = "assistant"
+            params["stream"] = True
+            async for stream_resp in await acompletion_with_retry(
+                self, messages=message_dicts, **params
+            ):
+                role = stream_resp["choices"][0]["delta"].get("role", role)
+                token = stream_resp["choices"][0]["delta"].get("content", "")
+                inner_completion += token
+                if self.callback_manager.is_async:
+                    await self.callback_manager.on_llm_new_token(
+                        token,
+                        verbose=self.verbose,
+                    )
+                else:
+                    self.callback_manager.on_llm_new_token(
+                        token,
+                        verbose=self.verbose,
+                    )
+            message = _convert_dict_to_message(
+                {"content": inner_completion, "role": role}
+            )
+            return ChatResult(generations=[ChatGeneration(message=message)])
+        else:
+            full_response = await acompletion_with_retry(
+                self, messages=message_dicts, **params
+            )
+            generations = []
+            for res in full_response["choices"]:
+                message = _convert_dict_to_message(res["message"])
+                gen = ChatGeneration(message=message)
+                generations.append(gen)
+            return ChatResult(generations=generations)
+
+    @property
+    def _identifying_params(self) -> Mapping[str, Any]:
+        """Get the identifying parameters."""
+        return {**{"model_name": self.model_name}, **self._default_params}
+
+    def get_num_tokens(self, text: str) -> int:
+        """Calculate num tokens with tiktoken package."""
+        # tiktoken NOT supported for Python 3.8 or below
+        if sys.version_info[1] <= 8:
+            return super().get_num_tokens(text)
+        try:
+            import tiktoken
+        except ImportError:
+            raise ValueError(
+                "Could not import tiktoken python package. "
+                "This is needed in order to calculate get_num_tokens. "
+                "Please it install it with `pip install tiktoken`."
+            )
+        # create a GPT-3.5-Turbo encoder instance
+        enc = tiktoken.encoding_for_model(self.model_name)
+
+        # encode the text using the GPT-3.5-Turbo encoder
+        tokenized_text = enc.encode(text)
+
+        # calculate the number of tokens in the encoded text
+        return len(tokenized_text)
--- a/langchain/llms/base.py
+++ b/langchain/llms/base.py
@ -10,7 +10,7 @@ from pydantic import BaseModel, Extra, Field, validator
 import langchain
 from langchain.callbacks import get_callback_manager
 from langchain.callbacks.base import BaseCallbackManager
-from langchain.schema import Generation, LLMResult
+from langchain.schema import BaseLanguageModel, Generation, LLMResult, PromptValue


 def _get_verbosity() -> bool:
@ -53,7 +53,7 @@ def update_cache(
    return llm_output


-class BaseLLM(BaseModel, ABC):
+class BaseLLM(BaseLanguageModel, BaseModel, ABC):
    """LLM wrapper should take in a prompt and return a string."""

    cache: Optional[bool] = None
@ -100,6 +100,18 @@ class BaseLLM(BaseModel, ABC):
    ) -> LLMResult:
        """Run the LLM on the given prompts."""

+    def generate_prompt(
+        self, prompts: List[PromptValue], stop: Optional[List[str]] = None
+    ) -> LLMResult:
+        prompt_strings = [p.to_string() for p in prompts]
+        return self.generate(prompt_strings, stop=stop)
+
+    async def agenerate_prompt(
+        self, prompts: List[PromptValue], stop: Optional[List[str]] = None
+    ) -> LLMResult:
+        prompt_strings = [p.to_string() for p in prompts]
+        return await self.agenerate(prompt_strings, stop=stop)
+
    def generate(
        self, prompts: List[str], stop: Optional[List[str]] = None
    ) -> LLMResult:
@ -229,27 +241,6 @@ class BaseLLM(BaseModel, ABC):
        generations = [existing_prompts[i] for i in range(len(prompts))]
        return LLMResult(generations=generations, llm_output=llm_output)

-    def get_num_tokens(self, text: str) -> int:
-        """Get the number of tokens present in the text."""
-        # TODO: this method may not be exact.
-        # TODO: this method may differ based on model (eg codex).
-        try:
-            from transformers import GPT2TokenizerFast
-        except ImportError:
-            raise ValueError(
-                "Could not import transformers python package. "
-                "This is needed in order to calculate get_num_tokens. "
-                "Please it install it with `pip install transformers`."
-            )
-        # create a GPT-3 tokenizer instance
-        tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")
-
-        # tokenize the text using the GPT-3 tokenizer
-        tokenized_text = tokenizer.tokenize(text)
-
-        # calculate the number of tokens in the tokenized text
-        return len(tokenized_text)
-
    def __call__(self, prompt: str, stop: Optional[List[str]] = None) -> str:
        """Check Cache and run the LLM on the given prompt and input."""
        return self.generate([prompt], stop=stop).generations[0][0].text
--- a/langchain/prompts/base.py
+++ b/langchain/prompts/base.py
@ -11,6 +11,7 @@ import yaml
 from pydantic import BaseModel, Extra, Field, root_validator

 from langchain.formatting import formatter
+from langchain.schema import BaseMessage, HumanMessage, PromptValue


 def jinja2_formatter(template: str, **kwargs: Any) -> str:
@ -115,6 +116,18 @@ class RegexParser(BaseOutputParser, BaseModel):
                }


+class StringPromptValue(PromptValue):
+    text: str
+
+    def to_string(self) -> str:
+        """Return prompt as string."""
+        return self.text
+
+    def to_messages(self) -> List[BaseMessage]:
+        """Return prompt as messages."""
+        return [HumanMessage(content=self.text)]
+
+
 class BasePromptTemplate(BaseModel, ABC):
    """Base prompt should expose the format method, returning a prompt."""

@ -132,6 +145,10 @@ class BasePromptTemplate(BaseModel, ABC):
        extra = Extra.forbid
        arbitrary_types_allowed = True

+    @abstractmethod
+    def format_prompt(self, **kwargs: Any) -> PromptValue:
+        """Create Chat Messages."""
+
    @root_validator()
    def validate_variable_names(cls, values: Dict) -> Dict:
        """Validate variable names do not include restricted names."""
@ -233,3 +250,9 @@ class BasePromptTemplate(BaseModel, ABC):
                yaml.dump(prompt_dict, f, default_flow_style=False)
        else:
            raise ValueError(f"{save_path} must be json or yaml")
+
+
+class StringPromptTemplate(BasePromptTemplate, ABC):
+    def format_prompt(self, **kwargs: Any) -> PromptValue:
+        """Create Chat Messages."""
+        return StringPromptValue(text=self.format(**kwargs))
--- a/langchain/prompts/chat.py
+++ b/langchain/prompts/chat.py
@ -0,0 +1,134 @@
+"""Chat prompt template."""
+from __future__ import annotations
+
+from abc import ABC, abstractmethod
+from pathlib import Path
+from typing import Any, Callable, List, Sequence, Tuple, Type, Union
+
+from pydantic import BaseModel, Field
+
+from langchain.prompts.base import BasePromptTemplate, StringPromptTemplate
+from langchain.prompts.prompt import PromptTemplate
+from langchain.schema import (
+    AIMessage,
+    BaseMessage,
+    ChatMessage,
+    HumanMessage,
+    PromptValue,
+    SystemMessage,
+)
+
+
+class BaseMessagePromptTemplate(BaseModel, ABC):
+    prompt: StringPromptTemplate
+    additional_kwargs: dict = Field(default_factory=dict)
+
+    @classmethod
+    def from_template(cls, template: str, **kwargs: Any) -> BaseMessagePromptTemplate:
+        prompt = PromptTemplate.from_template(template)
+        return cls(prompt=prompt, **kwargs)
+
+    @abstractmethod
+    def format(self, **kwargs: Any) -> BaseMessage:
+        """To a BaseMessage."""
+
+
+class ChatMessagePromptTemplate(BaseMessagePromptTemplate):
+    role: str
+
+    def format(self, **kwargs: Any) -> BaseMessage:
+        text = self.prompt.format(**kwargs)
+        return ChatMessage(
+            content=text, role=self.role, additional_kwargs=self.additional_kwargs
+        )
+
+
+class HumanMessagePromptTemplate(BaseMessagePromptTemplate):
+    def format(self, **kwargs: Any) -> BaseMessage:
+        text = self.prompt.format(**kwargs)
+        return HumanMessage(content=text, additional_kwargs=self.additional_kwargs)
+
+
+class AIMessagePromptTemplate(BaseMessagePromptTemplate):
+    def format(self, **kwargs: Any) -> BaseMessage:
+        text = self.prompt.format(**kwargs)
+        return AIMessage(content=text, additional_kwargs=self.additional_kwargs)
+
+
+class SystemMessagePromptTemplate(BaseMessagePromptTemplate):
+    def format(self, **kwargs: Any) -> BaseMessage:
+        text = self.prompt.format(**kwargs)
+        return SystemMessage(content=text, additional_kwargs=self.additional_kwargs)
+
+
+class ChatPromptValue(PromptValue):
+    messages: List[BaseMessage]
+
+    def to_string(self) -> str:
+        """Return prompt as string."""
+        return str(self.messages)
+
+    def to_messages(self) -> List[BaseMessage]:
+        """Return prompt as messages."""
+        return self.messages
+
+
+class ChatPromptTemplate(BasePromptTemplate, ABC):
+    input_variables: List[str]
+    messages: List[BaseMessagePromptTemplate]
+
+    @classmethod
+    def from_role_strings(
+        cls, string_messages: List[Tuple[str, str]]
+    ) -> ChatPromptTemplate:
+        messages = [
+            ChatMessagePromptTemplate(
+                content=PromptTemplate.from_template(template), role=role
+            )
+            for role, template in string_messages
+        ]
+        return cls.from_messages(messages)
+
+    @classmethod
+    def from_strings(
+        cls, string_messages: List[Tuple[Type[BaseMessagePromptTemplate], str]]
+    ) -> ChatPromptTemplate:
+        messages = [
+            role(content=PromptTemplate.from_template(template))
+            for role, template in string_messages
+        ]
+        return cls.from_messages(messages)
+
+    @classmethod
+    def from_messages(
+        cls, messages: Sequence[BaseMessagePromptTemplate]
+    ) -> ChatPromptTemplate:
+        input_vars = set()
+        for message in messages:
+            input_vars.update(message.prompt.input_variables)
+        return cls(input_variables=list(input_vars), messages=messages)
+
+    def format(self, **kwargs: Any) -> str:
+        return self.format_prompt(**kwargs).to_string()
+
+    def format_prompt(self, **kwargs: Any) -> PromptValue:
+        result = []
+        for message_template in self.messages:
+            rel_params = {
+                k: v
+                for k, v in kwargs.items()
+                if k in message_template.prompt.input_variables
+            }
+            message = message_template.format(**rel_params)
+            result.append(message)
+        return ChatPromptValue(messages=result)
+
+    def partial(self, **kwargs: Union[str, Callable[[], str]]) -> BasePromptTemplate:
+        raise NotImplementedError
+
+    @property
+    def _prompt_type(self) -> str:
+        raise NotImplementedError
+
+    def save(self, file_path: Union[Path, str]) -> None:
+        raise NotImplementedError
--- a/langchain/prompts/few_shot.py
+++ b/langchain/prompts/few_shot.py
@ -5,14 +5,14 @@ from pydantic import BaseModel, Extra, root_validator

 from langchain.prompts.base import (
    DEFAULT_FORMATTER_MAPPING,
-    BasePromptTemplate,
+    StringPromptTemplate,
    check_valid_template,
 )
 from langchain.prompts.example_selector.base import BaseExampleSelector
 from langchain.prompts.prompt import PromptTemplate


-class FewShotPromptTemplate(BasePromptTemplate, BaseModel):
+class FewShotPromptTemplate(StringPromptTemplate, BaseModel):
    """Prompt template that contains few shot examples."""

    examples: Optional[List[dict]] = None
--- a/langchain/prompts/few_shot_with_templates.py
+++ b/langchain/prompts/few_shot_with_templates.py
@ -3,12 +3,15 @@ from typing import Any, Dict, List, Optional

 from pydantic import BaseModel, Extra, root_validator

-from langchain.prompts.base import DEFAULT_FORMATTER_MAPPING, BasePromptTemplate
+from langchain.prompts.base import (
+    DEFAULT_FORMATTER_MAPPING,
+    StringPromptTemplate,
+)
 from langchain.prompts.example_selector.base import BaseExampleSelector
 from langchain.prompts.prompt import PromptTemplate


-class FewShotPromptWithTemplates(BasePromptTemplate, BaseModel):
+class FewShotPromptWithTemplates(StringPromptTemplate, BaseModel):
    """Prompt template that contains few shot examples."""

    examples: Optional[List[dict]] = None
@ -22,7 +25,7 @@ class FewShotPromptWithTemplates(BasePromptTemplate, BaseModel):
    example_prompt: PromptTemplate
    """PromptTemplate used to format an individual example."""

-    suffix: BasePromptTemplate
+    suffix: StringPromptTemplate
    """A PromptTemplate to put after the examples."""

    input_variables: List[str]
@ -31,7 +34,7 @@ class FewShotPromptWithTemplates(BasePromptTemplate, BaseModel):
    example_separator: str = "\n\n"
    """String separator used to join the prefix, the examples, and suffix."""

-    prefix: Optional[BasePromptTemplate] = None
+    prefix: Optional[StringPromptTemplate] = None
    """A PromptTemplate to put before the examples."""

    template_format: str = "f-string"
--- a/langchain/prompts/prompt.py
+++ b/langchain/prompts/prompt.py
@ -9,12 +9,12 @@ from pydantic import BaseModel, Extra, root_validator

 from langchain.prompts.base import (
    DEFAULT_FORMATTER_MAPPING,
-    BasePromptTemplate,
+    StringPromptTemplate,
    check_valid_template,
 )


-class PromptTemplate(BasePromptTemplate, BaseModel):
+class PromptTemplate(StringPromptTemplate, BaseModel):
    """Schema to represent a prompt for an LLM.

    Example:
--- a/langchain/schema.py
+++ b/langchain/schema.py
@ -1,9 +1,10 @@
 """Common schema objects."""
+from __future__ import annotations

-from dataclasses import dataclass
+from abc import ABC, abstractmethod
 from typing import Any, Dict, List, NamedTuple, Optional

-from dataclasses_json import dataclass_json
+from pydantic import BaseModel, Field, root_validator


 class AgentAction(NamedTuple):
@ -21,9 +22,7 @@ class AgentFinish(NamedTuple):
    log: str


-@dataclass_json
-@dataclass
-class Generation:
+class Generation(BaseModel):
    """Output of a single generation."""

    text: str
@ -35,9 +34,53 @@ class Generation:
    # TODO: add log probs


-@dataclass_json
-@dataclass
-class LLMResult:
+class BaseMessage(BaseModel):
+    """Message object."""
+
+    content: str
+    additional_kwargs: dict = Field(default_factory=dict)
+
+
+class HumanMessage(BaseMessage):
+    """Type of message that is spoken by the human."""
+
+
+class AIMessage(BaseMessage):
+    """Type of message that is spoken by the AI."""
+
+
+class SystemMessage(BaseMessage):
+    """Type of message that is a system message."""
+
+
+class ChatMessage(BaseMessage):
+    """Type of message with arbitrary speaker."""
+
+    role: str
+
+
+class ChatGeneration(Generation):
+    """Output of a single generation."""
+
+    text = ""
+    message: BaseMessage
+
+    @root_validator
+    def set_text(cls, values: Dict[str, Any]) -> Dict[str, Any]:
+        values["text"] = values["message"].content
+        return values
+
+
+class ChatResult(BaseModel):
+    """Class that contains all relevant information for a Chat Result."""
+
+    generations: List[ChatGeneration]
+    """List of the things generated."""
+    llm_output: Optional[dict] = None
+    """For arbitrary LLM provider specific output."""
+
+
+class LLMResult(BaseModel):
    """Class that contains all relevant information for an LLM Result."""

    generations: List[List[Generation]]
@ -45,3 +88,48 @@ class LLMResult:
    each input could have multiple generations."""
    llm_output: Optional[dict] = None
    """For arbitrary LLM provider specific output."""
+
+
+class PromptValue(BaseModel, ABC):
+    @abstractmethod
+    def to_string(self) -> str:
+        """Return prompt as string."""
+
+    @abstractmethod
+    def to_messages(self) -> List[BaseMessage]:
+        """Return prompt as messages."""
+
+
+class BaseLanguageModel(BaseModel, ABC):
+    @abstractmethod
+    def generate_prompt(
+        self, prompts: List[PromptValue], stop: Optional[List[str]] = None
+    ) -> LLMResult:
+        """Take in a list of prompt values and return an LLMResult."""
+
+    @abstractmethod
+    async def agenerate_prompt(
+        self, prompts: List[PromptValue], stop: Optional[List[str]] = None
+    ) -> LLMResult:
+        """Take in a list of prompt values and return an LLMResult."""
+
+    def get_num_tokens(self, text: str) -> int:
+        """Get the number of tokens present in the text."""
+        # TODO: this method may not be exact.
+        # TODO: this method may differ based on model (eg codex).
+        try:
+            from transformers import GPT2TokenizerFast
+        except ImportError:
+            raise ValueError(
+                "Could not import transformers python package. "
+                "This is needed in order to calculate get_num_tokens. "
+                "Please it install it with `pip install transformers`."
+            )
+        # create a GPT-3 tokenizer instance
+        tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")
+
+        # tokenize the text using the GPT-3 tokenizer
+        tokenized_text = tokenizer.tokenize(text)
+
+        # calculate the number of tokens in the tokenized text
+        return len(tokenized_text)
--- a/tests/integration_tests/chat_models/init.py
+++ b/tests/integration_tests/chat_models/init.py
--- a/tests/integration_tests/chat_models/test_openai.py
+++ b/tests/integration_tests/chat_models/test_openai.py
@ -0,0 +1,89 @@
+"""Test ChatOpenAI wrapper."""
+
+import pytest
+
+from langchain.callbacks.base import CallbackManager
+from langchain.chat_models.openai import ChatOpenAI
+from langchain.schema import (
+    BaseMessage,
+    ChatGeneration,
+    ChatResult,
+    HumanMessage,
+    LLMResult,
+    SystemMessage,
+)
+from tests.unit_tests.callbacks.fake_callback_handler import FakeCallbackHandler
+
+
+def test_chat_openai() -> None:
+    """Test ChatOpenAI wrapper."""
+    chat = ChatOpenAI(max_tokens=10)
+    message = HumanMessage(content="Hello")
+    response = chat([message])
+    assert isinstance(response, BaseMessage)
+    assert isinstance(response.content, str)
+
+
+def test_chat_openai_system_message() -> None:
+    """Test ChatOpenAI wrapper with system message."""
+    chat = ChatOpenAI(max_tokens=10)
+    system_message = SystemMessage(content="You are to chat with the user.")
+    human_message = HumanMessage(content="Hello")
+    response = chat([system_message, human_message])
+    assert isinstance(response, BaseMessage)
+    assert isinstance(response.content, str)
+
+
+def test_chat_openai_generate() -> None:
+    """Test ChatOpenAI wrapper with generate."""
+    chat = ChatOpenAI(max_tokens=10, n=2)
+    message = HumanMessage(content="Hello")
+    response = chat.generate([[message], [message]])
+    assert isinstance(response, LLMResult)
+    assert len(response.generations) == 2
+    for generations in response.generations:
+        assert len(generations) == 2
+        for generation in generations:
+            assert isinstance(generation, ChatGeneration)
+            assert isinstance(generation.text, str)
+            assert generation.text == generation.message.content
+
+
+def test_chat_openai_multiple_completions() -> None:
+    """Test ChatOpenAI wrapper with multiple completions."""
+    chat = ChatOpenAI(max_tokens=10, n=5)
+    message = HumanMessage(content="Hello")
+    response = chat._generate([message])
+    assert isinstance(response, ChatResult)
+    assert len(response.generations) == 5
+    for generation in response.generations:
+        assert isinstance(generation.message, BaseMessage)
+        assert isinstance(generation.message.content, str)
+
+
+def test_chat_openai_streaming() -> None:
+    """Test that streaming correctly invokes on_llm_new_token callback."""
+    callback_handler = FakeCallbackHandler()
+    callback_manager = CallbackManager([callback_handler])
+    chat = ChatOpenAI(
+        max_tokens=10,
+        streaming=True,
+        temperature=0,
+        callback_manager=callback_manager,
+        verbose=True,
+    )
+    message = HumanMessage(content="Hello")
+    response = chat([message])
+    assert callback_handler.llm_streams > 0
+    assert isinstance(response, BaseMessage)
+
+
+def test_chat_openai_invalid_streaming_params() -> None:
+    """Test that streaming correctly invokes on_llm_new_token callback."""
+    with pytest.raises(ValueError):
+        ChatOpenAI(
+            max_tokens=10,
+            streaming=True,
+            temperature=0,
+            n=5,
+        )
--- a/tests/unit_tests/callbacks/tracers/test_tracer.py
+++ b/tests/unit_tests/callbacks/tracers/test_tracer.py
@ -60,7 +60,7 @@ def _get_compare_run() -> Union[LLMRun, ChainRun, ToolRun]:
                        execution_order=3,
                        serialized={},
                        prompts=[],
-                        response=LLMResult([[]]),
+                        response=LLMResult(generations=[[]]),
                        session_id=TEST_SESSION_ID,
                    )
                ],
@ -74,7 +74,7 @@ def _get_compare_run() -> Union[LLMRun, ChainRun, ToolRun]:
                execution_order=4,
                serialized={},
                prompts=[],
-                response=LLMResult([[]]),
+                response=LLMResult(generations=[[]]),
                session_id=TEST_SESSION_ID,
            ),
        ],
@ -86,10 +86,10 @@ def _perform_nested_run(tracer: BaseTracer) -> None:
    tracer.on_chain_start(serialized={}, inputs={})
    tracer.on_tool_start(serialized={}, input_str="test")
    tracer.on_llm_start(serialized={}, prompts=[])
-    tracer.on_llm_end(response=LLMResult([[]]))
+    tracer.on_llm_end(response=LLMResult(generations=[[]]))
    tracer.on_tool_end("test")
    tracer.on_llm_start(serialized={}, prompts=[])
-    tracer.on_llm_end(response=LLMResult([[]]))
+    tracer.on_llm_end(response=LLMResult(generations=[[]]))
    tracer.on_chain_end(outputs={})


@ -209,7 +209,7 @@ def test_tracer_llm_run() -> None:
        execution_order=1,
        serialized={},
        prompts=[],
-        response=LLMResult([[]]),
+        response=LLMResult(generations=[[]]),
        session_id=TEST_SESSION_ID,
        error=None,
    )
@ -217,7 +217,7 @@ def test_tracer_llm_run() -> None:

    tracer.new_session()
    tracer.on_llm_start(serialized={}, prompts=[])
-    tracer.on_llm_end(response=LLMResult([[]]))
+    tracer.on_llm_end(response=LLMResult(generations=[[]]))
    assert tracer.runs == [compare_run]


@ -237,7 +237,7 @@ def test_tracer_llm_run_errors_no_start() -> None:

    tracer.new_session()
    with pytest.raises(TracerException):
-        tracer.on_llm_end(response=LLMResult([[]]))
+        tracer.on_llm_end(response=LLMResult(generations=[[]]))


@freeze_time("2023-01-01")
@ -251,7 +251,7 @@ def test_tracer_multiple_llm_runs() -> None:
        execution_order=1,
        serialized={},
        prompts=[],
-        response=LLMResult([[]]),
+        response=LLMResult(generations=[[]]),
        session_id=TEST_SESSION_ID,
        error=None,
    )
@ -261,7 +261,7 @@ def test_tracer_multiple_llm_runs() -> None:
    num_runs = 10
    for _ in range(num_runs):
        tracer.on_llm_start(serialized={}, prompts=[])
-        tracer.on_llm_end(response=LLMResult([[]]))
+        tracer.on_llm_end(response=LLMResult(generations=[[]]))

    assert tracer.runs == [compare_run] * num_runs

@ -409,9 +409,9 @@ def test_tracer_nested_runs_on_error() -> None:
    for _ in range(3):
        tracer.on_chain_start(serialized={}, inputs={})
        tracer.on_llm_start(serialized={}, prompts=[])
-        tracer.on_llm_end(response=LLMResult([[]]))
+        tracer.on_llm_end(response=LLMResult(generations=[[]]))
        tracer.on_llm_start(serialized={}, prompts=[])
-        tracer.on_llm_end(response=LLMResult([[]]))
+        tracer.on_llm_end(response=LLMResult(generations=[[]]))
        tracer.on_tool_start(serialized={}, input_str="test")
        tracer.on_llm_start(serialized={}, prompts=[])
        tracer.on_llm_error(exception)
--- a/tests/unit_tests/llms/test_base.py
+++ b/tests/unit_tests/llms/test_base.py
@ -31,7 +31,7 @@ def test_caching() -> None:
        [Generation(text="fizz")],
    ]
    expected_output = LLMResult(
-        expected_generations,
+        generations=expected_generations,
        llm_output=None,
    )
    assert output == expected_output
@ -69,7 +69,7 @@ def test_custom_caching() -> None:
        [Generation(text="fizz")],
    ]
    expected_output = LLMResult(
-        expected_generations,
+        generations=expected_generations,
        llm_output=None,
    )
    assert output == expected_output
--- a/tests/unit_tests/prompts/test_chat.py
+++ b/tests/unit_tests/prompts/test_chat.py
@ -0,0 +1,91 @@
+from typing import List
+
+from langchain.prompts import PromptTemplate
+from langchain.prompts.chat import (
+    AIMessagePromptTemplate,
+    BaseMessagePromptTemplate,
+    ChatMessagePromptTemplate,
+    ChatPromptTemplate,
+    ChatPromptValue,
+    HumanMessagePromptTemplate,
+    SystemMessagePromptTemplate,
+)
+
+
+def create_messages() -> List[BaseMessagePromptTemplate]:
+    """Create messages."""
+    system_message_prompt = SystemMessagePromptTemplate(
+        prompt=PromptTemplate(
+            template="Here's some context: {context}",
+            input_variables=["context"],
+        )
+    )
+    human_message_prompt = HumanMessagePromptTemplate(
+        prompt=PromptTemplate(
+            template="Hello {foo}, I'm {bar}. Thanks for the {context}",
+            input_variables=["foo", "bar", "context"],
+        )
+    )
+    ai_message_prompt = AIMessagePromptTemplate(
+        prompt=PromptTemplate(
+            template="I'm an AI. I'm {foo}. I'm {bar}.",
+            input_variables=["foo", "bar"],
+        )
+    )
+    chat_message_prompt = ChatMessagePromptTemplate(
+        role="test",
+        prompt=PromptTemplate(
+            template="I'm a generic message. I'm {foo}. I'm {bar}.",
+            input_variables=["foo", "bar"],
+        ),
+    )
+    return [
+        system_message_prompt,
+        human_message_prompt,
+        ai_message_prompt,
+        chat_message_prompt,
+    ]
+
+
+def create_chat_prompt_template() -> ChatPromptTemplate:
+    """Create a chat prompt template."""
+    return ChatPromptTemplate(
+        input_variables=["foo", "bar", "context"],
+        messages=create_messages(),
+    )
+
+
+def test_chat_prompt_template() -> None:
+    """Test chat prompt template."""
+    prompt_template = create_chat_prompt_template()
+    prompt = prompt_template.format_prompt(foo="foo", bar="bar", context="context")
+    assert isinstance(prompt, ChatPromptValue)
+    messages = prompt.to_messages()
+    assert len(messages) == 4
+    assert messages[0].content == "Here's some context: context"
+    assert messages[1].content == "Hello foo, I'm bar. Thanks for the context"
+    assert messages[2].content == "I'm an AI. I'm foo. I'm bar."
+    assert messages[3].content == "I'm a generic message. I'm foo. I'm bar."
+
+    string = prompt.to_string()
+    expected = (
+        '[SystemMessage(content="Here\'s some context: context", '
+        'additional_kwargs={}), HumanMessage(content="Hello foo, '
+        "I'm bar. Thanks for the context\", additional_kwargs={}), "
+        "AIMessage(content=\"I'm an AI. I'm foo. I'm bar.\", additional_kwargs={}), "
+        "ChatMessage(content=\"I'm a generic message. I'm foo. I'm bar.\","
+        " additional_kwargs={}, role='test')]"
+    )
+    assert string == expected
+
+    string = prompt_template.format(foo="foo", bar="bar", context="context")
+    assert string == expected
+
+
+def test_chat_prompt_template_from_messages() -> None:
+    """Test creating a chat prompt template from messages."""
+    chat_prompt_template = ChatPromptTemplate.from_messages(create_messages())
+    assert sorted(chat_prompt_template.input_variables) == sorted(
+        ["context", "foo", "bar"]
+    )
+    assert len(chat_prompt_template.messages) == 4