"In this guide, we'll learn how to create a custom chat model using LangChain abstractions.\n",
"\n",
"Wrapping your LLM with the standard `ChatModel` interface allow you to use your LLM in existing LangChain programs with minimal code modifications!\n",
"Wrapping your LLM with the standard `BaseChatModel` interface allow you to use your LLM in existing LangChain programs with minimal code modifications!\n",
"\n",
"As an bonus, your LLM will automatically become a LangChain `Runnable` and will benefit from some optimizations out of the box (e.g., batch via a threadpool), async support, the `astream_events` API, etc.\n",
"\n",
"## Inputs and outputs\n",
"\n",
"First, we need to talk about messages which are the inputs and outputs of chat models.\n",
"First, we need to talk about **messages** which are the inputs and outputs of chat models.\n",
"\n",
"### Messages\n",
"\n",
@ -24,13 +23,17 @@
"\n",
"LangChain has a few built-in message types:\n",
"\n",
"- `SystemMessage`: Used for priming AI behavior, usually passed in as the first of a sequence of input messages.\n",
"- `HumanMessage`: Represents a message from a person interacting with the chat model.\n",
"- `AIMessage`: Represents a message from the chat model. This can be either text or a request to invoke a tool.\n",
"- `FunctionMessage` / `ToolMessage`: Message for passing the results of tool invocation back to the model.\n",
"Inherting from `SimpleChatModel` is great for prototyping!\n",
"\n",
"It won't allow you to implement all features that you might want out of a chat model, but it's quick to implement, and if you need more you can transition to `BaseChatModel` shown below.\n",
"\n",
"Let's implement a chat model that echoes back the last `n` characters of the prompt!\n",
"\n",
"You need to implement the following:\n",
"\n",
"* The method `_call` - Use to generate a chat result from a prompt.\n",
"\n",
"In addition, you have the option to specify the following:\n",
"\n",
"* The property `_identifying_params` - Represent model parameterization for logging purposes.\n",
"\n",
"Optional:\n",
"\n",
"* `_stream` - Use to implement streaming.\n"
]
},
{
"cell_type": "markdown",
"id": "bbfebea1",
@ -143,29 +126,22 @@
"\n",
"Let's implement a chat model that echoes back the first `n` characetrs of the last message in the prompt!\n",
"\n",
"To do so, we will inherit from `BaseChatModel` and we'll need to implement the following methods/properties:\n",
"\n",
"In addition, you have the option to specify the following:\n",
"\n",
"To do so inherit from `BaseChatModel` which is a lower level class and implement the methods:\n",
"\n",
"* `_generate` - Use to generate a chat result from a prompt\n",
"* The property `_llm_type` - Used to uniquely identify the type of the model. Used for logging.\n",
"To do so, we will inherit from `BaseChatModel` and we'll need to implement the following:\n",
"| `_generate` | Use to generate a chat result from a prompt | Required |\n",
"| `_llm_type` (property) | Used to uniquely identify the type of the model. Used for logging.| Required |\n",
"| `_identifying_params` (property) | Represent model parameterization for tracing purposes. | Optional |\n",
"| `_stream` | Use to implement streaming. | Optional |\n",
"| `_agenerate` | Use to implement a native async method. | Optional |\n",
"| `_astream` | Use to implement async version of `_stream`. | Optional |\n",
"\n",
"* `_stream` - Use to implement streaming.\n",
"* `_agenerate` - Use to implement a native async method.\n",
"* `_astream` - Use to implement async version of `_stream`.\n",
"* The property `_identifying_params` - Represent model parameterization for logging purposes.\n",
"\n",
":::{.callout-tip}\n",
"The `_astream` implementation uses `run_in_executor` to launch the sync `_stream` in a separate thread if `_stream` is implemented, otherwise it fallsback to use `_agenerate`.\n",
"\n",
":::{.callout-caution}\n",
"\n",
"Currently, to get async streaming to work (via `astream`), you must provide an implementation of `_astream`.\n",
"\n",
"By default if `_astream` is not provided, then async streaming falls back on `_agenerate` which does not support\n",
"token by token streaming.\n",
"You can use this trick if you want to reuse the `_stream` implementation, but if you're able to implement code that's natively async that's a better solution since that code will run with less overhead.\n",
"/home/eugene/src/langchain/libs/core/langchain_core/_api/beta_decorator.py:86: LangChainBetaWarning: This API is in beta and may change in the future.\n",
"/home/eugene/src/langchain/libs/core/langchain_core/_api/beta_decorator.py:87: LangChainBetaWarning: This API is in beta and may change in the future.\n",
" warn_beta(\n"
]
}
@ -505,84 +493,6 @@
" print(event)"
]
},
{
"cell_type": "markdown",
"id": "42f9553f-7d8c-4277-aeb4-d80d77839d90",
"metadata": {},
"source": [
"## Identifying Params\n",
"\n",
"LangChain has a callback system which allows implementing loggers to monitor the behavior of LLM applications.\n",
"\n",
"Remember the `_identifying_params` property from earlier? \n",
"\n",
"It's passed to the callback system and is accessible for user specified loggers.\n",
"\n",
"Below we'll implement a handler with just a single `on_chat_model_start` event to see where `_identifying_params` appears."
"* [ ] Add unit or integration tests to the overridden methods. Verify that `invoke`, `ainvoke`, `batch`, `stream` work if you've over-ridden the corresponding code.\n",
"\n",
"\n",
"Streaming (if you're implementing it):\n",
"\n",
"* [ ] Provided an async implementation via `_astream`\n",
"* [ ] Make sure to invoke the `on_llm_new_token` callback\n",
"* [ ] `on_llm_new_token` is invoked BEFORE yielding the chunk\n",
"* [ ] Implement the _stream method to get streaming working\n",
"\n",
"Stop Token Behavior:\n",
"\n",
@ -616,7 +525,20 @@
"\n",
"Secret API Keys:\n",
"\n",
"* [ ] If your model connects to an API it will likely accept API keys as part of its initialization. Use Pydantic's `SecretStr` type for secrets, so they don't get accidentally printed out when folks print the model."
"* [ ] If your model connects to an API it will likely accept API keys as part of its initialization. Use Pydantic's `SecretStr` type for secrets, so they don't get accidentally printed out when folks print the model.\n",
"\n",
"\n",
"Identifying Params:\n",
"\n",
"* [ ] Include a `model_name` in identifying params\n",
"\n",
"\n",
"Optimizations:\n",
"\n",
"Consider providing native async support to reduce the overhead from the model!\n",
" \n",
"* [ ] Provided a native async of `_agenerate` (used by `ainvoke`)\n",
"* [ ] Provided a native async of `_astream` (used by `astream`)"