Add Streaming for GPT4All (#2642)

- Adds  support for callback handlers in GPT4All models
- Updates notebook and docs
fix_agent_callbacks
William FH 1 year ago committed by GitHub
parent c51753250d
commit 10ff1fda8e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -1,21 +1,21 @@
# GPT4All # GPT4All
This page covers how to use the `GPT4All` wrapper within LangChain. This page covers how to use the `GPT4All` wrapper within LangChain. The tutorial is divided into two parts: installation and setup, followed by usage with an example.
It is broken into two parts: installation and setup, and then usage with an example.
## Installation and Setup ## Installation and Setup
- Install the Python package with `pip install pyllamacpp` - Install the Python package with `pip install pyllamacpp`
- Download a [GPT4All model](https://github.com/nomic-ai/gpt4all) and place it in your desired directory - Download a [GPT4All model](https://github.com/nomic-ai/pyllamacpp#supported-model) and place it in your desired directory
## Usage ## Usage
### GPT4All ### GPT4All
To use the GPT4All wrapper, you need to provide the path to the pre-trained model file and the model's configuration. To use the GPT4All wrapper, you need to provide the path to the pre-trained model file and the model's configuration.
```python ```python
from langchain.llms import GPT4All from langchain.llms import GPT4All
# Instantiate the model # Instantiate the model. Callbacks support token-wise streaming
model = GPT4All(model="./models/gpt4all-model.bin", n_ctx=512, n_threads=8) model = GPT4All(model="./models/gpt4all-model.bin", n_ctx=512, n_threads=8)
# Generate text # Generate text
@ -24,14 +24,24 @@ response = model("Once upon a time, ")
You can also customize the generation parameters, such as n_predict, temp, top_p, top_k, and others. You can also customize the generation parameters, such as n_predict, temp, top_p, top_k, and others.
Example: To stream the model's predictions, add in a CallbackManager.
```python ```python
model = GPT4All(model="./models/gpt4all-model.bin", n_predict=55, temp=0) from langchain.llms import GPT4All
response = model("Once upon a time, ") from langchain.callbacks.base import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
# There are many CallbackHandlers supported, such as
# from langchain.callbacks.streamlit import StreamlitCallbackHandler
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])
model = GPT4All(model="./models/gpt4all-model.bin", n_ctx=512, n_threads=8, callback_handler=callback_handler, verbose=True)
# Generate text. Tokens are streamed throught the callback manager.
model("Once upon a time, ")
``` ```
## Model File ## Model File
You can find links to model file downloads at the [GPT4all](https://github.com/nomic-ai/gpt4all) repository. They will need to be converted to `ggml` format to work, as specified in the [pyllamacpp](https://github.com/nomic-ai/pyllamacpp) repository. You can find links to model file downloads in the [pyllamacpp](https://github.com/nomic-ai/pyllamacpp) repository.
For a more detailed walkthrough of this, see [this notebook](../modules/models/llms/integrations/gpt4all.ipynb) For a more detailed walkthrough of this, see [this notebook](../modules/models/llms/integrations/gpt4all.ipynb)

@ -4,7 +4,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# GPT4all\n", "# GPT4All\n",
"\n", "\n",
"This example goes over how to use LangChain to interact with GPT4All models" "This example goes over how to use LangChain to interact with GPT4All models"
] ]
@ -15,7 +15,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"!pip install pyllamacpp" "%pip install pyllamacpp > /dev/null"
] ]
}, },
{ {
@ -24,8 +24,10 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"from langchain import PromptTemplate, LLMChain\n",
"from langchain.llms import GPT4All\n", "from langchain.llms import GPT4All\n",
"from langchain import PromptTemplate, LLMChain" "from langchain.callbacks.base import CallbackManager\n",
"from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler"
] ]
}, },
{ {
@ -41,15 +43,70 @@
"prompt = PromptTemplate(template=template, input_variables=[\"question\"])" "prompt = PromptTemplate(template=template, input_variables=[\"question\"])"
] ]
}, },
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Specify Model\n",
"\n",
"To run locally, download a compatible ggml-formatted model. For more info, visit https://github.com/nomic-ai/pyllamacpp\n",
"\n",
"Note that new models are uploaded regularly - check the link above for the most recent `.bin` URL"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"local_path = './models/gpt4all-lora-quantized-ggml.bin' # replace with your desired local file path"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Uncomment the below block to download a model. You may want to update `url` to a new version."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# import requests\n",
"\n",
"# from pathlib import Path\n",
"# from tqdm import tqdm\n",
"\n",
"# Path(local_path).parent.mkdir(parents=True, exist_ok=True)\n",
"\n",
"# # Example model. Check https://github.com/nomic-ai/pyllamacpp for the latest models.\n",
"# url = 'https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-quantized-ggml.bin'\n",
"\n",
"# # send a GET request to the URL to download the file. Stream since it's large\n",
"# response = requests.get(url, stream=True)\n",
"\n",
"# # open the file in binary mode and write the contents of the response to it in chunks\n",
"# # This is a large file, so be prepared to wait.\n",
"# with open(local_path, 'wb') as f:\n",
"# for chunk in tqdm(response.iter_content(chunk_size=8192)):\n",
"# if chunk:\n",
"# f.write(chunk)"
]
},
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# You'll need to download a compatible model and convert it to ggml.\n", "# Callbacks support token-wise streaming\n",
"# See: https://github.com/nomic-ai/gpt4all for more information.\n", "callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])\n",
"llm = GPT4All(model=\"./models/gpt4all-model.bin\")" "# Verbose is required to pass to the callback manager\n",
"llm = GPT4All(model=local_path, callback_manager=callback_manager, verbose=True)"
] ]
}, },
{ {
@ -89,7 +146,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.9.1" "version": "3.11.2"
} }
}, },
"nbformat": 4, "nbformat": 4,

@ -1,4 +1,5 @@
"""Wrapper for the GPT4All model.""" """Wrapper for the GPT4All model."""
from functools import partial
from typing import Any, Dict, List, Mapping, Optional, Set from typing import Any, Dict, List, Mapping, Optional, Set
from pydantic import Extra, Field, root_validator from pydantic import Extra, Field, root_validator
@ -174,8 +175,12 @@ class GPT4All(LLM):
prompt = "Once upon a time, " prompt = "Once upon a time, "
response = model(prompt, n_predict=55) response = model(prompt, n_predict=55)
""" """
text_callback = partial(
self.callback_manager.on_llm_new_token, verbose=self.verbose
)
text = self.client.generate( text = self.client.generate(
prompt, prompt,
new_text_callback=text_callback,
**self._default_params, **self._default_params,
) )
if stop is not None: if stop is not None:

Loading…
Cancel
Save