mirror of https://github.com/hwchase17/langchain
Add C Transformers for GGML Models (#5218)
# Add C Transformers for GGML Models I created Python bindings for the GGML models: https://github.com/marella/ctransformers Currently it supports GPT-2, GPT-J, GPT-NeoX, LLaMA, MPT, etc. See [Supported Models](https://github.com/marella/ctransformers#supported-models). It provides a unified interface for all models: ```python from langchain.llms import CTransformers llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2') print(llm('AI is going to')) ``` It can be used with models hosted on the Hugging Face Hub: ```py llm = CTransformers(model='marella/gpt-2-ggml') ``` It supports streaming: ```py from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler llm = CTransformers(model='marella/gpt-2-ggml', callbacks=[StreamingStdOutCallbackHandler()]) ``` Please see [README](https://github.com/marella/ctransformers#readme) for more details. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>pull/5272/head
parent
ca88b25da6
commit
b3988621c5
@ -0,0 +1,57 @@
|
||||
# C Transformers
|
||||
|
||||
This page covers how to use the [C Transformers](https://github.com/marella/ctransformers) library within LangChain.
|
||||
It is broken into two parts: installation and setup, and then references to specific C Transformers wrappers.
|
||||
|
||||
## Installation and Setup
|
||||
|
||||
- Install the Python package with `pip install ctransformers`
|
||||
- Download a supported [GGML model](https://huggingface.co/TheBloke) (see [Supported Models](https://github.com/marella/ctransformers#supported-models))
|
||||
|
||||
## Wrappers
|
||||
|
||||
### LLM
|
||||
|
||||
There exists a CTransformers LLM wrapper, which you can access with:
|
||||
|
||||
```python
|
||||
from langchain.llms import CTransformers
|
||||
```
|
||||
|
||||
It provides a unified interface for all models:
|
||||
|
||||
```python
|
||||
llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2')
|
||||
|
||||
print(llm('AI is going to'))
|
||||
```
|
||||
|
||||
If you are getting `illegal instruction` error, try using `lib='avx'` or `lib='basic'`:
|
||||
|
||||
```py
|
||||
llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2', lib='avx')
|
||||
```
|
||||
|
||||
It can be used with models hosted on the Hugging Face Hub:
|
||||
|
||||
```py
|
||||
llm = CTransformers(model='marella/gpt-2-ggml')
|
||||
```
|
||||
|
||||
If a model repo has multiple model files (`.bin` files), specify a model file using:
|
||||
|
||||
```py
|
||||
llm = CTransformers(model='marella/gpt-2-ggml', model_file='ggml-model.bin')
|
||||
```
|
||||
|
||||
Additional parameters can be passed using the `config` parameter:
|
||||
|
||||
```py
|
||||
config = {'max_new_tokens': 256, 'repetition_penalty': 1.1}
|
||||
|
||||
llm = CTransformers(model='marella/gpt-2-ggml', config=config)
|
||||
```
|
||||
|
||||
See [Documentation](https://github.com/marella/ctransformers#config) for a list of available parameters.
|
||||
|
||||
For a more detailed walkthrough of this, see [this notebook](../modules/models/llms/integrations/ctransformers.ipynb).
|
@ -0,0 +1,125 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# C Transformers\n",
|
||||
"\n",
|
||||
"The [C Transformers](https://github.com/marella/ctransformers) library provides Python bindings for GGML models.\n",
|
||||
"\n",
|
||||
"This example goes over how to use LangChain to interact with `C Transformers` [models](https://github.com/marella/ctransformers#supported-models)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"**Install**"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%pip install ctransformers"
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"**Load Model**"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.llms import CTransformers\n",
|
||||
"\n",
|
||||
"llm = CTransformers(model='marella/gpt-2-ggml')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"**Generate Text**"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"print(llm('AI is going to'))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"**Streaming**"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
|
||||
"\n",
|
||||
"llm = CTransformers(model='marella/gpt-2-ggml', callbacks=[StreamingStdOutCallbackHandler()])\n",
|
||||
"\n",
|
||||
"response = llm('AI is going to')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"**LLMChain**"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain import PromptTemplate, LLMChain\n",
|
||||
"\n",
|
||||
"template = \"\"\"Question: {question}\n",
|
||||
"\n",
|
||||
"Answer:\"\"\"\n",
|
||||
"\n",
|
||||
"prompt = PromptTemplate(template=template, input_variables=['question'])\n",
|
||||
"\n",
|
||||
"llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
|
||||
"\n",
|
||||
"response = llm_chain.run('What is AI?')"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"language_info": {
|
||||
"name": "python"
|
||||
},
|
||||
"orig_nbformat": 4
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
@ -0,0 +1,104 @@
|
||||
"""Wrapper around the C Transformers library."""
|
||||
from typing import Any, Dict, Optional, Sequence
|
||||
|
||||
from pydantic import root_validator
|
||||
|
||||
from langchain.callbacks.manager import CallbackManagerForLLMRun
|
||||
from langchain.llms.base import LLM
|
||||
|
||||
|
||||
class CTransformers(LLM):
|
||||
"""Wrapper around the C Transformers LLM interface.
|
||||
|
||||
To use, you should have the ``ctransformers`` python package installed.
|
||||
See https://github.com/marella/ctransformers
|
||||
|
||||
Example:
|
||||
.. code-block:: python
|
||||
|
||||
from langchain.llms import CTransformers
|
||||
|
||||
llm = CTransformers(model="/path/to/ggml-gpt-2.bin", model_type="gpt2")
|
||||
"""
|
||||
|
||||
client: Any #: :meta private:
|
||||
|
||||
model: str
|
||||
"""The path to a model file or directory or the name of a Hugging Face Hub
|
||||
model repo."""
|
||||
|
||||
model_type: Optional[str] = None
|
||||
"""The model type."""
|
||||
|
||||
model_file: Optional[str] = None
|
||||
"""The name of the model file in repo or directory."""
|
||||
|
||||
config: Optional[Dict[str, Any]] = None
|
||||
"""The config parameters.
|
||||
See https://github.com/marella/ctransformers#config"""
|
||||
|
||||
lib: Optional[str] = None
|
||||
"""The path to a shared library or one of `avx2`, `avx`, `basic`."""
|
||||
|
||||
@property
|
||||
def _identifying_params(self) -> Dict[str, Any]:
|
||||
"""Get the identifying parameters."""
|
||||
return {
|
||||
"model": self.model,
|
||||
"model_type": self.model_type,
|
||||
"model_file": self.model_file,
|
||||
"config": self.config,
|
||||
}
|
||||
|
||||
@property
|
||||
def _llm_type(self) -> str:
|
||||
"""Return type of llm."""
|
||||
return "ctransformers"
|
||||
|
||||
@root_validator()
|
||||
def validate_environment(cls, values: Dict) -> Dict:
|
||||
"""Validate that ``ctransformers`` package is installed."""
|
||||
try:
|
||||
from ctransformers import AutoModelForCausalLM
|
||||
except ImportError:
|
||||
raise ImportError(
|
||||
"Could not import `ctransformers` package. "
|
||||
"Please install it with `pip install ctransformers`"
|
||||
)
|
||||
|
||||
config = values["config"] or {}
|
||||
values["client"] = AutoModelForCausalLM.from_pretrained(
|
||||
values["model"],
|
||||
model_type=values["model_type"],
|
||||
model_file=values["model_file"],
|
||||
lib=values["lib"],
|
||||
**config,
|
||||
)
|
||||
return values
|
||||
|
||||
def _call(
|
||||
self,
|
||||
prompt: str,
|
||||
stop: Optional[Sequence[str]] = None,
|
||||
run_manager: Optional[CallbackManagerForLLMRun] = None,
|
||||
) -> str:
|
||||
"""Generate text from a prompt.
|
||||
|
||||
Args:
|
||||
prompt: The prompt to generate text from.
|
||||
stop: A list of sequences to stop generation when encountered.
|
||||
|
||||
Returns:
|
||||
The generated text.
|
||||
|
||||
Example:
|
||||
.. code-block:: python
|
||||
|
||||
response = llm("Tell me a joke.")
|
||||
"""
|
||||
text = []
|
||||
_run_manager = run_manager or CallbackManagerForLLMRun.get_noop_manager()
|
||||
for chunk in self.client(prompt, stop=stop, stream=True):
|
||||
text.append(chunk)
|
||||
_run_manager.on_llm_new_token(chunk, verbose=self.verbose)
|
||||
return "".join(text)
|
@ -0,0 +1,21 @@
|
||||
"""Test C Transformers wrapper."""
|
||||
|
||||
from langchain.llms import CTransformers
|
||||
from tests.unit_tests.callbacks.fake_callback_handler import FakeCallbackHandler
|
||||
|
||||
|
||||
def test_ctransformers_call() -> None:
|
||||
"""Test valid call to C Transformers."""
|
||||
config = {"max_new_tokens": 5}
|
||||
callback_handler = FakeCallbackHandler()
|
||||
|
||||
llm = CTransformers(
|
||||
model="marella/gpt-2-ggml",
|
||||
config=config,
|
||||
callbacks=[callback_handler],
|
||||
)
|
||||
|
||||
output = llm("Say foo:")
|
||||
assert isinstance(output, str)
|
||||
assert len(output) > 1
|
||||
assert 0 < callback_handler.llm_streams <= config["max_new_tokens"]
|
Loading…
Reference in New Issue