langchain/docs/integrations/ctransformers.md
Ravindra Marella b3988621c5
Add C Transformers for GGML Models (#5218)
# Add C Transformers for GGML Models
I created Python bindings for the GGML models:
https://github.com/marella/ctransformers

Currently it supports GPT-2, GPT-J, GPT-NeoX, LLaMA, MPT, etc. See
[Supported
Models](https://github.com/marella/ctransformers#supported-models).


It provides a unified interface for all models:

```python
from langchain.llms import CTransformers

llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2')

print(llm('AI is going to'))
```

It can be used with models hosted on the Hugging Face Hub:

```py
llm = CTransformers(model='marella/gpt-2-ggml')
```

It supports streaming:

```py
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

llm = CTransformers(model='marella/gpt-2-ggml', callbacks=[StreamingStdOutCallbackHandler()])
```

Please see [README](https://github.com/marella/ctransformers#readme) for
more details.
---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-05-25 13:42:44 -07:00

58 lines
1.7 KiB
Markdown

# C Transformers
This page covers how to use the [C Transformers](https://github.com/marella/ctransformers) library within LangChain.
It is broken into two parts: installation and setup, and then references to specific C Transformers wrappers.
## Installation and Setup
- Install the Python package with `pip install ctransformers`
- Download a supported [GGML model](https://huggingface.co/TheBloke) (see [Supported Models](https://github.com/marella/ctransformers#supported-models))
## Wrappers
### LLM
There exists a CTransformers LLM wrapper, which you can access with:
```python
from langchain.llms import CTransformers
```
It provides a unified interface for all models:
```python
llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2')
print(llm('AI is going to'))
```
If you are getting `illegal instruction` error, try using `lib='avx'` or `lib='basic'`:
```py
llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2', lib='avx')
```
It can be used with models hosted on the Hugging Face Hub:
```py
llm = CTransformers(model='marella/gpt-2-ggml')
```
If a model repo has multiple model files (`.bin` files), specify a model file using:
```py
llm = CTransformers(model='marella/gpt-2-ggml', model_file='ggml-model.bin')
```
Additional parameters can be passed using the `config` parameter:
```py
config = {'max_new_tokens': 256, 'repetition_penalty': 1.1}
llm = CTransformers(model='marella/gpt-2-ggml', config=config)
```
See [Documentation](https://github.com/marella/ctransformers#config) for a list of available parameters.
For a more detailed walkthrough of this, see [this notebook](../modules/models/llms/integrations/ctransformers.ipynb).