Ravindra Marella b3988621c5

Add C Transformers for GGML Models (#5218 )

# Add C Transformers for GGML Models
I created Python bindings for the GGML models:
https://github.com/marella/ctransformers

Currently it supports GPT-2, GPT-J, GPT-NeoX, LLaMA, MPT, etc. See
[Supported
Models](https://github.com/marella/ctransformers#supported-models).


It provides a unified interface for all models:

```python
from langchain.llms import CTransformers

llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2')

print(llm('AI is going to'))
```

It can be used with models hosted on the Hugging Face Hub:

```py
llm = CTransformers(model='marella/gpt-2-ggml')
```

It supports streaming:

```py
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

llm = CTransformers(model='marella/gpt-2-ggml', callbacks=[StreamingStdOutCallbackHandler()])
```

Please see [README](https://github.com/marella/ctransformers#readme) for
more details.
---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>

2023-05-25 13:42:44 -07:00

1.7 KiB

Raw Blame History

C Transformers

This page covers how to use the C Transformers library within LangChain. It is broken into two parts: installation and setup, and then references to specific C Transformers wrappers.

Installation and Setup

Install the Python package with pip install ctransformers
Download a supported GGML model (see Supported Models)

Wrappers

LLM

There exists a CTransformers LLM wrapper, which you can access with:

from langchain.llms import CTransformers

It provides a unified interface for all models:

llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2')

print(llm('AI is going to'))

If you are getting illegal instruction error, try using lib='avx' or lib='basic':

llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2', lib='avx')

It can be used with models hosted on the Hugging Face Hub:

llm = CTransformers(model='marella/gpt-2-ggml')

If a model repo has multiple model files (.bin files), specify a model file using:

llm = CTransformers(model='marella/gpt-2-ggml', model_file='ggml-model.bin')

Additional parameters can be passed using the config parameter:

config = {'max_new_tokens': 256, 'repetition_penalty': 1.1}

llm = CTransformers(model='marella/gpt-2-ggml', config=config)

See Documentation for a list of available parameters.

For a more detailed walkthrough of this, see this notebook.

1.7 KiB Raw Blame History

C Transformers

Installation and Setup

Wrappers

LLM

1.7 KiB

Raw Blame History