mirror of
https://github.com/hwchase17/langchain
synced 2024-10-31 15:20:26 +00:00
b3988621c5
# Add C Transformers for GGML Models I created Python bindings for the GGML models: https://github.com/marella/ctransformers Currently it supports GPT-2, GPT-J, GPT-NeoX, LLaMA, MPT, etc. See [Supported Models](https://github.com/marella/ctransformers#supported-models). It provides a unified interface for all models: ```python from langchain.llms import CTransformers llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2') print(llm('AI is going to')) ``` It can be used with models hosted on the Hugging Face Hub: ```py llm = CTransformers(model='marella/gpt-2-ggml') ``` It supports streaming: ```py from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler llm = CTransformers(model='marella/gpt-2-ggml', callbacks=[StreamingStdOutCallbackHandler()]) ``` Please see [README](https://github.com/marella/ctransformers#readme) for more details. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1.7 KiB
1.7 KiB
C Transformers
This page covers how to use the C Transformers library within LangChain. It is broken into two parts: installation and setup, and then references to specific C Transformers wrappers.
Installation and Setup
- Install the Python package with
pip install ctransformers
- Download a supported GGML model (see Supported Models)
Wrappers
LLM
There exists a CTransformers LLM wrapper, which you can access with:
from langchain.llms import CTransformers
It provides a unified interface for all models:
llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2')
print(llm('AI is going to'))
If you are getting illegal instruction
error, try using lib='avx'
or lib='basic'
:
llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2', lib='avx')
It can be used with models hosted on the Hugging Face Hub:
llm = CTransformers(model='marella/gpt-2-ggml')
If a model repo has multiple model files (.bin
files), specify a model file using:
llm = CTransformers(model='marella/gpt-2-ggml', model_file='ggml-model.bin')
Additional parameters can be passed using the config
parameter:
config = {'max_new_tokens': 256, 'repetition_penalty': 1.1}
llm = CTransformers(model='marella/gpt-2-ggml', config=config)
See Documentation for a list of available parameters.
For a more detailed walkthrough of this, see this notebook.