@ -9,6 +9,7 @@ Currently, there are five different model architectures that are supported:
3. MPT - Based off of Mosaic ML's MPT architecture with examples found [here](https://huggingface.co/mosaicml/mpt-7b)
4. Replit - Based off of Replit Inc.'s Replit architecture with examples found [here](https://huggingface.co/replit/replit-code-v1-3b)
5. Falcon - Based off of TII's Falcon architecture with examples found [here](https://huggingface.co/tiiuae/falcon-40b)
6. StarCoder - Based off of BigCode's StarCoder architecture with examples found [here](https://huggingface.co/bigcode/starcoder)
## Why so many different architectures? What differentiates them?
@ -16,7 +17,7 @@ One of the major differences is license. Currently, the LLAMA based models are s
## How does GPT4All make these models available for CPU inference?
By leveraging the ggml library written by Georgi Gerganov and a growing community of developers. There are currently multiple different versions of this library. The original github repo can be found [here](https://github.com/ggerganov/ggml), but the developer of the library has also created a LLAMA based version [here](https://github.com/ggerganov/llama.cpp). Currently, this backend is using the latter as a submodule.
By leveraging the ggml library written by Georgi Gerganov and a growing community of developers. There are currently multiple different versions of this library. The original GitHub repo can be found [here](https://github.com/ggerganov/ggml), but the developer of the library has also created a LLAMA based version [here](https://github.com/ggerganov/llama.cpp). Currently, this backend is using the latter as a submodule.
## Does that mean GPT4All is compatible with all llama.cpp models and vice versa?
@ -35,8 +36,55 @@ Your CPU needs to support [AVX or AVX2 instructions](https://en.wikipedia.org/wi
In newer versions of llama.cpp, there has been some added support for NVIDIA GPU's for inference. We're investigating how to incorporate this into our downloadable installers.
## Ok, so bottom line... how do I make my model on huggingface compatible with GPT4All ecosystem right now?
## Ok, so bottom line... how do I make my model on Hugging Face compatible with GPT4All ecosystem right now?
1. Check to make sure the huggingface model is available in one of our three supported architectures
1. Check to make sure the Hugging Face model is available in one of our three supported architectures
2. If it is, then you can use the conversion script inside of our pinned llama.cpp submodule for GPTJ and LLAMA based models
3. Or if your model is an MPT model you can use the conversion script located directly in this backend directory under the scripts subdirectory
## Language Bindings
#### There's a problem with the download
Some bindings can download a model, if allowed to do so. For example, in Python or TypeScript if `allow_download=True`
or `allowDownload=true` (default), a model is automatically downloaded into `.cache/gpt4all/` in the user's home folder,
unless it already exists.
In case of connection issues or errors during the download, you might want to manually verify the model file's MD5
checksum by comparing it with the one listed in [models.json].
As an alternative to the basic downloader built into the bindings, you can choose to download from the
<https://gpt4all.io/> website instead. Scroll down to 'Model Explorer' and pick your preferred model.