Commit Graph

166 Commits (005c092943e573c015f0bb23727fd8e576ef0ee8)

Author SHA1 Message Date
Aaron Miller afaa291eab python bindings should be quiet by default
* disable llama.cpp logging unless GPT4ALL_VERBOSE_LLAMACPP envvar is
  nonempty
* make verbose flag for retrieve_model default false (but also be
  overridable via gpt4all constructor)

should be able to run a basic test:

```python
import gpt4all
model = gpt4all.GPT4All('/Users/aaron/Downloads/rift-coder-v0-7b-q4_0.gguf')
print(model.generate('def fib(n):'))
```

and see no non-model output when successful
1 year ago
cebtenzzre 7b611b49f2
llmodel: print an error if the CPU does not support AVX (#1499) 1 year ago
Aaron Miller 043617168e do not process prompts on gpu yet 1 year ago
Aaron Miller 64001a480a mat*mat for q4_0, q8_0 1 year ago
cebtenzzre 7a19047329
llmodel: do not call magic_match unless build variant is correct (#1488) 1 year ago
Cebtenzzre 5fe685427a chat: clearer CPU fallback messages 1 year ago
Adam Treat eec906aa05 Speculative fix for build on mac. 1 year ago
Adam Treat a9acdd25de Push a new version number for llmodel backend now that it is based on gguf. 1 year ago
Cebtenzzre 8bb6a6c201 rebase on newer llama.cpp 1 year ago
Cebtenzzre d87573ea75 remove old llama.cpp submodules 1 year ago
Cebtenzzre cc6db61c93 backend: fix build with Visual Studio generator
Use the $<CONFIG> generator expression instead of CMAKE_BUILD_TYPE. This
is needed because Visual Studio is a multi-configuration generator, so
we do not know what the build type will be until `cmake --build` is
called.

Fixes #1470
1 year ago
Adam Treat f605a5b686 Add q8_0 kernels to kompute shaders and bump to latest llama/gguf. 1 year ago
Cebtenzzre 672cb850f9 differentiate between init failure and unsupported models 1 year ago
Adam Treat 906699e8e9 Bump to latest llama/gguf branch. 1 year ago
Cebtenzzre 088afada49 llamamodel: fix static vector in LLamaModel::endTokens 1 year ago
Adam Treat b4d82ea289 Bump to the latest fixes for vulkan in llama. 1 year ago
Adam Treat 12f943e966 Fix regenerate button to be deterministic and bump the llama version to latest we have for gguf. 1 year ago
Adam Treat 5d346e13d7 Add q6_k kernels for vulkan. 1 year ago
Adam Treat 4eefd386d0 Refactor for subgroups on mat * vec kernel. 1 year ago
Cebtenzzre 3c2aa299d8 gptj: remove unused variables 1 year ago
Cebtenzzre f9deb87d20 convert scripts: add feed-forward length for better compatiblilty
This GGUF key is used by all llama.cpp models with upstream support.
1 year ago
Cebtenzzre cc7675d432 convert scripts: make gptj script executable 1 year ago
Cebtenzzre 0493e6eb07 convert scripts: use bytes_to_unicode from transformers 1 year ago
Cebtenzzre d5d72f0361 gpt-j: update inference to match latest llama.cpp insights
- Use F16 KV cache
- Store transposed V in the cache
- Avoid unnecessary Q copy

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

ggml upstream commit 0265f0813492602fec0e1159fe61de1bf0ccaf78
1 year ago
Cebtenzzre 050e7f076e backend: port GPT-J to GGUF 1 year ago
Cebtenzzre 8f3abb37ca fix references to removed model types 1 year ago
Cebtenzzre 4219c0e2e7 convert scripts: make them directly executable 1 year ago
Cebtenzzre ce7be1db48 backend: use llamamodel.cpp for Falcon 1 year ago
Cebtenzzre cca9e6ce81 convert_mpt_hf_to_gguf.py: better tokenizer decoding 1 year ago
Cebtenzzre 25297786db convert scripts: load model as late as possible 1 year ago
Cebtenzzre fd47088f2b conversion scripts: cleanup 1 year ago
Cebtenzzre 6277eac9cc backend: use llamamodel.cpp for StarCoder 1 year ago
Cebtenzzre 17fc9e3e58 backend: port Replit to GGUF 1 year ago
Cebtenzzre 7c67262a13 backend: port MPT to GGUF 1 year ago
Cebtenzzre 42bcb814b3 backend: port BERT to GGUF 1 year ago
Cebtenzzre 1d29e4696c llamamodel: metal supports all quantization types now 1 year ago
Aaron Miller 507753a37c macos build fixes 1 year ago
Adam Treat d90d003a1d Latest rebase on llama.cpp with gguf support. 1 year ago
Adam Treat 99c106e6b5 Fix a bug seen on AMD RADEON cards with vulkan backend. 1 year ago
Jacob Nguyen e86c63750d Update llama.cpp.cmake
Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>
1 year ago
Adam Treat 84905aa281 Fix for crashes on systems where vulkan is not installed properly. 1 year ago
Adam Treat 045f6e6cdc Link against ggml in bin so we can get the available devices without loading a model. 1 year ago
Adam Treat aa33419c6e Fallback to CPU more robustly. 1 year ago
Adam Treat 9013a089bd Bump to new llama with new bugfix. 1 year ago
Adam Treat 3076e0bf26 Only show GPU when we're actually using it. 1 year ago
Adam Treat cf4eb530ce Sync to a newer version of llama.cpp with bugfix for vulkan. 1 year ago
Adam Treat 4b9a345aee Update the submodule. 1 year ago
Aaron Miller 6f038c136b init at most one vulkan device, submodule update
fixes issues w/ multiple of the same gpu
1 year ago
Adam Treat 8f99dca70f Bring the vulkan backend to the GUI. 1 year ago
Aaron Miller f0735efa7d vulkan python bindings on windows fixes 1 year ago