Adam Treat
5d346e13d7
Add q6_k kernels for vulkan.
2023-10-05 18:16:19 -04:00
Adam Treat
4eefd386d0
Refactor for subgroups on mat * vec kernel.
2023-10-05 18:16:19 -04:00
Cebtenzzre
3c2aa299d8
gptj: remove unused variables
2023-10-05 18:16:19 -04:00
Cebtenzzre
f9deb87d20
convert scripts: add feed-forward length for better compatiblilty
...
This GGUF key is used by all llama.cpp models with upstream support.
2023-10-05 18:16:19 -04:00
Cebtenzzre
cc7675d432
convert scripts: make gptj script executable
2023-10-05 18:16:19 -04:00
Cebtenzzre
0493e6eb07
convert scripts: use bytes_to_unicode from transformers
2023-10-05 18:16:19 -04:00
Cebtenzzre
a49a1dcdf4
chatllm: grammar fix
2023-10-05 18:16:19 -04:00
Cebtenzzre
d5d72f0361
gpt-j: update inference to match latest llama.cpp insights
...
- Use F16 KV cache
- Store transposed V in the cache
- Avoid unnecessary Q copy
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
ggml upstream commit 0265f0813492602fec0e1159fe61de1bf0ccaf78
2023-10-05 18:16:19 -04:00
Cebtenzzre
050e7f076e
backend: port GPT-J to GGUF
2023-10-05 18:16:19 -04:00
Cebtenzzre
31b20f093a
modellist: fix the system prompt
2023-10-05 18:16:19 -04:00
Cebtenzzre
8f3abb37ca
fix references to removed model types
2023-10-05 18:16:19 -04:00
Cebtenzzre
4219c0e2e7
convert scripts: make them directly executable
2023-10-05 18:16:19 -04:00
Cebtenzzre
ce7be1db48
backend: use llamamodel.cpp for Falcon
2023-10-05 18:16:19 -04:00
Cebtenzzre
cca9e6ce81
convert_mpt_hf_to_gguf.py: better tokenizer decoding
2023-10-05 18:16:19 -04:00
Cebtenzzre
25297786db
convert scripts: load model as late as possible
2023-10-05 18:16:19 -04:00
Cebtenzzre
fd47088f2b
conversion scripts: cleanup
2023-10-05 18:16:19 -04:00
Cebtenzzre
6277eac9cc
backend: use llamamodel.cpp for StarCoder
2023-10-05 18:16:19 -04:00
Cebtenzzre
aa706ab1ff
backend: use gguf branch of llama.cpp-mainline
2023-10-05 18:16:19 -04:00
Cebtenzzre
17fc9e3e58
backend: port Replit to GGUF
2023-10-05 18:16:19 -04:00
Cebtenzzre
7c67262a13
backend: port MPT to GGUF
2023-10-05 18:16:19 -04:00
Cebtenzzre
42bcb814b3
backend: port BERT to GGUF
2023-10-05 18:16:19 -04:00
Cebtenzzre
4392bf26e0
pyllmodel: print specific error message
2023-10-05 18:16:19 -04:00
Cebtenzzre
34f2ec2b33
gpt4all.py: GGUF
2023-10-05 18:16:19 -04:00
Cebtenzzre
1d29e4696c
llamamodel: metal supports all quantization types now
2023-10-05 18:16:19 -04:00
Aaron Miller
507753a37c
macos build fixes
2023-10-05 18:16:19 -04:00
Adam Treat
d90d003a1d
Latest rebase on llama.cpp with gguf support.
2023-10-05 18:16:19 -04:00
Akarshan Biswas
5f3d739205
appdata: update software description
2023-10-05 10:12:43 -04:00
Akarshan Biswas
b4cf12e1bd
Update to 2.4.19
2023-10-05 10:12:43 -04:00
Akarshan Biswas
21a5709b07
Remove unnecessary stuffs from manifest
2023-10-05 10:12:43 -04:00
Akarshan Biswas
4426640f44
Add flatpak manifest
2023-10-05 10:12:43 -04:00
Aaron Miller
6711bddc4c
launch browser instead of maintenancetool from offline builds
2023-09-27 11:24:21 -07:00
Aaron Miller
7f979c8258
Build offline installers in CircleCI
2023-09-27 11:24:21 -07:00
Adam Treat
99c106e6b5
Fix a bug seen on AMD RADEON cards with vulkan backend.
2023-09-26 11:59:47 -04:00
Andriy Mulyar
9611c4081a
Update README.md
...
Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
2023-09-20 15:50:28 -04:00
kevinbazira
17cb4a86d1
Replace git clone SSH URI with HTTPS URL
...
Running `git clone --recurse-submodules git@github.com:nomic-ai/gpt4all.git`
returns `Permission denied (publickey)` as shown below:
```
git clone --recurse-submodules git@github.com:nomic-ai/gpt4all.git
Cloning into gpt4all...
git@github.com: Permission denied (publickey).
fatal: Could not read from remote repository.
```
This change replaces `git@github.com:nomic-ai/gpt4all.git` with
`https://github.com/nomic-ai/gpt4all.git ` which runs without permission issues.
resolves nomic-ai/gpt4all#8 , resolves nomic-ai/gpt4all#49
2023-09-20 09:48:47 -04:00
Andriy Mulyar
0d1edaf029
Update README.md with GPU support
...
Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
2023-09-19 10:51:17 -04:00
Adam Treat
dc80d1e578
Fix up the offline installer.
2023-09-18 16:21:50 -04:00
Jacob Nguyen
e86c63750d
Update llama.cpp.cmake
...
Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>
2023-09-16 11:42:56 -07:00
Adam Treat
f47e698193
Release notes for v2.4.19 and bump the version.
2023-09-16 12:35:08 -04:00
Adam Treat
84905aa281
Fix for crashes on systems where vulkan is not installed properly.
2023-09-16 12:19:46 -04:00
Adam Treat
ecf014f03b
Release notes for v2.4.18 and bump the version.
2023-09-16 10:21:50 -04:00
Adam Treat
e6e724d2dc
Actually bump the version.
2023-09-16 10:07:20 -04:00
Adam Treat
06a833e652
Send actual and requested device info for those who have opt-in.
2023-09-16 09:42:22 -04:00
Adam Treat
045f6e6cdc
Link against ggml in bin so we can get the available devices without loading a model.
2023-09-15 14:45:25 -04:00
Adam Treat
0f046cf905
Bump the Python version to python-v1.0.12 to restrict the quants that vulkan recognizes.
2023-09-15 09:12:20 -04:00
Adam Treat
655372dbfa
Release notes for v2.4.17 and bump the version.
2023-09-14 17:11:04 -04:00
Adam Treat
aa33419c6e
Fallback to CPU more robustly.
2023-09-14 16:53:11 -04:00
Adam Treat
79843c269e
Release notes for v2.4.16 and bump the version.
2023-09-14 11:24:25 -04:00
Adam Treat
9013a089bd
Bump to new llama with new bugfix.
2023-09-14 10:02:11 -04:00
Adam Treat
3076e0bf26
Only show GPU when we're actually using it.
2023-09-14 09:59:19 -04:00