Commit Graph

1463 Commits (7a190473292604734f02f1a347020ab4f44a4617)
 

Author SHA1 Message Date
cebtenzzre 7a19047329
llmodel: do not call magic_match unless build variant is correct (#1488) 11 months ago
Adam Treat df8528df73 Another codespell attempted fix. 11 months ago
Adam Treat f0742c22f4 Restore state from text if necessary. 11 months ago
Adam Treat 35f9cdb70a Do not delete saved chats if we fail to serialize properly. 11 months ago
cebtenzzre 9fb135e020
cmake: install the GPT-J plugin (#1487) 11 months ago
Cebtenzzre df66226f7d issue template: remove "Related Components" section 11 months ago
Aaron Miller 3c25d81759 make codespell happy 11 months ago
Jan Philipp Harries 4f0cee9330 added EM German Mistral Model 11 months ago
Adam Treat 56c0d2898d Update the language here to avoid misunderstanding. 11 months ago
Adam Treat b2cd3bdb3f Fix crasher with an empty string for prompt template. 11 months ago
Cebtenzzre 5fe685427a chat: clearer CPU fallback messages 11 months ago
Adam Treat eec906aa05 Speculative fix for build on mac. 11 months ago
Aaron Miller 9325075f80 fix stray comma in models2.json
Signed-off-by: Aaron Miller <apage43@ninjawhale.com>
11 months ago
Adam Treat a9acdd25de Push a new version number for llmodel backend now that it is based on gguf. 11 months ago
Adam Treat f028f67c68 Add starcoder, rift and sbert to our models2.json. 11 months ago
Aaron Miller a10f3aea5e python/embed4all: use gguf model, allow passing kwargs/overriding model 11 months ago
Cebtenzzre 8bb6a6c201 rebase on newer llama.cpp 11 months ago
Adam Treat 4528f73479 Reorder and refresh our models2.json. 11 months ago
Cebtenzzre d87573ea75 remove old llama.cpp submodules 11 months ago
Cebtenzzre cc6db61c93 backend: fix build with Visual Studio generator
Use the $<CONFIG> generator expression instead of CMAKE_BUILD_TYPE. This
is needed because Visual Studio is a multi-configuration generator, so
we do not know what the build type will be until `cmake --build` is
called.

Fixes #1470
11 months ago
Adam Treat f605a5b686 Add q8_0 kernels to kompute shaders and bump to latest llama/gguf. 11 months ago
Cebtenzzre 1534df3e9f backend: do not use Vulkan with non-LLaMA models 11 months ago
Cebtenzzre 672cb850f9 differentiate between init failure and unsupported models 11 months ago
Cebtenzzre a5b93cf095 more accurate fallback descriptions 11 months ago
Cebtenzzre 75deee9adb chat: make sure to clear fallback reason on success 11 months ago
Cebtenzzre 2eb83b9f2a chat: report reason for fallback to CPU 11 months ago
Adam Treat 906699e8e9 Bump to latest llama/gguf branch. 11 months ago
Adam Treat ea66669cef Switch to new models2.json for new gguf release and bump our version to
2.5.0.
11 months ago
Cebtenzzre 088afada49 llamamodel: fix static vector in LLamaModel::endTokens 11 months ago
Adam Treat b4d82ea289 Bump to the latest fixes for vulkan in llama. 11 months ago
Adam Treat 12f943e966 Fix regenerate button to be deterministic and bump the llama version to latest we have for gguf. 11 months ago
Cebtenzzre 40c78d2f78 python binding: print debug message to stderr 11 months ago
Adam Treat 5d346e13d7 Add q6_k kernels for vulkan. 11 months ago
Adam Treat 4eefd386d0 Refactor for subgroups on mat * vec kernel. 11 months ago
Cebtenzzre 3c2aa299d8 gptj: remove unused variables 11 months ago
Cebtenzzre f9deb87d20 convert scripts: add feed-forward length for better compatiblilty
This GGUF key is used by all llama.cpp models with upstream support.
11 months ago
Cebtenzzre cc7675d432 convert scripts: make gptj script executable 11 months ago
Cebtenzzre 0493e6eb07 convert scripts: use bytes_to_unicode from transformers 11 months ago
Cebtenzzre a49a1dcdf4 chatllm: grammar fix 11 months ago
Cebtenzzre d5d72f0361 gpt-j: update inference to match latest llama.cpp insights
- Use F16 KV cache
- Store transposed V in the cache
- Avoid unnecessary Q copy

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

ggml upstream commit 0265f0813492602fec0e1159fe61de1bf0ccaf78
11 months ago
Cebtenzzre 050e7f076e backend: port GPT-J to GGUF 11 months ago
Cebtenzzre 31b20f093a modellist: fix the system prompt 11 months ago
Cebtenzzre 8f3abb37ca fix references to removed model types 11 months ago
Cebtenzzre 4219c0e2e7 convert scripts: make them directly executable 11 months ago
Cebtenzzre ce7be1db48 backend: use llamamodel.cpp for Falcon 11 months ago
Cebtenzzre cca9e6ce81 convert_mpt_hf_to_gguf.py: better tokenizer decoding 11 months ago
Cebtenzzre 25297786db convert scripts: load model as late as possible 11 months ago
Cebtenzzre fd47088f2b conversion scripts: cleanup 11 months ago
Cebtenzzre 6277eac9cc backend: use llamamodel.cpp for StarCoder 11 months ago
Cebtenzzre aa706ab1ff backend: use gguf branch of llama.cpp-mainline 11 months ago