Commit Graph

1463 Commits (7a190473292604734f02f1a347020ab4f44a4617)
 

Author SHA1 Message Date
cebtenzzre 7a19047329
llmodel: do not call magic_match unless build variant is correct (#1488) 9 months ago
Adam Treat df8528df73 Another codespell attempted fix. 9 months ago
Adam Treat f0742c22f4 Restore state from text if necessary. 9 months ago
Adam Treat 35f9cdb70a Do not delete saved chats if we fail to serialize properly. 9 months ago
cebtenzzre 9fb135e020
cmake: install the GPT-J plugin (#1487) 9 months ago
Cebtenzzre df66226f7d issue template: remove "Related Components" section 9 months ago
Aaron Miller 3c25d81759 make codespell happy 9 months ago
Jan Philipp Harries 4f0cee9330 added EM German Mistral Model 9 months ago
Adam Treat 56c0d2898d Update the language here to avoid misunderstanding. 9 months ago
Adam Treat b2cd3bdb3f Fix crasher with an empty string for prompt template. 9 months ago
Cebtenzzre 5fe685427a chat: clearer CPU fallback messages 9 months ago
Adam Treat eec906aa05 Speculative fix for build on mac. 9 months ago
Aaron Miller 9325075f80 fix stray comma in models2.json
Signed-off-by: Aaron Miller <apage43@ninjawhale.com>
9 months ago
Adam Treat a9acdd25de Push a new version number for llmodel backend now that it is based on gguf. 9 months ago
Adam Treat f028f67c68 Add starcoder, rift and sbert to our models2.json. 9 months ago
Aaron Miller a10f3aea5e python/embed4all: use gguf model, allow passing kwargs/overriding model 9 months ago
Cebtenzzre 8bb6a6c201 rebase on newer llama.cpp 9 months ago
Adam Treat 4528f73479 Reorder and refresh our models2.json. 9 months ago
Cebtenzzre d87573ea75 remove old llama.cpp submodules 9 months ago
Cebtenzzre cc6db61c93 backend: fix build with Visual Studio generator
Use the $<CONFIG> generator expression instead of CMAKE_BUILD_TYPE. This
is needed because Visual Studio is a multi-configuration generator, so
we do not know what the build type will be until `cmake --build` is
called.

Fixes #1470
9 months ago
Adam Treat f605a5b686 Add q8_0 kernels to kompute shaders and bump to latest llama/gguf. 9 months ago
Cebtenzzre 1534df3e9f backend: do not use Vulkan with non-LLaMA models 9 months ago
Cebtenzzre 672cb850f9 differentiate between init failure and unsupported models 9 months ago
Cebtenzzre a5b93cf095 more accurate fallback descriptions 9 months ago
Cebtenzzre 75deee9adb chat: make sure to clear fallback reason on success 9 months ago
Cebtenzzre 2eb83b9f2a chat: report reason for fallback to CPU 9 months ago
Adam Treat 906699e8e9 Bump to latest llama/gguf branch. 9 months ago
Adam Treat ea66669cef Switch to new models2.json for new gguf release and bump our version to
2.5.0.
9 months ago
Cebtenzzre 088afada49 llamamodel: fix static vector in LLamaModel::endTokens 9 months ago
Adam Treat b4d82ea289 Bump to the latest fixes for vulkan in llama. 9 months ago
Adam Treat 12f943e966 Fix regenerate button to be deterministic and bump the llama version to latest we have for gguf. 9 months ago
Cebtenzzre 40c78d2f78 python binding: print debug message to stderr 9 months ago
Adam Treat 5d346e13d7 Add q6_k kernels for vulkan. 9 months ago
Adam Treat 4eefd386d0 Refactor for subgroups on mat * vec kernel. 9 months ago
Cebtenzzre 3c2aa299d8 gptj: remove unused variables 9 months ago
Cebtenzzre f9deb87d20 convert scripts: add feed-forward length for better compatiblilty
This GGUF key is used by all llama.cpp models with upstream support.
9 months ago
Cebtenzzre cc7675d432 convert scripts: make gptj script executable 9 months ago
Cebtenzzre 0493e6eb07 convert scripts: use bytes_to_unicode from transformers 9 months ago
Cebtenzzre a49a1dcdf4 chatllm: grammar fix 9 months ago
Cebtenzzre d5d72f0361 gpt-j: update inference to match latest llama.cpp insights
- Use F16 KV cache
- Store transposed V in the cache
- Avoid unnecessary Q copy

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

ggml upstream commit 0265f0813492602fec0e1159fe61de1bf0ccaf78
9 months ago
Cebtenzzre 050e7f076e backend: port GPT-J to GGUF 9 months ago
Cebtenzzre 31b20f093a modellist: fix the system prompt 9 months ago
Cebtenzzre 8f3abb37ca fix references to removed model types 9 months ago
Cebtenzzre 4219c0e2e7 convert scripts: make them directly executable 9 months ago
Cebtenzzre ce7be1db48 backend: use llamamodel.cpp for Falcon 9 months ago
Cebtenzzre cca9e6ce81 convert_mpt_hf_to_gguf.py: better tokenizer decoding 9 months ago
Cebtenzzre 25297786db convert scripts: load model as late as possible 9 months ago
Cebtenzzre fd47088f2b conversion scripts: cleanup 9 months ago
Cebtenzzre 6277eac9cc backend: use llamamodel.cpp for StarCoder 9 months ago
Cebtenzzre aa706ab1ff backend: use gguf branch of llama.cpp-mainline 9 months ago