Commit Graph

39 Commits

Author SHA1 Message Date
Aaron Miller
821b28a4fa mpt: allow q4_2 quantized models to load 2023-05-08 18:23:36 -04:00
Aaron Miller
49fc7b315a mpt tokenizer: better special token handling
closer to the behavior of huggingface `tokenizers`,
do not attempt to handle additional tokens as if they were part
of the original vocabulary as this cannot prevent them from being
split into smaller chunks - handle added tokens *before*
the regular tokenizing pass

note this is still necessary even with a "proper" tokenizer implementation
2023-05-08 18:23:36 -04:00
Adam Treat
9da4fac023 Fix gptj to have lower memory requirements for kv cache and add versioning to the internal state to smoothly handle such a fix in the future. 2023-05-08 17:23:02 -04:00
Adam Treat
be9e748abe Remove as upstream has removed. 2023-05-08 15:09:23 -04:00
Adam Treat
3a8ad1f700 Update to the alibi version that Zach made. 2023-05-08 12:27:01 -04:00
Adam Treat
90b2bcfebe Match Helly's impl of kv cache. 2023-05-08 12:21:30 -04:00
Adam Treat
368886015d Use F16 for kv cache on mpt. 2023-05-08 12:21:30 -04:00
Adam Treat
00804c4e3e Fix for special tokens. 2023-05-08 12:21:30 -04:00
Adam Treat
98e19ebc25 Fix up mpt. 2023-05-08 12:21:30 -04:00
Zach Nussbaum
712aeb8866 fix: helly changes 2023-05-08 12:21:30 -04:00
Zach Nussbaum
d14b93222f fix: model loading 2023-05-08 12:21:30 -04:00
Zach Nussbaum
d928540a08 feat: load model 2023-05-08 12:21:30 -04:00
Zach Nussbaum
285e57ca68 feat: build works + tokenizer 2023-05-08 12:21:30 -04:00
Zach Nussbaum
199a585ad1 feat: add ln 2, rename vars 2023-05-08 12:21:30 -04:00
Zach Nussbaum
21f2aa4911 feat: mpt wip 2023-05-08 12:21:30 -04:00
Adam Treat
a066cba17d Scaffolding for the mpt <-> ggml project. 2023-05-08 12:21:30 -04:00
Adam Treat
e0c9d7f8e0 Fail early/gracefully if incompatible hardware detected. And default to universal builds on mac. 2023-05-08 08:23:00 -04:00
Adam Treat
fb464bb60e Add debug for chatllm model loading and fix order of getting rid of the
dummy chat when no models are restored.
2023-05-07 14:40:02 -04:00
Adam Treat
b7b2ff8bab Add reverse prompt support for gptj too. 2023-05-05 11:16:24 -04:00
Adam Treat
cd83723ed7 Persistent state for gpt-j models too. 2023-05-05 10:00:17 -04:00
Aaron Miller
56e9fd7e63 include <cstdint> in llmodel.h 2023-05-04 20:36:19 -04:00
Adam Treat
01e582f15b First attempt at providing a persistent chat list experience.
Limitations:

1) Context is not restored for gpt-j models
2) When you switch between different model types in an existing chat
   the context and all the conversation is lost
3) The settings are not chat or conversation specific
4) The sizes of the chat persisted files are very large due to how much
   data the llama.cpp backend tries to persist. Need to investigate how
   we can shrink this.
2023-05-04 15:31:41 -04:00
Adam Treat
97ec9074e5 Add reverse prompts for llama models. 2023-05-03 11:58:26 -04:00
Adam Treat
34407f1563 Don't set the app version in the llmodel. 2023-04-29 10:31:12 -04:00
Adam Treat
2a5b34b193 Load models from filepath only. 2023-04-28 20:15:10 -04:00
Adam Treat
70ab18f644 Update to latest llama.cpp 2023-04-28 11:03:16 -04:00
Adam Treat
a3253c4ab1 Move the saving of the tokens to the impl and not the callbacks responsibility. 2023-04-27 11:16:51 -04:00
Adam Treat
9a65f73392 Move the promptCallback to own function. 2023-04-27 11:08:15 -04:00
Adam Treat
ebf660d2bd Provide an initial impl. of the C interface. NOTE: has not been tested. 2023-04-27 09:43:24 -04:00
Adam Treat
368cd8e119 Add this and unbreak the build. 2023-04-26 22:45:10 -04:00
Adam Treat
eafb98b3a9 Initial support for opt-in telemetry. 2023-04-26 22:05:56 -04:00
Adam Treat
70e6b45123 Don't crash when prompt is too large. 2023-04-26 19:08:37 -04:00
Adam Treat
b04ab8fb5c Update llama.cpp submodule to latest. 2023-04-26 11:50:05 -04:00
Adam Treat
ebc51b3e8d Clean up the docs a bit more still. 2023-04-26 08:22:38 -04:00
Adam Treat
ae7ca04408 Clean up the docs a bit more. 2023-04-26 08:22:38 -04:00
Adam Treat
4e5c4927fc Clean up the docs a bit. 2023-04-26 08:22:38 -04:00
Adam Treat
04190e6107 Only need one opaque pointer. 2023-04-26 08:22:38 -04:00
Adam Treat
d86b441c5d Fixup the api a bit. 2023-04-26 08:22:38 -04:00
Adam Treat
4b47478626 Move the backend code into own subdirectory and make it a shared library. Begin fleshing out the C api wrapper that bindings can use. 2023-04-26 08:22:38 -04:00