Commit Graph

872 Commits

Author SHA1 Message Date
Adam Treat
5f372bd881 Gracefully handle when we have a previous chat where the model that it used has gone away. 2023-05-08 20:51:03 -04:00
Adam Treat
8b80345c98 Copy pasta. 2023-05-08 19:10:22 -04:00
Adam Treat
af4a67c109 Fix for special im_end token in mpt-7b-chat model. 2023-05-08 18:57:40 -04:00
Adam Treat
d3ec333314 Allow these to load for gptj too. 2023-05-08 18:31:20 -04:00
Aaron Miller
5002614b20 mpt: allow q4_2 quantized models to load 2023-05-08 18:23:36 -04:00
Aaron Miller
832720dd27 mpt tokenizer: better special token handling
closer to the behavior of huggingface `tokenizers`,
do not attempt to handle additional tokens as if they were part
of the original vocabulary as this cannot prevent them from being
split into smaller chunks - handle added tokens *before*
the regular tokenizing pass

note this is still necessary even with a "proper" tokenizer implementation
2023-05-08 18:23:36 -04:00
Adam Treat
8c4b8f215f Fix gptj to have lower memory requirements for kv cache and add versioning to the internal state to smoothly handle such a fix in the future. 2023-05-08 17:23:02 -04:00
Adam Treat
ccbd16cf18 Fix the version. 2023-05-08 16:50:21 -04:00
Adam Treat
a549871220 Remove as upstream has removed. 2023-05-08 15:09:23 -04:00
Adam Treat
dfe85386b5 This shouldn't have snuck in. 2023-05-08 15:09:23 -04:00
Adam Treat
992e553cfa Update to the alibi version that Zach made. 2023-05-08 12:27:01 -04:00
Adam Treat
98aedd2173 Match Helly's impl of kv cache. 2023-05-08 12:21:30 -04:00
Adam Treat
eb77d5157b Use F16 for kv cache on mpt. 2023-05-08 12:21:30 -04:00
Adam Treat
dc559c1575 Fix for special tokens. 2023-05-08 12:21:30 -04:00
Adam Treat
b6886c0e31 Fix up mpt. 2023-05-08 12:21:30 -04:00
Zach Nussbaum
61e2aabadb fix: helly changes 2023-05-08 12:21:30 -04:00
Zach Nussbaum
d30be81506 fix: model loading 2023-05-08 12:21:30 -04:00
Zach Nussbaum
f732ba2d56 fix: convert script working 2023-05-08 12:21:30 -04:00
Zach Nussbaum
6a56bcaf06 feat: load model 2023-05-08 12:21:30 -04:00
Zach Nussbaum
58069dc8b9 chore: import for mpt 2023-05-08 12:21:30 -04:00
Zach Nussbaum
03bde18e49 feat: mpt convert from hf to ggml 2023-05-08 12:21:30 -04:00
Zach Nussbaum
2f6ecbe798 feat: build works + tokenizer 2023-05-08 12:21:30 -04:00
Zach Nussbaum
525b703984 feat: add ln 2, rename vars 2023-05-08 12:21:30 -04:00
Zach Nussbaum
aef524b460 feat: mpt wip 2023-05-08 12:21:30 -04:00
Adam Treat
159053be5a Scaffolding for the mpt <-> ggml project. 2023-05-08 12:21:30 -04:00
Adam Treat
40b976436a Only generate three words max. 2023-05-08 12:21:30 -04:00
Adam Treat
49a6a6ed65 Restore defaults for repeat penalty too. 2023-05-08 12:21:30 -04:00
Adam Treat
c054efa6ac Send info on how many are running into this error. 2023-05-08 08:31:35 -04:00
Adam Treat
6d943917f1 Fail early/gracefully if incompatible hardware detected. And default to universal builds on mac. 2023-05-08 08:23:00 -04:00
Adam Treat
3c30310539 Convert the old format properly. 2023-05-08 05:53:16 -04:00
Adam Treat
7b66cb7119 Add debug for chatllm model loading and fix order of getting rid of the
dummy chat when no models are restored.
2023-05-07 14:40:02 -04:00
Adam Treat
9bd5609ba0 Deserialize one at a time and don't block gui until all of them are done. 2023-05-07 09:20:09 -04:00
Adam Treat
86da175e1c Use last lts for this. 2023-05-07 06:39:32 -04:00
Adam Treat
ab13148430 The GUI should come up immediately and not wait on deserializing from disk. 2023-05-06 20:01:14 -04:00
Adam Treat
eb7b61a76d Move the location of the chat files to the model download directory and add a magic+version. 2023-05-06 18:51:49 -04:00
Aaron Miller
7a8f437f8f add name to LICENSE 2023-05-06 13:11:39 -04:00
Adam Treat
e397fda250 Bump the version and save up to an order of magnitude of disk space for chat files. 2023-05-05 20:12:00 -04:00
Adam Treat
8d2c8c8cb0 Turn off saving chats to disk by default as it eats so much disk space. 2023-05-05 12:30:11 -04:00
Adam Treat
6d4d86d07c Bump the version. 2023-05-05 11:43:25 -04:00
Adam Treat
d0d5d84e06 Add reverse prompt support for gptj too. 2023-05-05 11:16:24 -04:00
Adam Treat
06bb6960d4 Add about dialog. 2023-05-05 10:47:05 -04:00
Adam Treat
659442394f Persistent state for gpt-j models too. 2023-05-05 10:00:17 -04:00
Adam Treat
5b71d39024 Don't crash if state has not been set. 2023-05-05 10:00:17 -04:00
Richard Guo
7ab7d948b5
Update monorepo_plan.md 2023-05-05 09:32:45 -04:00
Aaron Miller
019f6d0103 include <cstdint> in llmodel.h 2023-05-04 20:36:19 -04:00
Adam Treat
f291853e51 First attempt at providing a persistent chat list experience.
Limitations:

1) Context is not restored for gpt-j models
2) When you switch between different model types in an existing chat
   the context and all the conversation is lost
3) The settings are not chat or conversation specific
4) The sizes of the chat persisted files are very large due to how much
   data the llama.cpp backend tries to persist. Need to investigate how
   we can shrink this.
2023-05-04 15:31:41 -04:00
Adam Treat
081d32bd97 Restore the model when switching chats. 2023-05-03 12:45:14 -04:00
Adam Treat
0bb52fc5fe Experiment with a much shorter default prompt template. 2023-05-03 12:19:14 -04:00
Adam Treat
82c1d08b33 Add reverse prompts for llama models. 2023-05-03 11:58:26 -04:00
Adam Treat
01accf9e33 Don't exceed the window size for dialogs. 2023-05-03 08:37:45 -04:00