Adam Treat
9c66308922
Fix for special im_end token in mpt-7b-chat model.
2023-05-08 18:57:40 -04:00
Adam Treat
a4bec78ec6
Allow these to load for gptj too.
2023-05-08 18:31:20 -04:00
Aaron Miller
821b28a4fa
mpt: allow q4_2 quantized models to load
2023-05-08 18:23:36 -04:00
Aaron Miller
49fc7b315a
mpt tokenizer: better special token handling
...
closer to the behavior of huggingface `tokenizers`,
do not attempt to handle additional tokens as if they were part
of the original vocabulary as this cannot prevent them from being
split into smaller chunks - handle added tokens *before*
the regular tokenizing pass
note this is still necessary even with a "proper" tokenizer implementation
2023-05-08 18:23:36 -04:00
Adam Treat
9da4fac023
Fix gptj to have lower memory requirements for kv cache and add versioning to the internal state to smoothly handle such a fix in the future.
2023-05-08 17:23:02 -04:00
Adam Treat
c7f5280f9f
Fix the version.
2023-05-08 16:50:21 -04:00
Adam Treat
be9e748abe
Remove as upstream has removed.
2023-05-08 15:09:23 -04:00
Adam Treat
126dd97b0a
This shouldn't have snuck in.
2023-05-08 15:09:23 -04:00
Adam Treat
3a8ad1f700
Update to the alibi version that Zach made.
2023-05-08 12:27:01 -04:00
Adam Treat
90b2bcfebe
Match Helly's impl of kv cache.
2023-05-08 12:21:30 -04:00
Adam Treat
368886015d
Use F16 for kv cache on mpt.
2023-05-08 12:21:30 -04:00
Adam Treat
00804c4e3e
Fix for special tokens.
2023-05-08 12:21:30 -04:00
Adam Treat
98e19ebc25
Fix up mpt.
2023-05-08 12:21:30 -04:00
Zach Nussbaum
712aeb8866
fix: helly changes
2023-05-08 12:21:30 -04:00
Zach Nussbaum
d14b93222f
fix: model loading
2023-05-08 12:21:30 -04:00
Zach Nussbaum
28f0f76b9f
fix: convert script working
2023-05-08 12:21:30 -04:00
Zach Nussbaum
d928540a08
feat: load model
2023-05-08 12:21:30 -04:00
Zach Nussbaum
f8f248c18a
chore: import for mpt
2023-05-08 12:21:30 -04:00
Zach Nussbaum
e3f17c8e82
feat: mpt convert from hf to ggml
2023-05-08 12:21:30 -04:00
Zach Nussbaum
285e57ca68
feat: build works + tokenizer
2023-05-08 12:21:30 -04:00
Zach Nussbaum
199a585ad1
feat: add ln 2, rename vars
2023-05-08 12:21:30 -04:00
Zach Nussbaum
21f2aa4911
feat: mpt wip
2023-05-08 12:21:30 -04:00
Adam Treat
a066cba17d
Scaffolding for the mpt <-> ggml project.
2023-05-08 12:21:30 -04:00
Adam Treat
da5b057041
Only generate three words max.
2023-05-08 12:21:30 -04:00
Adam Treat
2b76fa6b20
Restore defaults for repeat penalty too.
2023-05-08 12:21:30 -04:00
Adam Treat
ee016e10ab
Send info on how many are running into this error.
2023-05-08 08:31:35 -04:00
Adam Treat
e0c9d7f8e0
Fail early/gracefully if incompatible hardware detected. And default to universal builds on mac.
2023-05-08 08:23:00 -04:00
Adam Treat
4bcc88b051
Convert the old format properly.
2023-05-08 05:53:16 -04:00
Adam Treat
fb464bb60e
Add debug for chatllm model loading and fix order of getting rid of the
...
dummy chat when no models are restored.
2023-05-07 14:40:02 -04:00
Adam Treat
3a039c8dc1
Deserialize one at a time and don't block gui until all of them are done.
2023-05-07 09:20:09 -04:00
Adam Treat
fc8c158fac
Use last lts for this.
2023-05-07 06:39:32 -04:00
Adam Treat
280ad04c63
The GUI should come up immediately and not wait on deserializing from disk.
2023-05-06 20:01:14 -04:00
Adam Treat
ec7ea8a550
Move the location of the chat files to the model download directory and add a magic+version.
2023-05-06 18:51:49 -04:00
Aaron Miller
516a7ffa23
add name to LICENSE
2023-05-06 13:11:39 -04:00
Adam Treat
eb294d5623
Bump the version and save up to an order of magnitude of disk space for chat files.
2023-05-05 20:12:00 -04:00
Adam Treat
6ba0a1b693
Turn off saving chats to disk by default as it eats so much disk space.
2023-05-05 12:30:11 -04:00
Adam Treat
ba76cecbdf
Bump the version.
2023-05-05 11:43:25 -04:00
Adam Treat
b7b2ff8bab
Add reverse prompt support for gptj too.
2023-05-05 11:16:24 -04:00
Adam Treat
c2a81e5692
Add about dialog.
2023-05-05 10:47:05 -04:00
Adam Treat
cd83723ed7
Persistent state for gpt-j models too.
2023-05-05 10:00:17 -04:00
Adam Treat
a548448fcf
Don't crash if state has not been set.
2023-05-05 10:00:17 -04:00
Richard Guo
561acf81d7
Update monorepo_plan.md
2023-05-05 09:32:45 -04:00
Aaron Miller
56e9fd7e63
include <cstdint> in llmodel.h
2023-05-04 20:36:19 -04:00
Adam Treat
01e582f15b
First attempt at providing a persistent chat list experience.
...
Limitations:
1) Context is not restored for gpt-j models
2) When you switch between different model types in an existing chat
the context and all the conversation is lost
3) The settings are not chat or conversation specific
4) The sizes of the chat persisted files are very large due to how much
data the llama.cpp backend tries to persist. Need to investigate how
we can shrink this.
2023-05-04 15:31:41 -04:00
Adam Treat
02c9bb4ac7
Restore the model when switching chats.
2023-05-03 12:45:14 -04:00
Adam Treat
078675386f
Experiment with a much shorter default prompt template.
2023-05-03 12:19:14 -04:00
Adam Treat
97ec9074e5
Add reverse prompts for llama models.
2023-05-03 11:58:26 -04:00
Adam Treat
fec5093351
Don't exceed the window size for dialogs.
2023-05-03 08:37:45 -04:00
Adam Treat
005898b1bc
Changes the datalake feature so all conversations are captured when opted-in.
2023-05-03 07:54:45 -04:00
Aaron Miller
f487118007
download: make model downloads resumable
...
* save files as `incomplete-{filename}` in the dest folder
* rename into place after hash is confirmed or delete if hash is bad
* resume downloads using http `range`
* if DL is resumed from a different app session rewind a bit -
this is to deal with the case where the file size changes before
the content is fully flushed out
* flush dest file at end of readyRead, this mitigates the above
and provides backpressure on the download if the destination disk
is slower than the network connection
2023-05-02 20:36:25 -04:00