Adam Treat
8b80345c98
Copy pasta.
2023-05-08 19:10:22 -04:00
Adam Treat
af4a67c109
Fix for special im_end token in mpt-7b-chat model.
2023-05-08 18:57:40 -04:00
Aaron Miller
5002614b20
mpt: allow q4_2 quantized models to load
2023-05-08 18:23:36 -04:00
Aaron Miller
832720dd27
mpt tokenizer: better special token handling
...
closer to the behavior of huggingface `tokenizers`,
do not attempt to handle additional tokens as if they were part
of the original vocabulary as this cannot prevent them from being
split into smaller chunks - handle added tokens *before*
the regular tokenizing pass
note this is still necessary even with a "proper" tokenizer implementation
2023-05-08 18:23:36 -04:00
Adam Treat
98aedd2173
Match Helly's impl of kv cache.
2023-05-08 12:21:30 -04:00
Adam Treat
eb77d5157b
Use F16 for kv cache on mpt.
2023-05-08 12:21:30 -04:00
Adam Treat
dc559c1575
Fix for special tokens.
2023-05-08 12:21:30 -04:00
Adam Treat
b6886c0e31
Fix up mpt.
2023-05-08 12:21:30 -04:00
Zach Nussbaum
61e2aabadb
fix: helly changes
2023-05-08 12:21:30 -04:00
Zach Nussbaum
d30be81506
fix: model loading
2023-05-08 12:21:30 -04:00
Zach Nussbaum
2f6ecbe798
feat: build works + tokenizer
2023-05-08 12:21:30 -04:00
Zach Nussbaum
525b703984
feat: add ln 2, rename vars
2023-05-08 12:21:30 -04:00
Zach Nussbaum
aef524b460
feat: mpt wip
2023-05-08 12:21:30 -04:00
Adam Treat
159053be5a
Scaffolding for the mpt <-> ggml project.
2023-05-08 12:21:30 -04:00