no effect, but avoids a *potential* bug later if we use
actualVocabSize - which is for when a model has a larger
embedding tensor/# of output logits than actually trained token
to allow room for adding extras in finetuning - presently all of our
models have had "placeholder" tokens in the vocab so this hasn't broken
anything, but if the sizes did differ we want the equivalent of
`logits[actualVocabSize:]` (the start point is unchanged), not
`logits[-actualVocabSize:]` (this.)
* porting over replit code model to gpt4all
* replaced memory with kv_self struct
* continuing debug
* welp it built but lot of sus things
* working model loading and somewhat working generate.. need to format response?
* revert back to semi working version
* finally got rid of weird formatting
* figured out problem is with python bindings - this is good to go for testing
* addressing PR feedback
* output refactor
* fixed prompt reponse collection
* cleanup
* addressing PR comments
* building replit backend with new ggmlver code
* chatllm replit and clean python files
* cleanup
* updated replit to match new llmodel api
* match llmodel api and change size_t to Token
* resolve PR comments
* replit model commit comment
Major change to the backend that allows for pluggable versions of llama.cpp/ggml. This was squashed merged from dlopen_backend_5 where the history is preserved.
Improves output quality by making these tokenizers more closely
match the behavior of the huggingface `tokenizers` based BPE
tokenizers these models were trained with.
Featuring:
* Fixed unicode handling (via ICU)
* Fixed BPE token merge handling
* Complete added vocabulary handling
Fix occurrences of the prompt callback being incorrectly specified, or
the response callback's prototype being incorrectly used in its place.
Signed-off-by: Juuso Alasuutari <juuso.alasuutari@gmail.com>
Caught with AddressSanitizer running a basic prompt test against llmodel
standalone. This fix allows ASan builds to complete a simple prompt
without illegal accesses but there are still notably several leaks.
Fixes the bug where llmodel_model_create prints "Invalid model file" even though the model is loaded correctly. Credits and thanks to @serendipity for the fix.