Commit Graph

8 Commits

Author SHA1 Message Date
niansa
0855c0df1d Fixed Replit implementation compile warnings 2023-06-26 14:49:58 -03:00
Aaron Miller
1290b32451 update to latest mainline llama.cpp
add max_size param to ggml_metal_add_buffer - introduced in https://github.com/ggerganov/llama.cpp/pull/1826
2023-06-26 14:40:52 -03:00
Adam Treat
bd58c46da0 Initialize these to nullptr to prevent double deletion when a model fails to load. 2023-06-20 18:23:45 -04:00
niansa/tuxifan
68f9786ed9
Use operator ""_MiB (#991) 2023-06-16 15:56:22 -04:00
Aaron Miller
f71d8efc71
metal replit (#931)
metal+replit

makes replit work with Metal and removes its use of `mem_per_token`
in favor of fixed size scratch buffers (closer to llama.cpp)
2023-06-13 07:29:14 -07:00
Aaron Miller
88616fde7f
llmodel: change tokenToString to not use string_view (#968)
fixes a definite use-after-free and likely avoids some other
potential ones - std::string will convert to a std::string_view
automatically but as soon as the std::string in question goes out of
scope it is already freed and the string_view is pointing at freed
memory - this is *mostly* fine if its returning a reference to the
tokenizer's internal vocab table but it's, imo, too easy to return a
reference to a dynamically constructed string with this as replit is
doing (and unfortunately needs to do to convert the internal whitespace
replacement symbol back to a space)
2023-06-13 07:14:02 -04:00
niansa/tuxifan
f03da8d732 Removed double-static from variables in replit.cpp
The anonymous namespace already makes it static.

Signed-off-by: niansa/tuxifan <tuxifan@posteo.de>
2023-06-09 08:55:15 -04:00
Richard Guo
c4706d0c14
Replit Model (#713)
* porting over replit code model to gpt4all

* replaced memory with kv_self struct

* continuing debug

* welp it built but lot of sus things

* working model loading and somewhat working generate.. need to format response?

* revert back to semi working version

* finally got rid of weird formatting

* figured out problem is with python bindings - this is good to go for testing

* addressing PR feedback

* output refactor

* fixed prompt reponse collection

* cleanup

* addressing PR comments

* building replit backend with new ggmlver code

* chatllm replit and clean python files

* cleanup

* updated replit to match new llmodel api

* match llmodel api and change size_t to Token

* resolve PR comments

* replit model commit comment
2023-06-06 17:09:00 -04:00