Commit Graph

64 Commits (2eb83b9f2a4762898d8d16b57d6388f3ec10b03d)

Author SHA1 Message Date
Cebtenzzre 2eb83b9f2a chat: report reason for fallback to CPU 1 year ago
Adam Treat 12f943e966 Fix regenerate button to be deterministic and bump the llama version to latest we have for gguf. 1 year ago
Cebtenzzre a49a1dcdf4 chatllm: grammar fix 1 year ago
Cebtenzzre 8f3abb37ca fix references to removed model types 1 year ago
Adam Treat d90d003a1d Latest rebase on llama.cpp with gguf support. 1 year ago
Adam Treat 045f6e6cdc Link against ggml in bin so we can get the available devices without loading a model. 1 year ago
Adam Treat aa33419c6e Fallback to CPU more robustly. 1 year ago
Adam Treat 3076e0bf26 Only show GPU when we're actually using it. 1 year ago
Adam Treat 1fa67a585c Report the actual device we're using. 1 year ago
Adam Treat 21a3244645 Fix a bug where we're not properly falling back to CPU. 1 year ago
Aaron Miller 6f038c136b init at most one vulkan device, submodule update
fixes issues w/ multiple of the same gpu
1 year ago
Adam Treat 891ddafc33 When device is Auto (the default) then we will only consider discrete GPU's otherwise fallback to CPU. 1 year ago
Adam Treat 8f99dca70f Bring the vulkan backend to the GUI. 1 year ago
Adam Treat 987546c63b Nomic vulkan backend licensed under the Software for Open Models License (SOM), version 1.0. 1 year ago
Adam Treat 6d03b3e500 Add starcoder support. 1 year ago
Adam Treat 0efdbfcffe Bert 1 year ago
Adam Treat 315a1f2aa2 Move it back as internal class. 1 year ago
Adam Treat 1f749d7633 Clean up backend code a bit and hide impl. details. 1 year ago
Adam Treat 8eb0844277 Check if the trimmed version is empty. 1 year ago
Adam Treat be395c12cc Make all system prompts empty by default if model does not include in training data. 1 year ago
Adam Treat 34a3b9c857 Don't block on exit when not connected. 1 year ago
Adam Treat 88bbe30952 Provide a guardrail for OOM errors. 1 year ago
Adam Treat 99cd555743 Provide some guardrails for thread count. 1 year ago
Adam Treat 3e3b05a2a4 Don't process the system prompt when restoring state. 1 year ago
Adam Treat 12083fcdeb When deleting chats we sometimes have to update our modelinfo. 1 year ago
Adam Treat 59f3c093cb Stop generating anything on shutdown. 1 year ago
Adam Treat 6d9cdf228c Huge change that completely revamps the settings dialog and implements
per model settings as well as the ability to clone a model into a "character."
This also implements system prompts as well as quite a few bugfixes for
instance this fixes chatgpt.
1 year ago
Adam Treat 7f252b4970 This completes the work of consolidating all settings that can be changed by the user on new settings object. 1 year ago
Adam Treat 267601d670 Enable the force metal setting. 1 year ago
Aaron Miller e22dd164d8 add falcon to chatllm::serialize 1 year ago
Aaron Miller 198b5e4832 add Falcon 7B model
Tested with https://huggingface.co/TheBloke/falcon-7b-instruct-GGML/blob/main/falcon7b-instruct.ggmlv3.q4_0.bin
1 year ago
Adam Treat 7f01b153b3 Modellist temp 1 year ago
Adam Treat c8a590bc6f Get rid of last blocking operations and make the chat/llm thread safe. 1 year ago
Adam Treat 84ec4311e9 Remove duplicated state tracking for chatgpt. 1 year ago
Adam Treat 7d2ce06029 Start working on more thread safety and model load error handling. 1 year ago
Adam Treat aa2c824258 Initialize these. 1 year ago
Adam Treat a3a6a20146 Don't store db results in ChatLLM. 1 year ago
Adam Treat 0cfe225506 Remove this as unnecessary. 1 year ago
AT 2b6cc99a31
Show token generation speed in gui. (#1020) 1 year ago
AT a576220b18
Support loading files if 'ggml' is found anywhere in the name not just at (#1001)
the beginning and add deprecated flag to models.json so older versions will
show a model, but later versions don't. This will allow us to transition
away from models < ggmlv2 and still allow older installs of gpt4all to work.
1 year ago
Richard Guo c4706d0c14
Replit Model (#713)
* porting over replit code model to gpt4all

* replaced memory with kv_self struct

* continuing debug

* welp it built but lot of sus things

* working model loading and somewhat working generate.. need to format response?

* revert back to semi working version

* finally got rid of weird formatting

* figured out problem is with python bindings - this is good to go for testing

* addressing PR feedback

* output refactor

* fixed prompt reponse collection

* cleanup

* addressing PR comments

* building replit backend with new ggmlver code

* chatllm replit and clean python files

* cleanup

* updated replit to match new llmodel api

* match llmodel api and change size_t to Token

* resolve PR comments

* replit model commit comment
1 year ago
Andriy Mulyar d8e821134e
Revert "Fix bug with resetting context with chatgpt model." (#859)
This reverts commit 031d7149a7.
1 year ago
Adam Treat 9f590db98d Better error handling when the model fails to load. 1 year ago
niansa/tuxifan f3564ac6b9
Fixed tons of warnings and clazy findings (#811) 1 year ago
Adam Treat 031d7149a7 Fix bug with resetting context with chatgpt model. 1 year ago
Adam Treat aea94f756d Better name for database results. 1 year ago
Adam Treat f62e439a2d Make localdocs work with server mode. 1 year ago
Adam Treat f74363bb3a Fix compile 1 year ago
niansa a3d08cdcd5 Dlopen better implementation management (Version 2) 1 year ago
AT 48275d0dcc
Dlopen backend 5 (#779)
Major change to the backend that allows for pluggable versions of llama.cpp/ggml. This was squashed merged from dlopen_backend_5 where the history is preserved.
1 year ago