Commit Graph

107 Commits (69f766cbbbce858fdef866901f5255c353b93949)

Author SHA1 Message Date
Jared Van Bortel d3d777bc51
chat: fix #includes with include-what-you-use (#2401)
Also use qGuiApp instead of qApp.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
4 months ago
Jared Van Bortel d2a99d9bc6
support the llama.cpp CUDA backend (#2310)
* rebase onto llama.cpp commit ggerganov/llama.cpp@d46dbc76f
* support for CUDA backend (enabled by default)
* partial support for Occam's Vulkan backend (disabled by default)
* partial support for HIP/ROCm backend (disabled by default)
* sync llama.cpp.cmake with upstream llama.cpp CMakeLists.txt
* changes to GPT4All backend, bindings, and chat UI to handle choice of llama.cpp backend (Kompute or CUDA)
* ship CUDA runtime with installed version
* make device selection in the UI on macOS actually do something
* model whitelist: remove dbrx, mamba, persimmon, plamo; add internlm and starcoder2

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
4 months ago
Jared Van Bortel 7e1e00f331
chat: fix issues with quickly switching between multiple chats (#2343)
* prevent load progress from getting out of sync with the current chat
* fix memory leak on exit if the LLModelStore contains a model
* do not report cancellation as a failure in console/Mixpanel
* show "waiting for model" separately from "switching context" in UI
* do not show lower "reload" button on error
* skip context switch if unload is pending
* skip unnecessary calls to LLModel::saveState

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
4 months ago
Jared Van Bortel 7f1c3d4275
chatllm: fix model loading progress showing "Reload" sometimes (#2337)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
4 months ago
Jared Van Bortel 5fb9d17c00
chatllm: use a better prompt for the generated chat name (#2322)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
4 months ago
Jared Van Bortel adaecb7a72
mixpanel: improved GPU device statistics (plus GPU sort order fix) (#2297)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
5 months ago
Jared Van Bortel c622921894
improve mixpanel usage statistics (#2238)
Other changes:
- Always display first start dialog if privacy options are unset (e.g. if the user closed GPT4All without selecting them)
- LocalDocs scanQueue is now always deferred
- Fix a potential crash in magic_match
- LocalDocs indexing is now started after the first start dialog is dismissed so usage stats are included

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
5 months ago
Jared Van Bortel 271d752701
localdocs: small but important fixes to local docs (#2236)
* chat: use .rmodel extension for Nomic Embed

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* database: fix order of SQL arguments in updateDocument

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

---------

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
5 months ago
Jared Van Bortel ac498f79ac
fix regressions in system prompt handling (#2219)
* python: fix system prompt being ignored
* fix unintended whitespace after system prompt

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
5 months ago
Olyxz16 2c0a660e6e
feat: Add support for Mistral API models (#2053)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Cédric Sazos <cedric.sazos@tutanota.com>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
6 months ago
Jared Van Bortel 406e88b59a
implement local Nomic Embed via llama.cpp (#2086)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
6 months ago
Xu Zhen 0072860d24 Fix compatibility with Qt 6.4
Signed-off-by: Xu Zhen <xuzhen@users.noreply.github.com>
6 months ago
Adam Treat 17dee02287 Fix for issue #2080 where the GUI appears to hang when a chat with a large
model is deleted. There is no reason to save the context for a chat that
is being deleted.

Signed-off-by: Adam Treat <treat.adam@gmail.com>
7 months ago
Jared Van Bortel 44717682a7
chat: implement display of model loading warnings (#2034)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel a0bd96f75d
chat: join ChatLLM threads without calling destructors (#2043)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel 2a91ffd73f chatllm: fix undefined behavior in resetContext
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
chrisbarrera f8b1069a1c
add min_p sampling parameter (#2014)
Signed-off-by: Christopher Barrera <cb@arda.tx.rr.com>
Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
7 months ago
Adam Treat 67bbce43ab Fix state issues with reloading model.
Signed-off-by: Adam Treat <treat.adam@gmail.com>
7 months ago
Jared Van Bortel 4fc4d94be4
fix chat-style prompt templates (#1970)
Also use a new version of Mistral OpenOrca.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Adam Treat fa0a2129dc Don't try and detect model load error on startup.
Signed-off-by: Adam Treat <treat.adam@gmail.com>
7 months ago
Adam Treat 67099f80ba Add comment to make this clear.
Signed-off-by: Adam Treat <treat.adam@gmail.com>
7 months ago
Adam Treat d948a4f2ee Complete revamp of model loading to allow for more discreet control by
the user of the models loading behavior.

Signed-off-by: Adam Treat <treat.adam@gmail.com>
7 months ago
Adam Treat 4461af35c7 Fix includes.
Signed-off-by: Adam Treat <treat.adam@gmail.com>
8 months ago
Jared Van Bortel 10e3f7bbf5
Fix VRAM leak when model loading fails (#1901)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Adam Treat d14b95f4bd Add Nomic Embed model for atlas with localdocs. 8 months ago
Jared Van Bortel 061d1969f8
expose n_gpu_layers parameter of llama.cpp (#1890)
Also dynamically limit the GPU layers and context length fields to the maximum supported by the model.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel c7ea283f1f
chatllm: fix deserialization version mismatch (#1859)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel d1c56b8b28
Implement configurable context length (#1749) 9 months ago
Jared Van Bortel 0600f551b3
chatllm: do not attempt to serialize incompatible state (#1742) 9 months ago
Adam Treat fb3b1ceba2 Do not attempt to do a blocking retrieval if we don't have any collections. 10 months ago
Moritz Tim W 012f399639
fix typo (#1697) 10 months ago
Adam Treat 9e27a118ed Fix system prompt. 10 months ago
Adam Treat 5c0d077f74 Remove leading whitespace in responses. 11 months ago
Adam Treat dc2e7d6e9b Don't start recalculating context immediately upon switching to a new chat
but rather wait until the first prompt. This allows users to switch between
chats fast and to delete chats more easily.

Fixes issue #1545
11 months ago
cebtenzzre 4338e72a51
MPT: use upstream llama.cpp implementation (#1515) 11 months ago
cebtenzzre 04499d1c7d
chatllm: do not write uninitialized data to stream (#1486) 11 months ago
Adam Treat f0742c22f4 Restore state from text if necessary. 12 months ago
Adam Treat b2cd3bdb3f Fix crasher with an empty string for prompt template. 12 months ago
Cebtenzzre 5fe685427a chat: clearer CPU fallback messages 12 months ago
Cebtenzzre 1534df3e9f backend: do not use Vulkan with non-LLaMA models 12 months ago
Cebtenzzre 672cb850f9 differentiate between init failure and unsupported models 12 months ago
Cebtenzzre a5b93cf095 more accurate fallback descriptions 12 months ago
Cebtenzzre 75deee9adb chat: make sure to clear fallback reason on success 12 months ago
Cebtenzzre 2eb83b9f2a chat: report reason for fallback to CPU 12 months ago
Adam Treat 12f943e966 Fix regenerate button to be deterministic and bump the llama version to latest we have for gguf. 12 months ago
Cebtenzzre a49a1dcdf4 chatllm: grammar fix 12 months ago
Cebtenzzre 8f3abb37ca fix references to removed model types 12 months ago
Adam Treat d90d003a1d Latest rebase on llama.cpp with gguf support. 12 months ago
Adam Treat 045f6e6cdc Link against ggml in bin so we can get the available devices without loading a model. 1 year ago
Adam Treat aa33419c6e Fallback to CPU more robustly. 1 year ago