Jared Van Bortel
271d752701
localdocs: small but important fixes to local docs ( #2236 )
...
* chat: use .rmodel extension for Nomic Embed
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
* database: fix order of SQL arguments in updateDocument
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
---------
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-04-18 14:51:13 -04:00
Jared Van Bortel
ac498f79ac
fix regressions in system prompt handling ( #2219 )
...
* python: fix system prompt being ignored
* fix unintended whitespace after system prompt
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-04-15 11:39:48 -04:00
Olyxz16
2c0a660e6e
feat: Add support for Mistral API models ( #2053 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Cédric Sazos <cedric.sazos@tutanota.com>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
2024-03-13 18:23:57 -04:00
Jared Van Bortel
406e88b59a
implement local Nomic Embed via llama.cpp ( #2086 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-13 18:09:24 -04:00
Xu Zhen
0072860d24
Fix compatibility with Qt 6.4
...
Signed-off-by: Xu Zhen <xuzhen@users.noreply.github.com>
2024-03-12 07:42:22 -05:00
Adam Treat
17dee02287
Fix for issue #2080 where the GUI appears to hang when a chat with a large
...
model is deleted. There is no reason to save the context for a chat that
is being deleted.
Signed-off-by: Adam Treat <treat.adam@gmail.com>
2024-03-06 16:52:17 -06:00
Jared Van Bortel
44717682a7
chat: implement display of model loading warnings ( #2034 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-06 17:14:54 -05:00
Jared Van Bortel
a0bd96f75d
chat: join ChatLLM threads without calling destructors ( #2043 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-06 16:42:59 -05:00
Jared Van Bortel
2a91ffd73f
chatllm: fix undefined behavior in resetContext
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-06 12:54:19 -06:00
chrisbarrera
f8b1069a1c
add min_p sampling parameter ( #2014 )
...
Signed-off-by: Christopher Barrera <cb@arda.tx.rr.com>
Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
2024-02-24 17:51:34 -05:00
Adam Treat
67bbce43ab
Fix state issues with reloading model.
...
Signed-off-by: Adam Treat <treat.adam@gmail.com>
2024-02-21 16:05:49 -05:00
Jared Van Bortel
4fc4d94be4
fix chat-style prompt templates ( #1970 )
...
Also use a new version of Mistral OpenOrca.
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-02-21 15:45:32 -05:00
Adam Treat
fa0a2129dc
Don't try and detect model load error on startup.
...
Signed-off-by: Adam Treat <treat.adam@gmail.com>
2024-02-21 10:15:20 -06:00
Adam Treat
67099f80ba
Add comment to make this clear.
...
Signed-off-by: Adam Treat <treat.adam@gmail.com>
2024-02-21 10:15:20 -06:00
Adam Treat
d948a4f2ee
Complete revamp of model loading to allow for more discreet control by
...
the user of the models loading behavior.
Signed-off-by: Adam Treat <treat.adam@gmail.com>
2024-02-21 10:15:20 -06:00
Adam Treat
4461af35c7
Fix includes.
...
Signed-off-by: Adam Treat <treat.adam@gmail.com>
2024-02-05 16:46:16 -05:00
Jared Van Bortel
10e3f7bbf5
Fix VRAM leak when model loading fails ( #1901 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-02-01 15:45:45 -05:00
Adam Treat
d14b95f4bd
Add Nomic Embed model for atlas with localdocs.
2024-01-31 22:22:08 -05:00
Jared Van Bortel
061d1969f8
expose n_gpu_layers parameter of llama.cpp ( #1890 )
...
Also dynamically limit the GPU layers and context length fields to the maximum supported by the model.
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-01-31 14:17:44 -05:00
Jared Van Bortel
c7ea283f1f
chatllm: fix deserialization version mismatch ( #1859 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-01-22 10:01:31 -05:00
Jared Van Bortel
d1c56b8b28
Implement configurable context length ( #1749 )
2023-12-16 17:58:15 -05:00
Jared Van Bortel
0600f551b3
chatllm: do not attempt to serialize incompatible state ( #1742 )
2023-12-12 11:45:03 -05:00
Adam Treat
fb3b1ceba2
Do not attempt to do a blocking retrieval if we don't have any collections.
2023-12-04 12:58:40 -05:00
Moritz Tim W
012f399639
fix typo ( #1697 )
2023-11-30 12:37:52 -05:00
Adam Treat
9e27a118ed
Fix system prompt.
2023-11-21 10:42:12 -05:00
Adam Treat
5c0d077f74
Remove leading whitespace in responses.
2023-10-28 16:53:42 -04:00
Adam Treat
dc2e7d6e9b
Don't start recalculating context immediately upon switching to a new chat
...
but rather wait until the first prompt. This allows users to switch between
chats fast and to delete chats more easily.
Fixes issue #1545
2023-10-28 16:41:23 -04:00
cebtenzzre
4338e72a51
MPT: use upstream llama.cpp implementation ( #1515 )
2023-10-19 15:25:17 -04:00
cebtenzzre
04499d1c7d
chatllm: do not write uninitialized data to stream ( #1486 )
2023-10-11 11:31:34 -04:00
Adam Treat
f0742c22f4
Restore state from text if necessary.
2023-10-11 09:16:02 -04:00
Adam Treat
b2cd3bdb3f
Fix crasher with an empty string for prompt template.
2023-10-06 12:44:53 -04:00
Cebtenzzre
5fe685427a
chat: clearer CPU fallback messages
2023-10-06 11:35:14 -04:00
Cebtenzzre
1534df3e9f
backend: do not use Vulkan with non-LLaMA models
2023-10-05 18:16:19 -04:00
Cebtenzzre
672cb850f9
differentiate between init failure and unsupported models
2023-10-05 18:16:19 -04:00
Cebtenzzre
a5b93cf095
more accurate fallback descriptions
2023-10-05 18:16:19 -04:00
Cebtenzzre
75deee9adb
chat: make sure to clear fallback reason on success
2023-10-05 18:16:19 -04:00
Cebtenzzre
2eb83b9f2a
chat: report reason for fallback to CPU
2023-10-05 18:16:19 -04:00
Adam Treat
12f943e966
Fix regenerate button to be deterministic and bump the llama version to latest we have for gguf.
2023-10-05 18:16:19 -04:00
Cebtenzzre
a49a1dcdf4
chatllm: grammar fix
2023-10-05 18:16:19 -04:00
Cebtenzzre
8f3abb37ca
fix references to removed model types
2023-10-05 18:16:19 -04:00
Adam Treat
d90d003a1d
Latest rebase on llama.cpp with gguf support.
2023-10-05 18:16:19 -04:00
Adam Treat
045f6e6cdc
Link against ggml in bin so we can get the available devices without loading a model.
2023-09-15 14:45:25 -04:00
Adam Treat
aa33419c6e
Fallback to CPU more robustly.
2023-09-14 16:53:11 -04:00
Adam Treat
3076e0bf26
Only show GPU when we're actually using it.
2023-09-14 09:59:19 -04:00
Adam Treat
1fa67a585c
Report the actual device we're using.
2023-09-14 08:25:37 -04:00
Adam Treat
21a3244645
Fix a bug where we're not properly falling back to CPU.
2023-09-13 19:30:27 -04:00
Aaron Miller
6f038c136b
init at most one vulkan device, submodule update
...
fixes issues w/ multiple of the same gpu
2023-09-13 12:49:53 -07:00
Adam Treat
891ddafc33
When device is Auto (the default) then we will only consider discrete GPU's otherwise fallback to CPU.
2023-09-13 11:59:36 -04:00
Adam Treat
8f99dca70f
Bring the vulkan backend to the GUI.
2023-09-13 11:26:10 -04:00
Adam Treat
987546c63b
Nomic vulkan backend licensed under the Software for Open Models License (SOM), version 1.0.
2023-08-31 15:29:54 -04:00