Commit Graph

226 Commits (0cc5a806563a9264b7c7751f4ca50e2f700c0847)

Author SHA1 Message Date
Jared Van Bortel 5c248dbec9
models: new MPT model file without duplicated token_embd.weight (#2006)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel c19b763e03
llmodel_c: expose fakeReply to the bindings (#2061)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel f500bcf6e5
llmodel: default to a blank line between reply and next prompt (#1996)
Also make some related adjustments to the provided Alpaca-style prompt templates
and system prompts.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel 007d469034
bert: fix layer norm epsilon value (#1946)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Adam Treat f720261d46 Fix another vulnerable spot for crashes.
Signed-off-by: Adam Treat <treat.adam@gmail.com>
7 months ago
chrisbarrera f8b1069a1c
add min_p sampling parameter (#2014)
Signed-off-by: Christopher Barrera <cb@arda.tx.rr.com>
Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
7 months ago
Jared Van Bortel e7f2ff189f fix some compilation warnings on macOS
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel 88e330ef0e
llama.cpp: enable Kompute support for 10 more model arches (#2005)
These are Baichuan, Bert and Nomic Bert, CodeShell, GPT-2, InternLM,
MiniCPM, Orion, Qwen, and StarCoder.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel fc6c5ea0c7
llama.cpp: gemma: allow offloading the output tensor (#1997)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel 4fc4d94be4
fix chat-style prompt templates (#1970)
Also use a new version of Mistral OpenOrca.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel 7810b757c9 llamamodel: add gemma model support
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Adam Treat d948a4f2ee Complete revamp of model loading to allow for more discreet control by
the user of the models loading behavior.

Signed-off-by: Adam Treat <treat.adam@gmail.com>
7 months ago
Jared Van Bortel 6fdec808b2 backend: update llama.cpp for faster state serialization
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel a1471becf3 backend: update llama.cpp for Intel GPU blacklist
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel eb1081d37e cmake: fix LLAMA_DIR use before set
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel e60b388a2e cmake: fix backwards LLAMA_KOMPUTE default
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel fc7e5f4a09
ci: fix missing Kompute support in python bindings (#1953)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel bf493bb048
Mixtral crash fix and python bindings v2.2.0 (#1931)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel 92c025a7f6
llamamodel: add 12 new architectures for CPU inference (#1914)
Baichuan, BLOOM, CodeShell, GPT-2, Orion, Persimmon, Phi and Phi-2,
Plamo, Qwen, Qwen2, Refact, StableLM

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel 10e3f7bbf5
Fix VRAM leak when model loading fails (#1901)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel eadc3b8d80 backend: bump llama.cpp for VRAM leak fix when switching models
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel 6db5307730 update llama.cpp for unhandled Vulkan OOM exception fix
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel 0a40e71652
Maxwell/Pascal GPU support and crash fix (#1895)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel b11c3f679e bump llama.cpp-mainline for C++11 compat
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel 061d1969f8
expose n_gpu_layers parameter of llama.cpp (#1890)
Also dynamically limit the GPU layers and context length fields to the maximum supported by the model.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel f549d5a70a backend : quick llama.cpp update to fix fallback to CPU
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel 38c61493d2 backend: update to latest commit of llama.cpp Vulkan PR
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel 26acdebafa
convert: replace GPTJConfig with AutoConfig (#1866)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel a9c5f53562 update llama.cpp for nomic-ai/llama.cpp#12
Fixes #1477

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
9 months ago
Jared Van Bortel b7c92c5afd
sync llama.cpp with latest Vulkan PR and newer upstream (#1819) 9 months ago
Jared Van Bortel 7e9786fccf chat: set search path early
This fixes the issues with installed versions of v2.6.0.
9 months ago
AT 96cee4f9ac
Explicitly clear the kv cache each time we eval tokens to match n_past. (#1808) 9 months ago
ThiloteE 2d566710e5 Address review 9 months ago
ThiloteE a0f7d7ae0e Fix for "LLModel ERROR: Could not find CPU LLaMA implementation" v2 9 months ago
ThiloteE 38d81c14d0 Fixes https://github.com/nomic-ai/gpt4all/issues/1760 LLModel ERROR: Could not find CPU LLaMA implementation.
Inspired by Microsoft docs for LoadLibraryExA (https://learn.microsoft.com/en-us/windows/win32/api/libloaderapi/nf-libloaderapi-loadlibraryexa).
When using LOAD_LIBRARY_SEARCH_DLL_LOAD_DIR, the lpFileName parameter must specify a fully qualified path, also it needs to be backslashes (\), not forward slashes (/).
9 months ago
Jared Van Bortel d1c56b8b28
Implement configurable context length (#1749) 10 months ago
Jared Van Bortel 3acbef14b7
fix AVX support by removing direct linking to AVX2 libs (#1750) 10 months ago
Jared Van Bortel 0600f551b3
chatllm: do not attempt to serialize incompatible state (#1742) 10 months ago
Jared Van Bortel 1df3da0a88 update llama.cpp for clang warning fix 10 months ago
Jared Van Bortel dfd8ef0186
backend: use ggml_new_graph for GGML backend v2 (#1719) 10 months ago
Jared Van Bortel 9e28dfac9c
Update to latest llama.cpp (#1706) 10 months ago
Adam Treat cce5fe2045 Fix macos build. 11 months ago
Adam Treat 371e2a5cbc LocalDocs version 2 with text embeddings. 11 months ago
Jared Van Bortel d4ce9f4a7c
llmodel_c: improve quality of error messages (#1625) 11 months ago
cebtenzzre 64101d3af5 update llama.cpp-mainline 11 months ago
Adam Treat ffef60912f Update to llama.cpp 11 months ago
Adam Treat f5f22fdbd0 Update llama.cpp for latest bugfixes. 11 months ago
cebtenzzre 7bcd9e8089 update llama.cpp-mainline 11 months ago
cebtenzzre fd0c501d68
backend: support GGUFv3 (#1582) 11 months ago
Adam Treat 14b410a12a Update to latest version of llama.cpp which fixes issue 1507. 11 months ago