Commit Graph

223 Commits (fc1a2813811acac3e02f619ced9d350470a4e934)

Author SHA1 Message Date
Jared Van Bortel 007d469034
bert: fix layer norm epsilon value (#1946)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
6 months ago
Adam Treat f720261d46 Fix another vulnerable spot for crashes.
Signed-off-by: Adam Treat <treat.adam@gmail.com>
6 months ago
chrisbarrera f8b1069a1c
add min_p sampling parameter (#2014)
Signed-off-by: Christopher Barrera <cb@arda.tx.rr.com>
Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
6 months ago
Jared Van Bortel e7f2ff189f fix some compilation warnings on macOS
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel 88e330ef0e
llama.cpp: enable Kompute support for 10 more model arches (#2005)
These are Baichuan, Bert and Nomic Bert, CodeShell, GPT-2, InternLM,
MiniCPM, Orion, Qwen, and StarCoder.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel fc6c5ea0c7
llama.cpp: gemma: allow offloading the output tensor (#1997)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel 4fc4d94be4
fix chat-style prompt templates (#1970)
Also use a new version of Mistral OpenOrca.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel 7810b757c9 llamamodel: add gemma model support
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Adam Treat d948a4f2ee Complete revamp of model loading to allow for more discreet control by
the user of the models loading behavior.

Signed-off-by: Adam Treat <treat.adam@gmail.com>
7 months ago
Jared Van Bortel 6fdec808b2 backend: update llama.cpp for faster state serialization
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel a1471becf3 backend: update llama.cpp for Intel GPU blacklist
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel eb1081d37e cmake: fix LLAMA_DIR use before set
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel e60b388a2e cmake: fix backwards LLAMA_KOMPUTE default
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel fc7e5f4a09
ci: fix missing Kompute support in python bindings (#1953)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel bf493bb048
Mixtral crash fix and python bindings v2.2.0 (#1931)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel 92c025a7f6
llamamodel: add 12 new architectures for CPU inference (#1914)
Baichuan, BLOOM, CodeShell, GPT-2, Orion, Persimmon, Phi and Phi-2,
Plamo, Qwen, Qwen2, Refact, StableLM

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel 10e3f7bbf5
Fix VRAM leak when model loading fails (#1901)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel eadc3b8d80 backend: bump llama.cpp for VRAM leak fix when switching models
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel 6db5307730 update llama.cpp for unhandled Vulkan OOM exception fix
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel 0a40e71652
Maxwell/Pascal GPU support and crash fix (#1895)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel b11c3f679e bump llama.cpp-mainline for C++11 compat
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel 061d1969f8
expose n_gpu_layers parameter of llama.cpp (#1890)
Also dynamically limit the GPU layers and context length fields to the maximum supported by the model.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel f549d5a70a backend : quick llama.cpp update to fix fallback to CPU
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel 38c61493d2 backend: update to latest commit of llama.cpp Vulkan PR
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel 26acdebafa
convert: replace GPTJConfig with AutoConfig (#1866)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel a9c5f53562 update llama.cpp for nomic-ai/llama.cpp#12
Fixes #1477

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel b7c92c5afd
sync llama.cpp with latest Vulkan PR and newer upstream (#1819) 8 months ago
Jared Van Bortel 7e9786fccf chat: set search path early
This fixes the issues with installed versions of v2.6.0.
8 months ago
AT 96cee4f9ac
Explicitly clear the kv cache each time we eval tokens to match n_past. (#1808) 8 months ago
ThiloteE 2d566710e5 Address review 8 months ago
ThiloteE a0f7d7ae0e Fix for "LLModel ERROR: Could not find CPU LLaMA implementation" v2 8 months ago
ThiloteE 38d81c14d0 Fixes https://github.com/nomic-ai/gpt4all/issues/1760 LLModel ERROR: Could not find CPU LLaMA implementation.
Inspired by Microsoft docs for LoadLibraryExA (https://learn.microsoft.com/en-us/windows/win32/api/libloaderapi/nf-libloaderapi-loadlibraryexa).
When using LOAD_LIBRARY_SEARCH_DLL_LOAD_DIR, the lpFileName parameter must specify a fully qualified path, also it needs to be backslashes (\), not forward slashes (/).
8 months ago
Jared Van Bortel d1c56b8b28
Implement configurable context length (#1749) 9 months ago
Jared Van Bortel 3acbef14b7
fix AVX support by removing direct linking to AVX2 libs (#1750) 9 months ago
Jared Van Bortel 0600f551b3
chatllm: do not attempt to serialize incompatible state (#1742) 9 months ago
Jared Van Bortel 1df3da0a88 update llama.cpp for clang warning fix 9 months ago
Jared Van Bortel dfd8ef0186
backend: use ggml_new_graph for GGML backend v2 (#1719) 9 months ago
Jared Van Bortel 9e28dfac9c
Update to latest llama.cpp (#1706) 9 months ago
Adam Treat cce5fe2045 Fix macos build. 10 months ago
Adam Treat 371e2a5cbc LocalDocs version 2 with text embeddings. 10 months ago
Jared Van Bortel d4ce9f4a7c
llmodel_c: improve quality of error messages (#1625) 10 months ago
cebtenzzre 64101d3af5 update llama.cpp-mainline 10 months ago
Adam Treat ffef60912f Update to llama.cpp 10 months ago
Adam Treat f5f22fdbd0 Update llama.cpp for latest bugfixes. 10 months ago
cebtenzzre 7bcd9e8089 update llama.cpp-mainline 10 months ago
cebtenzzre fd0c501d68
backend: support GGUFv3 (#1582) 10 months ago
Adam Treat 14b410a12a Update to latest version of llama.cpp which fixes issue 1507. 10 months ago
Adam Treat ab96035bec Update to llama.cpp submodule for some vulkan fixes. 11 months ago
cebtenzzre e90263c23f
make scripts executable (#1555) 11 months ago
Aaron Miller f414c28589 llmodel: whitelist library name patterns
this fixes some issues that were being seen on installed windows builds of 2.5.0

only load dlls that actually might be model impl dlls, otherwise we pull all sorts of random junk into the process before it might expect to be

Signed-off-by: Aaron Miller <apage43@ninjawhale.com>
11 months ago