Commit Graph

33 Commits (d3d777bc5197d319b7e90d6fac99385853757f00)

Author SHA1 Message Date
Jared Van Bortel a92d266cea
cmake: fix Metal build after #2310 (#2350)
I don't understand why this is needed, but it works.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
4 months ago
Jared Van Bortel d2a99d9bc6
support the llama.cpp CUDA backend (#2310)
* rebase onto llama.cpp commit ggerganov/llama.cpp@d46dbc76f
* support for CUDA backend (enabled by default)
* partial support for Occam's Vulkan backend (disabled by default)
* partial support for HIP/ROCm backend (disabled by default)
* sync llama.cpp.cmake with upstream llama.cpp CMakeLists.txt
* changes to GPT4All backend, bindings, and chat UI to handle choice of llama.cpp backend (Kompute or CUDA)
* ship CUDA runtime with installed version
* make device selection in the UI on macOS actually do something
* model whitelist: remove dbrx, mamba, persimmon, plamo; add internlm and starcoder2

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
4 months ago
Jared Van Bortel eb1081d37e cmake: fix LLAMA_DIR use before set
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel e60b388a2e cmake: fix backwards LLAMA_KOMPUTE default
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel fc7e5f4a09
ci: fix missing Kompute support in python bindings (#1953)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel 38c61493d2 backend: update to latest commit of llama.cpp Vulkan PR
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel b7c92c5afd
sync llama.cpp with latest Vulkan PR and newer upstream (#1819) 8 months ago
Jared Van Bortel 9e28dfac9c
Update to latest llama.cpp (#1706) 10 months ago
cebtenzzre 017c3a9649
python: prepare version 2.0.0rc1 (#1529) 11 months ago
cebtenzzre 9a19c740ee
kompute: fix library loading issues with kp_logger (#1517) 11 months ago
Aaron Miller f79557d2aa speedup: just use mat*vec shaders for mat*mat
so far my from-scratch mat*mats are still slower than just running more
invocations of the existing Metal ported mat*vec shaders - it should be
theoretically possible to make a mat*mat that's faster (for actual
mat*mat cases) than an optimal mat*vec, but it will need to be at
*least* as fast as the mat*vec op and then take special care to be
cache-friendly and save memory bandwidth, as the # of compute ops is the
same
11 months ago
Aaron Miller 2490977f89 q6k, q4_1 mat*mat 11 months ago
Aaron Miller 64001a480a mat*mat for q4_0, q8_0 11 months ago
Cebtenzzre cc6db61c93 backend: fix build with Visual Studio generator
Use the $<CONFIG> generator expression instead of CMAKE_BUILD_TYPE. This
is needed because Visual Studio is a multi-configuration generator, so
we do not know what the build type will be until `cmake --build` is
called.

Fixes #1470
12 months ago
Adam Treat f605a5b686 Add q8_0 kernels to kompute shaders and bump to latest llama/gguf. 12 months ago
Adam Treat 5d346e13d7 Add q6_k kernels for vulkan. 12 months ago
Adam Treat 4eefd386d0 Refactor for subgroups on mat * vec kernel. 12 months ago
Aaron Miller 507753a37c macos build fixes 12 months ago
Adam Treat d90d003a1d Latest rebase on llama.cpp with gguf support. 12 months ago
Jacob Nguyen e86c63750d Update llama.cpp.cmake
Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>
1 year ago
Adam Treat c953b321b7 Don't link against libvulkan. 1 year ago
Adam Treat 987546c63b Nomic vulkan backend licensed under the Software for Open Models License (SOM), version 1.0. 1 year ago
Adam Treat d55cbbee32 Update to newer llama.cpp and disable older forks. 1 year ago
Adam Treat 84deebd223 Fix compile for windows and linux again. PLEASE DON'T REVERT THISgit gui! 1 year ago
Cosmic Snow ae4a275bcd Fix Windows MSVC AVX builds
- bug introduced in 0cb2b86730
- currently getting: `warning C5102: ignoring invalid command-line macro definition '/arch:AVX2'`
- solution is to use `_options(...)` not `_definitions(...)`
1 year ago
Aaron Miller d3ba1295a7
Metal+LLama take two (#929)
Support latest llama with Metal
---------

Co-authored-by: Adam Treat <adam@nomic.ai>
Co-authored-by: niansa/tuxifan <tuxifan@posteo.de>
1 year ago
Adam Treat b162b5c64e Revert "llama on Metal (#885)"
This reverts commit c55f81b860.
1 year ago
Aaron Miller c55f81b860
llama on Metal (#885)
Support latest llama with Metal

---------

Co-authored-by: Adam Treat <adam@nomic.ai>
Co-authored-by: niansa/tuxifan <tuxifan@posteo.de>
1 year ago
niansa 0cb2b86730 Synced llama.cpp.cmake with upstream 1 year ago
Adam Treat 010a04d96f Revert "Synced llama.cpp.cmake with upstream (#887)"
This reverts commit 89910c7ca8.
1 year ago
niansa/tuxifan 89910c7ca8
Synced llama.cpp.cmake with upstream (#887) 1 year ago
Adam Treat c5de9634c9 Fix llama models on linux and windows. 1 year ago
AT 48275d0dcc
Dlopen backend 5 (#779)
Major change to the backend that allows for pluggable versions of llama.cpp/ggml. This was squashed merged from dlopen_backend_5 where the history is preserved.
1 year ago