Commit Graph

40 Commits

Author SHA1 Message Date
Jared Van Bortel
fc7e5f4a09
ci: fix missing Kompute support in python bindings (#1953)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-02-09 21:40:32 -05:00
Jared Van Bortel
3acbef14b7
fix AVX support by removing direct linking to AVX2 libs (#1750) 2023-12-13 12:11:09 -05:00
cebtenzzre
4338e72a51
MPT: use upstream llama.cpp implementation (#1515) 2023-10-19 15:25:17 -04:00
Adam Treat
a9acdd25de Push a new version number for llmodel backend now that it is based on gguf. 2023-10-05 18:18:07 -04:00
Cebtenzzre
050e7f076e backend: port GPT-J to GGUF 2023-10-05 18:16:19 -04:00
Cebtenzzre
ce7be1db48 backend: use llamamodel.cpp for Falcon 2023-10-05 18:16:19 -04:00
Cebtenzzre
6277eac9cc backend: use llamamodel.cpp for StarCoder 2023-10-05 18:16:19 -04:00
Cebtenzzre
17fc9e3e58 backend: port Replit to GGUF 2023-10-05 18:16:19 -04:00
Cebtenzzre
7c67262a13 backend: port MPT to GGUF 2023-10-05 18:16:19 -04:00
Adam Treat
045f6e6cdc Link against ggml in bin so we can get the available devices without loading a model. 2023-09-15 14:45:25 -04:00
Adam Treat
85e34598f9 more circleci 2023-08-31 15:29:54 -04:00
Adam Treat
17d3e4976c Add a comment indicating future work. 2023-08-31 15:29:54 -04:00
Adam Treat
987546c63b Nomic vulkan backend licensed under the Software for Open Models License (SOM), version 1.0. 2023-08-31 15:29:54 -04:00
Adam Treat
d55cbbee32 Update to newer llama.cpp and disable older forks. 2023-08-31 15:29:54 -04:00
Aaron Miller
0bc2274869 bump llama.cpp version + needed fixes for that 2023-08-31 15:29:54 -04:00
aaron miller
33c22be2aa starcoder: use ggml_graph_plan 2023-08-31 15:29:54 -04:00
Adam Treat
6d03b3e500 Add starcoder support. 2023-07-27 09:15:16 -04:00
Adam Treat
4963db8f43 Bump the version numbers for both python and c backend. 2023-07-13 14:21:46 -04:00
Adam Treat
ae8eb297ac Add sbert backend. 2023-07-13 14:21:46 -04:00
Aaron Miller
f0faa23ad5
cmakelists: always export build commands (#1179)
friendly for using editors with clangd integration that don't also
manage the build themselves
2023-07-12 10:49:24 -04:00
Aaron Miller
8d19ef3909
backend: factor out common elements in model code (#1089)
* backend: factor out common structs in model code

prepping to hack on these by hopefully making there be fewer places to fix the same bug

rename

* use common buffer wrapper instead of manual malloc

* fix replit compile warnings
2023-06-28 17:35:07 -07:00
Aaron Miller
198b5e4832 add Falcon 7B model
Tested with https://huggingface.co/TheBloke/falcon-7b-instruct-GGML/blob/main/falcon7b-instruct.ggmlv3.q4_0.bin
2023-06-27 14:06:39 -03:00
Aaron Miller
abc081e48d
fix llama.cpp k-quants (#988)
* enable k-quants on *all* mainline builds
2023-06-15 14:06:14 -07:00
Aaron Miller
f71d8efc71
metal replit (#931)
metal+replit

makes replit work with Metal and removes its use of `mem_per_token`
in favor of fixed size scratch buffers (closer to llama.cpp)
2023-06-13 07:29:14 -07:00
Tim Miller
797891c995
Initial Library Loader for .NET Bindings / Update bindings to support newest changes (#763)
* Initial Library Loader

* Load library as part of Model factory

* Dynamically search and find the dlls

* Update tests to use locally built runtimes

* Fix dylib loading, add macos runtime support for sample/tests

* Bypass automatic loading by default.

* Only set CMAKE_OSX_ARCHITECTURES if not already set, allow cross-compile

* Switch Loading again

* Update build scripts for mac/linux

* Update bindings to support newest breaking changes

* Fix build

* Use llmodel for Windows

* Actually, it does need to be libllmodel

* Name

* Remove TFMs, bypass loading by default

* Fix script

* Delete mac script

---------

Co-authored-by: Tim Miller <innerlogic4321@ghmail.com>
2023-06-13 14:05:34 +02:00
Aaron Miller
d3ba1295a7
Metal+LLama take two (#929)
Support latest llama with Metal
---------

Co-authored-by: Adam Treat <adam@nomic.ai>
Co-authored-by: niansa/tuxifan <tuxifan@posteo.de>
2023-06-09 16:48:46 -04:00
Adam Treat
b162b5c64e Revert "llama on Metal (#885)"
This reverts commit c55f81b860.
2023-06-09 15:08:46 -04:00
Aaron Miller
c55f81b860
llama on Metal (#885)
Support latest llama with Metal

---------

Co-authored-by: Adam Treat <adam@nomic.ai>
Co-authored-by: niansa/tuxifan <tuxifan@posteo.de>
2023-06-09 14:58:12 -04:00
Adam Treat
7e304106cc Fix for windows. 2023-06-07 12:58:51 -04:00
Richard Guo
c4706d0c14
Replit Model (#713)
* porting over replit code model to gpt4all

* replaced memory with kv_self struct

* continuing debug

* welp it built but lot of sus things

* working model loading and somewhat working generate.. need to format response?

* revert back to semi working version

* finally got rid of weird formatting

* figured out problem is with python bindings - this is good to go for testing

* addressing PR feedback

* output refactor

* fixed prompt reponse collection

* cleanup

* addressing PR comments

* building replit backend with new ggmlver code

* chatllm replit and clean python files

* cleanup

* updated replit to match new llmodel api

* match llmodel api and change size_t to Token

* resolve PR comments

* replit model commit comment
2023-06-06 17:09:00 -04:00
Adam Treat
c5de9634c9 Fix llama models on linux and windows. 2023-06-05 14:31:15 -04:00
AT
bbe195ee02
Backend prompt dedup (#822)
* Deduplicated prompt() function code
2023-06-04 08:59:24 -04:00
Adam Treat
cec8831e12 Fix mac build again. 2023-06-02 10:51:09 -04:00
Adam Treat
70e3b7e907 Try and fix build on mac. 2023-06-02 10:47:12 -04:00
AT
48275d0dcc
Dlopen backend 5 (#779)
Major change to the backend that allows for pluggable versions of llama.cpp/ggml. This was squashed merged from dlopen_backend_5 where the history is preserved.
2023-05-31 17:04:01 -04:00
Adam Treat
7f9f91ad94 Revert "New tokenizer implementation for MPT and GPT-J"
This reverts commit bbcee1ced5.
2023-05-30 12:59:00 -04:00
Aaron Miller
bbcee1ced5 New tokenizer implementation for MPT and GPT-J
Improves output quality by making these tokenizers more closely
match the behavior of the huggingface `tokenizers` based BPE
tokenizers these models were trained with.

Featuring:
 * Fixed unicode handling (via ICU)
 * Fixed BPE token merge handling
 * Complete added vocabulary handling
2023-05-30 12:05:57 -04:00
Adam Treat
474c5387f9 Get the backend as well as the client building/working with msvc. 2023-05-25 15:22:45 -04:00
kuvaus
3cb6dd7a66
gpt4all-backend: Add llmodel create and destroy functions (#554)
* Add llmodel create and destroy functions

* Fix capitalization

* Fix capitalization

* Fix capitalization

* Update CMakeLists.txt

---------

Co-authored-by: kuvaus <kuvaus@users.noreply.github.com>
2023-05-16 11:36:46 -04:00
Adam Treat
d918b02c29 Move the llmodel C API to new top-level directory and version it. 2023-05-10 11:46:40 -04:00