Commit Graph

73 Commits (d018b4c8219fcd23dfa44ccf6fb4e74f4d855eb2)

Author SHA1 Message Date
niansa/tuxifan 68f9786ed9
Use operator ""_MiB (#991) 1 year ago
Aaron Miller abc081e48d
fix llama.cpp k-quants (#988)
* enable k-quants on *all* mainline builds
1 year ago
Aaron Miller c4319d2c8e
dlhandle: prevent libs from using each other's symbols (#977)
use RTLD_LOCAL so that symbols are *only* exposed via dlsym

without this all symbols exported by the libs are available for symbol
resolution, resulting in different lib versions potentially resolving
*each other's* symbols, causing incredibly cursed behavior such as
https://gist.github.com/apage43/085c1ff69f6dd05387793ebc301840f6
1 year ago
Aaron Miller f71d8efc71
metal replit (#931)
metal+replit

makes replit work with Metal and removes its use of `mem_per_token`
in favor of fixed size scratch buffers (closer to llama.cpp)
1 year ago
Aaron Miller 85964a7635
bump llama.cpp mainline to latest (#964) 1 year ago
Tim Miller 797891c995
Initial Library Loader for .NET Bindings / Update bindings to support newest changes (#763)
* Initial Library Loader

* Load library as part of Model factory

* Dynamically search and find the dlls

* Update tests to use locally built runtimes

* Fix dylib loading, add macos runtime support for sample/tests

* Bypass automatic loading by default.

* Only set CMAKE_OSX_ARCHITECTURES if not already set, allow cross-compile

* Switch Loading again

* Update build scripts for mac/linux

* Update bindings to support newest breaking changes

* Fix build

* Use llmodel for Windows

* Actually, it does need to be libllmodel

* Name

* Remove TFMs, bypass loading by default

* Fix script

* Delete mac script

---------

Co-authored-by: Tim Miller <innerlogic4321@ghmail.com>
1 year ago
Aaron Miller 88616fde7f
llmodel: change tokenToString to not use string_view (#968)
fixes a definite use-after-free and likely avoids some other
potential ones - std::string will convert to a std::string_view
automatically but as soon as the std::string in question goes out of
scope it is already freed and the string_view is pointing at freed
memory - this is *mostly* fine if its returning a reference to the
tokenizer's internal vocab table but it's, imo, too easy to return a
reference to a dynamically constructed string with this as replit is
doing (and unfortunately needs to do to convert the internal whitespace
replacement symbol back to a space)
1 year ago
Adam Treat 84deebd223 Fix compile for windows and linux again. PLEASE DON'T REVERT THISgit gui! 1 year ago
Juuso Alasuutari 5cfb1bda89
llmodel: add model wrapper destructor, fix mem leak in golang bindings (#862)
Signed-off-by: Juuso Alasuutari <juuso.alasuutari@gmail.com>
1 year ago
Cosmic Snow ae4a275bcd Fix Windows MSVC AVX builds
- bug introduced in 0cb2b86730
- currently getting: `warning C5102: ignoring invalid command-line macro definition '/arch:AVX2'`
- solution is to use `_options(...)` not `_definitions(...)`
1 year ago
Adam Treat b906fb4057 When recalculating context we can't erase the BOS. 1 year ago
Aaron Miller d3ba1295a7
Metal+LLama take two (#929)
Support latest llama with Metal
---------

Co-authored-by: Adam Treat <adam@nomic.ai>
Co-authored-by: niansa/tuxifan <tuxifan@posteo.de>
1 year ago
Adam Treat b162b5c64e Revert "llama on Metal (#885)"
This reverts commit c55f81b860.
1 year ago
Aaron Miller c55f81b860
llama on Metal (#885)
Support latest llama with Metal

---------

Co-authored-by: Adam Treat <adam@nomic.ai>
Co-authored-by: niansa/tuxifan <tuxifan@posteo.de>
1 year ago
niansa/tuxifan 14e9ccbc6a Do auto detection by default in C++ API
Signed-off-by: niansa/tuxifan <tuxifan@posteo.de>
1 year ago
niansa/tuxifan f03da8d732 Removed double-static from variables in replit.cpp
The anonymous namespace already makes it static.

Signed-off-by: niansa/tuxifan <tuxifan@posteo.de>
1 year ago
niansa 0cb2b86730 Synced llama.cpp.cmake with upstream 1 year ago
Aaron Miller 47fbc0e309
non-llama: explicitly greedy sampling for temp<=0 (#901)
copied directly from llama.cpp - without this temp=0.0 will just
scale all the logits to infinity and give bad output
1 year ago
Aaron Miller b14953e136
sampling: remove incorrect offset for n_vocab (#900)
no effect, but avoids a *potential* bug later if we use
actualVocabSize - which is for when a model has a larger
embedding tensor/# of output logits than actually trained token
to allow room for adding extras in finetuning - presently all of our
models have had "placeholder" tokens in the vocab so this hasn't broken
anything, but if the sizes did differ we want the equivalent of
`logits[actualVocabSize:]` (the start point is unchanged), not
`logits[-actualVocabSize:]` (this.)
1 year ago
Adam Treat 010a04d96f Revert "Synced llama.cpp.cmake with upstream (#887)"
This reverts commit 89910c7ca8.
1 year ago
Adam Treat 7e304106cc Fix for windows. 1 year ago
niansa/tuxifan 89910c7ca8
Synced llama.cpp.cmake with upstream (#887) 1 year ago
Richard Guo c4706d0c14
Replit Model (#713)
* porting over replit code model to gpt4all

* replaced memory with kv_self struct

* continuing debug

* welp it built but lot of sus things

* working model loading and somewhat working generate.. need to format response?

* revert back to semi working version

* finally got rid of weird formatting

* figured out problem is with python bindings - this is good to go for testing

* addressing PR feedback

* output refactor

* fixed prompt reponse collection

* cleanup

* addressing PR comments

* building replit backend with new ggmlver code

* chatllm replit and clean python files

* cleanup

* updated replit to match new llmodel api

* match llmodel api and change size_t to Token

* resolve PR comments

* replit model commit comment
1 year ago
Adam Treat c5de9634c9 Fix llama models on linux and windows. 1 year ago
Adam Treat 8a9ad258f4 Fix symbol resolution on windows. 1 year ago
Adam Treat 812b2f4b29 Make installers work with mac/windows for big backend change. 1 year ago
Adam Treat f73333c6a1 Update to latest llama.cpp 1 year ago
Adam Treat 301d2fdbea Fix up for newer models on reset context. This fixes the model from totally failing after a reset context. 1 year ago
AT 5f95aa9fc6
We no longer have an avx_only repository and better error handling for minimum hardware requirements. (#833) 1 year ago
AT bbe195ee02
Backend prompt dedup (#822)
* Deduplicated prompt() function code
1 year ago
Ikko Eltociear Ashimine 945297d837 Update README.md
huggingface -> Hugging Face

Signed-off-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
1 year ago
Peter Gagarinov 23391d44e0 Only default mlock on macOS where swap seems to be a problem
Repeating the change that once was done in https://github.com/nomic-ai/gpt4all/pull/663 but then was overriden by 48275d0dcc

Signed-off-by: Peter Gagarinov <pgagarinov@users.noreply.github.com>
1 year ago
niansa/tuxifan f3564ac6b9
Fixed tons of warnings and clazy findings (#811) 1 year ago
niansa/tuxifan d6a70ddb5f
Fixed model type for GPT-J (#815)
Signed-off-by: niansa/tuxifan <tuxifan@posteo.de>
1 year ago
Richard Guo e709e58603 more cleanup 1 year ago
Richard Guo 98420ea6d5 cleanup 1 year ago
Richard Guo c54c42e3fb fixed finding model libs 1 year ago
Adam Treat cec8831e12 Fix mac build again. 1 year ago
Adam Treat 70e3b7e907 Try and fix build on mac. 1 year ago
Adam Treat a41bd6ac0a Trying to shrink the copy+paste code and do more code sharing between backend model impl. 1 year ago
Tim Miller 87cb3505d3 Fix MSVC Build, Update C# Binding Scripts 1 year ago
niansa/tuxifan 27e80e1d10
Allow user to specify custom search path via $GPT4ALL_IMPLEMENTATIONS_PATH (#789) 1 year ago
niansa 5175db2781 Fixed double-free in LLModel::Implementation destructor 1 year ago
niansa/tuxifan fc60f0c09c
Cleaned up implementation management (#787)
* Cleaned up implementation management

* Initialize LLModel::m_implementation to nullptr

* llmodel.h: Moved dlhandle fwd declare above LLModel class
1 year ago
Adam Treat 1eca524171 Add fixme's and clean up a bit. 1 year ago
niansa a3d08cdcd5 Dlopen better implementation management (Version 2) 1 year ago
niansa/tuxifan 92407438c8
Advanced avxonly autodetection (#744)
* Advanced avxonly requirement detection
1 year ago
AT 48275d0dcc
Dlopen backend 5 (#779)
Major change to the backend that allows for pluggable versions of llama.cpp/ggml. This was squashed merged from dlopen_backend_5 where the history is preserved.
1 year ago
Adam Treat 7f9f91ad94 Revert "New tokenizer implementation for MPT and GPT-J"
This reverts commit bbcee1ced5.
1 year ago
Adam Treat cdc7d6ccc4 Revert "buf_ref.into() can be const now"
This reverts commit d59c77ac55.
1 year ago