Jared Van Bortel
061d1969f8
expose n_gpu_layers parameter of llama.cpp ( #1890 )
...
Also dynamically limit the GPU layers and context length fields to the maximum supported by the model.
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-01-31 14:17:44 -05:00
Jared Van Bortel
38c61493d2
backend: update to latest commit of llama.cpp Vulkan PR
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-01-29 15:47:26 -06:00
Jared Van Bortel
d1c56b8b28
Implement configurable context length ( #1749 )
2023-12-16 17:58:15 -05:00
Jared Van Bortel
dfd8ef0186
backend: use ggml_new_graph for GGML backend v2 ( #1719 )
2023-12-06 14:38:53 -05:00
Adam Treat
cce5fe2045
Fix macos build.
2023-11-17 11:59:31 -05:00
Adam Treat
371e2a5cbc
LocalDocs version 2 with text embeddings.
2023-11-17 11:59:31 -05:00
cebtenzzre
fd0c501d68
backend: support GGUFv3 ( #1582 )
2023-10-27 17:07:23 -04:00
Cebtenzzre
050e7f076e
backend: port GPT-J to GGUF
2023-10-05 18:16:19 -04:00
Cebtenzzre
42bcb814b3
backend: port BERT to GGUF
2023-10-05 18:16:19 -04:00
Adam Treat
d90d003a1d
Latest rebase on llama.cpp with gguf support.
2023-10-05 18:16:19 -04:00
Aaron Miller
0bc2274869
bump llama.cpp version + needed fixes for that
2023-08-31 15:29:54 -04:00
Aaron Miller
1c4a244291
bump mem allocation a bit
2023-07-14 09:48:57 -04:00
Adam Treat
ee4186d579
Fixup bert python bindings.
2023-07-14 09:48:57 -04:00
Adam Treat
0efdbfcffe
Bert
2023-07-13 14:21:46 -04:00
Adam Treat
ae8eb297ac
Add sbert backend.
2023-07-13 14:21:46 -04:00