Commit Graph

22 Commits (ba53ab5da0931b61cf1771aec27a3e390b36575c)

Author SHA1 Message Date
Jared Van Bortel ba53ab5da0
python: do not print GPU name with verbose=False, expose this info via properties (#2222)
* llamamodel: only print device used in verbose mode

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* python: expose backend and device via GPT4All properties

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* backend: const correctness fixes

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* python: bump version

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* python: typing fixups

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

* python: fix segfault with closed GPT4All

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

---------

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2 months ago
Jared Van Bortel 46818e466e
python: embedding cancel callback for nomic client dynamic mode (#2214)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2 months ago
Jared Van Bortel 1b84a48c47
python: add list_gpus to the GPT4All API (#2194)
Other changes:
* fix memory leak in llmodel_available_gpu_devices
* drop model argument from llmodel_available_gpu_devices
* breaking: make GPT4All/Embed4All arguments past model_name keyword-only

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2 months ago
Jared Van Bortel 0455b80b7f
Embed4All: optionally count tokens, misc fixes (#2145)
Key changes:
* python: optionally return token count in Embed4All.embed
* python and docs: models2.json -> models3.json
* Embed4All: require explicit prefix for unknown models
* llamamodel: fix shouldAddBOS for Bert and Nomic Bert

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 months ago
Jared Van Bortel 406e88b59a
implement local Nomic Embed via llama.cpp (#2086)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 months ago
Jared Van Bortel c19b763e03
llmodel_c: expose fakeReply to the bindings (#2061)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
3 months ago
chrisbarrera f8b1069a1c
add min_p sampling parameter (#2014)
Signed-off-by: Christopher Barrera <cb@arda.tx.rr.com>
Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>
4 months ago
Jared Van Bortel 4fc4d94be4
fix chat-style prompt templates (#1970)
Also use a new version of Mistral OpenOrca.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
4 months ago
Jared Van Bortel 061d1969f8
expose n_gpu_layers parameter of llama.cpp (#1890)
Also dynamically limit the GPU layers and context length fields to the maximum supported by the model.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
5 months ago
Jared Van Bortel d1c56b8b28
Implement configurable context length (#1749) 6 months ago
Jared Van Bortel d4ce9f4a7c
llmodel_c: improve quality of error messages (#1625) 7 months ago
Adam Treat 987546c63b Nomic vulkan backend licensed under the Software for Open Models License (SOM), version 1.0. 10 months ago
cosmic-snow 2d02c65177
Handle edge cases when generating embeddings (#1215)
* Handle edge cases when generating embeddings
* Improve Python handling & add llmodel_c.h note
- In the Python bindings fail fast with a ValueError when text is empty
- Advice other bindings authors to do likewise in llmodel_c.h
11 months ago
Adam Treat 0efdbfcffe Bert 11 months ago
Aaron Miller b19a3e5b2c add requiredMem method to llmodel impls
most of these can just shortcut out of the model loading logic llama is a bit worse to deal with because we submodule it so I have to at least parse the hparams, and then I just use the size on disk as an estimate for the mem size (which seems reasonable since we mmap() the llama files anyway)
12 months ago
Adam Treat 812b2f4b29 Make installers work with mac/windows for big backend change. 1 year ago
Richard Guo e709e58603 more cleanup 1 year ago
Richard Guo c54c42e3fb fixed finding model libs 1 year ago
AT 48275d0dcc
Dlopen backend 5 (#779)
Major change to the backend that allows for pluggable versions of llama.cpp/ggml. This was squashed merged from dlopen_backend_5 where the history is preserved.
1 year ago
Juuso Alasuutari 08ece43f0d llmodel: fix wrong and/or missing prompt callback type
Fix occurrences of the prompt callback being incorrectly specified, or
the response callback's prototype being incorrectly used in its place.

Signed-off-by: Juuso Alasuutari <juuso.alasuutari@gmail.com>
1 year ago
kuvaus 3cb6dd7a66
gpt4all-backend: Add llmodel create and destroy functions (#554)
* Add llmodel create and destroy functions

* Fix capitalization

* Fix capitalization

* Fix capitalization

* Update CMakeLists.txt

---------

Co-authored-by: kuvaus <kuvaus@users.noreply.github.com>
1 year ago
Adam Treat d918b02c29 Move the llmodel C API to new top-level directory and version it. 1 year ago