Commit Graph

17 Commits (1272b694ae0aec987a8f260b836261221ae6d8b1)

Author SHA1 Message Date
Jared Van Bortel 636307160e
backend: fix #includes with include-what-you-use (#2371)
Also fix a PARENT_SCOPE warning when building the backend.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
4 months ago
Jared Van Bortel 46818e466e
python: embedding cancel callback for nomic client dynamic mode (#2214)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
5 months ago
Jared Van Bortel 0455b80b7f
Embed4All: optionally count tokens, misc fixes (#2145)
Key changes:
* python: optionally return token count in Embed4All.embed
* python and docs: models2.json -> models3.json
* Embed4All: require explicit prefix for unknown models
* llamamodel: fix shouldAddBOS for Bert and Nomic Bert

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
6 months ago
Jared Van Bortel 406e88b59a
implement local Nomic Embed via llama.cpp (#2086)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
6 months ago
Jared Van Bortel f500bcf6e5
llmodel: default to a blank line between reply and next prompt (#1996)
Also make some related adjustments to the provided Alpaca-style prompt templates
and system prompts.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel 4fc4d94be4
fix chat-style prompt templates (#1970)
Also use a new version of Mistral OpenOrca.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel 3acbef14b7
fix AVX support by removing direct linking to AVX2 libs (#1750) 9 months ago
Adam Treat 12f943e966 Fix regenerate button to be deterministic and bump the llama version to latest we have for gguf. 12 months ago
Adam Treat 045f6e6cdc Link against ggml in bin so we can get the available devices without loading a model. 1 year ago
Adam Treat 0efdbfcffe Bert 1 year ago
Adam Treat 315a1f2aa2 Move it back as internal class. 1 year ago
Adam Treat 1f749d7633 Clean up backend code a bit and hide impl. details. 1 year ago
Aaron Miller 7a5f6e4726 limit prompt batch size to 128 1 year ago
Aaron Miller 88616fde7f
llmodel: change tokenToString to not use string_view (#968)
fixes a definite use-after-free and likely avoids some other
potential ones - std::string will convert to a std::string_view
automatically but as soon as the std::string in question goes out of
scope it is already freed and the string_view is pointing at freed
memory - this is *mostly* fine if its returning a reference to the
tokenizer's internal vocab table but it's, imo, too easy to return a
reference to a dynamically constructed string with this as replit is
doing (and unfortunately needs to do to convert the internal whitespace
replacement symbol back to a space)
1 year ago
Adam Treat 301d2fdbea Fix up for newer models on reset context. This fixes the model from totally failing after a reset context. 1 year ago
AT bbe195ee02
Backend prompt dedup (#822)
* Deduplicated prompt() function code
1 year ago
Adam Treat 70e3b7e907 Try and fix build on mac. 1 year ago