Jared Van Bortel
bd307abfe6
backend: fix a crash on inputs greater than n_ctx ( #2498 )
...
This fixes a regression in commit 4fc4d94b
("fix chat-style prompt
templates (#1970 )"), which moved some return statements into a new
function (LLModel::decodePrompt) without making them return from the
parent as well.
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-07-01 11:33:46 -04:00
AT
9273b49b62
chat: major UI redesign for v3.0.0 ( #2396 )
...
Signed-off-by: Adam Treat <treat.adam@gmail.com>
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
2024-06-24 18:49:23 -04:00
Jared Van Bortel
636307160e
backend: fix #includes with include-what-you-use ( #2371 )
...
Also fix a PARENT_SCOPE warning when building the backend.
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-05-31 16:34:54 -04:00
Jared Van Bortel
46818e466e
python: embedding cancel callback for nomic client dynamic mode ( #2214 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-04-12 16:00:39 -04:00
Jared Van Bortel
0455b80b7f
Embed4All: optionally count tokens, misc fixes ( #2145 )
...
Key changes:
* python: optionally return token count in Embed4All.embed
* python and docs: models2.json -> models3.json
* Embed4All: require explicit prefix for unknown models
* llamamodel: fix shouldAddBOS for Bert and Nomic Bert
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-20 11:24:02 -04:00
Jared Van Bortel
406e88b59a
implement local Nomic Embed via llama.cpp ( #2086 )
...
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-03-13 18:09:24 -04:00
Jared Van Bortel
f500bcf6e5
llmodel: default to a blank line between reply and next prompt ( #1996 )
...
Also make some related adjustments to the provided Alpaca-style prompt templates
and system prompts.
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-02-26 13:11:15 -05:00
Jared Van Bortel
4fc4d94be4
fix chat-style prompt templates ( #1970 )
...
Also use a new version of Mistral OpenOrca.
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
2024-02-21 15:45:32 -05:00
Jared Van Bortel
3acbef14b7
fix AVX support by removing direct linking to AVX2 libs ( #1750 )
2023-12-13 12:11:09 -05:00
Adam Treat
12f943e966
Fix regenerate button to be deterministic and bump the llama version to latest we have for gguf.
2023-10-05 18:16:19 -04:00
Adam Treat
045f6e6cdc
Link against ggml in bin so we can get the available devices without loading a model.
2023-09-15 14:45:25 -04:00
Adam Treat
0efdbfcffe
Bert
2023-07-13 14:21:46 -04:00
Adam Treat
315a1f2aa2
Move it back as internal class.
2023-07-13 14:21:46 -04:00
Adam Treat
1f749d7633
Clean up backend code a bit and hide impl. details.
2023-07-13 14:21:46 -04:00
Aaron Miller
7a5f6e4726
limit prompt batch size to 128
2023-06-30 21:07:21 -03:00
Aaron Miller
88616fde7f
llmodel: change tokenToString to not use string_view ( #968 )
...
fixes a definite use-after-free and likely avoids some other
potential ones - std::string will convert to a std::string_view
automatically but as soon as the std::string in question goes out of
scope it is already freed and the string_view is pointing at freed
memory - this is *mostly* fine if its returning a reference to the
tokenizer's internal vocab table but it's, imo, too easy to return a
reference to a dynamically constructed string with this as replit is
doing (and unfortunately needs to do to convert the internal whitespace
replacement symbol back to a space)
2023-06-13 07:14:02 -04:00
Adam Treat
301d2fdbea
Fix up for newer models on reset context. This fixes the model from totally failing after a reset context.
2023-06-04 19:31:20 -04:00
AT
bbe195ee02
Backend prompt dedup ( #822 )
...
* Deduplicated prompt() function code
2023-06-04 08:59:24 -04:00
Adam Treat
70e3b7e907
Try and fix build on mac.
2023-06-02 10:47:12 -04:00