gpt4all

Commit Graph

Author	SHA1	Message	Date
Jared Van Bortel	01870b4a46	chat: fix blank device in UI and improve Mixpanel reporting (#2409 ) Also remove LLModel::hasGPUDevice. Signed-off-by: Jared Van Bortel <jared@nomic.ai>	3 months ago
Jared Van Bortel	88d85be0f9	chat: fix build on Windows and Nomic Embed path on macOS (#2467 ) * chat: remove unused oscompat source files These files are no longer needed now that the hnswlib index is gone. This fixes an issue with the Windows build as there was a compilation error in oscompat.cpp. Signed-off-by: Jared Van Bortel <jared@nomic.ai> * llm: fix pragma to be recognized by MSVC Replaces this MSVC warning: C:\msys64\home\Jared\gpt4all\gpt4all-chat\llm.cpp(53,21): warning C4081: expected '('; found 'string' With this: C:\msys64\home\Jared\gpt4all\gpt4all-chat\llm.cpp : warning : offline installer build will not check for updates! Signed-off-by: Jared Van Bortel <jared@nomic.ai> * usearch: fork usearch to fix `CreateFile` build error Signed-off-by: Jared Van Bortel <jared@nomic.ai> * dlhandle: fix incorrect assertion on Windows SetErrorMode returns the previous value of the error mode flags, not an indicator of success. Signed-off-by: Jared Van Bortel <jared@nomic.ai> * llamamodel: fix UB in LLamaModel::embedInternal It is undefined behavior to increment an STL iterator past the end of the container. Use offsets to do the math instead. Signed-off-by: Jared Van Bortel <jared@nomic.ai> * cmake: install embedding model to bundle's Resources dir on macOS Signed-off-by: Jared Van Bortel <jared@nomic.ai> * ci: fix macOS build by explicitly installing Rosetta Signed-off-by: Jared Van Bortel <jared@nomic.ai> --------- Signed-off-by: Jared Van Bortel <jared@nomic.ai>	3 months ago
AT	9273b49b62	chat: major UI redesign for v3.0.0 (#2396 ) Signed-off-by: Adam Treat <treat.adam@gmail.com> Signed-off-by: Jared Van Bortel <jared@nomic.ai> Co-authored-by: Jared Van Bortel <jared@nomic.ai>	3 months ago
Jared Van Bortel	636307160e	backend: fix #includes with include-what-you-use (#2371 ) Also fix a PARENT_SCOPE warning when building the backend. Signed-off-by: Jared Van Bortel <jared@nomic.ai>	4 months ago
Jared Van Bortel	e94177ee9a	llamamodel: fix embedding crash for >512 tokens after #2310 (#2383 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	4 months ago
Jared Van Bortel	f1b4092ca6	llamamodel: fix BERT tokenization after llama.cpp update (#2381 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	4 months ago
Jared Van Bortel	d2a99d9bc6	support the llama.cpp CUDA backend (#2310 ) * rebase onto llama.cpp commit ggerganov/llama.cpp@d46dbc76f * support for CUDA backend (enabled by default) * partial support for Occam's Vulkan backend (disabled by default) * partial support for HIP/ROCm backend (disabled by default) * sync llama.cpp.cmake with upstream llama.cpp CMakeLists.txt * changes to GPT4All backend, bindings, and chat UI to handle choice of llama.cpp backend (Kompute or CUDA) * ship CUDA runtime with installed version * make device selection in the UI on macOS actually do something * model whitelist: remove dbrx, mamba, persimmon, plamo; add internlm and starcoder2 Signed-off-by: Jared Van Bortel <jared@nomic.ai>	4 months ago
Jared Van Bortel	9f9d8e636f	backend: do not crash if GGUF lacks general.architecture (#2346 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	4 months ago
Jared Van Bortel	6d8888b267	llamamodel: free the batch in embedInternal (#2348 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	4 months ago
Jared Van Bortel	c622921894	improve mixpanel usage statistics (#2238 ) Other changes: - Always display first start dialog if privacy options are unset (e.g. if the user closed GPT4All without selecting them) - LocalDocs scanQueue is now always deferred - Fix a potential crash in magic_match - LocalDocs indexing is now started after the first start dialog is dismissed so usage stats are included Signed-off-by: Jared Van Bortel <jared@nomic.ai>	5 months ago
Jared Van Bortel	ba53ab5da0	python: do not print GPU name with verbose=False, expose this info via properties (#2222 ) * llamamodel: only print device used in verbose mode Signed-off-by: Jared Van Bortel <jared@nomic.ai> * python: expose backend and device via GPT4All properties Signed-off-by: Jared Van Bortel <jared@nomic.ai> * backend: const correctness fixes Signed-off-by: Jared Van Bortel <jared@nomic.ai> * python: bump version Signed-off-by: Jared Van Bortel <jared@nomic.ai> * python: typing fixups Signed-off-by: Jared Van Bortel <jared@nomic.ai> * python: fix segfault with closed GPT4All Signed-off-by: Jared Van Bortel <jared@nomic.ai> --------- Signed-off-by: Jared Van Bortel <jared@nomic.ai>	5 months ago
Jared Van Bortel	ac498f79ac	fix regressions in system prompt handling (#2219 ) * python: fix system prompt being ignored * fix unintended whitespace after system prompt Signed-off-by: Jared Van Bortel <jared@nomic.ai>	5 months ago
Jared Van Bortel	3f8257c563	llamamodel: fix semantic typo in nomic client dynamic mode (#2216 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	5 months ago
Jared Van Bortel	46818e466e	python: embedding cancel callback for nomic client dynamic mode (#2214 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	5 months ago
Jared Van Bortel	459289b94c	embed4all: small fixes related to nomic client local embeddings (#2213 ) * actually submit larger batches with increased n_ctx * fix crash when llama_tokenize returns no tokens Signed-off-by: Jared Van Bortel <jared@nomic.ai>	5 months ago
Jared Van Bortel	0455b80b7f	Embed4All: optionally count tokens, misc fixes (#2145 ) Key changes: * python: optionally return token count in Embed4All.embed * python and docs: models2.json -> models3.json * Embed4All: require explicit prefix for unknown models * llamamodel: fix shouldAddBOS for Bert and Nomic Bert Signed-off-by: Jared Van Bortel <jared@nomic.ai>	6 months ago
Jared Van Bortel	a1bb6084ed	python: documentation update and typing improvements (#2129 ) Key changes: * revert "python: tweak constructor docstrings" * docs: update python GPT4All and Embed4All documentation * breaking: require keyword args to GPT4All.generate Signed-off-by: Jared Van Bortel <jared@nomic.ai>	6 months ago
Jared Van Bortel	255568fb9a	python: various fixes for GPT4All and Embed4All (#2130 ) Key changes: * honor empty system prompt argument * current_chat_session is now read-only and defaults to None * deprecate fallback prompt template for unknown models * fix mistakes from #2086 Signed-off-by: Jared Van Bortel <jared@nomic.ai>	6 months ago
Jared Van Bortel	53f109f519	llamamodel: fix macOS build (#2125 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	6 months ago
Jared Van Bortel	406e88b59a	implement local Nomic Embed via llama.cpp (#2086 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	6 months ago
Adam Treat	f720261d46	Fix another vulnerable spot for crashes. Signed-off-by: Adam Treat <treat.adam@gmail.com>	7 months ago
chrisbarrera	f8b1069a1c	add min_p sampling parameter (#2014 ) Signed-off-by: Christopher Barrera <cb@arda.tx.rr.com> Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>	7 months ago
Jared Van Bortel	e7f2ff189f	fix some compilation warnings on macOS Signed-off-by: Jared Van Bortel <jared@nomic.ai>	7 months ago
Jared Van Bortel	4fc4d94be4	fix chat-style prompt templates (#1970 ) Also use a new version of Mistral OpenOrca. Signed-off-by: Jared Van Bortel <jared@nomic.ai>	7 months ago
Jared Van Bortel	7810b757c9	llamamodel: add gemma model support Signed-off-by: Jared Van Bortel <jared@nomic.ai>	7 months ago
Adam Treat	d948a4f2ee	Complete revamp of model loading to allow for more discreet control by the user of the models loading behavior. Signed-off-by: Adam Treat <treat.adam@gmail.com>	7 months ago
Jared Van Bortel	fc7e5f4a09	ci: fix missing Kompute support in python bindings (#1953 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	7 months ago
Jared Van Bortel	bf493bb048	Mixtral crash fix and python bindings v2.2.0 (#1931 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	8 months ago
Jared Van Bortel	92c025a7f6	llamamodel: add 12 new architectures for CPU inference (#1914 ) Baichuan, BLOOM, CodeShell, GPT-2, Orion, Persimmon, Phi and Phi-2, Plamo, Qwen, Qwen2, Refact, StableLM Signed-off-by: Jared Van Bortel <jared@nomic.ai>	8 months ago
Jared Van Bortel	10e3f7bbf5	Fix VRAM leak when model loading fails (#1901 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	8 months ago
Jared Van Bortel	061d1969f8	expose n_gpu_layers parameter of llama.cpp (#1890 ) Also dynamically limit the GPU layers and context length fields to the maximum supported by the model. Signed-off-by: Jared Van Bortel <jared@nomic.ai>	8 months ago
Jared Van Bortel	38c61493d2	backend: update to latest commit of llama.cpp Vulkan PR Signed-off-by: Jared Van Bortel <jared@nomic.ai>	8 months ago
Jared Van Bortel	b7c92c5afd	sync llama.cpp with latest Vulkan PR and newer upstream (#1819 )	8 months ago
AT	96cee4f9ac	Explicitly clear the kv cache each time we eval tokens to match n_past. (#1808 )	9 months ago
Jared Van Bortel	d1c56b8b28	Implement configurable context length (#1749 )	9 months ago
Jared Van Bortel	0600f551b3	chatllm: do not attempt to serialize incompatible state (#1742 )	9 months ago
Jared Van Bortel	9e28dfac9c	Update to latest llama.cpp (#1706 )	10 months ago
Jared Van Bortel	d4ce9f4a7c	llmodel_c: improve quality of error messages (#1625 )	11 months ago
cebtenzzre	fd0c501d68	backend: support GGUFv3 (#1582 )	11 months ago
cebtenzzre	4338e72a51	MPT: use upstream llama.cpp implementation (#1515 )	11 months ago
cebtenzzre	0fe2e19691	llamamodel: re-enable error messages by default (#1537 )	11 months ago
Aaron Miller	afaa291eab	python bindings should be quiet by default * disable llama.cpp logging unless GPT4ALL_VERBOSE_LLAMACPP envvar is nonempty * make verbose flag for retrieve_model default false (but also be overridable via gpt4all constructor) should be able to run a basic test: ```python import gpt4all model = gpt4all.GPT4All('/Users/aaron/Downloads/rift-coder-v0-7b-q4_0.gguf') print(model.generate('def fib(n):')) ``` and see no non-model output when successful	11 months ago
Cebtenzzre	5fe685427a	chat: clearer CPU fallback messages	12 months ago
Cebtenzzre	d87573ea75	remove old llama.cpp submodules	12 months ago
Cebtenzzre	672cb850f9	differentiate between init failure and unsupported models	12 months ago
Cebtenzzre	088afada49	llamamodel: fix static vector in LLamaModel::endTokens	12 months ago
Adam Treat	12f943e966	Fix regenerate button to be deterministic and bump the llama version to latest we have for gguf.	12 months ago
Cebtenzzre	ce7be1db48	backend: use llamamodel.cpp for Falcon	12 months ago
Cebtenzzre	6277eac9cc	backend: use llamamodel.cpp for StarCoder	12 months ago
Cebtenzzre	1d29e4696c	llamamodel: metal supports all quantization types now	12 months ago

1 2

76 Commits (d515ad3b183fe06a9d5c2b84a26a7384d24fc616)