gpt4all

Commit Graph

Author	SHA1	Message	Date
Adam Treat	f720261d46	Fix another vulnerable spot for crashes. Signed-off-by: Adam Treat <treat.adam@gmail.com>	4 months ago
chrisbarrera	f8b1069a1c	add min_p sampling parameter (#2014 ) Signed-off-by: Christopher Barrera <cb@arda.tx.rr.com> Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>	4 months ago
Jared Van Bortel	e7f2ff189f	fix some compilation warnings on macOS Signed-off-by: Jared Van Bortel <jared@nomic.ai>	4 months ago
Jared Van Bortel	4fc4d94be4	fix chat-style prompt templates (#1970 ) Also use a new version of Mistral OpenOrca. Signed-off-by: Jared Van Bortel <jared@nomic.ai>	4 months ago
Jared Van Bortel	7810b757c9	llamamodel: add gemma model support Signed-off-by: Jared Van Bortel <jared@nomic.ai>	4 months ago
Adam Treat	d948a4f2ee	Complete revamp of model loading to allow for more discreet control by the user of the models loading behavior. Signed-off-by: Adam Treat <treat.adam@gmail.com>	4 months ago
Jared Van Bortel	fc7e5f4a09	ci: fix missing Kompute support in python bindings (#1953 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	5 months ago
Jared Van Bortel	bf493bb048	Mixtral crash fix and python bindings v2.2.0 (#1931 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	5 months ago
Jared Van Bortel	92c025a7f6	llamamodel: add 12 new architectures for CPU inference (#1914 ) Baichuan, BLOOM, CodeShell, GPT-2, Orion, Persimmon, Phi and Phi-2, Plamo, Qwen, Qwen2, Refact, StableLM Signed-off-by: Jared Van Bortel <jared@nomic.ai>	5 months ago
Jared Van Bortel	10e3f7bbf5	Fix VRAM leak when model loading fails (#1901 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	5 months ago
Jared Van Bortel	061d1969f8	expose n_gpu_layers parameter of llama.cpp (#1890 ) Also dynamically limit the GPU layers and context length fields to the maximum supported by the model. Signed-off-by: Jared Van Bortel <jared@nomic.ai>	5 months ago
Jared Van Bortel	38c61493d2	backend: update to latest commit of llama.cpp Vulkan PR Signed-off-by: Jared Van Bortel <jared@nomic.ai>	5 months ago
Jared Van Bortel	b7c92c5afd	sync llama.cpp with latest Vulkan PR and newer upstream (#1819 )	6 months ago
AT	96cee4f9ac	Explicitly clear the kv cache each time we eval tokens to match n_past. (#1808 )	6 months ago
Jared Van Bortel	d1c56b8b28	Implement configurable context length (#1749 )	7 months ago
Jared Van Bortel	0600f551b3	chatllm: do not attempt to serialize incompatible state (#1742 )	7 months ago
Jared Van Bortel	9e28dfac9c	Update to latest llama.cpp (#1706 )	7 months ago
Jared Van Bortel	d4ce9f4a7c	llmodel_c: improve quality of error messages (#1625 )	8 months ago
cebtenzzre	fd0c501d68	backend: support GGUFv3 (#1582 )	8 months ago
cebtenzzre	4338e72a51	MPT: use upstream llama.cpp implementation (#1515 )	9 months ago
cebtenzzre	0fe2e19691	llamamodel: re-enable error messages by default (#1537 )	9 months ago
Aaron Miller	afaa291eab	python bindings should be quiet by default * disable llama.cpp logging unless GPT4ALL_VERBOSE_LLAMACPP envvar is nonempty * make verbose flag for retrieve_model default false (but also be overridable via gpt4all constructor) should be able to run a basic test: ```python import gpt4all model = gpt4all.GPT4All('/Users/aaron/Downloads/rift-coder-v0-7b-q4_0.gguf') print(model.generate('def fib(n):')) ``` and see no non-model output when successful	9 months ago
Cebtenzzre	5fe685427a	chat: clearer CPU fallback messages	9 months ago
Cebtenzzre	d87573ea75	remove old llama.cpp submodules	9 months ago
Cebtenzzre	672cb850f9	differentiate between init failure and unsupported models	9 months ago
Cebtenzzre	088afada49	llamamodel: fix static vector in LLamaModel::endTokens	9 months ago
Adam Treat	12f943e966	Fix regenerate button to be deterministic and bump the llama version to latest we have for gguf.	9 months ago
Cebtenzzre	ce7be1db48	backend: use llamamodel.cpp for Falcon	9 months ago
Cebtenzzre	6277eac9cc	backend: use llamamodel.cpp for StarCoder	9 months ago
Cebtenzzre	1d29e4696c	llamamodel: metal supports all quantization types now	9 months ago
Adam Treat	d90d003a1d	Latest rebase on llama.cpp with gguf support.	9 months ago
Adam Treat	aa33419c6e	Fallback to CPU more robustly.	10 months ago
Adam Treat	3076e0bf26	Only show GPU when we're actually using it.	10 months ago
Adam Treat	987546c63b	Nomic vulkan backend licensed under the Software for Open Models License (SOM), version 1.0.	10 months ago
Aaron Miller	198b5e4832	add Falcon 7B model Tested with https://huggingface.co/TheBloke/falcon-7b-instruct-GGML/blob/main/falcon7b-instruct.ggmlv3.q4_0.bin	1 year ago
Aaron Miller	db34a2f670	llmodel: skip attempting Metal if model+kvcache > 53% of system ram	1 year ago
Aaron Miller	b19a3e5b2c	add requiredMem method to llmodel impls most of these can just shortcut out of the model loading logic llama is a bit worse to deal with because we submodule it so I have to at least parse the hparams, and then I just use the size on disk as an estimate for the mem size (which seems reasonable since we mmap() the llama files anyway)	1 year ago
Aaron Miller	88616fde7f	llmodel: change tokenToString to not use string_view (#968 ) fixes a definite use-after-free and likely avoids some other potential ones - std::string will convert to a std::string_view automatically but as soon as the std::string in question goes out of scope it is already freed and the string_view is pointing at freed memory - this is mostly fine if its returning a reference to the tokenizer's internal vocab table but it's, imo, too easy to return a reference to a dynamically constructed string with this as replit is doing (and unfortunately needs to do to convert the internal whitespace replacement symbol back to a space)	1 year ago
Adam Treat	b906fb4057	When recalculating context we can't erase the BOS.	1 year ago
Aaron Miller	d3ba1295a7	Metal+LLama take two (#929 ) Support latest llama with Metal --------- Co-authored-by: Adam Treat <adam@nomic.ai> Co-authored-by: niansa/tuxifan <tuxifan@posteo.de>	1 year ago
Adam Treat	b162b5c64e	Revert "llama on Metal (#885 )" This reverts commit `c55f81b860`.	1 year ago
Aaron Miller	c55f81b860	llama on Metal (#885 ) Support latest llama with Metal --------- Co-authored-by: Adam Treat <adam@nomic.ai> Co-authored-by: niansa/tuxifan <tuxifan@posteo.de>	1 year ago
Adam Treat	301d2fdbea	Fix up for newer models on reset context. This fixes the model from totally failing after a reset context.	1 year ago
AT	bbe195ee02	Backend prompt dedup (#822 ) * Deduplicated prompt() function code	1 year ago
Peter Gagarinov	23391d44e0	Only default mlock on macOS where swap seems to be a problem Repeating the change that once was done in https://github.com/nomic-ai/gpt4all/pull/663 but then was overriden by `48275d0dcc` Signed-off-by: Peter Gagarinov <pgagarinov@users.noreply.github.com>	1 year ago
niansa/tuxifan	f3564ac6b9	Fixed tons of warnings and clazy findings (#811 )	1 year ago
Adam Treat	a41bd6ac0a	Trying to shrink the copy+paste code and do more code sharing between backend model impl.	1 year ago
niansa	a3d08cdcd5	Dlopen better implementation management (Version 2)	1 year ago
AT	48275d0dcc	Dlopen backend 5 (#779 ) Major change to the backend that allows for pluggable versions of llama.cpp/ggml. This was squashed merged from dlopen_backend_5 where the history is preserved.	1 year ago
Adam Treat	9bfff8bfcb	Add new reverse prompt for new localdocs context feature.	1 year ago

1 2

56 Commits (0cc5a806563a9264b7c7751f4ca50e2f700c0847)