gpt4all

mirror of https://github.com/nomic-ai/gpt4all synced 2024-11-02 09:40:42 +00:00

Author	SHA1	Message	Date
Adam Treat	f720261d46	Fix another vulnerable spot for crashes. Signed-off-by: Adam Treat <treat.adam@gmail.com>	2024-02-26 12:04:16 -06:00
chrisbarrera	f8b1069a1c	add min_p sampling parameter (#2014 ) Signed-off-by: Christopher Barrera <cb@arda.tx.rr.com> Co-authored-by: Jared Van Bortel <cebtenzzre@gmail.com>	2024-02-24 17:51:34 -05:00
Jared Van Bortel	e7f2ff189f	fix some compilation warnings on macOS Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-22 15:09:06 -05:00
Jared Van Bortel	4fc4d94be4	fix chat-style prompt templates (#1970 ) Also use a new version of Mistral OpenOrca. Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-21 15:45:32 -05:00
Jared Van Bortel	7810b757c9	llamamodel: add gemma model support Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-21 13:36:31 -06:00
Adam Treat	d948a4f2ee	Complete revamp of model loading to allow for more discreet control by the user of the models loading behavior. Signed-off-by: Adam Treat <treat.adam@gmail.com>	2024-02-21 10:15:20 -06:00
Jared Van Bortel	fc7e5f4a09	ci: fix missing Kompute support in python bindings (#1953 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-09 21:40:32 -05:00
Jared Van Bortel	bf493bb048	Mixtral crash fix and python bindings v2.2.0 (#1931 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-06 11:01:15 -05:00
Jared Van Bortel	92c025a7f6	llamamodel: add 12 new architectures for CPU inference (#1914 ) Baichuan, BLOOM, CodeShell, GPT-2, Orion, Persimmon, Phi and Phi-2, Plamo, Qwen, Qwen2, Refact, StableLM Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-05 16:49:31 -05:00
Jared Van Bortel	10e3f7bbf5	Fix VRAM leak when model loading fails (#1901 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-02-01 15:45:45 -05:00
Jared Van Bortel	061d1969f8	expose n_gpu_layers parameter of llama.cpp (#1890 ) Also dynamically limit the GPU layers and context length fields to the maximum supported by the model. Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-01-31 14:17:44 -05:00
Jared Van Bortel	38c61493d2	backend: update to latest commit of llama.cpp Vulkan PR Signed-off-by: Jared Van Bortel <jared@nomic.ai>	2024-01-29 15:47:26 -06:00
Jared Van Bortel	b7c92c5afd	sync llama.cpp with latest Vulkan PR and newer upstream (#1819 )	2024-01-16 16:36:21 -05:00
AT	96cee4f9ac	Explicitly clear the kv cache each time we eval tokens to match n_past. (#1808 )	2024-01-03 14:06:08 -05:00
Jared Van Bortel	d1c56b8b28	Implement configurable context length (#1749 )	2023-12-16 17:58:15 -05:00
Jared Van Bortel	0600f551b3	chatllm: do not attempt to serialize incompatible state (#1742 )	2023-12-12 11:45:03 -05:00
Jared Van Bortel	9e28dfac9c	Update to latest llama.cpp (#1706 )	2023-12-01 16:51:15 -05:00
Jared Van Bortel	d4ce9f4a7c	llmodel_c: improve quality of error messages (#1625 )	2023-11-07 11:20:14 -05:00
cebtenzzre	fd0c501d68	backend: support GGUFv3 (#1582 )	2023-10-27 17:07:23 -04:00
cebtenzzre	4338e72a51	MPT: use upstream llama.cpp implementation (#1515 )	2023-10-19 15:25:17 -04:00
cebtenzzre	0fe2e19691	llamamodel: re-enable error messages by default (#1537 )	2023-10-19 13:46:33 -04:00
Aaron Miller	afaa291eab	python bindings should be quiet by default * disable llama.cpp logging unless GPT4ALL_VERBOSE_LLAMACPP envvar is nonempty * make verbose flag for retrieve_model default false (but also be overridable via gpt4all constructor) should be able to run a basic test: ```python import gpt4all model = gpt4all.GPT4All('/Users/aaron/Downloads/rift-coder-v0-7b-q4_0.gguf') print(model.generate('def fib(n):')) ``` and see no non-model output when successful	2023-10-11 14:14:36 -07:00
Cebtenzzre	5fe685427a	chat: clearer CPU fallback messages	2023-10-06 11:35:14 -04:00
Cebtenzzre	d87573ea75	remove old llama.cpp submodules	2023-10-05 18:16:19 -04:00
Cebtenzzre	672cb850f9	differentiate between init failure and unsupported models	2023-10-05 18:16:19 -04:00
Cebtenzzre	088afada49	llamamodel: fix static vector in LLamaModel::endTokens	2023-10-05 18:16:19 -04:00
Adam Treat	12f943e966	Fix regenerate button to be deterministic and bump the llama version to latest we have for gguf.	2023-10-05 18:16:19 -04:00
Cebtenzzre	ce7be1db48	backend: use llamamodel.cpp for Falcon	2023-10-05 18:16:19 -04:00
Cebtenzzre	6277eac9cc	backend: use llamamodel.cpp for StarCoder	2023-10-05 18:16:19 -04:00
Cebtenzzre	1d29e4696c	llamamodel: metal supports all quantization types now	2023-10-05 18:16:19 -04:00
Adam Treat	d90d003a1d	Latest rebase on llama.cpp with gguf support.	2023-10-05 18:16:19 -04:00
Adam Treat	aa33419c6e	Fallback to CPU more robustly.	2023-09-14 16:53:11 -04:00
Adam Treat	3076e0bf26	Only show GPU when we're actually using it.	2023-09-14 09:59:19 -04:00
Adam Treat	987546c63b	Nomic vulkan backend licensed under the Software for Open Models License (SOM), version 1.0.	2023-08-31 15:29:54 -04:00
Aaron Miller	198b5e4832	add Falcon 7B model Tested with https://huggingface.co/TheBloke/falcon-7b-instruct-GGML/blob/main/falcon7b-instruct.ggmlv3.q4_0.bin	2023-06-27 14:06:39 -03:00
Aaron Miller	db34a2f670	llmodel: skip attempting Metal if model+kvcache > 53% of system ram	2023-06-26 19:46:49 -03:00
Aaron Miller	b19a3e5b2c	add requiredMem method to llmodel impls most of these can just shortcut out of the model loading logic llama is a bit worse to deal with because we submodule it so I have to at least parse the hparams, and then I just use the size on disk as an estimate for the mem size (which seems reasonable since we mmap() the llama files anyway)	2023-06-26 18:27:58 -03:00
Aaron Miller	88616fde7f	llmodel: change tokenToString to not use string_view (#968 ) fixes a definite use-after-free and likely avoids some other potential ones - std::string will convert to a std::string_view automatically but as soon as the std::string in question goes out of scope it is already freed and the string_view is pointing at freed memory - this is mostly fine if its returning a reference to the tokenizer's internal vocab table but it's, imo, too easy to return a reference to a dynamically constructed string with this as replit is doing (and unfortunately needs to do to convert the internal whitespace replacement symbol back to a space)	2023-06-13 07:14:02 -04:00
Adam Treat	b906fb4057	When recalculating context we can't erase the BOS.	2023-06-12 08:43:20 -07:00
Aaron Miller	d3ba1295a7	Metal+LLama take two (#929 ) Support latest llama with Metal --------- Co-authored-by: Adam Treat <adam@nomic.ai> Co-authored-by: niansa/tuxifan <tuxifan@posteo.de>	2023-06-09 16:48:46 -04:00
Adam Treat	b162b5c64e	Revert "llama on Metal (#885 )" This reverts commit `c55f81b860`.	2023-06-09 15:08:46 -04:00
Aaron Miller	c55f81b860	llama on Metal (#885 ) Support latest llama with Metal --------- Co-authored-by: Adam Treat <adam@nomic.ai> Co-authored-by: niansa/tuxifan <tuxifan@posteo.de>	2023-06-09 14:58:12 -04:00
Adam Treat	301d2fdbea	Fix up for newer models on reset context. This fixes the model from totally failing after a reset context.	2023-06-04 19:31:20 -04:00
AT	bbe195ee02	Backend prompt dedup (#822 ) * Deduplicated prompt() function code	2023-06-04 08:59:24 -04:00
Peter Gagarinov	23391d44e0	Only default mlock on macOS where swap seems to be a problem Repeating the change that once was done in https://github.com/nomic-ai/gpt4all/pull/663 but then was overriden by `48275d0dcc` Signed-off-by: Peter Gagarinov <pgagarinov@users.noreply.github.com>	2023-06-03 07:51:18 -04:00
niansa/tuxifan	f3564ac6b9	Fixed tons of warnings and clazy findings (#811 )	2023-06-02 15:46:41 -04:00
Adam Treat	a41bd6ac0a	Trying to shrink the copy+paste code and do more code sharing between backend model impl.	2023-06-02 07:20:59 -04:00
niansa	a3d08cdcd5	Dlopen better implementation management (Version 2)	2023-06-01 07:44:15 -04:00
AT	48275d0dcc	Dlopen backend 5 (#779 ) Major change to the backend that allows for pluggable versions of llama.cpp/ggml. This was squashed merged from dlopen_backend_5 where the history is preserved.	2023-05-31 17:04:01 -04:00
Adam Treat	9bfff8bfcb	Add new reverse prompt for new localdocs context feature.	2023-05-25 11:28:06 -04:00

1 2

56 Commits