gpt4all

mirror of https://github.com/nomic-ai/gpt4all synced 2024-11-06 09:20:33 +00:00

Author	SHA1	Message	Date
Adam Treat	045f6e6cdc	Link against ggml in bin so we can get the available devices without loading a model.	2023-09-15 14:45:25 -04:00
Adam Treat	aa33419c6e	Fallback to CPU more robustly.	2023-09-14 16:53:11 -04:00
Adam Treat	9013a089bd	Bump to new llama with new bugfix.	2023-09-14 10:02:11 -04:00
Adam Treat	3076e0bf26	Only show GPU when we're actually using it.	2023-09-14 09:59:19 -04:00
Adam Treat	cf4eb530ce	Sync to a newer version of llama.cpp with bugfix for vulkan.	2023-09-13 21:01:44 -04:00
Adam Treat	4b9a345aee	Update the submodule.	2023-09-13 17:05:46 -04:00
Aaron Miller	6f038c136b	init at most one vulkan device, submodule update fixes issues w/ multiple of the same gpu	2023-09-13 12:49:53 -07:00
Adam Treat	8f99dca70f	Bring the vulkan backend to the GUI.	2023-09-13 11:26:10 -04:00
Aaron Miller	f0735efa7d	vulkan python bindings on windows fixes	2023-09-12 14:16:02 -07:00
Adam Treat	c953b321b7	Don't link against libvulkan.	2023-09-12 14:26:56 -04:00
Aaron Miller	c4d23512e4	remove extra dynamic linker deps when building with vulkan	2023-09-11 08:44:39 -07:00
Adam Treat	85e34598f9	more circleci	2023-08-31 15:29:54 -04:00
Adam Treat	f578fa6cdf	Fix for windows.	2023-08-31 15:29:54 -04:00
Adam Treat	17d3e4976c	Add a comment indicating future work.	2023-08-31 15:29:54 -04:00
Adam Treat	987546c63b	Nomic vulkan backend licensed under the Software for Open Models License (SOM), version 1.0.	2023-08-31 15:29:54 -04:00
Adam Treat	d55cbbee32	Update to newer llama.cpp and disable older forks.	2023-08-31 15:29:54 -04:00
Aaron Miller	0bc2274869	bump llama.cpp version + needed fixes for that	2023-08-31 15:29:54 -04:00
aaron miller	33c22be2aa	starcoder: use ggml_graph_plan	2023-08-31 15:29:54 -04:00
Cosmic Snow	108d950874	Fix Windows unable to load models on older Windows builds - Replace high-level IsProcessorFeaturePresent - Reintroduce low-level compiler intrinsics implementation	2023-08-09 09:27:43 +02:00
Adam Treat	6d03b3e500	Add starcoder support.	2023-07-27 09:15:16 -04:00
cosmic-snow	2d02c65177	Handle edge cases when generating embeddings (#1215 ) * Handle edge cases when generating embeddings * Improve Python handling & add llmodel_c.h note - In the Python bindings fail fast with a ValueError when text is empty - Advice other bindings authors to do likewise in llmodel_c.h	2023-07-17 13:21:03 -07:00
Aaron Miller	1c4a244291	bump mem allocation a bit	2023-07-14 09:48:57 -04:00
Adam Treat	ee4186d579	Fixup bert python bindings.	2023-07-14 09:48:57 -04:00
cosmic-snow	6200900677	Fix Windows MSVC arch detection (#1194 ) - in llmodel.cpp to fix AVX-only handling Signed-off-by: cosmic-snow <134004613+cosmic-snow@users.noreply.github.com>	2023-07-13 14:44:17 -04:00
Adam Treat	4963db8f43	Bump the version numbers for both python and c backend.	2023-07-13 14:21:46 -04:00
Adam Treat	0efdbfcffe	Bert	2023-07-13 14:21:46 -04:00
Adam Treat	315a1f2aa2	Move it back as internal class.	2023-07-13 14:21:46 -04:00
Adam Treat	ae8eb297ac	Add sbert backend.	2023-07-13 14:21:46 -04:00
Adam Treat	1f749d7633	Clean up backend code a bit and hide impl. details.	2023-07-13 14:21:46 -04:00
Adam Treat	33557b1f39	Move the implementation out of llmodel class.	2023-07-13 14:21:46 -04:00
Aaron Miller	432b7ebbd7	include windows.h just to be safe	2023-07-12 12:46:46 -04:00
Aaron Miller	95b8fb312e	windows/msvc: use high level processor feature detection API see https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-isprocessorfeaturepresent	2023-07-12 12:46:46 -04:00
Aaron Miller	f0faa23ad5	cmakelists: always export build commands (#1179 ) friendly for using editors with clangd integration that don't also manage the build themselves	2023-07-12 10:49:24 -04:00
Aaron Miller	4a24b586df	llama.cpp: metal buffer freeing	2023-06-30 21:07:21 -03:00
Aaron Miller	137bc2c367	replit: free metal context	2023-06-30 21:07:21 -03:00
Aaron Miller	57dc0c8953	adjust eval buf sizes to pass long input test	2023-06-30 21:07:21 -03:00
Aaron Miller	7a5f6e4726	limit prompt batch size to 128	2023-06-30 21:07:21 -03:00
Aaron Miller	883775bc5f	move 230511 submodule to nomic fork, fix alibi assert	2023-06-30 21:07:21 -03:00
Andriy Mulyar	46a0762bd5	Python Bindings: Improved unit tests, documentation and unification of API (#1090 ) * Makefiles, black, isort * Black and isort * unit tests and generation method * chat context provider * context does not reset * Current state * Fixup * Python bindings with unit tests * GPT4All Python Bindings: chat contexts, tests * New python bindings and backend fixes * Black and Isort * Documentation error * preserved n_predict for backwords compat with langchain --------- Co-authored-by: Adam Treat <treat.adam@gmail.com>	2023-06-30 16:02:02 -04:00
Aaron Miller	40a3faeb05	Use ggml scratch bufs for mpt and gptj models (#1104 ) * backend/gptj: use scratch buffers reduces total memory required and makes eval buf not grow with n_past * backend/mpt: use scratch bufs * fix format-related compile warnings	2023-06-30 10:53:45 -07:00
Aaron Miller	8d19ef3909	backend: factor out common elements in model code (#1089 ) * backend: factor out common structs in model code prepping to hack on these by hopefully making there be fewer places to fix the same bug rename * use common buffer wrapper instead of manual malloc * fix replit compile warnings	2023-06-28 17:35:07 -07:00
Aaron Miller	28d41d4f6d	falcon: use model-local eval & scratch bufs (#1079 ) fixes memory leaks copied from ggml/examples based implementation	2023-06-27 16:09:11 -07:00
Zach Nussbaum	2565f6a94a	feat: add conversion script	2023-06-27 14:06:39 -03:00
Aaron Miller	198b5e4832	add Falcon 7B model Tested with https://huggingface.co/TheBloke/falcon-7b-instruct-GGML/blob/main/falcon7b-instruct.ggmlv3.q4_0.bin	2023-06-27 14:06:39 -03:00
Aaron Miller	db34a2f670	llmodel: skip attempting Metal if model+kvcache > 53% of system ram	2023-06-26 19:46:49 -03:00
Aaron Miller	b19a3e5b2c	add requiredMem method to llmodel impls most of these can just shortcut out of the model loading logic llama is a bit worse to deal with because we submodule it so I have to at least parse the hparams, and then I just use the size on disk as an estimate for the mem size (which seems reasonable since we mmap() the llama files anyway)	2023-06-26 18:27:58 -03:00
Adam Treat	a0f80453e5	Use sysinfo in backend.	2023-06-26 14:14:49 -04:00
niansa/tuxifan	47323f8591	Update replit.cpp replit_tokenizer_detokenize returnins std::string now Signed-off-by: niansa/tuxifan <tuxifan@posteo.de>	2023-06-26 14:49:58 -03:00
niansa	0855c0df1d	Fixed Replit implementation compile warnings	2023-06-26 14:49:58 -03:00
Aaron Miller	1290b32451	update to latest mainline llama.cpp add max_size param to ggml_metal_add_buffer - introduced in https://github.com/ggerganov/llama.cpp/pull/1826	2023-06-26 14:40:52 -03:00

1 2 3

125 Commits