gpt4all

Commit Graph

Author	SHA1	Message	Date
Jared Van Bortel	e60b388a2e	cmake: fix backwards LLAMA_KOMPUTE default Signed-off-by: Jared Van Bortel <jared@nomic.ai>	7 months ago
Jared Van Bortel	fc7e5f4a09	ci: fix missing Kompute support in python bindings (#1953 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	7 months ago
Jared Van Bortel	bf493bb048	Mixtral crash fix and python bindings v2.2.0 (#1931 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	8 months ago
Jared Van Bortel	92c025a7f6	llamamodel: add 12 new architectures for CPU inference (#1914 ) Baichuan, BLOOM, CodeShell, GPT-2, Orion, Persimmon, Phi and Phi-2, Plamo, Qwen, Qwen2, Refact, StableLM Signed-off-by: Jared Van Bortel <jared@nomic.ai>	8 months ago
Jared Van Bortel	10e3f7bbf5	Fix VRAM leak when model loading fails (#1901 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	8 months ago
Jared Van Bortel	eadc3b8d80	backend: bump llama.cpp for VRAM leak fix when switching models Signed-off-by: Jared Van Bortel <jared@nomic.ai>	8 months ago
Jared Van Bortel	6db5307730	update llama.cpp for unhandled Vulkan OOM exception fix Signed-off-by: Jared Van Bortel <jared@nomic.ai>	8 months ago
Jared Van Bortel	0a40e71652	Maxwell/Pascal GPU support and crash fix (#1895 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	8 months ago
Jared Van Bortel	b11c3f679e	bump llama.cpp-mainline for C++11 compat Signed-off-by: Jared Van Bortel <jared@nomic.ai>	8 months ago
Jared Van Bortel	061d1969f8	expose n_gpu_layers parameter of llama.cpp (#1890 ) Also dynamically limit the GPU layers and context length fields to the maximum supported by the model. Signed-off-by: Jared Van Bortel <jared@nomic.ai>	8 months ago
Jared Van Bortel	f549d5a70a	backend : quick llama.cpp update to fix fallback to CPU Signed-off-by: Jared Van Bortel <jared@nomic.ai>	8 months ago
Jared Van Bortel	38c61493d2	backend: update to latest commit of llama.cpp Vulkan PR Signed-off-by: Jared Van Bortel <jared@nomic.ai>	8 months ago
Jared Van Bortel	26acdebafa	convert: replace GPTJConfig with AutoConfig (#1866 ) Signed-off-by: Jared Van Bortel <jared@nomic.ai>	8 months ago
Jared Van Bortel	a9c5f53562	update llama.cpp for nomic-ai/llama.cpp#12 Fixes #1477 Signed-off-by: Jared Van Bortel <jared@nomic.ai>	8 months ago
Jared Van Bortel	b7c92c5afd	sync llama.cpp with latest Vulkan PR and newer upstream (#1819 )	8 months ago
Jared Van Bortel	7e9786fccf	chat: set search path early This fixes the issues with installed versions of v2.6.0.	8 months ago
AT	96cee4f9ac	Explicitly clear the kv cache each time we eval tokens to match n_past. (#1808 )	9 months ago
ThiloteE	2d566710e5	Address review	9 months ago
ThiloteE	a0f7d7ae0e	Fix for "LLModel ERROR: Could not find CPU LLaMA implementation" v2	9 months ago
ThiloteE	38d81c14d0	Fixes https://github.com/nomic-ai/gpt4all/issues/1760 LLModel ERROR: Could not find CPU LLaMA implementation. Inspired by Microsoft docs for LoadLibraryExA (https://learn.microsoft.com/en-us/windows/win32/api/libloaderapi/nf-libloaderapi-loadlibraryexa). When using LOAD_LIBRARY_SEARCH_DLL_LOAD_DIR, the lpFileName parameter must specify a fully qualified path, also it needs to be backslashes (\), not forward slashes (/).	9 months ago
Jared Van Bortel	d1c56b8b28	Implement configurable context length (#1749 )	9 months ago
Jared Van Bortel	3acbef14b7	fix AVX support by removing direct linking to AVX2 libs (#1750 )	9 months ago
Jared Van Bortel	0600f551b3	chatllm: do not attempt to serialize incompatible state (#1742 )	9 months ago
Jared Van Bortel	1df3da0a88	update llama.cpp for clang warning fix	9 months ago
Jared Van Bortel	dfd8ef0186	backend: use ggml_new_graph for GGML backend v2 (#1719 )	10 months ago
Jared Van Bortel	9e28dfac9c	Update to latest llama.cpp (#1706 )	10 months ago
Adam Treat	cce5fe2045	Fix macos build.	10 months ago
Adam Treat	371e2a5cbc	LocalDocs version 2 with text embeddings.	10 months ago
Jared Van Bortel	d4ce9f4a7c	llmodel_c: improve quality of error messages (#1625 )	11 months ago
cebtenzzre	64101d3af5	update llama.cpp-mainline	11 months ago
Adam Treat	ffef60912f	Update to llama.cpp	11 months ago
Adam Treat	f5f22fdbd0	Update llama.cpp for latest bugfixes.	11 months ago
cebtenzzre	7bcd9e8089	update llama.cpp-mainline	11 months ago
cebtenzzre	fd0c501d68	backend: support GGUFv3 (#1582 )	11 months ago
Adam Treat	14b410a12a	Update to latest version of llama.cpp which fixes issue 1507.	11 months ago
Adam Treat	ab96035bec	Update to llama.cpp submodule for some vulkan fixes.	11 months ago
cebtenzzre	e90263c23f	make scripts executable (#1555 )	11 months ago
Aaron Miller	f414c28589	llmodel: whitelist library name patterns this fixes some issues that were being seen on installed windows builds of 2.5.0 only load dlls that actually might be model impl dlls, otherwise we pull all sorts of random junk into the process before it might expect to be Signed-off-by: Aaron Miller <apage43@ninjawhale.com>	11 months ago
cebtenzzre	4338e72a51	MPT: use upstream llama.cpp implementation (#1515 )	11 months ago
cebtenzzre	0fe2e19691	llamamodel: re-enable error messages by default (#1537 )	11 months ago
cebtenzzre	017c3a9649	python: prepare version 2.0.0rc1 (#1529 )	11 months ago
cebtenzzre	9a19c740ee	kompute: fix library loading issues with kp_logger (#1517 )	11 months ago
Aaron Miller	f79557d2aa	speedup: just use matvec shaders for matmat so far my from-scratch matmats are still slower than just running more invocations of the existing Metal ported matvec shaders - it should be theoretically possible to make a matmat that's faster (for actual matmat cases) than an optimal matvec, but it will need to be at least* as fast as the mat*vec op and then take special care to be cache-friendly and save memory bandwidth, as the # of compute ops is the same	11 months ago
cebtenzzre	22de3c56bd	convert scripts: fix AutoConfig typo (#1512 )	11 months ago
Aaron Miller	2490977f89	q6k, q4_1 mat*mat	11 months ago
Aaron Miller	afaa291eab	python bindings should be quiet by default * disable llama.cpp logging unless GPT4ALL_VERBOSE_LLAMACPP envvar is nonempty * make verbose flag for retrieve_model default false (but also be overridable via gpt4all constructor) should be able to run a basic test: ```python import gpt4all model = gpt4all.GPT4All('/Users/aaron/Downloads/rift-coder-v0-7b-q4_0.gguf') print(model.generate('def fib(n):')) ``` and see no non-model output when successful	11 months ago
cebtenzzre	7b611b49f2	llmodel: print an error if the CPU does not support AVX (#1499 )	11 months ago
Aaron Miller	043617168e	do not process prompts on gpu yet	11 months ago
Aaron Miller	64001a480a	mat*mat for q4_0, q8_0	11 months ago
cebtenzzre	7a19047329	llmodel: do not call magic_match unless build variant is correct (#1488 )	11 months ago
Cebtenzzre	5fe685427a	chat: clearer CPU fallback messages	12 months ago
Adam Treat	eec906aa05	Speculative fix for build on mac.	12 months ago
Adam Treat	a9acdd25de	Push a new version number for llmodel backend now that it is based on gguf.	12 months ago
Cebtenzzre	8bb6a6c201	rebase on newer llama.cpp	12 months ago
Cebtenzzre	d87573ea75	remove old llama.cpp submodules	12 months ago
Cebtenzzre	cc6db61c93	backend: fix build with Visual Studio generator Use the $<CONFIG> generator expression instead of CMAKE_BUILD_TYPE. This is needed because Visual Studio is a multi-configuration generator, so we do not know what the build type will be until `cmake --build` is called. Fixes #1470	12 months ago
Adam Treat	f605a5b686	Add q8_0 kernels to kompute shaders and bump to latest llama/gguf.	12 months ago
Cebtenzzre	672cb850f9	differentiate between init failure and unsupported models	12 months ago
Adam Treat	906699e8e9	Bump to latest llama/gguf branch.	12 months ago
Cebtenzzre	088afada49	llamamodel: fix static vector in LLamaModel::endTokens	12 months ago
Adam Treat	b4d82ea289	Bump to the latest fixes for vulkan in llama.	12 months ago
Adam Treat	12f943e966	Fix regenerate button to be deterministic and bump the llama version to latest we have for gguf.	12 months ago
Adam Treat	5d346e13d7	Add q6_k kernels for vulkan.	12 months ago
Adam Treat	4eefd386d0	Refactor for subgroups on mat * vec kernel.	12 months ago
Cebtenzzre	3c2aa299d8	gptj: remove unused variables	12 months ago
Cebtenzzre	f9deb87d20	convert scripts: add feed-forward length for better compatiblilty This GGUF key is used by all llama.cpp models with upstream support.	12 months ago
Cebtenzzre	cc7675d432	convert scripts: make gptj script executable	12 months ago
Cebtenzzre	0493e6eb07	convert scripts: use bytes_to_unicode from transformers	12 months ago
Cebtenzzre	d5d72f0361	gpt-j: update inference to match latest llama.cpp insights - Use F16 KV cache - Store transposed V in the cache - Avoid unnecessary Q copy Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> ggml upstream commit 0265f0813492602fec0e1159fe61de1bf0ccaf78	12 months ago
Cebtenzzre	050e7f076e	backend: port GPT-J to GGUF	12 months ago
Cebtenzzre	8f3abb37ca	fix references to removed model types	12 months ago
Cebtenzzre	4219c0e2e7	convert scripts: make them directly executable	12 months ago
Cebtenzzre	ce7be1db48	backend: use llamamodel.cpp for Falcon	12 months ago
Cebtenzzre	cca9e6ce81	convert_mpt_hf_to_gguf.py: better tokenizer decoding	12 months ago
Cebtenzzre	25297786db	convert scripts: load model as late as possible	12 months ago
Cebtenzzre	fd47088f2b	conversion scripts: cleanup	12 months ago
Cebtenzzre	6277eac9cc	backend: use llamamodel.cpp for StarCoder	12 months ago
Cebtenzzre	17fc9e3e58	backend: port Replit to GGUF	12 months ago
Cebtenzzre	7c67262a13	backend: port MPT to GGUF	12 months ago
Cebtenzzre	42bcb814b3	backend: port BERT to GGUF	12 months ago
Cebtenzzre	1d29e4696c	llamamodel: metal supports all quantization types now	12 months ago
Aaron Miller	507753a37c	macos build fixes	12 months ago
Adam Treat	d90d003a1d	Latest rebase on llama.cpp with gguf support.	12 months ago
Adam Treat	99c106e6b5	Fix a bug seen on AMD RADEON cards with vulkan backend.	12 months ago
Jacob Nguyen	e86c63750d	Update llama.cpp.cmake Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>	1 year ago
Adam Treat	84905aa281	Fix for crashes on systems where vulkan is not installed properly.	1 year ago
Adam Treat	045f6e6cdc	Link against ggml in bin so we can get the available devices without loading a model.	1 year ago
Adam Treat	aa33419c6e	Fallback to CPU more robustly.	1 year ago
Adam Treat	9013a089bd	Bump to new llama with new bugfix.	1 year ago
Adam Treat	3076e0bf26	Only show GPU when we're actually using it.	1 year ago
Adam Treat	cf4eb530ce	Sync to a newer version of llama.cpp with bugfix for vulkan.	1 year ago
Adam Treat	4b9a345aee	Update the submodule.	1 year ago
Aaron Miller	6f038c136b	init at most one vulkan device, submodule update fixes issues w/ multiple of the same gpu	1 year ago
Adam Treat	8f99dca70f	Bring the vulkan backend to the GUI.	1 year ago
Aaron Miller	f0735efa7d	vulkan python bindings on windows fixes	1 year ago
Adam Treat	c953b321b7	Don't link against libvulkan.	1 year ago
Aaron Miller	c4d23512e4	remove extra dynamic linker deps when building with vulkan	1 year ago
Adam Treat	85e34598f9	more circleci	1 year ago
Adam Treat	f578fa6cdf	Fix for windows.	1 year ago
Adam Treat	17d3e4976c	Add a comment indicating future work.	1 year ago

1 2 3 4 5 ...

261 Commits (d92252cab15e3f9724157b4d336a536c52bc4c78)