gpt4all

mirror of https://github.com/nomic-ai/gpt4all synced 2024-11-10 01:10:35 +00:00

Author	SHA1	Message	Date
Andriy Mulyar	3d10110314	Moved model check into cpu only paths	2023-07-24 11:34:50 -04:00
Zach Nussbaum	8aba2c9009	GPU Inference Server (#1112 ) * feat: local inference server * fix: source to use bash + vars * chore: isort and black * fix: make file + inference mode * chore: logging * refactor: remove old links * fix: add new env vars * feat: hf inference server * refactor: remove old links * test: batch and single response * chore: black + isort * separate gpu and cpu dockerfiles * moved gpu to separate dockerfile * Fixed test endpoints * Edits to API. server won't start due to failed instantiation error * Method signature * fix: gpu_infer * tests: fix tests --------- Co-authored-by: Andriy Mulyar <andriy.mulyar@gmail.com>	2023-07-21 15:13:29 -04:00
Andriy Mulyar	58f0fcab57	Added health endpoint Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>	2023-07-20 21:23:29 -04:00
385olt	b4dbbd1485	Python bindings: Custom callbacks, chat session improvement, refactoring (#1145 ) * Added the following features: \n 1) Now prompt_model uses the positional argument callback to return the response tokens. \n 2) Due to the callback argument of prompt_model, prompt_model_streaming only manages the queue and threading now, which reduces duplication of the code. \n 3) Added optional verbose argument to prompt_model which prints out the prompt that is passed to the model. \n 4) Chat sessions can now have a header, i.e. an instruction before the transcript of the conversation. The header is set at the creation of the chat session context. \n 5) generate function now accepts an optional callback. \n 6) When streaming and using chat session, the user doesn't need to save assistant's messages by himself. This is done automatically. * added _empty_response_callback so I don't have to check if callback is None * added docs * now if the callback stop generation, the last token is ignored * fixed type hints, reimplemented chat session header as a system prompt, minor refactoring, docs: removed section about manual update of chat session for streaming * forgot to add some type hints! * keep the config of the model in GPT4All class which is taken from models.json if the download is allowed * During chat sessions, the model-specific systemPrompt and promptTemplate are applied. * implemented the changes * Fixed typing. Now the user can set a prompt template that will be applied even outside of a chat session. The template can also have multiple placeholders that can be filled by passing a dictionary to the generate function * reversed some changes concerning the prompt templates and their functionality * fixed some type hints, changed list[float] to List[Float] * fixed type hints, changed List[Float] to List[float] * fix typo in the comment: Pepare => Prepare --------- Signed-off-by: 385olt <385olt@gmail.com>	2023-07-19 18:36:49 -04:00
AMOGUS	5f0aaf8bdb	python binding's TopP also needs some love Changed the Python binding's TopP from 0.1 to 0.4 Signed-off-by: AMOGUS <137312610+Amogus8P@users.noreply.github.com>	2023-07-19 10:36:23 -04:00
AMOGUS	4974ae917c	Update default TopP to 0.4 TopP 0.1 was found to be somewhat too aggressive, so a more moderate default of 0.4 would be better suited for general use. Signed-off-by: AMOGUS <137312610+Amogus8P@users.noreply.github.com>	2023-07-19 10:36:23 -04:00
cosmic-snow	63849d9afc	Add AVX/AVX2 requirement to main README.md Signed-off-by: cosmic-snow <134004613+cosmic-snow@users.noreply.github.com>	2023-07-19 13:05:42 +02:00
cosmic-snow	2d02c65177	Handle edge cases when generating embeddings (#1215 ) * Handle edge cases when generating embeddings * Improve Python handling & add llmodel_c.h note - In the Python bindings fail fast with a ValueError when text is empty - Advice other bindings authors to do likewise in llmodel_c.h	2023-07-17 13:21:03 -07:00
Felix Zaslavskiy	1e74171a7b	Java binding - Improve error check before loading Model file (#1206 ) * Javav binding - Add check for Model file be Readable. * add todo for java binding. --------- Co-authored-by: Feliks Zaslavskiy <feliks.zaslavskiy@optum.com> Co-authored-by: felix <felix@zaslavskiy.net>	2023-07-15 18:07:42 -04:00
Andriy Mulyar	cfd70b69fc	Update gpt4all_python_embedding.md Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>	2023-07-14 14:54:56 -04:00
Andriy Mulyar	306105e62f	Update gpt4all_python_embedding.md Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>	2023-07-14 14:54:36 -04:00
Andriy Mulyar	89e277bb3c	Update gpt4all_python_embedding.md Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>	2023-07-14 14:30:14 -04:00
Adam Treat	f543affa9a	Add better docs and threading support to bert.	2023-07-14 14:14:22 -04:00
Lakshay Kansal	6c8669cad3	highlighting rules for html and php and latex	2023-07-14 11:36:01 -04:00
Adam Treat	0c0a4f2c22	Add the docs.	2023-07-14 10:48:18 -04:00
Adam Treat	6656f0f41e	Fix the test to work and not do timings.	2023-07-14 09:48:57 -04:00
Adam Treat	bb2b82e1b9	Add docs and bump version since we changed python api again.	2023-07-14 09:48:57 -04:00
Aaron Miller	c77ab849c0	LLModel objects should hold a reference to the library prevents llmodel lib from being gc'd before live model objects	2023-07-14 09:48:57 -04:00
Aaron Miller	1c4a244291	bump mem allocation a bit	2023-07-14 09:48:57 -04:00
Aaron Miller	936dcd2bfc	use default n_threads	2023-07-14 09:48:57 -04:00
Aaron Miller	15f1fe5445	rename embedder	2023-07-14 09:48:57 -04:00
Adam Treat	ee4186d579	Fixup bert python bindings.	2023-07-14 09:48:57 -04:00
cosmic-snow	6200900677	Fix Windows MSVC arch detection (#1194 ) - in llmodel.cpp to fix AVX-only handling Signed-off-by: cosmic-snow <134004613+cosmic-snow@users.noreply.github.com>	2023-07-13 14:44:17 -04:00
Adam Treat	4963db8f43	Bump the version numbers for both python and c backend.	2023-07-13 14:21:46 -04:00
Adam Treat	0efdbfcffe	Bert	2023-07-13 14:21:46 -04:00
Adam Treat	315a1f2aa2	Move it back as internal class.	2023-07-13 14:21:46 -04:00
Adam Treat	ae8eb297ac	Add sbert backend.	2023-07-13 14:21:46 -04:00
Adam Treat	1f749d7633	Clean up backend code a bit and hide impl. details.	2023-07-13 14:21:46 -04:00
Adam Treat	33557b1f39	Move the implementation out of llmodel class.	2023-07-13 14:21:46 -04:00
Adam Treat	64b409e0b8	keep trying	2023-07-13 13:57:22 -04:00
Adam Treat	e59946f05d	try again to unbreak circleci	2023-07-13 13:55:22 -04:00
Adam Treat	b72b409d40	try again to unbreak circlci	2023-07-13 13:52:55 -04:00
Adam Treat	59cae1132c	Try and unbreak circleci.	2023-07-13 13:45:47 -04:00
Adam Treat	a0dae86a95	Add bert to models.json	2023-07-13 13:37:12 -04:00
AT	18ca8901f0	Update README.md Signed-off-by: AT <manyoso@users.noreply.github.com>	2023-07-12 16:30:56 -04:00
cosmic-snow	00a945eaee	Update gpt4all_faq.md - Add information about AVX/AVX2. - Update supported architectures. Signed-off-by: cosmic-snow <134004613+cosmic-snow@users.noreply.github.com>	2023-07-12 15:19:26 -04:00
Zach Nussbaum	6c4f449b7a	fix: update train scripts and configs for other models (#1164 ) * feat: falcon config * feat: mpt config * chore: gitignore * refactor: step calculation * fix: attention mask + shuffle on epoch end * fix: return tensors * fix: wait for everyone * chore: config * chore: ds config * fix: remove ccols * fix: logging and saving * chore: add einops	2023-07-12 15:18:24 -04:00
Adam Treat	e8b19b8e82	Bump version to 2.4.14 and provide release notes.	2023-07-12 14:58:45 -04:00
Adam Treat	8eb0844277	Check if the trimmed version is empty.	2023-07-12 14:31:43 -04:00
Adam Treat	be395c12cc	Make all system prompts empty by default if model does not include in training data.	2023-07-12 14:31:43 -04:00
Aaron Miller	6a8fa27c8d	Correctly find models in subdirs of model dir QDirIterator doesn't seem particular subdir aware, its path() returns the iterated dir. This was the simplest way I found to get this right.	2023-07-12 14:18:40 -04:00
Adam Treat	8893db5896	Add wizard model and rename orca to be more specific.	2023-07-12 14:12:46 -04:00
Adam Treat	60627bd41f	Prefer 7b models in order of default model load.	2023-07-12 12:50:18 -04:00
Aaron Miller	5df4f1bf8c	codespell	2023-07-12 12:49:06 -04:00
Aaron Miller	10ca2c4475	center the spinner	2023-07-12 12:49:06 -04:00
Adam Treat	e9897518d1	Show busy if models.json download taking longer than expected.	2023-07-12 12:49:06 -04:00
Aaron Miller	432b7ebbd7	include windows.h just to be safe	2023-07-12 12:46:46 -04:00
Aaron Miller	95b8fb312e	windows/msvc: use high level processor feature detection API see https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-isprocessorfeaturepresent	2023-07-12 12:46:46 -04:00
Aaron Miller	ad0e7fd01f	chatgpt: ensure no extra newline in header	2023-07-12 10:53:25 -04:00
Aaron Miller	f0faa23ad5	cmakelists: always export build commands (#1179 ) friendly for using editors with clangd integration that don't also manage the build themselves	2023-07-12 10:49:24 -04:00

... 3 4 5 6 7 ...

1501 Commits