Andriy Mulyar
390994ea5e
Update README.md to include inference example
...
Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
2023-06-28 16:24:48 -04:00
Andriy Mulyar
a67f8132e1
Update README.md
...
Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
2023-06-28 14:29:15 -04:00
Andriy Mulyar
633e2a2137
GPT4All API Scaffolding. Matches OpenAI OpenAPI spec for chats and completions ( #839 )
...
* GPT4All API Scaffolding. Matches OpenAI OpenAI spec for engines, chats and completions
* Edits for docker building
* FastAPI app builds and pydantic models are accurate
* Added groovy download into dockerfile
* improved dockerfile
* Chat completions endpoint edits
* API uni test sketch
* Working example of groovy inference with open ai api
* Added lines to test
* Set default to mpt
2023-06-28 14:28:52 -04:00
Andriy Mulyar
6b8456bf99
Update README.md ( #1086 )
...
Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
2023-06-28 12:15:05 -04:00
Adam Treat
e70899a26c
Make the retrieval/parsing of models.json sync on startup. We were jumping to many hoops to mitigate the async behavior.
2023-06-28 12:32:22 -03:00
Adam Treat
9560336490
Match on the filename too for server mode.
2023-06-28 09:20:05 -04:00
Aaron Miller
28d41d4f6d
falcon: use *model-local* eval & scratch bufs ( #1079 )
...
fixes memory leaks copied from ggml/examples based implementation
2023-06-27 16:09:11 -07:00
Adam Treat
58cd346686
Bump release again and new release notes.
2023-06-27 18:01:23 -04:00
Adam Treat
0f8f364d76
Fix mac again for falcon.
2023-06-27 17:20:40 -04:00
Adam Treat
8aae4e52b3
Fix for falcon on mac.
2023-06-27 17:13:13 -04:00
Adam Treat
9375c71aa7
New release notes for 2.4.9 and bump version.
2023-06-27 17:01:49 -04:00
Adam Treat
71449bbc4b
Fix this correctly?
2023-06-27 16:01:11 -04:00
Adam Treat
07a5405618
Make it clear this is our finetune.
2023-06-27 15:33:38 -04:00
Adam Treat
189ac82277
Fix server mode.
2023-06-27 15:01:16 -04:00
Adam Treat
b56cc61ca2
Don't allow setting an invalid prompt template.
2023-06-27 14:52:44 -04:00
Adam Treat
0780393d00
Don't use local.
2023-06-27 14:13:42 -04:00
Adam Treat
924efd9e25
Add falcon to our models.json
2023-06-27 13:56:16 -04:00
Adam Treat
d3b8234106
Fix spelling.
2023-06-27 14:23:56 -03:00
Adam Treat
42c0a6673a
Don't persist the force metal setting.
2023-06-27 14:23:56 -03:00
Adam Treat
267601d670
Enable the force metal setting.
2023-06-27 14:23:56 -03:00
Zach Nussbaum
2565f6a94a
feat: add conversion script
2023-06-27 14:06:39 -03:00
Aaron Miller
e22dd164d8
add falcon to chatllm::serialize
2023-06-27 14:06:39 -03:00
Aaron Miller
198b5e4832
add Falcon 7B model
...
Tested with https://huggingface.co/TheBloke/falcon-7b-instruct-GGML/blob/main/falcon7b-instruct.ggmlv3.q4_0.bin
2023-06-27 14:06:39 -03:00
AMOGUS
b8464073b8
Update gpt4all_chat.md ( #1050 )
...
* Update gpt4all_chat.md
Cleaned up and made the sideloading part more readable, also moved Replit architecture to supported ones. (+ renamed all "ggML" to "GGML" because who calls it "ggML"??)
Signed-off-by: AMOGUS <137312610+Amogus8P@users.noreply.github.com>
* Removed the prefixing part
Signed-off-by: AMOGUS <137312610+Amogus8P@users.noreply.github.com>
* Bump version
Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
---------
Signed-off-by: AMOGUS <137312610+Amogus8P@users.noreply.github.com>
Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
Co-authored-by: Andriy Mulyar <andriy.mulyar@gmail.com>
2023-06-27 10:49:45 -04:00
Adam Treat
985d3bbfa4
Add Orca models to list.
2023-06-27 09:38:43 -04:00
Adam Treat
8558fb4297
Fix models.json for spanning multiple lines with string.
2023-06-26 21:35:56 -04:00
Adam Treat
c24ad02a6a
Wait just a bit to set the model name so that we can display the proper name instead of filename.
2023-06-26 21:00:09 -04:00
Aaron Miller
db34a2f670
llmodel: skip attempting Metal if model+kvcache > 53% of system ram
2023-06-26 19:46:49 -03:00
Adam Treat
57fa8644d6
Make spelling check happy.
2023-06-26 17:56:56 -04:00
Adam Treat
d0a3e82ffc
Restore feature I accidentally erased in modellist update.
2023-06-26 17:50:45 -04:00
Aaron Miller
b19a3e5b2c
add requiredMem method to llmodel impls
...
most of these can just shortcut out of the model loading logic llama is a bit worse to deal with because we submodule it so I have to at least parse the hparams, and then I just use the size on disk as an estimate for the mem size (which seems reasonable since we mmap() the llama files anyway)
2023-06-26 18:27:58 -03:00
Adam Treat
dead954134
Fix save chats setting.
2023-06-26 16:43:37 -04:00
Adam Treat
26c9193227
Sigh. Windows.
2023-06-26 16:34:35 -04:00
Adam Treat
5deec2afe1
Change this back now that it is ready.
2023-06-26 16:21:09 -04:00
Adam Treat
676248fe8f
Update the language.
2023-06-26 14:14:49 -04:00
Adam Treat
ef92492d8c
Add better warnings and links.
2023-06-26 14:14:49 -04:00
Adam Treat
71c972f8fa
Provide a more stark warning for localdocs and add more size to dialogs.
2023-06-26 14:14:49 -04:00
Adam Treat
1b5aa4617f
Enable the add button always, but show an error in placeholder text.
2023-06-26 14:14:49 -04:00
Adam Treat
a0f80453e5
Use sysinfo in backend.
2023-06-26 14:14:49 -04:00
Adam Treat
5e520bb775
Fix so that models are searched in subdirectories.
2023-06-26 14:14:49 -04:00
Adam Treat
64e98b8ea9
Fix bug with model loading on initial load.
2023-06-26 14:14:49 -04:00
Adam Treat
3ca9e8692c
Don't try and load incomplete files.
2023-06-26 14:14:49 -04:00
Adam Treat
27f25d5878
Get rid of recursive mutex.
2023-06-26 14:14:49 -04:00
Adam Treat
7f01b153b3
Modellist temp
2023-06-26 14:14:46 -04:00
Adam Treat
c1794597a7
Revert "Enable Wayland in build"
...
This reverts commit d686a583f9
.
2023-06-26 14:10:27 -04:00
Akarshan Biswas
d686a583f9
Enable Wayland in build
...
# Describe your changes
The patch include support for running natively on a Linux Wayland display server/compositor which is successor to old Xorg.
Cmakelist was missing WaylandClient so added it back.
Will fix #1047 .
Signed-off-by: Akarshan Biswas <akarshan.biswas@gmail.com>
2023-06-26 14:58:23 -03:00
niansa/tuxifan
47323f8591
Update replit.cpp
...
replit_tokenizer_detokenize returnins std::string now
Signed-off-by: niansa/tuxifan <tuxifan@posteo.de>
2023-06-26 14:49:58 -03:00
niansa
0855c0df1d
Fixed Replit implementation compile warnings
2023-06-26 14:49:58 -03:00
Aaron Miller
1290b32451
update to latest mainline llama.cpp
...
add max_size param to ggml_metal_add_buffer - introduced in https://github.com/ggerganov/llama.cpp/pull/1826
2023-06-26 14:40:52 -03:00
AMOGUS
3417a37c54
Change "web server" to "API server" for less confusion ( #1039 )
...
* Change "Web server" to "API server"
* Changed "API server" to "OpenAPI server"
* Reversed back to "API server" and updated tooltip
2023-06-23 16:28:52 -04:00