Commit Graph

1194 Commits

Author SHA1 Message Date
Adam Treat
7f9f91ad94 Revert "New tokenizer implementation for MPT and GPT-J"
This reverts commit bbcee1ced5.
2023-05-30 12:59:00 -04:00
Adam Treat
cdc7d6ccc4 Revert "buf_ref.into() can be const now"
This reverts commit d59c77ac55.
2023-05-30 12:58:53 -04:00
Adam Treat
b5edaa2656 Revert "add tokenizer readme w/ instructions for convert script"
This reverts commit 5063c2c1b2.
2023-05-30 12:58:18 -04:00
aaron miller
5063c2c1b2 add tokenizer readme w/ instructions for convert script 2023-05-30 12:05:57 -04:00
Aaron Miller
d59c77ac55 buf_ref.into() can be const now 2023-05-30 12:05:57 -04:00
Aaron Miller
bbcee1ced5 New tokenizer implementation for MPT and GPT-J
Improves output quality by making these tokenizers more closely
match the behavior of the huggingface `tokenizers` based BPE
tokenizers these models were trained with.

Featuring:
 * Fixed unicode handling (via ICU)
 * Fixed BPE token merge handling
 * Complete added vocabulary handling
2023-05-30 12:05:57 -04:00
Andriy Mulyar
6ed9c1a8d8
Improved localdocs documentation (#762)
* Improved localdocs documentation

* Improved localdocs documentation

* Improved localdocs documentation

* Improved localdocs documentation
2023-05-30 11:26:34 -04:00
Andriy Mulyar
02290fd881
LocalDocs documentation initial (#761)
* LocalDocs documentation initial
2023-05-30 08:35:26 -04:00
mvenditto
9eb81cb549
C# Bindings - Prompt formatting (#712)
* Added support for custom prompt formatting

* more docs added

* bump version
2023-05-28 19:57:00 -04:00
Chase McDougall
44c23cd2e8
fix(training instructions): model repo name (#728)
Signed-off-by: Chase McDougall <chasemcdougall@hotmail.com>
2023-05-28 19:56:24 -04:00
Nandakumar
d101ca06d4
Update README.md (#738)
* Update README.md

fix golang gpt4all import path

Signed-off-by: Nandakumar <nandagunasekaran@gmail.com>

* Update README.md

Signed-off-by: Nandakumar <nandagunasekaran@gmail.com>

---------

Signed-off-by: Nandakumar <nandagunasekaran@gmail.com>
2023-05-28 19:51:11 -04:00
Joseph Mearman
020f64b9a4
tiny typo (#739) 2023-05-28 19:50:45 -04:00
Richard Guo
73db20ba85 hotfix default verbose optioin 2023-05-26 12:49:32 -04:00
Konstantin Gukov
a6f3e94458 one funcion to append .bin suffix 2023-05-26 09:24:03 -04:00
Konstantin Gukov
659244f0a2 Correct indentation of the multiline error message 2023-05-26 09:24:03 -04:00
Konstantin Gukov
5e61008424 Add optional verbosity 2023-05-26 09:24:03 -04:00
Konstantin Gukov
e05ee9466a Correct return type 2023-05-26 09:24:03 -04:00
Konstantin Gukov
100c809f1e Do not ignore explicitly passed 4 threads 2023-05-26 09:24:03 -04:00
Konstantin Gukov
dcbdd369ad Redundant else 2023-05-26 09:24:03 -04:00
Konstantin Gukov
ace34afef2 1. Cleanup the interrupted download
2. with-syntax
2023-05-26 09:24:03 -04:00
Konstantin Gukov
8053dc014b less magic number 2023-05-26 09:24:03 -04:00
Konstantin Gukov
e98cfd97b3 convert to f-strings 2023-05-26 09:24:03 -04:00
Konstantin Gukov
2b6fb7b95e reduce nesting, better error reporting 2023-05-26 09:24:03 -04:00
Konstantin Gukov
a067f38544 Concise model matching 2023-05-26 09:24:03 -04:00
Konstantin Gukov
c1f3dd310c Log where the model was found 2023-05-26 09:24:03 -04:00
Konstantin Gukov
f96300534b Nicer handling of missing model directory.
Correct exception message.
2023-05-26 09:24:03 -04:00
Konstantin Gukov
59d7db9aad More precise condition 2023-05-26 09:24:03 -04:00
Konstantin Gukov
adc599b0a6 rm redundant json 2023-05-26 09:24:03 -04:00
Adam Treat
810a3b12cc This time remember to bump the version right after a release. 2023-05-25 18:26:33 -04:00
Adam Treat
d1ff7132c5 Bump the version number. 2023-05-25 17:08:50 -04:00
Adam Treat
afe3870b7a Libraries named differently on msvc. 2023-05-25 16:27:09 -04:00
Adam Treat
474c5387f9 Get the backend as well as the client building/working with msvc. 2023-05-25 15:22:45 -04:00
redthing1
63f57635d8 make sample print usage and cleaner 2023-05-25 11:34:21 -04:00
redthing1
dec8546abe create test project and basic model loading tests 2023-05-25 11:34:07 -04:00
redthing1
0cc86d19be ignore rider and vscode dirs 2023-05-25 11:34:07 -04:00
Adam Treat
265488e54a Add a newline 2023-05-25 11:28:06 -04:00
Adam Treat
98201540a2 Various fixes to remove unnecessary warnings. 2023-05-25 11:28:06 -04:00
Adam Treat
0403a122ca Don't use the full path in reference text. 2023-05-25 11:28:06 -04:00
Adam Treat
9b0629db8b Add context link to references. 2023-05-25 11:28:06 -04:00
Adam Treat
db9eecdce4 Store the references separately so they are not sent to datalake. 2023-05-25 11:28:06 -04:00
Adam Treat
b5380c9b7f Adds the collections to serialize and implement references for localdocs. 2023-05-25 11:28:06 -04:00
Adam Treat
d81302950e Complete the settings for localdocs. 2023-05-25 11:28:06 -04:00
Adam Treat
01b8c7617f Add more of the UI for selecting collections for chats. 2023-05-25 11:28:06 -04:00
Adam Treat
2827c5876c Clean up the settings dialog for localdocs a bit. 2023-05-25 11:28:06 -04:00
Adam Treat
d555ed3b07 Begin implementing the localdocs ui in earnest. 2023-05-25 11:28:06 -04:00
Adam Treat
120fbbf67d Start fleshing out the localdocs ui. 2023-05-25 11:28:06 -04:00
Adam Treat
af33be7b3e Add a localdocs tab. 2023-05-25 11:28:06 -04:00
Adam Treat
d9eddbec45 Add a collection list to support a UI. 2023-05-25 11:28:06 -04:00
Adam Treat
68ba9c564b Specify a large number of suffixes we will search for now. 2023-05-25 11:28:06 -04:00
Adam Treat
c800291e7f Add prompt processing and localdocs to the busy indicator in UI. 2023-05-25 11:28:06 -04:00