Commit Graph

237 Commits (6da62a62f062704cc96f32d75d411c4428f23a4a)

Author SHA1 Message Date
Jared Van Bortel 6da62a62f0 python: this was supposed to be an f-string
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel 059afb8ee8
csharp: update README to reflect new NuGet package
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
7 months ago
Jared Van Bortel 5dd7378db4
csharp: fix NuGet package build (#1951)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
Signed-off-by: Konstantin Semenenko <mail@ksemenenko.com>
Co-authored-by: Konstantin Semenenko <mail@ksemenenko.com>
7 months ago
Jared Van Bortel ec13ba2818
docs: update list of supported localdocs formats (#1944)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel bf493bb048
Mixtral crash fix and python bindings v2.2.0 (#1931)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel 061d1969f8
expose n_gpu_layers parameter of llama.cpp (#1890)
Also dynamically limit the GPU layers and context length fields to the maximum supported by the model.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel b881598166
py: improve README (#1860)
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
8 months ago
Jared Van Bortel 03a9f0bedf
csharp: update C# bindings to work with GGUF (#1651) 8 months ago
Jared Van Bortel f8564398fc minor change to trigger CircleCI 8 months ago
Jared Van Bortel eef604fd64 python: release bindings version 2.1.0
The backend has a breaking change for Falcon and MPT models, so we need
to make a new release.
8 months ago
Daniel Salvatierra c72c73a94f
app.py: add --device option for GPU support (#1769)
Signed-off-by: Daniel Salvatierra <dsalvat1@gmail.com>
Co-authored-by: Jared Van Bortel <jared@nomic.ai>
9 months ago
Jared Van Bortel d1c56b8b28
Implement configurable context length (#1749) 9 months ago
Jacob Nguyen 7aa0f779de
Update mkdocs.yml (#1759)
update doc routing
9 months ago
Jacob Nguyen a1f27072c2
fix/macm1ts (#1746)
* make runtime library backend universal searchable

* corepack enable

* fix

* pass tests

* simpler

* add more jsdoc

* fix testS

* fix up circle ci

* bump version

* remove false positive warning

* add disclaimer

* update readme

* revert

* update ts docs

---------

Co-authored-by: Matthew Nguyen <matthewpnguyen@Matthews-MacBook-Pro-7.local>
9 months ago
Jared Van Bortel 778264fbab python: don't use importlib as_file for a directory
The only reason to use as_file is to support copying a file from a
frozen package. We don't currently support this anyway, and as_file
isn't supported until Python 3.9, so get rid of it.

Fixes #1605
9 months ago
aj-gameon 7facb8207b
docs: golang --recurse-submodules (#1720)
Co-authored-by: aj-gameon <aj@gameontechnology.com>
9 months ago
AT 84749a4ced Update gpt4all_chat.md
Signed-off-by: AT <manyoso@users.noreply.github.com>
10 months ago
AT f1c58d0e2c Update gpt4all_chat.md
Signed-off-by: AT <manyoso@users.noreply.github.com>
10 months ago
Jared Van Bortel d4ce9f4a7c
llmodel_c: improve quality of error messages (#1625) 11 months ago
aj-gameon 8fabf0be4a
Updated readme for correct install instructions (#1607)
Co-authored-by: aj-gameon <aj@gameontechnology.com>
11 months ago
Jacob Nguyen 45d76d6234
ts/tooling (#1602) 11 months ago
Jacob Nguyen da95bcfb4b
vulkan support for typescript bindings, gguf support (#1390)
* adding some native methods to cpp wrapper

* gpu seems to work

* typings and add availibleGpus method

* fix spelling

* fix syntax

* more

* normalize methods to conform to py

* remove extra dynamic linker deps when building with vulkan

* bump python version (library linking fix)

* Don't link against libvulkan.

* vulkan python bindings on windows fixes

* Bring the vulkan backend to the GUI.

* When device is Auto (the default) then we will only consider discrete GPU's otherwise fallback to CPU.

* Show the device we're currently using.

* Fix up the name and formatting.

* init at most one vulkan device, submodule update

fixes issues w/ multiple of the same gpu

* Update the submodule.

* Add version 2.4.15 and bump the version number.

* Fix a bug where we're not properly falling back to CPU.

* Sync to a newer version of llama.cpp with bugfix for vulkan.

* Report the actual device we're using.

* Only show GPU when we're actually using it.

* Bump to new llama with new bugfix.

* Release notes for v2.4.16 and bump the version.

* Fallback to CPU more robustly.

* Release notes for v2.4.17 and bump the version.

* Bump the Python version to python-v1.0.12 to restrict the quants that vulkan recognizes.

* Link against ggml in bin so we can get the available devices without loading a model.

* Send actual and requested device info for those who have opt-in.

* Actually bump the version.

* Release notes for v2.4.18 and bump the version.

* Fix for crashes on systems where vulkan is not installed properly.

* Release notes for v2.4.19 and bump the version.

* fix typings and vulkan build works on win

* Add flatpak manifest

* Remove unnecessary stuffs from manifest

* Update to 2.4.19

* appdata: update software description

* Latest rebase on llama.cpp with gguf support.

* macos build fixes

* llamamodel: metal supports all quantization types now

* gpt4all.py: GGUF

* pyllmodel: print specific error message

* backend: port BERT to GGUF

* backend: port MPT to GGUF

* backend: port Replit to GGUF

* backend: use gguf branch of llama.cpp-mainline

* backend: use llamamodel.cpp for StarCoder

* conversion scripts: cleanup

* convert scripts: load model as late as possible

* convert_mpt_hf_to_gguf.py: better tokenizer decoding

* backend: use llamamodel.cpp for Falcon

* convert scripts: make them directly executable

* fix references to removed model types

* modellist: fix the system prompt

* backend: port GPT-J to GGUF

* gpt-j: update inference to match latest llama.cpp insights

- Use F16 KV cache
- Store transposed V in the cache
- Avoid unnecessary Q copy

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

ggml upstream commit 0265f0813492602fec0e1159fe61de1bf0ccaf78

* chatllm: grammar fix

* convert scripts: use bytes_to_unicode from transformers

* convert scripts: make gptj script executable

* convert scripts: add feed-forward length for better compatiblilty

This GGUF key is used by all llama.cpp models with upstream support.

* gptj: remove unused variables

* Refactor for subgroups on mat * vec kernel.

* Add q6_k kernels for vulkan.

* python binding: print debug message to stderr

* Fix regenerate button to be deterministic and bump the llama version to latest we have for gguf.

* Bump to the latest fixes for vulkan in llama.

* llamamodel: fix static vector in LLamaModel::endTokens

* Switch to new models2.json for new gguf release and bump our version to
2.5.0.

* Bump to latest llama/gguf branch.

* chat: report reason for fallback to CPU

* chat: make sure to clear fallback reason on success

* more accurate fallback descriptions

* differentiate between init failure and unsupported models

* backend: do not use Vulkan with non-LLaMA models

* Add q8_0 kernels to kompute shaders and bump to latest llama/gguf.

* backend: fix build with Visual Studio generator

Use the $<CONFIG> generator expression instead of CMAKE_BUILD_TYPE. This
is needed because Visual Studio is a multi-configuration generator, so
we do not know what the build type will be until `cmake --build` is
called.

Fixes #1470

* remove old llama.cpp submodules

* Reorder and refresh our models2.json.

* rebase on newer llama.cpp

* python/embed4all: use gguf model, allow passing kwargs/overriding model

* Add starcoder, rift and sbert to our models2.json.

* Push a new version number for llmodel backend now that it is based on gguf.

* fix stray comma in models2.json

Signed-off-by: Aaron Miller <apage43@ninjawhale.com>

* Speculative fix for build on mac.

* chat: clearer CPU fallback messages

* Fix crasher with an empty string for prompt template.

* Update the language here to avoid misunderstanding.

* added EM German Mistral Model

* make codespell happy

* issue template: remove "Related Components" section

* cmake: install the GPT-J plugin (#1487)

* Do not delete saved chats if we fail to serialize properly.

* Restore state from text if necessary.

* Another codespell attempted fix.

* llmodel: do not call magic_match unless build variant is correct (#1488)

* chatllm: do not write uninitialized data to stream (#1486)

* mat*mat for q4_0, q8_0

* do not process prompts on gpu yet

* python: support Path in GPT4All.__init__ (#1462)

* llmodel: print an error if the CPU does not support AVX (#1499)

* python bindings should be quiet by default

* disable llama.cpp logging unless GPT4ALL_VERBOSE_LLAMACPP envvar is
  nonempty
* make verbose flag for retrieve_model default false (but also be
  overridable via gpt4all constructor)

should be able to run a basic test:

```python
import gpt4all
model = gpt4all.GPT4All('/Users/aaron/Downloads/rift-coder-v0-7b-q4_0.gguf')
print(model.generate('def fib(n):'))
```

and see no non-model output when successful

* python: always check status code of HTTP responses (#1502)

* Always save chats to disk, but save them as text by default. This also changes
the UI behavior to always open a 'New Chat' and setting it as current instead
of setting a restored chat as current. This improves usability by not requiring
the user to wait if they want to immediately start chatting.

* Update README.md

Signed-off-by: umarmnaq <102142660+umarmnaq@users.noreply.github.com>

* fix embed4all filename

https://discordapp.com/channels/1076964370942267462/1093558720690143283/1161778216462192692

Signed-off-by: Aaron Miller <apage43@ninjawhale.com>

* Improves Java API signatures maintaining back compatibility

* python: replace deprecated pkg_resources with importlib (#1505)

* Updated chat wishlist (#1351)

* q6k, q4_1 mat*mat

* update mini-orca 3b to gguf2, license

Signed-off-by: Aaron Miller <apage43@ninjawhale.com>

* convert scripts: fix AutoConfig typo (#1512)

* publish config https://docs.npmjs.com/cli/v9/configuring-npm/package-json#publishconfig (#1375)

merge into my branch

* fix appendBin

* fix gpu not initializing first

* sync up

* progress, still wip on destructor

* some detection work

* untested dispose method

* add js side of dispose

* Update gpt4all-bindings/typescript/index.cc

Co-authored-by: cebtenzzre <cebtenzzre@gmail.com>
Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>

* Update gpt4all-bindings/typescript/index.cc

Co-authored-by: cebtenzzre <cebtenzzre@gmail.com>
Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>

* Update gpt4all-bindings/typescript/index.cc

Co-authored-by: cebtenzzre <cebtenzzre@gmail.com>
Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>

* Update gpt4all-bindings/typescript/src/gpt4all.d.ts

Co-authored-by: cebtenzzre <cebtenzzre@gmail.com>
Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>

* Update gpt4all-bindings/typescript/src/gpt4all.js

Co-authored-by: cebtenzzre <cebtenzzre@gmail.com>
Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>

* Update gpt4all-bindings/typescript/src/util.js

Co-authored-by: cebtenzzre <cebtenzzre@gmail.com>
Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>

* fix tests

* fix circleci for nodejs

* bump version

---------

Signed-off-by: Aaron Miller <apage43@ninjawhale.com>
Signed-off-by: umarmnaq <102142660+umarmnaq@users.noreply.github.com>
Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>
Co-authored-by: Aaron Miller <apage43@ninjawhale.com>
Co-authored-by: Adam Treat <treat.adam@gmail.com>
Co-authored-by: Akarshan Biswas <akarshan.biswas@gmail.com>
Co-authored-by: Cebtenzzre <cebtenzzre@gmail.com>
Co-authored-by: Jan Philipp Harries <jpdus@users.noreply.github.com>
Co-authored-by: umarmnaq <102142660+umarmnaq@users.noreply.github.com>
Co-authored-by: Alex Soto <asotobu@gmail.com>
Co-authored-by: niansa/tuxifan <tuxifan@posteo.de>
11 months ago
cebtenzzre 3c561bcdf2 python: bump bindings version for AMD fixes 11 months ago
cebtenzzre 79a5522931 fix references to old backend implementations 11 months ago
cebtenzzre 78d930516d
app.py: change default model to Mistral Instruct (#1564) 11 months ago
cebtenzzre e90263c23f
make scripts executable (#1555) 11 months ago
cebtenzzre 7e5e84fbb7
python: change default extension to .gguf (#1559) 11 months ago
cebtenzzre 37b007603a
bindings: replace references to GGMLv3 models with GGUF (#1547) 11 months ago
Andriy Mulyar d50803ff8e
GGUF Python Release (#1539) 11 months ago
cebtenzzre 245c5ce5ea
update default model URLs (#1538) 11 months ago
cebtenzzre 0fe2e19691
llamamodel: re-enable error messages by default (#1537) 11 months ago
cebtenzzre 5fbeeb1cb4
python: connection resume and MSVC support (#1535) 11 months ago
cebtenzzre 017c3a9649
python: prepare version 2.0.0rc1 (#1529) 11 months ago
cebtenzzre fd3014016b
docs: clarify Vulkan dep in build instructions for bindings (#1525) 11 months ago
cebtenzzre 4d4275d1b8
python: replace deprecated pkg_resources with importlib (#1505) 11 months ago
Alex Soto 3c45a555e9 Improves Java API signatures maintaining back compatibility 11 months ago
Aaron Miller f39df0906e fix embed4all filename
https://discordapp.com/channels/1076964370942267462/1093558720690143283/1161778216462192692

Signed-off-by: Aaron Miller <apage43@ninjawhale.com>
11 months ago
cebtenzzre aed2068342
python: always check status code of HTTP responses (#1502) 11 months ago
Aaron Miller afaa291eab python bindings should be quiet by default
* disable llama.cpp logging unless GPT4ALL_VERBOSE_LLAMACPP envvar is
  nonempty
* make verbose flag for retrieve_model default false (but also be
  overridable via gpt4all constructor)

should be able to run a basic test:

```python
import gpt4all
model = gpt4all.GPT4All('/Users/aaron/Downloads/rift-coder-v0-7b-q4_0.gguf')
print(model.generate('def fib(n):'))
```

and see no non-model output when successful
11 months ago
cebtenzzre f81b4b45bf
python: support Path in GPT4All.__init__ (#1462) 11 months ago
Aaron Miller a10f3aea5e python/embed4all: use gguf model, allow passing kwargs/overriding model 12 months ago
Adam Treat ea66669cef Switch to new models2.json for new gguf release and bump our version to
2.5.0.
12 months ago
Cebtenzzre 40c78d2f78 python binding: print debug message to stderr 12 months ago
Cebtenzzre 4392bf26e0 pyllmodel: print specific error message 12 months ago
Cebtenzzre 34f2ec2b33 gpt4all.py: GGUF 12 months ago
kevinbazira 17cb4a86d1 Replace git clone SSH URI with HTTPS URL
Running `git clone --recurse-submodules git@github.com:nomic-ai/gpt4all.git`
returns `Permission denied (publickey)` as shown below:
```
git clone --recurse-submodules git@github.com:nomic-ai/gpt4all.git
Cloning into gpt4all...
git@github.com: Permission denied (publickey).
fatal: Could not read from remote repository.
```

This change replaces `git@github.com:nomic-ai/gpt4all.git` with
`https://github.com/nomic-ai/gpt4all.git` which runs without permission issues.

resolves nomic-ai/gpt4all#8, resolves nomic-ai/gpt4all#49
1 year ago
Adam Treat 0f046cf905 Bump the Python version to python-v1.0.12 to restrict the quants that vulkan recognizes. 1 year ago
Aaron Miller f0735efa7d vulkan python bindings on windows fixes 1 year ago
Aaron Miller 0ad1472b62 bump python version (library linking fix) 1 year ago
Andriy Mulyar b6e38d69ed
Python version bump
Signed-off-by: Andriy Mulyar <andriy.mulyar@gmail.com>
1 year ago