diff --git a/README.md b/README.md index 9a5bfee2..79d88582 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,5 @@

GPT4All

-

Open-source assistant-style large language models that run locally on CPU

+

Open-source assistant-style large language models that run locally on your CPU

GPT4All Website @@ -80,7 +80,6 @@ If you have older hardware that only supports avx and not avx2 you can use these * [Ubuntu - avx-only](https://gpt4all.io/installers/gpt4all-installer-linux-avx-only.run) - Find the most up-to-date information on the [GPT4All Website](https://gpt4all.io/) ### Chat Client building and running @@ -93,43 +92,9 @@ Find the most up-to-date information on the [GPT4All Website](https://gpt4all.io * :computer: Official Typescript Bindings -## Training GPT4All-J - -Please see [GPT4All-J Technical Report](https://static.nomic.ai/gpt4all/2023_GPT4All-J_Technical_Report_2.pdf) for details. - -### GPT4All-J Training Data - -- We are releasing the curated training data for anyone to replicate GPT4All-J here: [GPT4All-J Training Data](https://huggingface.co/datasets/nomic-ai/gpt4all-j-prompt-generations) - - [Atlas Map of Prompts](https://atlas.nomic.ai/map/gpt4all-j-prompts-curated) - - [Atlas Map of Responses](https://atlas.nomic.ai/map/gpt4all-j-response-curated) - -We have released updated versions of our `GPT4All-J` model and training data. - -- `v1.0`: The original model trained on the v1.0 dataset -- `v1.1-breezy`: Trained on a filtered dataset where we removed all instances of AI language model -- `v1.2-jazzy`: Trained on a filtered dataset where we also removed instances like I'm sorry, I can't answer... and AI language model - -The [models](https://huggingface.co/nomic-ai/gpt4all-j) and [data](https://huggingface.co/datasets/nomic-ai/gpt4all-j-prompt-generations) versions can be specified by passing a `revision` argument. - -For example, to load the `v1.2-jazzy` model and dataset, run: - -```python -from datasets import load_dataset -from transformers import AutoModelForCausalLM - -dataset = load_dataset("nomic-ai/gpt4all-j-prompt-generations", revision="v1.2-jazzy") -model = AutoModelForCausalLM.from_pretrained("nomic-ai/gpt4all-j-prompt-generations", revision="v1.2-jazzy") -``` - -### GPT4All-J Training Instructions - -```bash -accelerate launch --dynamo_backend=inductor --num_processes=8 --num_machines=1 --machine_rank=0 --deepspeed_multinode_launcher standard --mixed_precision=bf16 --use_deepspeed --deepspeed_config_file=configs/deepspeed/ds_config_gptj.json train.py --config configs/train/finetune_gptj.yaml -``` - ## Contributing -GPT4All welcomes contribution, involvment, and discussion from the open source community! -Please see CONTRIBUTING.md and follow the issue, bug report, and PR markdown templates. +GPT4All welcomes contributions, involvement, and discussion from the open source community! +Please see CONTRIBUTING.md and follow the issues, bug reports, and PR markdown templates. Check project discord, with project owners, or through existing issues/PRs to avoid duplicate work. Please make sure to tag all of the above with relevant project identifiers or your contribution could potentially get lost.