7.5 KiB
GPT4All
Demo, data and code to train an assistant-style large language model with ~800k GPT-3.5-Turbo Generations based on LLaMa
Run on M1 Mac (not sped up!)
Try it yourself
Here's how to get started with the CPU quantized gpt4all model checkpoint:
- Download the
gpt4all-lora-quantized.bin
file from Direct Link or [Torrent-Magnet]. - Clone this repository, navigate to
chat
, and place the downloaded file there. - Run the appropriate command for your OS:
- M1 Mac/OSX:
cd chat;./gpt4all-lora-quantized-OSX-m1
- Linux:
cd chat;./gpt4all-lora-quantized-linux-x86
- Windows (PowerShell):
cd chat;./gpt4all-lora-quantized-win64.exe
- Intel Mac/OSX:
cd chat;./gpt4all-lora-quantized-OSX-intel
- M1 Mac/OSX:
For custom hardware compilation, see our Alpaca C++ repository.
Secret Unfiltered Checkpoint - [Torrent]
This model had all refusal to answer responses removed from training. Try it with:
cd chat;./gpt4all-lora-quantized-OSX-m1 -m gpt4all-lora-unfiltered-quantized.bin
Note: the full model on GPU (16GB of RAM required) performs much better in our qualitative evaluations.
Reproducibility
Trained LoRa Weights:
- gpt4all-lora (four full epochs of training): https://huggingface.co/nomic-ai/gpt4all-lora
- gpt4all-lora-epoch-2 (three full epochs of training) https://huggingface.co/nomic-ai/gpt4all-lora-epoch-2
Raw Data:
We are not distributing a LLaMa 7B checkpoint.
You can reproduce our trained model by doing the following:
Setup
Clone the repo
git clone --recurse-submodules https://github.com/nomic-ai/gpt4all.git
git submodule configure && git submodule update
Setup the environment
python -m pip install -r requirements.txt
cd transformers
pip install -e .
cd ../peft
pip install -e .
Training
accelerate launch --dynamo_backend=inductor --num_processes=8 --num_machines=1 --machine_rank=0 --deepspeed_multinode_launcher standard --mixed_precision=bf16 --use_deepspeed --deepspeed_config_file=configs/deepspeed/ds_config.json train.py --config configs/train/finetune-7b.yaml
Generate
python generate.py --config configs/generate/generate.yaml --prompt "Write a script to reverse a string in Python"
Sample Generations
Provide instructions for the given exercise. Leg Raises
Leg Raises
- Stand with your feet shoulder-width apart and your knees slightly bent.
- Raise your arms straight out in front of you.
- Slowly bend your knees and raise your heels off the ground.
- Hold this position for a count of 3, then slowly lower your heels back to the ground.
- Repeat the exercise 10 times.
- Do not let your knees go beyond your toes.
- Keep your back straight and your abdominal muscles engaged.
- Do not let your hips sag or your back arch.
- Keep your arms straight and your elbows locked.
- Do not let your head drop or your shoulders rise.
- Breathe normally throughout the exercise.
- Rest for 30 seconds before repeating the exercise.
- Gradually increase the number of repetitions as you become stronger.
A color description has been provided. Find the CSS code associated with that color. A light red color with a medium light shade of pink
The CSS code associated with the color provided is: #FF6347
Come up with an interesting idea for a new movie plot. Your plot should be described with a title and a summary.
- The title of your movie plot should be "The Last Stand". 2. The summary of your movie plot should be: A group of mercenaries are hired to protect a small town from a dangerous criminal who is on the run. However, the town's sheriff is not willing to let them take over and decides to take matters into his own hands.
Reverse a string in python.
In Python, you can reverse a list or tuple by using the reversed() function on it. Here's an example of how to use this method with strings:
my_string = "Hello World" # Define your original string here reversed_str = my_string[::-1] # Use a slice and negative index for reverse order print(reversed_str) # Output will be reversed version of the inputted string.
List 10 dogs.
- Labrador Retriever
- Golden Retriever
- Beagle
- Poodle
- German Shepherd
- Bulldog
- Rottweiler
- Boxer
- Shih Tzu
- Pug
Write me a poem about the fall of Julius Ceasar into a ceasar salad in iambic pentameter.
The fall of Julius Ceasar into a ceasar salad In iambic pentameter, The mighty Roman emperor Fell into a salad of lettuce and croutons, His empire crumbling around him, As he was devoured by the hungry diners. The once mighty emperor Was now just a salad topping, His legacy reduced to a mere garnish, As he was consumed by the masses. The fall of Julius Ceasar into a ceasar salad, In iambic pentameter, Was a tragic end to a great man, Who had once ruled the world with an iron fist. But now, he was just a salad topping, His legacy reduced to a mere garnish, As he was consumed by the masses.
What is a three word topic describing the following keywords: baseball, football, soccer:
Sports, athletics, games
GPU Interface
There are two ways to get up and running with this model on GPU.
- clone the nomic client repo and run
pip install .[GPT4All]
in the home dir. - run
pip install nomic
and install the additional deps from the wheels built here
Once this is done, you can run the model on GPU with a script like the following:
from nomic import GPT4AllGPU
m = GPT4AllGPU(LLAMA_PATH)
config = {'num_beams': 2,
'min_new_tokens': 10,
'max_length': 100,
'repetition_penalty': 2.0}
out = m.generate('write me a story about a lonely computer', config)
print(out)
You can pass any of the huggingface generation config params in the config.
If you utilize this reposistory, models or data in a downstream project, please consider citing it with:
@misc{gpt4all,
author = {Yuvanesh Anand and Zach Nussbaum and Brandon Duderstadt and Benjamin Schmidt and Andriy Mulyar},
title = {GPT4All: Training an Assistant-style Chatbot with Large Scale Data Distillation from GPT-3.5-Turbo},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/nomic-ai/gpt4all}},
}