gpt4all/README.md

<h1 align="center">GPT4All</h1>
<p align="center">Demo, data, and code to train an assistant-style large language model with ~800k GPT-3.5-Turbo Generations based on LLaMa</p>

<p align="center">
<a href="https://s3.amazonaws.com/static.nomic.ai/gpt4all/2023_GPT4All_Technical_Report.pdf">:green_book: Technical Report</a>
</p>

<p align="center">
<a href="https://github.com/nomic-ai/pyllamacpp">:snake: Official Python Bindings</a>
</p>

<p align="center">
<a href="https://github.com/nomic-ai/gpt4all-ts">:computer: Official Typescript Bindings</a>
</p>

<p align="center">
<a href="https://python.langchain.com/en/latest/modules/models/llms/integrations/gpt4all.html">🦜️🔗 Official Langchain Backend</a> 
</p>


<p align="center">
<a href="https://discord.gg/mGZE39AS3e">Discord</a>
</p>


![gpt4all-lora-demo](https://user-images.githubusercontent.com/13879686/228352356-de66ca7a-df70-474e-b929-2e3656165051.gif)

Run on M1 Mac (not sped up!)

# Try it yourself

Here's how to get started with the CPU quantized GPT4All model checkpoint:

1. Download the `gpt4all-lora-quantized.bin` file from [Direct Link](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-quantized.bin) or [[Torrent-Magnet]](https://tinyurl.com/gpt4all-lora-quantized).
2. Clone this repository, navigate to `chat`, and place the downloaded file there.
3. Run the appropriate command for your OS:
   - M1 Mac/OSX: `cd chat;./gpt4all-lora-quantized-OSX-m1`
   - Linux: `cd chat;./gpt4all-lora-quantized-linux-x86`
   - Windows (PowerShell): `cd chat;./gpt4all-lora-quantized-win64.exe`
   - Intel Mac/OSX: `cd chat;./gpt4all-lora-quantized-OSX-intel`

For custom hardware compilation, see our [llama.cpp](https://github.com/zanussbaum/gpt4all.cpp) fork.

-----------
Find all compatible models in the GPT4All Ecosystem section.

[Secret Unfiltered Checkpoint](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-unfiltered-quantized.bin) - [[Torrent]](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-unfiltered-quantized.bin.torrent)

This model had all refusal to answer responses removed from training. Try it with:
- M1 Mac/OSX: `cd chat;./gpt4all-lora-quantized-OSX-m1 -m gpt4all-lora-unfiltered-quantized.bin`
- Linux: `cd chat;./gpt4all-lora-quantized-linux-x86 -m gpt4all-lora-unfiltered-quantized.bin`
- Windows (PowerShell): `cd chat;./gpt4all-lora-quantized-win64.exe -m gpt4all-lora-unfiltered-quantized.bin`
- Intel Mac/OSX: `cd chat;./gpt4all-lora-quantized-OSX-intel -m gpt4all-lora-unfiltered-quantized.bin`
-----------
Note: the full model on GPU (16GB of RAM required) performs much better in our qualitative evaluations.

# Python Client
## CPU Interface
To run GPT4All in python, see the new [official Python bindings](https://github.com/nomic-ai/pyllamacpp).

The old bindings are still available but now deprecated. They will not work in a notebook environment.
To get running using the python client with the CPU interface, first install the [nomic client](https://github.com/nomic-ai/nomic) using `pip install nomic`
Then, you can use the following script to interact with GPT4All:
```
from nomic.gpt4all import GPT4All
m = GPT4All()
m.open()
m.prompt('write me a story about a lonely computer')
```

## GPU Interface
There are two ways to get up and running with this model on GPU.
The setup here is slightly more involved than the CPU model.
1. clone the nomic client [repo](https://github.com/nomic-ai/nomic) and run `pip install .[GPT4All]` in the home dir.
2. run `pip install nomic` and install the additional deps from the wheels built [here](https://github.com/nomic-ai/nomic/tree/main/bin)

Once this is done, you can run the model on GPU with a script like the following:
```
from nomic.gpt4all import GPT4AllGPU
m = GPT4AllGPU(LLAMA_PATH)
config = {'num_beams': 2,
          'min_new_tokens': 10,
          'max_length': 100,
          'repetition_penalty': 2.0}
out = m.generate('write me a story about a lonely computer', config)
print(out)
```
Where LLAMA_PATH is the path to a Huggingface Automodel compliant LLAMA model.
Nomic is unable to distribute this file at this time.
We are working on a GPT4All that does not have this limitation right now.

You can pass any of the [huggingface generation config params](https://huggingface.co/docs/transformers/main_classes/text_generation#transformers.GenerationConfig) in the config.

# GPT4All Compatibility Ecosystem
Edge models in the GPT4All Ecosystem. Please PR as the [community grows](https://huggingface.co/models?sort=modified&search=4bit).
Feel free to convert this to a more structured table.

- [gpt4all](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-quantized.bin) [[MD5 Signature](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-quantized-ggml.bin.md5)]
   - [gpt4all-ggml-converted](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-quantized-ggml.bin) [[MD5 Signature](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-quantized-ggml.bin.md5)]
- [gpt4all-unfiltered](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-unfiltered-quantized.bin) [[MD5 Signature](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-unfiltered-quantized.bin.md5)]
- [ggml-vicuna-7b-4bit](https://huggingface.co/eachadea/ggml-vicuna-7b-4bit)
- [vicuna-13b-GPTQ-4bit-128g](https://huggingface.co/anon8231489123/vicuna-13b-GPTQ-4bit-128g)
- [LLaMa-Storytelling-4Bit](https://huggingface.co/GamerUntouch/LLaMa-Storytelling-4Bit)


# Roadmap
## Short Term
 - <span style="color:green">(IN PROGRESS)</span> Train a GPT4All model based on GPTJ to alleviate llama distribution issues.
 - <span style="color:green">(IN PROGRESS)</span> Create improved CPU and GPU interfaces for this model.
 - <span style="color:red">(NOT STARTED)</span> Integrate llama.cpp bindings
 - <span style="color:red">(NOT STARTED)</span> Create a good conversational chat interface for the model.
 - <span style="color:red">(NOT STARTED)</span> Allow users to opt in and submit their chats for subsequent training runs

## Medium Term
 - <span style="color:red">(NOT STARTED)</span> Integrate GPT4All with [Atlas](https://atlas.nomic.ai) to allow for document retrieval.
   - BLOCKED by GPT4All based on GPTJ
 - <span style="color:red">(NOT STARTED)</span> Integrate GPT4All with Langchain.
 - <span style="color:green">(IN PROGRESS)</span> Build easy custom training scripts to allow users to fine tune models.

## Long Term
 - <span style="color:red">(NOT STARTED)</span> Allow anyone to curate training data for subsequent GPT4All releases using Atlas.
 - <span style="color:green">(IN PROGRESS)</span> Democratize AI. 

# Reproducibility

Trained LoRa Weights:
- gpt4all-lora (four full epochs of training):  https://huggingface.co/nomic-ai/gpt4all-lora
- gpt4all-lora-epoch-2 (three full epochs of training) https://huggingface.co/nomic-ai/gpt4all-lora-epoch-2

Raw Data:
- [Training Data Without P3](https://huggingface.co/datasets/nomic-ai/gpt4all_prompt_generations)
  - Explorer: https://atlas.nomic.ai/map/gpt4all_data_clean_without_p3
- [Full Dataset with P3](https://huggingface.co/datasets/nomic-ai/gpt4all_prompt_generations_with_p3)
  - Explorer: https://atlas.nomic.ai/map/gpt4all_data_clean

We are not distributing a LLaMa 7B checkpoint.

You can reproduce our trained model by doing the following:

## Setup

Clone the repo

```
git clone --recurse-submodules https://github.com/nomic-ai/gpt4all.git
git submodule update --init
```

Setup the environment

```
python -m pip install -r requirements.txt

cd transformers
pip install -e . 

cd ../peft
pip install -e .
```

## Training

```bash
accelerate launch --dynamo_backend=inductor --num_processes=8 --num_machines=1 --machine_rank=0 --deepspeed_multinode_launcher standard --mixed_precision=bf16  --use_deepspeed --deepspeed_config_file=configs/deepspeed/ds_config.json train.py --config configs/train/finetune-7b.yaml
```

## Generate

```bash
python generate.py --config configs/generate/generate.yaml --prompt "Write a script to reverse a string in Python"
```

## Need Help?

Join the <a href="https://discord.gg/kvmy6dQB"> Discord </a> and ask for help in `#gpt4all-help`

# Sample Generations

### Provide instructions for the given exercise. Leg Raises
>Leg Raises
>1. Stand with your feet shoulder-width apart and your knees slightly bent.
>2. Raise your arms straight out in front of you.
>3. Slowly bend your knees and raise your heels off the ground.
>4. Hold this position for a count of 3, then slowly lower your heels back to the ground.
>5. Repeat the exercise 10 times.
>6. Do not let your knees go beyond your toes.
>7. Keep your back straight and your abdominal muscles engaged.
>8. Do not let your hips sag or your back arch.
>9. Keep your arms straight and your elbows locked.
>10. Do not let your head drop or your shoulders rise.
>11. Breathe normally throughout the exercise.
>12. Rest for 30 seconds before repeating the exercise.
>13. Gradually increase the number of repetitions as you become stronger.


### A color description has been provided. Find the CSS code associated with that color. A light red color with a medium light shade of pink

>The CSS code associated with the color provided is: #FF6347


### Come up with an interesting idea for a new movie plot. Your plot should be described with a title and a summary.

> 1. The title of your movie plot should be "The Last Stand". 
> 2. The summary of your movie plot should be: A group of mercenaries are hired to protect a small town from a dangerous criminal who is on the run. However, the town's sheriff is not willing to let them take over and decides to take matters into his own hands.

### Reverse a string in python.
> In Python, you can reverse a list or tuple by using the reversed() function on it. Here's an example of how to use this method with strings: 
>    ```python
>   my_string = "Hello World" # Define your original string here
>   reversed_str = my_string[::-1]   # Use a slice and negative index for reverse order
>   print(reversed_str)     # Output will be reversed version of the inputted string.

### List 10 dogs.
> 1. Labrador Retriever
> 2. Golden Retriever
> 3. Beagle
> 4. Poodle
> 5. German Shepherd
> 6. Bulldog
> 7. Rottweiler
> 8. Boxer
> 9. Shih Tzu
> 10. Pug

### Write me a poem about the fall of Julius Caesar into a caesar salad in iambic pentameter. 

> The fall of Julius Caesar into a caesar salad<br>
> In iambic pentameter,<br>
> The mighty Roman emperor<br>
> Fell into a salad of lettuce and croutons,<br>
> His empire crumbling around him,<br>
> As he was devoured by the hungry diners.<br>
> The once mighty emperor<br>
> Was now just a salad topping,<br>
> His legacy reduced to a mere garnish,<br>
> As he was consumed by the masses.<br>
> The fall of Julius Caesar into a caesar salad,<br>
> In iambic pentameter,<br>
> Was a tragic end to a great man,<br>
> Who had once ruled the world with an iron fist.<br>
> But now, he was just a salad topping,<br>
> His legacy reduced to a mere garnish,<br>
> As he was consumed by the masses.
    
### What is a three word topic describing the following keywords: baseball, football, soccer: 
> Sports, athletics, games
    
## Citation

If you utilize this repository, models or data in a downstream project, please consider citing it with:
```
@misc{gpt4all,
  author = {Yuvanesh Anand and Zach Nussbaum and Brandon Duderstadt and Benjamin Schmidt and Andriy Mulyar},
  title = {GPT4All: Training an Assistant-style Chatbot with Large Scale Data Distillation from GPT-3.5-Turbo},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/nomic-ai/gpt4all}},
}
```
Initial readme update. 2023-03-28 00:20:59 +00:00			`<h1 align="center">GPT4All</h1>`
Update README.md Fixing punctuation and capitalization to maintain consistency within the README file. 2023-04-04 00:09:51 +00:00			`<p align="center">Demo, data, and code to train an assistant-style large language model with ~800k GPT-3.5-Turbo Generations based on LLaMa</p>`
Update README.md 2023-03-28 20:04:18 +00:00
			`<p align="center">`
Update README.md 2023-03-28 20:12:30 +00:00			`<a href="https://s3.amazonaws.com/static.nomic.ai/gpt4all/2023_GPT4All_Technical_Report.pdf">:green_book: Technical Report</a>`
Update README.md 2023-03-28 20:04:18 +00:00			`</p>`
Updated Python Bindings 2023-04-03 05:50:43 +00:00
			`<p align="center">`
			`<a href="https://github.com/nomic-ai/pyllamacpp">:snake: Official Python Bindings</a>`
			`</p>`

Typescript bindings link 2023-04-05 17:03:17 +00:00			`<p align="center">`
Formatting Update 2023-04-05 18:10:00 +00:00			`<a href="https://github.com/nomic-ai/gpt4all-ts">:computer: Official Typescript Bindings</a>`
Typescript bindings link 2023-04-05 17:03:17 +00:00			`</p>`

Formatting Update 2023-04-05 18:10:00 +00:00			`<p align="center">`
			`<a href="https://python.langchain.com/en/latest/modules/models/llms/integrations/gpt4all.html">🦜️🔗 Official Langchain Backend</a>`
			`</p>`


Update README.md 2023-03-29 01:07:04 +00:00			`<p align="center">`
Discord Link 2023-04-05 03:23:34 +00:00			`<a href="https://discord.gg/mGZE39AS3e">Discord</a>`
Update README.md 2023-03-29 01:07:04 +00:00			`</p>`
Update README.md 2023-03-28 20:04:18 +00:00

Update README.md 2023-03-28 19:55:45 +00:00
Typescript bindings link 2023-04-05 17:03:17 +00:00
Typescript and Langchain bindings 2023-04-05 17:24:47 +00:00
Update README.md 2023-03-28 19:55:45 +00:00			`![gpt4all-lora-demo](https://user-images.githubusercontent.com/13879686/228352356-de66ca7a-df70-474e-b929-2e3656165051.gif)`
Update README.md 2023-03-28 21:25:06 +00:00
Update README.md 2023-03-28 21:06:28 +00:00			`Run on M1 Mac (not sped up!)`
feat: generation works 2023-03-25 16:43:27 +00:00
Initial readme update. 2023-03-28 00:20:59 +00:00			`# Try it yourself`
Update README.md 2023-03-28 19:55:45 +00:00
Made capitalization consistent 2023-04-01 00:26:09 +00:00			`Here's how to get started with the CPU quantized GPT4All model checkpoint:`
Merge branch 'main' into patch-1 2023-03-29 14:38:17 +00:00
Update README.md - Improve the Try it yourself section. 2023-03-30 14:32:17 +00:00			1. Download the `gpt4all-lora-quantized.bin` file from [Direct Link](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-quantized.bin) or [[Torrent-Magnet]](https://tinyurl.com/gpt4all-lora-quantized).
			2. Clone this repository, navigate to `chat`, and place the downloaded file there.
			`3. Run the appropriate command for your OS:`
			- M1 Mac/OSX: `cd chat;./gpt4all-lora-quantized-OSX-m1`
			- Linux: `cd chat;./gpt4all-lora-quantized-linux-x86`
			- Windows (PowerShell): `cd chat;./gpt4all-lora-quantized-win64.exe`
			- Intel Mac/OSX: `cd chat;./gpt4all-lora-quantized-OSX-intel`
Initial readme update. 2023-03-28 00:20:59 +00:00
Update README.md 2023-03-31 16:29:38 +00:00			`For custom hardware compilation, see our [llama.cpp](https://github.com/zanussbaum/gpt4all.cpp) fork.`
feat: generation works 2023-03-25 16:43:27 +00:00
Update README.md 2023-03-29 21:13:55 +00:00			`-----------`
Added MD5 signatures to ecosystem links. 2023-04-05 17:15:23 +00:00			`Find all compatible models in the GPT4All Ecosystem section.`
Update README.md 2023-03-29 21:13:55 +00:00
Update README.md - Move Torrent/Magnet links to save space in the readme file. 2023-03-30 13:56:12 +00:00			`[Secret Unfiltered Checkpoint](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-unfiltered-quantized.bin) - [[Torrent]](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-unfiltered-quantized.bin.torrent)`
Update README.md 2023-03-29 21:13:55 +00:00
			`This model had all refusal to answer responses removed from training. Try it with:`
Update README.md unfiltered.bin Instructions Added terminal commands to run gpt4all-lora-unfiltered-quantized.bin on Mac, Windows, Linux, Intel OS 2023-03-31 07:48:14 +00:00			- M1 Mac/OSX: `cd chat;./gpt4all-lora-quantized-OSX-m1 -m gpt4all-lora-unfiltered-quantized.bin`
			- Linux: `cd chat;./gpt4all-lora-quantized-linux-x86 -m gpt4all-lora-unfiltered-quantized.bin`
			- Windows (PowerShell): `cd chat;./gpt4all-lora-quantized-win64.exe -m gpt4all-lora-unfiltered-quantized.bin`
			- Intel Mac/OSX: `cd chat;./gpt4all-lora-quantized-OSX-intel -m gpt4all-lora-unfiltered-quantized.bin`
Update README.md 2023-03-29 21:13:55 +00:00			`-----------`
Update README.md 2023-03-28 21:05:03 +00:00			`Note: the full model on GPU (16GB of RAM required) performs much better in our qualitative evaluations.`
Update README.md 2023-03-28 15:56:16 +00:00
added roadmap 2023-03-30 15:10:07 +00:00			`# Python Client`
			`## CPU Interface`
Update README.md Fixing punctuation and capitalization to maintain consistency within the README file. 2023-04-04 00:09:51 +00:00			`To run GPT4All in python, see the new [official Python bindings](https://github.com/nomic-ai/pyllamacpp).`
Updated Python Bindings 2023-04-03 05:50:43 +00:00
			`The old bindings are still available but now deprecated. They will not work in a notebook environment.`
added roadmap 2023-03-30 15:10:07 +00:00			To get running using the python client with the CPU interface, first install the [nomic client](https://github.com/nomic-ai/nomic) using `pip install nomic`
Fix typo 2023-03-30 19:51:40 +00:00			`Then, you can use the following script to interact with GPT4All:`
added roadmap 2023-03-30 15:10:07 +00:00			```
Update README.md 2023-03-30 17:46:03 +00:00			`from nomic.gpt4all import GPT4All`
added roadmap 2023-03-30 15:10:07 +00:00			`m = GPT4All()`
Update README.md 2023-03-30 17:47:04 +00:00			`m.open()`
added roadmap 2023-03-30 15:10:07 +00:00			`m.prompt('write me a story about a lonely computer')`
			```

			`## GPU Interface`
			`There are two ways to get up and running with this model on GPU.`
			`The setup here is slightly more involved than the CPU model.`
			1. clone the nomic client [repo](https://github.com/nomic-ai/nomic) and run `pip install .[GPT4All]` in the home dir.
			2. run `pip install nomic` and install the additional deps from the wheels built [here](https://github.com/nomic-ai/nomic/tree/main/bin)

			`Once this is done, you can run the model on GPU with a script like the following:`
			```
Update README.md 2023-03-30 17:46:03 +00:00			`from nomic.gpt4all import GPT4AllGPU`
added roadmap 2023-03-30 15:10:07 +00:00			`m = GPT4AllGPU(LLAMA_PATH)`
			`config = {'num_beams': 2,`
			`'min_new_tokens': 10,`
			`'max_length': 100,`
			`'repetition_penalty': 2.0}`
			`out = m.generate('write me a story about a lonely computer', config)`
			`print(out)`
			```
			`Where LLAMA_PATH is the path to a Huggingface Automodel compliant LLAMA model.`
			`Nomic is unable to distribute this file at this time.`
			`We are working on a GPT4All that does not have this limitation right now.`

			`You can pass any of the [huggingface generation config params](https://huggingface.co/docs/transformers/main_classes/text_generation#transformers.GenerationConfig) in the config.`

GPT4All Compatibility Ecosystem 2023-04-05 16:48:54 +00:00			`# GPT4All Compatibility Ecosystem`
			`Edge models in the GPT4All Ecosystem. Please PR as the [community grows](https://huggingface.co/models?sort=modified&search=4bit).`
			`Feel free to convert this to a more structured table.`

Added MD5 signatures to ecosystem links. 2023-04-05 17:15:23 +00:00			`- [gpt4all](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-quantized.bin) [[MD5 Signature](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-quantized-ggml.bin.md5)]`
			`- [gpt4all-ggml-converted](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-quantized-ggml.bin) [[MD5 Signature](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-quantized-ggml.bin.md5)]`
			`- [gpt4all-unfiltered](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-unfiltered-quantized.bin) [[MD5 Signature](https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/gpt4all-lora-unfiltered-quantized.bin.md5)]`
GPT4All Compatibility Ecosystem 2023-04-05 16:48:54 +00:00			`- [ggml-vicuna-7b-4bit](https://huggingface.co/eachadea/ggml-vicuna-7b-4bit)`
			`- [vicuna-13b-GPTQ-4bit-128g](https://huggingface.co/anon8231489123/vicuna-13b-GPTQ-4bit-128g)`
			`- [LLaMa-Storytelling-4Bit](https://huggingface.co/GamerUntouch/LLaMa-Storytelling-4Bit)`


added roadmap 2023-03-30 15:10:07 +00:00			`# Roadmap`
			`## Short Term`
			`- <span style="color:green">(IN PROGRESS)</span> Train a GPT4All model based on GPTJ to alleviate llama distribution issues.`
			`- <span style="color:green">(IN PROGRESS)</span> Create improved CPU and GPU interfaces for this model.`
			`- <span style="color:red">(NOT STARTED)</span> Integrate llama.cpp bindings`
			`- <span style="color:red">(NOT STARTED)</span> Create a good conversational chat interface for the model.`
			`- <span style="color:red">(NOT STARTED)</span> Allow users to opt in and submit their chats for subsequent training runs`

			`## Medium Term`
			`- <span style="color:red">(NOT STARTED)</span> Integrate GPT4All with [Atlas](https://atlas.nomic.ai) to allow for document retrieval.`
			`- BLOCKED by GPT4All based on GPTJ`
			`- <span style="color:red">(NOT STARTED)</span> Integrate GPT4All with Langchain.`
updated roadmap 2023-03-30 16:32:14 +00:00			`- <span style="color:green">(IN PROGRESS)</span> Build easy custom training scripts to allow users to fine tune models.`
added roadmap 2023-03-30 15:10:07 +00:00
			`## Long Term`
			`- <span style="color:red">(NOT STARTED)</span> Allow anyone to curate training data for subsequent GPT4All releases using Atlas.`
			`- <span style="color:green">(IN PROGRESS)</span> Democratize AI.`

Update README.md 2023-03-28 15:56:16 +00:00			`# Reproducibility`
Update README.md 2023-03-28 16:26:23 +00:00
Update README.md 2023-03-28 20:21:09 +00:00			`Trained LoRa Weights:`
Qualified number of epochs for LoRa weights 2023-03-29 16:26:47 +00:00			`- gpt4all-lora (four full epochs of training): https://huggingface.co/nomic-ai/gpt4all-lora`
			`- gpt4all-lora-epoch-2 (three full epochs of training) https://huggingface.co/nomic-ai/gpt4all-lora-epoch-2`
Update README.md 2023-03-28 20:21:09 +00:00
			`Raw Data:`
Updated training data link 2023-03-30 14:30:50 +00:00			`- [Training Data Without P3](https://huggingface.co/datasets/nomic-ai/gpt4all_prompt_generations)`
Update README.md 2023-03-28 21:24:46 +00:00			`- Explorer: https://atlas.nomic.ai/map/gpt4all_data_clean_without_p3`
Huggingface Datasets link 2023-03-30 16:54:28 +00:00			`- [Full Dataset with P3](https://huggingface.co/datasets/nomic-ai/gpt4all_prompt_generations_with_p3)`
Update README.md 2023-03-28 21:24:46 +00:00			`- Explorer: https://atlas.nomic.ai/map/gpt4all_data_clean`
Update README.md 2023-03-28 16:26:23 +00:00
Update README.md 2023-03-28 19:39:03 +00:00			`We are not distributing a LLaMa 7B checkpoint.`
Update README.md 2023-03-28 16:26:23 +00:00
Update README.md 2023-03-28 19:32:48 +00:00			`You can reproduce our trained model by doing the following:`
Update README.md 2023-03-28 15:56:16 +00:00
			`## Setup`
feat: generation works 2023-03-25 16:43:27 +00:00
			`Clone the repo`

Fix `git submodule` instructions 2023-04-02 16:19:02 +00:00			```
			`git clone --recurse-submodules https://github.com/nomic-ai/gpt4all.git`
			`git submodule update --init`
			```
Update README.md 2023-03-28 00:46:24 +00:00
feat: generation works 2023-03-25 16:43:27 +00:00			`Setup the environment`

			```
			`python -m pip install -r requirements.txt`

			`cd transformers`
			`pip install -e .`

			`cd ../peft`
			`pip install -e .`
			```

Update README.md 2023-03-28 20:11:43 +00:00			`## Training`
feat: generation works 2023-03-25 16:43:27 +00:00
Update README.md 2023-03-28 18:52:27 +00:00			```bash
Update README.md 2023-03-28 20:11:43 +00:00			`accelerate launch --dynamo_backend=inductor --num_processes=8 --num_machines=1 --machine_rank=0 --deepspeed_multinode_launcher standard --mixed_precision=bf16 --use_deepspeed --deepspeed_config_file=configs/deepspeed/ds_config.json train.py --config configs/train/finetune-7b.yaml`
Update README.md 2023-03-28 18:52:27 +00:00			```
fix: update to how to train 2023-03-25 21:57:01 +00:00
Update README.md 2023-03-28 20:11:43 +00:00			`## Generate`
fix: update to how to train 2023-03-25 21:57:01 +00:00
Update README.md 2023-03-28 18:52:27 +00:00			```bash
fix typo in generate example command 2023-03-29 00:40:34 +00:00			`python generate.py --config configs/generate/generate.yaml --prompt "Write a script to reverse a string in Python"`
Update README.md 2023-03-28 18:52:27 +00:00			```
Update README.md 2023-03-28 16:00:25 +00:00
Update README.md 2023-03-30 21:53:24 +00:00			`## Need Help?`

			Join the <a href="https://discord.gg/kvmy6dQB"> Discord </a> and ask for help in `#gpt4all-help`

Update README.md 2023-03-28 21:24:46 +00:00			`# Sample Generations`

Update README.md 2023-03-28 21:22:11 +00:00			`### Provide instructions for the given exercise. Leg Raises`
			`>Leg Raises`
			`>1. Stand with your feet shoulder-width apart and your knees slightly bent.`
			`>2. Raise your arms straight out in front of you.`
			`>3. Slowly bend your knees and raise your heels off the ground.`
			`>4. Hold this position for a count of 3, then slowly lower your heels back to the ground.`
			`>5. Repeat the exercise 10 times.`
			`>6. Do not let your knees go beyond your toes.`
			`>7. Keep your back straight and your abdominal muscles engaged.`
			`>8. Do not let your hips sag or your back arch.`
			`>9. Keep your arms straight and your elbows locked.`
			`>10. Do not let your head drop or your shoulders rise.`
			`>11. Breathe normally throughout the exercise.`
			`>12. Rest for 30 seconds before repeating the exercise.`
			`>13. Gradually increase the number of repetitions as you become stronger.`


			`### A color description has been provided. Find the CSS code associated with that color. A light red color with a medium light shade of pink`

			`>The CSS code associated with the color provided is: #FF6347`


			`### Come up with an interesting idea for a new movie plot. Your plot should be described with a title and a summary.`

Update README.md Type and formatting improvements. 2023-04-02 04:24:19 +00:00			`> 1. The title of your movie plot should be "The Last Stand".`
			`> 2. The summary of your movie plot should be: A group of mercenaries are hired to protect a small town from a dangerous criminal who is on the run. However, the town's sheriff is not willing to let them take over and decides to take matters into his own hands.`
Update README.md 2023-03-28 16:00:25 +00:00
Update README.md 2023-03-28 21:31:35 +00:00			`### Reverse a string in python.`
			`> In Python, you can reverse a list or tuple by using the reversed() function on it. Here's an example of how to use this method with strings:`
			> ```python
			`> my_string = "Hello World" # Define your original string here`
			`> reversed_str = my_string[::-1] # Use a slice and negative index for reverse order`
			`> print(reversed_str) # Output will be reversed version of the inputted string.`

			`### List 10 dogs.`
Update README.md Type and formatting improvements. 2023-04-02 04:24:19 +00:00			`> 1. Labrador Retriever`
			`> 2. Golden Retriever`
			`> 3. Beagle`
			`> 4. Poodle`
			`> 5. German Shepherd`
			`> 6. Bulldog`
			`> 7. Rottweiler`
			`> 8. Boxer`
			`> 9. Shih Tzu`
			`> 10. Pug`

			`### Write me a poem about the fall of Julius Caesar into a caesar salad in iambic pentameter.`

			`> The fall of Julius Caesar into a caesar salad<br>`
			`> In iambic pentameter,<br>`
			`> The mighty Roman emperor<br>`
			`> Fell into a salad of lettuce and croutons,<br>`
			`> His empire crumbling around him,<br>`
			`> As he was devoured by the hungry diners.<br>`
			`> The once mighty emperor<br>`
			`> Was now just a salad topping,<br>`
			`> His legacy reduced to a mere garnish,<br>`
			`> As he was consumed by the masses.<br>`
			`> The fall of Julius Caesar into a caesar salad,<br>`
			`> In iambic pentameter,<br>`
			`> Was a tragic end to a great man,<br>`
			`> Who had once ruled the world with an iron fist.<br>`
			`> But now, he was just a salad topping,<br>`
			`> His legacy reduced to a mere garnish,<br>`
			`> As he was consumed by the masses.`
Update README.md 2023-03-28 21:31:35 +00:00
			`### What is a three word topic describing the following keywords: baseball, football, soccer:`
Update README.md Type and formatting improvements. 2023-04-02 04:24:19 +00:00			`> Sports, athletics, games`
Update README.md 2023-03-28 21:31:35 +00:00
added roadmap 2023-03-30 15:10:07 +00:00			`## Citation`
Update README.md 2023-03-28 16:00:25 +00:00
Update README.md Type and formatting improvements. 2023-04-02 04:24:19 +00:00			`If you utilize this repository, models or data in a downstream project, please consider citing it with:`
Update README.md 2023-03-28 16:00:25 +00:00			```
			`@misc{gpt4all,`
Update README.md 2023-03-28 18:50:27 +00:00			`author = {Yuvanesh Anand and Zach Nussbaum and Brandon Duderstadt and Benjamin Schmidt and Andriy Mulyar},`
Update README.md 2023-03-28 16:00:25 +00:00			`title = {GPT4All: Training an Assistant-style Chatbot with Large Scale Data Distillation from GPT-3.5-Turbo},`
			`year = {2023},`
			`publisher = {GitHub},`
			`journal = {GitHub repository},`
			`howpublished = {\url{https://github.com/nomic-ai/gpt4all}},`
			`}`
			```