You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
85 lines
4.3 KiB
Markdown
85 lines
4.3 KiB
Markdown
2 years ago
|
# Adding a concept to Stable Diffusion
|
||
|
|
||
|
You can use Imaginairy to teach the model a new concept (a person, thing, style, etc) using the `aimg train-concept`
|
||
|
command.
|
||
|
|
||
|
## Requirements
|
||
|
- Graphics card: 3090 or better
|
||
|
- Linux
|
||
|
- A working Imaginairy installation
|
||
|
- a folder of images of the concept you want to teach the model
|
||
|
|
||
|
## Background
|
||
|
|
||
|
To train the model we show it a lot of images of the concept we want to teach it. The problem is the model can easily
|
||
|
overfit to the images we show it. To prevent this we also show it images of the class of thing that is being trained.
|
||
|
Imaginairy will generate the images needed for this before running the training job.
|
||
|
|
||
|
Provided a directory of concept images, a concept token, and a class token, this command will train the model
|
||
|
to generate images of that concept.
|
||
|
|
||
|
|
||
|
This happens in a 3-step process:
|
||
|
|
||
|
1. Cropping and resizing your training images. If --person is set we crop to include the face.
|
||
|
2. Generating a set of class images to train on. This helps prevent overfitting.
|
||
|
3. Training the model on the concept and class images.
|
||
|
|
||
|
The output of this command is a new model weights file that you can use with the --model option.
|
||
|
|
||
|
|
||
|
|
||
|
## Instructions
|
||
|
|
||
|
1. Gather a set of images of the concept you want to train on. The images should show the subject from a variety of angles
|
||
|
and in a variety of situations.
|
||
|
2. Run `aimg train-concept` to train the model.
|
||
|
|
||
|
- Concept label: For a person, firstnamelastname should be fine.
|
||
|
- If all the training images are photos you should add "a photo of" to the beginning of the concept label.
|
||
|
- Class label: This is the category of the things beings trained on. For people this is typically "person", "man"
|
||
|
or "woman".
|
||
|
- If all the training images are photos you should add "a photo of" to the beginning of the class label.
|
||
|
- CLass images will be generated for you if you do not provide them.
|
||
|
|
||
|
For example, if you were training on photos of a man named bill hamilton you could run the following:
|
||
|
|
||
|
```
|
||
|
aimg train-concept \\
|
||
|
--person \\
|
||
|
--concept-label "photo of billhamilton man" \\
|
||
|
--concept-images-dir ./images/billhamilton \\
|
||
|
--class-label "photo of a man" \\
|
||
|
--class-images-dir ./images/man
|
||
|
```
|
||
|
3. Stop training before it overfits.
|
||
|
- The training script will output checkpoint ckpt files into the logs folder of wherever it is run from. You can also
|
||
|
monitor generated images in the logs/images folder. They will be the ones named "sample"
|
||
|
- I don't have great advice on when to stop training yet. I stopped mine at epoch 62 at it didn't seem quite good enough, at epoch 111 it
|
||
|
produced my face correctly 50% of the time but also seemed overfit in some ways (always placing me in the same clothes or background as training photos).
|
||
|
- You can monitor model training progress in Tensorboard. Run `tensorboard --logdir lightning_logs` and open the link it gives you in your browser.
|
||
|
|
||
|
4. Prune the model to bring the size from 11gb to ~4gb: `aimg prune-ckpt logs/2023-01-15T05-52-06/checkpoints/epoch\=000049.ckpt`. Copy it somewhere
|
||
|
and give it a meaninful name.
|
||
|
|
||
|
## Using the new model
|
||
|
You can reference the model like this in imaginairy:
|
||
|
`imagine --model my-models/billhamilton-man-e111.ckpt`
|
||
|
|
||
|
When you use the model you should prompt with `firstnamelastname classname` (e.g. `billhamilton man`).
|
||
|
|
||
|
|
||
|
## Disclaimers
|
||
|
|
||
|
- The settings imaginairy uses to train the model are different than other software projects. As such you cannot follow
|
||
|
advice you may read from other tutorials regarding learning rate, epochs, steps, batch size. They are not directly
|
||
|
comparable. In laymans terms the "steps" are much bigger in imaginairy.
|
||
|
- I consider this training feature experimental and don't currently plan to offer support for it. Any further work will
|
||
|
be at my leisure. As a result I may close any reported issues related to this feature.
|
||
|
- You can find a lot more relevant information here: https://github.com/JoePenna/Dreambooth-Stable-Diffusion
|
||
|
|
||
|
## Todo
|
||
|
- figure out how to improve consistency of quality from trained model
|
||
|
- train on the depth guided model instead of SD 1.5 since that will enable more consistent output
|
||
|
- figure out metric to use for stopping training
|
||
|
- possibly swap out and randomize backgrounds on training photos so over-fitting does not occur
|