docs: update docs

This commit is contained in:
Bryce 2023-02-21 20:41:29 -08:00 committed by Bryce Drennan
parent 54c3ad51d6
commit b261c62d4e
28 changed files with 168 additions and 52 deletions

199
README.md
View File

@ -9,78 +9,156 @@ AI imagined images. Pythonic generation of stable diffusion images.
"just works" on Linux and macOS(M1) (and maybe windows?).
## Examples
```bash
# on macOS, make sure rust is installed first
>> pip install imaginairy
>> imagine "a scenic landscape" "a photo of a dog" "photo of a fruit bowl" "portrait photo of a freckled woman"
# Stable Diffusion 2.1
>> imagine --model SD-2.1 "a forest"
# Make generation gif
>> imagine "a scenic landscape" "a photo of a dog" "photo of a fruit bowl" "portrait photo of a freckled woman" "a bluejay"
# Make an animation showing the generation process
>> imagine --gif "a flower"
```
<p float="left">
<img src="assets/000019_786355545_PLMS50_PS7.5_a_scenic_landscape.jpg" height="256">
<img src="assets/000032_337692011_PLMS40_PS7.5_a_photo_of_a_dog.jpg" height="256">
<img src="assets/000056_293284644_PLMS40_PS7.5_photo_of_a_bowl_of_fruit.jpg" height="256">
<img src="assets/000078_260972468_PLMS40_PS7.5_portrait_photo_of_a_freckled_woman.jpg" height="256">
<img src="assets/013986_1_kdpmpp2m59_PS7.5_a_bluejay_[generated].jpg" height="256">
<img src="assets/009719_942389026_kdpmpp2m15_PS7.5_a_flower.gif" height="256">
</p>
<details closed>
<summary>Console Output</summary>
## Features
- Instruction based image edits (InstructPix2Pix)
- Control image generation structure (ControlNet)
- Seamless tiled images
- Text-based masking (clipseg)
- Face enhancement (CodeFormer)
- Upscaling
- Outpainting
- Prompt expansion
- Image captioning
- Different generation models
### Image Structure Control [by ControlNet](https://github.com/lllyasviel/ControlNet)
Generate images guided by body poses, depth maps, canny edges, hed boundaries, or normal maps.
<details>
<summary>Openpose Control</summary>
```bash
🤖🧠 received 4 prompt(s) and will repeat them 1 times to create 4 images.
Loading model onto mps backend...
Generating 🖼 : "a scenic landscape" 512x512px seed:557988237 prompt-strength:7.5 steps:40 sampler-type:PLMS
PLMS Sampler: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 40/40 [00:29<00:00, 1.36it/s]
🖼 saved to: ./outputs/000001_557988237_PLMS40_PS7.5_a_scenic_landscape.jpg
Generating 🖼 : "a photo of a dog" 512x512px seed:277230171 prompt-strength:7.5 steps:40 sampler-type:PLMS
PLMS Sampler: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 40/40 [00:28<00:00, 1.41it/s]
🖼 saved to: ./outputs/000002_277230171_PLMS40_PS7.5_a_photo_of_a_dog.jpg
Generating 🖼 : "photo of a fruit bowl" 512x512px seed:639753980 prompt-strength:7.5 steps:40 sampler-type:PLMS
PLMS Sampler: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 40/40 [00:28<00:00, 1.40it/s]
🖼 saved to: ./outputs/000003_639753980_PLMS40_PS7.5_photo_of_a_fruit_bowl.jpg
Generating 🖼 : "portrait photo of a freckled woman" 512x512px seed:500686645 prompt-strength:7.5 steps:40 sampler-type:PLMS
PLMS Sampler: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 40/40 [00:29<00:00, 1.37it/s]
🖼 saved to: ./outputs/000004_500686645_PLMS40_PS7.5_portrait_photo_of_a_freckled_woman.jpg
imagine --control-image assets/indiana.jpg --control-mode openpose --caption-text openpose "photo of a polar bear"
```
</details>
<p float="left">
<img src="assets/indiana.jpg" height="256">
<img src="assets/indiana-pose.jpg" height="256">
<img src="assets/indiana-pose-polar-bear.jpg" height="256">
</p>
<img src="https://raw.githubusercontent.com/brycedrennan/imaginAIry/master/assets/000019_786355545_PLMS50_PS7.5_a_scenic_landscape.jpg" height="256"><img src="https://raw.githubusercontent.com/brycedrennan/imaginAIry/master/assets/000032_337692011_PLMS40_PS7.5_a_photo_of_a_dog.jpg" height="256"><br>
<img src="https://raw.githubusercontent.com/brycedrennan/imaginAIry/master/assets/000056_293284644_PLMS40_PS7.5_photo_of_a_bowl_of_fruit.jpg" height="256"><img src="https://raw.githubusercontent.com/brycedrennan/imaginAIry/master/assets/000078_260972468_PLMS40_PS7.5_portrait_photo_of_a_freckled_woman.jpg" height="256"><br>
<img src="assets/009719_942389026_kdpmpp2m15_PS7.5_a_flower.gif" height="256">
<details>
<summary>Canny Edge Control</summary>
```bash
imagine --control-image assets/lena.png --control-mode canny --caption-text canny "photo of a woman with a hat looking at the camera"
```
</details>
<p float="left">
<img src="assets/lena.png" height="256">
<img src="assets/lena-canny.jpg" height="256">
<img src="assets/lena-canny-generated.jpg" height="256">
</p>
<details>
<summary>HED Boundary Control</summary>
```bash
imagine --control-image dog.jpg --control-mode hed "photo of a dalmation"
```
</details>
<p float="left">
<img src="assets/000032_337692011_PLMS40_PS7.5_a_photo_of_a_dog.jpg" height="256">
<img src="assets/dog-hed-boundary.jpg" height="256">
<img src="assets/dog-hed-boundary-dalmation.jpg" height="256">
</p>
<details>
<summary>Depth Map Control</summary>
```bash
imagine --control-image fancy-living.jpg --control-mode depth "a modern living room"
```
</details>
<p float="left">
<img src="assets/fancy-living.jpg" height="256">
<img src="assets/fancy-living-depth.jpg" height="256">
<img src="assets/fancy-living-depth-generated.jpg" height="256">
</p>
<details>
<summary>Normal Map Control</summary>
```bash
imagine --control-image bird.jpg --control-mode normal "a bird"
```
</details>
<p float="left">
<img src="assets/013986_1_kdpmpp2m59_PS7.5_a_bluejay_[generated].jpg" height="256">
<img src="assets/bird-normal.jpg" height="256">
<img src="assets/bird-normal-generated.jpg" height="256">
</p>
### 🎉 Edit Images with Instructions alone! [by InstructPix2Pix](https://github.com/timothybrooks/instruct-pix2pix)
Just tell imaginairy how to edit the image and it will do it for you!
Use prompt strength to control how strong the edit is. For extra control you can combine
with prompt-based masking.
### Instruction based image edits [by InstructPix2Pix](https://github.com/timothybrooks/instruct-pix2pix)
Just tell imaginairy how to edit the image and it will do it for you!
<p float="left">
<img src="assets/scenic_landscape_winter.jpg" height="256">
<img src="assets/dog_red.jpg" height="256">
<img src="assets/bowl_of_fruit_strawberries.jpg" height="256">
<img src="assets/freckled_woman_cyborg.jpg" height="256">
<img src="assets/014214_51293814_kdpmpp2m30_PS10.0_img2img-1.0_make_the_bird_wear_a_cowboy_hat_[generated].jpg" height="256">
<img src="assets/flower-make-the-flower-out-of-paper-origami.gif" height="256">
<img src="assets/girl-pearl-clown-compare.gif" height="256">
<img src="assets/mona-lisa-headshot-anim.gif" height="256">
<img src="assets/make-it-night-time.gif" height="256">
</p>
<details>
<summary>Click to see shell commands</summary>
Use prompt strength to control how strong the edit is. For extra control you can combine with prompt-based masking.
```bash
# enter imaginairy shell
>> aimg
🤖🧠> edit scenic_landscape.jpg -p "make it winter" --prompt-strength 20
🤖🧠> edit scenic_landscape.jpg -p "make it winter" --steps 30 --arg-schedule "prompt_strength[2:25:0.5]" --compilation-anim gif
🤖🧠> edit dog.jpg -p "make the dog red" --prompt-strength 5
🤖🧠> edit bowl_of_fruit.jpg -p "replace the fruit with strawberries"
🤖🧠> edit freckled_woman.jpg -p "make her a cyborg" --prompt-strength 13
🤖🧠> edit bluebird.jpg -p "make the bird wear a cowboy hat" --prompt-strength 10
🤖🧠> edit flower.jpg -p "make the flower out of paper origami" --arg-schedule prompt-strength[1:11:0.3] --steps 25 --compilation-anim gif
# create a comparison gif
🤖🧠> edit pearl_girl.jpg -p "make her wear clown makeup" --compare-gif
# create an animation showing the edit with increasing prompt strengths
🤖🧠> edit mona-lisa.jpg -p "make it a color professional photo headshot" --negative-prompt "old, ugly, blurry" --arg-schedule "prompt-strength[2:8:0.5]" --compilation-anim gif
🤖🧠> edit gg-bridge.jpg -p "make it night time" --prompt-strength 15 --steps 30 --arg-schedule prompt-strength[1:15:1] --compilation-anim gif
```
</details>
<img src="assets/scenic_landscape_winter.jpg" height="256"><img src="assets/dog_red.jpg" height="256"><br>
<img src="assets/bowl_of_fruit_strawberries.jpg" height="256"><img src="assets/freckled_woman_cyborg.jpg" height="256"><br>
<img src="assets/girl-pearl-clown-compare.gif" height="256"><img src="assets/mona-lisa-headshot-anim.gif" height="256"><br>
### Quick Image Edit Demo
Want just quickly have some fun? Try `edit-demo` to apply some pre-defined edits.
```bash
>> aimg edit-demo pearl_girl.jpg
>> aimg edit-demo mona-lisa.jpg
>> aimg edit-demo luke.jpg
>> aimg edit-demo spock.jpg
```
<img src="assets/girl_with_a_pearl_earring_suprise.gif" height="256"><img src="assets/mona-lisa-suprise.gif" height="256"><br>
<img src="assets/luke-suprise.gif" height="256"><img src="assets/spock-suprise.gif" height="256"><br>
<img src="assets/gg-bridge-suprise.gif" height="256"><img src="assets/shire-suprise.gif" height="256"><br>
<p float="left">
<img src="assets/girl_with_a_pearl_earring_suprise.gif" height="256">
<img src="assets/mona-lisa-suprise.gif" height="256">
<img src="assets/luke-suprise.gif" height="256">
<img src="assets/spock-suprise.gif" height="256">
<img src="assets/gg-bridge-suprise.gif" height="256">
<img src="assets/shire-suprise.gif" height="256">
</p>
### Prompt Based Masking [by clipseg](https://github.com/timojl/clipseg)
@ -167,10 +245,11 @@ Use depth maps for amazing "translations" of existing images.
```bash
>> imagine --model SD-2.0-depth --init-image girl_with_a_pearl_earring_large.jpg --init-image-strength 0.05 "professional headshot photo of a woman with a pearl earring" -r 4 -w 1024 -h 1024 --steps 50
```
<img src="https://raw.githubusercontent.com/brycedrennan/imaginAIry/master/tests/data/girl_with_a_pearl_earring.jpg" height="256"> ➡️
<img src="assets/pearl_depth_1.jpg" height="512">
<img src="assets/pearl_depth_2.jpg" height="512">
<img src="assets/pearl_depth_3.jpg" height="512">
<p float="left">
<img src="tests/data/girl_with_a_pearl_earring.jpg" width="256"> ➡️
<img src="assets/pearl_depth_1.jpg" width="256">
<img src="assets/pearl_depth_2.jpg" width="256">
</p>
### Outpainting
@ -179,9 +258,28 @@ Given a starting image, one can generate it's "surroundings".
Example:
`imagine --init-image pearl-earring.jpg --init-image-strength 0 --outpaint all250,up0,down600 "woman standing"`
<img src="https://raw.githubusercontent.com/brycedrennan/imaginAIry/master/tests/data/girl_with_a_pearl_earring.jpg" height="256"> ➡️
<img src="https://raw.githubusercontent.com/brycedrennan/imaginAIry/master/tests/expected_output/test_outpainting_outpaint_.png" height="256">
### Work with different generation models
<p float="left">
<img src="assets/fairytale-treehouse-sd14.jpg" height="256">
<img src="assets/fairytale-treehouse-sd15.jpg" height="256">
<img src="assets/fairytale-treehouse-sd20.jpg" height="256">
<img src="assets/fairytale-treehouse-sd21.jpg" height="256">
<img src="assets/fairytale-treehouse-openjourney-v1.jpg" height="256">
<img src="assets/fairytale-treehouse-openjourney-v2.jpg" height="256">
</p>
<details>
<summary>Click to see shell command</summary>
```bash
imagine "valley, fairytale treehouse village covered, , matte painting, highly detailed, dynamic lighting, cinematic, realism, realistic, photo real, sunset, detailed, high contrast, denoised, centered, michael whelan" --steps 60 --seed 1 --arg-schedule model[sd14,sd15,sd20,sd21,openjourney-v1,openjourney-v2] --arg-schedule "caption-text[sd14,sd15,sd20,sd21,openjourney-v1,openjourney-v2]"
```
</details>
### Prompt Expansion
You can use `{}` to randomly pull values from lists. A list of values separated by `|`
@ -218,21 +316,16 @@ You can use `{}` to randomly pull values from lists. A list of values separated
a bowl full of gold bars sitting on a table
```
## Features
- It makes images from text descriptions! 🎉
### Additional Features
- Generate images either in code or from command line.
- It just works. Proper requirements are installed. model weights are automatically downloaded. No huggingface account needed.
- It just works. Proper requirements are installed. Model weights are automatically downloaded. No huggingface account needed.
(if you have the right hardware... and aren't on windows)
- No more distorted faces!
- Noisy logs are gone (which was surprisingly hard to accomplish)
- WeightedPrompts let you smash together separate prompts (cat-dog)
- Tile Mode creates tileable images
- Prompt metadata saved into image file metadata
- Edit images by describing the part you want edited (see example above)
- Have AI generate captions for images `aimg describe <filename-or-url>`
- Interactive prompt: just run `aimg`
- 🎉 finetune your own image model. kind of like dreambooth. Read instructions on ["Concept Training"](docs/concept-training.md) page
- finetune your own image model. kind of like dreambooth. Read instructions on ["Concept Training"](docs/concept-training.md) page
## How To
@ -581,6 +674,10 @@ would be uncorrelated to the rest of the surrounding image. It created terrible
- ✅ add k-diffusion sampling methods
- ✅ tiling
- ✅ generation videos/gifs
- ✅ controlnet
- scribbles input
- segmentation input
- mlsd input
- [Attend and Excite](https://attendandexcite.github.io/Attend-and-Excite/)
- Compositional Visual Generation
- https://github.com/energy-based-model/Compositional-Visual-Generation-with-Composable-Diffusion-Models-PyTorch

Binary file not shown.

After

Width:  |  Height:  |  Size: 36 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB

BIN
assets/bird-normal.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 30 KiB

BIN
assets/dog-hed-boundary.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 49 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 50 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 41 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 45 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 138 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

BIN
assets/fancy-living.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 662 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.9 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 38 KiB

BIN
assets/indiana-pose.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 11 KiB

BIN
assets/indiana.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 436 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB

BIN
assets/lena-canny.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 54 KiB

BIN
assets/lena.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 463 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.4 MiB

View File

@ -3,6 +3,7 @@ import math
import os
import re
from imaginairy.img_utils import add_caption_to_image
from imaginairy.schema import SafetyMode
logger = logging.getLogger(__name__)
@ -571,6 +572,9 @@ def _generate_single_image(
mask_img=mask_image_orig,
)
if prompt.caption_text:
add_caption_to_image(gen_img, prompt.caption_text)
result = ImagineResult(
img=gen_img,
prompt=prompt,

View File

@ -247,6 +247,13 @@ common_options = [
type=click.Choice(["gif", "mp4"]),
help="Generate an animation composed of all the images generated in this run. Defaults to gif but `--compilation-anim mp4` will generate an mp4 instead.",
),
click.option(
"--caption-text",
"caption_text",
default=None,
help="Specify the text to write onto the image",
type=str,
),
]
@ -391,6 +398,7 @@ def imagine_cmd(
make_compare_gif,
arg_schedules,
make_compilation_animation,
caption_text,
control_image,
control_image_raw,
control_mode,
@ -440,6 +448,7 @@ def imagine_cmd(
make_compare_gif,
arg_schedules,
make_compilation_animation,
caption_text,
control_image,
control_image_raw,
control_mode,
@ -559,6 +568,7 @@ def edit_image( # noqa
make_compare_gif,
arg_schedules,
make_compilation_animation,
caption_text,
):
"""
Edit an image via AI.
@ -611,6 +621,7 @@ def edit_image( # noqa
make_compare_gif,
arg_schedules,
make_compilation_animation,
caption_text,
)
@ -654,6 +665,7 @@ def _imagine_cmd(
make_compare_gif=False,
arg_schedules=None,
make_compilation_animation=False,
caption_text="",
control_image=None,
control_image_raw=None,
control_mode="",
@ -762,6 +774,7 @@ def _imagine_cmd(
allow_compose_phase=allow_compose_phase,
model=model_weights_path,
model_config_path=model_config_path,
caption_text=caption_text,
)
from imaginairy.prompt_schedules import (
parse_schedule_strs,

View File

@ -118,6 +118,7 @@ class ImaginePrompt:
model_config_path=None,
is_intermediate=False,
collect_progress_latents=False,
caption_text="",
):
self.prompts = prompt
self.negative_prompt = negative_prompt
@ -146,6 +147,7 @@ class ImaginePrompt:
self.allow_compose_phase = allow_compose_phase
self.model = model
self.model_config_path = model_config_path
self.caption_text = caption_text
# we don't want to save intermediate images
self.is_intermediate = is_intermediate
@ -227,7 +229,7 @@ class ImaginePrompt:
self.steps = self.steps or SamplerCls.default_steps
self.width = self.width or get_model_default_image_size(self.model)
self.height = self.height or get_model_default_image_size(self.model)
self.steps = int(self.steps)
if self.negative_prompt is None:
model_config = config.MODEL_CONFIG_SHORTCUTS.get(self.model, None)
if model_config:

BIN
tests/data/red.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 319 B