imaginAIry/README.md

# ImaginAIry 🤖🧠

AI imagined images. Pythonic generation of stable diffusion images.

"just works" on Linux and OSX(M1).

## Examples
### Multiple Prompts
```bash
>> pip install imaginairy
>> imagine "a scenic landscape" "a photo of a dog" "photo of a fruit bowl" "portrait photo of a freckled woman"
🤖🧠 received 4 prompt(s) and will repeat them 1 times to create 4 images.
Loading model onto mps backend...
Generating 🖼  : "a scenic landscape" 512x512px seed:557988237 prompt-strength:7.5 steps:40 sampler-type:PLMS
    PLMS Sampler: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 40/40 [00:29<00:00,  1.36it/s]
    🖼  saved to: ./outputs/000001_557988237_PLMS40_PS7.5_a_scenic_landscape.jpg
Generating 🖼  : "a photo of a dog" 512x512px seed:277230171 prompt-strength:7.5 steps:40 sampler-type:PLMS
    PLMS Sampler: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 40/40 [00:28<00:00,  1.41it/s]
    🖼  saved to: ./outputs/000002_277230171_PLMS40_PS7.5_a_photo_of_a_dog.jpg
Generating 🖼  : "photo of a fruit bowl" 512x512px seed:639753980 prompt-strength:7.5 steps:40 sampler-type:PLMS
    PLMS Sampler: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 40/40 [00:28<00:00,  1.40it/s]
    🖼  saved to: ./outputs/000003_639753980_PLMS40_PS7.5_photo_of_a_fruit_bowl.jpg
Generating 🖼  : "portrait photo of a freckled woman" 512x512px seed:500686645 prompt-strength:7.5 steps:40 sampler-type:PLMS
    PLMS Sampler: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 40/40 [00:29<00:00,  1.37it/s]
    🖼  saved to: ./outputs/000004_500686645_PLMS40_PS7.5_portrait_photo_of_a_freckled_woman.jpg
```
<img src="assets/000019_786355545_PLMS50_PS7.5_a_scenic_landscape.jpg" width="256" height="256">
<img src="assets/000032_337692011_PLMS40_PS7.5_a_photo_of_a_dog.jpg" width="256" height="256">
<img src="assets/000056_293284644_PLMS40_PS7.5_photo_of_a_bowl_of_fruit.jpg" width="256" height="256">
<img src="assets/000078_260972468_PLMS40_PS7.5_portrait_photo_of_a_freckled_woman.jpg" width="256" height="256">

### Tiled Images
```bash
>> imagine "Art Nouveau mosaic" --tile
🤖🧠 received 1 prompt(s) and will repeat them 1 times to create 1 images.
Loading model onto mps backend...
Generating 🖼  : "Art Nouveau mosaic" 512x512px seed:658241102 prompt-strength:7.5 steps:40 sampler-type:PLMS
    PLMS Sampler: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 40/40 [00:31<00:00,  1.28it/s]
    🖼  saved to: ./outputs/000058_658241102_PLMS40_PS7.5_Art_Nouveau_mosaic.jpg
```

<img src="assets/000057_802839261_PLMS40_PS7.5_Art_Nouveau_mosaic._ornate,_highly_detailed,_sharp_focus.jpg" width="256" height="256"><img src="assets/000057_802839261_PLMS40_PS7.5_Art_Nouveau_mosaic._ornate,_highly_detailed,_sharp_focus.jpg" width="256" height="256">
## Features
 
 - It makes images from text descriptions! 🎉
 - Generate images either in code or from command line.
 - It just works. Proper requirements installed, model weights automatically downloaded. No huggingface account needed. (if you have the right hardware... and aren't on windows)
 - Noisy logs are gone (which was surprisingly hard to accomplish)
 - WeightedPrompts let you smash together separate prompts (cat-dog)
 - Tile Mode creates tileable images
 - Prompt metadata saved into image file metadata

## How To

```python
from imaginairy import imagine_images, imagine_image_files, ImaginePrompt, WeightedPrompt

prompts = [
    ImaginePrompt("a scenic landscape", seed=1),
    ImaginePrompt("a bowl of fruit"),
    ImaginePrompt([
       WeightedPrompt("cat", weight=1),
       WeightedPrompt("dog", weight=1),
    ])
]
for result in imagine_images(prompts):
    # do something
    result.save("my_image.jpg")
    
# or

imagine_image_files(prompts, outdir="./my-art")

```

# Requirements

- Computer with CUDA supported graphics card. ~10 gb video ram
OR
- Apple M1 computer

# Improvements from CompVis
 - img2img actually does # of steps you specify
 - performance optimizations
 - 

# Models Used
 - CLIP - https://openai.com/blog/clip/
 - LDM - Latent Diffusion
 - Stable Diffusion 
   - https://github.com/CompVis/stable-diffusion
   - https://huggingface.co/CompVis/stable-diffusion-v1-4
   - https://laion.ai/blog/laion-5b/

# Todo
 - performance optimizations 
   - https://github.com/huggingface/diffusers/blob/main/docs/source/optimization/fp16.mdx
   - https://github.com/neonsecret/stable-diffusion
   - ✅ https://github.com/CompVis/stable-diffusion/compare/main...Doggettx:stable-diffusion:autocast-improvements#
   - ✅ https://www.reddit.com/r/StableDiffusion/comments/xalaws/test_update_for_less_memory_usage_and_higher/
 - deploy to pypi
 - add tests
 - set up ci (test/lint/format)
 - add docs
 - notify https://github.com/CompVis/stable-diffusion/issues/25
 - remove yaml config
 - delete more unused code
 - Interface improvements
   - init-image at command line
   - prompt expansion?
   - webserver interface (low priority, this is a library)
 - Image Generation Features
   - upscaling
     - https://github.com/lowfuel/progrock-stable
   - face improvements
     - codeformer
   - image describe feature - https://replicate.com/methexis-inc/img2prompt
   - outpainting
   - inpainting
     - https://github.com/andreas128/RePaint
   - add more sampling methods?
   - img2img but keeps img stable
     - https://www.reddit.com/r/StableDiffusion/comments/xboy90/a_better_way_of_doing_img2img_by_finding_the/
     - https://gist.github.com/trygvebw/c71334dd127d537a15e9d59790f7f5e1
   - img2img for plms?
   - images as actual prompts instead of just init images
   - cross-attention control: 
     - https://github.com/bloc97/CrossAttentionControl/blob/main/CrossAttention_Release_NoImages.ipynb
   - guided generation https://colab.research.google.com/drive/1dlgggNa5Mz8sEAGU0wFCHhGLFooW_pf1#scrollTo=UDeXQKbPTdZI
   - tiling
   - output show-work videos
   - image variations https://github.com/lstein/stable-diffusion/blob/main/VARIATIONS.md
   - textual inversion 
     - https://www.reddit.com/r/StableDiffusion/comments/xbwb5y/how_to_run_textual_inversion_locally_train_your/
     - https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/sd_textual_inversion_training.ipynb#scrollTo=50JuJUM8EG1h
   - zooming videos? a la disco diffusion
   - fix saturation at high CFG https://www.reddit.com/r/StableDiffusion/comments/xalo78/fixing_excessive_contrastsaturation_resulting/
feature: add nsfw image filter 2022-09-11 07:35:57 +00:00			`# ImaginAIry 🤖🧠`
first commit 2022-09-08 03:59:30 +00:00
fix: util functions in wrong place add tile example in readme 2022-09-12 01:20:58 +00:00			`AI imagined images. Pythonic generation of stable diffusion images.`
first commit 2022-09-08 03:59:30 +00:00
perf: performance optimizations from Doggettx https://github.com/CompVis/stable-diffusion/compare/main...Doggettx:stable-diffusion:autocast-improvements# https://www.reddit.com/r/StableDiffusion/comments/xalaws/test_update_for_less_memory_usage_and_higher/ 2022-09-11 10:08:51 +00:00			`"just works" on Linux and OSX(M1).`
refactor: simplify structure 2022-09-11 07:58:56 +00:00
feature: (wip) better image to image I tried it with the DDIM sampler and it didn't work. Probably need to use the k-diffusion sampler with it from https://gist.githubusercontent.com/trygvebw/c71334dd127d537a15e9d59790f7f5e1/raw/a846393251f5be8289d4febc75a19f1f962aabcc/find_noise.py needs https://github.com/crowsonkb/k-diffusion 2022-09-12 01:00:40 +00:00			`## Examples`
fix: util functions in wrong place add tile example in readme 2022-09-12 01:20:58 +00:00			`### Multiple Prompts`
feature: cleaned up logging - cleans up all the logging. hide most of it - create better readme. show example images - save metadata into image 2022-09-11 06:27:22 +00:00			```bash
			`>> pip install imaginairy`
			`>> imagine "a scenic landscape" "a photo of a dog" "photo of a fruit bowl" "portrait photo of a freckled woman"`
refactor: simplify structure 2022-09-11 07:58:56 +00:00			`🤖🧠 received 4 prompt(s) and will repeat them 1 times to create 4 images.`
			`Loading model onto mps backend...`
			`Generating 🖼 : "a scenic landscape" 512x512px seed:557988237 prompt-strength:7.5 steps:40 sampler-type:PLMS`
			`PLMS Sampler: 100%\|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████\| 40/40 [00:29<00:00, 1.36it/s]`
			`🖼 saved to: ./outputs/000001_557988237_PLMS40_PS7.5_a_scenic_landscape.jpg`
			`Generating 🖼 : "a photo of a dog" 512x512px seed:277230171 prompt-strength:7.5 steps:40 sampler-type:PLMS`
			`PLMS Sampler: 100%\|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████\| 40/40 [00:28<00:00, 1.41it/s]`
			`🖼 saved to: ./outputs/000002_277230171_PLMS40_PS7.5_a_photo_of_a_dog.jpg`
			`Generating 🖼 : "photo of a fruit bowl" 512x512px seed:639753980 prompt-strength:7.5 steps:40 sampler-type:PLMS`
			`PLMS Sampler: 100%\|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████\| 40/40 [00:28<00:00, 1.40it/s]`
			`🖼 saved to: ./outputs/000003_639753980_PLMS40_PS7.5_photo_of_a_fruit_bowl.jpg`
			`Generating 🖼 : "portrait photo of a freckled woman" 512x512px seed:500686645 prompt-strength:7.5 steps:40 sampler-type:PLMS`
			`PLMS Sampler: 100%\|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████\| 40/40 [00:29<00:00, 1.37it/s]`
			`🖼 saved to: ./outputs/000004_500686645_PLMS40_PS7.5_portrait_photo_of_a_freckled_woman.jpg`
feature: cleaned up logging - cleans up all the logging. hide most of it - create better readme. show example images - save metadata into image 2022-09-11 06:27:22 +00:00			```
			`<img src="assets/000019_786355545_PLMS50_PS7.5_a_scenic_landscape.jpg" width="256" height="256">`
			`<img src="assets/000032_337692011_PLMS40_PS7.5_a_photo_of_a_dog.jpg" width="256" height="256">`
			`<img src="assets/000056_293284644_PLMS40_PS7.5_photo_of_a_bowl_of_fruit.jpg" width="256" height="256">`
			`<img src="assets/000078_260972468_PLMS40_PS7.5_portrait_photo_of_a_freckled_woman.jpg" width="256" height="256">`
first commit 2022-09-08 03:59:30 +00:00
fix: util functions in wrong place add tile example in readme 2022-09-12 01:20:58 +00:00			`### Tiled Images`
			```bash
			`>> imagine "Art Nouveau mosaic" --tile`
			`🤖🧠 received 1 prompt(s) and will repeat them 1 times to create 1 images.`
			`Loading model onto mps backend...`
			`Generating 🖼 : "Art Nouveau mosaic" 512x512px seed:658241102 prompt-strength:7.5 steps:40 sampler-type:PLMS`
			`PLMS Sampler: 100%\|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████\| 40/40 [00:31<00:00, 1.28it/s]`
			`🖼 saved to: ./outputs/000058_658241102_PLMS40_PS7.5_Art_Nouveau_mosaic.jpg`
			```

			`<img src="assets/000057_802839261_PLMS40_PS7.5_Art_Nouveau_mosaic._ornate,_highly_detailed,_sharp_focus.jpg" width="256" height="256"><img src="assets/000057_802839261_PLMS40_PS7.5_Art_Nouveau_mosaic._ornate,_highly_detailed,_sharp_focus.jpg" width="256" height="256">`
feature: (wip) better image to image I tried it with the DDIM sampler and it didn't work. Probably need to use the k-diffusion sampler with it from https://gist.githubusercontent.com/trygvebw/c71334dd127d537a15e9d59790f7f5e1/raw/a846393251f5be8289d4febc75a19f1f962aabcc/find_noise.py needs https://github.com/crowsonkb/k-diffusion 2022-09-12 01:00:40 +00:00			`## Features`
feature: cleaned up logging - cleans up all the logging. hide most of it - create better readme. show example images - save metadata into image 2022-09-11 06:27:22 +00:00
fix: util functions in wrong place add tile example in readme 2022-09-12 01:20:58 +00:00			`- It makes images from text descriptions! 🎉`
feature: cleaned up logging - cleans up all the logging. hide most of it - create better readme. show example images - save metadata into image 2022-09-11 06:27:22 +00:00			`- Generate images either in code or from command line.`
fix: util functions in wrong place add tile example in readme 2022-09-12 01:20:58 +00:00			`- It just works. Proper requirements installed, model weights automatically downloaded. No huggingface account needed. (if you have the right hardware... and aren't on windows)`
feature: cleaned up logging - cleans up all the logging. hide most of it - create better readme. show example images - save metadata into image 2022-09-11 06:27:22 +00:00			`- Noisy logs are gone (which was surprisingly hard to accomplish)`
feature: tile mode from https://github.com/replicate/cog-stable-diffusion/compare/main...TomMoore515:material_stable_diffusion:main 2022-09-11 20:56:41 +00:00			`- WeightedPrompts let you smash together separate prompts (cat-dog)`
			`- Tile Mode creates tileable images`
fix: util functions in wrong place add tile example in readme 2022-09-12 01:20:58 +00:00			`- Prompt metadata saved into image file metadata`
first commit 2022-09-08 03:59:30 +00:00
feature: (wip) better image to image I tried it with the DDIM sampler and it didn't work. Probably need to use the k-diffusion sampler with it from https://gist.githubusercontent.com/trygvebw/c71334dd127d537a15e9d59790f7f5e1/raw/a846393251f5be8289d4febc75a19f1f962aabcc/find_noise.py needs https://github.com/crowsonkb/k-diffusion 2022-09-12 01:00:40 +00:00			`## How To`
first commit 2022-09-08 03:59:30 +00:00
feature: cleaned up logging - cleans up all the logging. hide most of it - create better readme. show example images - save metadata into image 2022-09-11 06:27:22 +00:00			```python
			`from imaginairy import imagine_images, imagine_image_files, ImaginePrompt, WeightedPrompt`

			`prompts = [`
			`ImaginePrompt("a scenic landscape", seed=1),`
			`ImaginePrompt("a bowl of fruit"),`
			`ImaginePrompt([`
			`WeightedPrompt("cat", weight=1),`
			`WeightedPrompt("dog", weight=1),`
			`])`
			`]`
			`for result in imagine_images(prompts):`
			`# do something`
			`result.save("my_image.jpg")`

			`# or`

			`imagine_image_files(prompts, outdir="./my-art")`

			```

			`# Requirements`

			`- Computer with CUDA supported graphics card. ~10 gb video ram`
			`OR`
			`- Apple M1 computer`

			`# Improvements from CompVis`
			`- img2img actually does # of steps you specify`
feature: tile mode from https://github.com/replicate/cog-stable-diffusion/compare/main...TomMoore515:material_stable_diffusion:main 2022-09-11 20:56:41 +00:00			`- performance optimizations`
			`-`
feature: cleaned up logging - cleans up all the logging. hide most of it - create better readme. show example images - save metadata into image 2022-09-11 06:27:22 +00:00
			`# Models Used`
perf: performance optimizations from Doggettx https://github.com/CompVis/stable-diffusion/compare/main...Doggettx:stable-diffusion:autocast-improvements# https://www.reddit.com/r/StableDiffusion/comments/xalaws/test_update_for_less_memory_usage_and_higher/ 2022-09-11 10:08:51 +00:00			`- CLIP - https://openai.com/blog/clip/`
first commit 2022-09-08 03:59:30 +00:00			`- LDM - Latent Diffusion`
refactor: simplify structure 2022-09-11 07:58:56 +00:00			`- Stable Diffusion`
			`- https://github.com/CompVis/stable-diffusion`
			`- https://huggingface.co/CompVis/stable-diffusion-v1-4`
feature: tile mode from https://github.com/replicate/cog-stable-diffusion/compare/main...TomMoore515:material_stable_diffusion:main 2022-09-11 20:56:41 +00:00			`- https://laion.ai/blog/laion-5b/`
feature: image prompts 2022-09-09 04:30:20 +00:00
			`# Todo`
feature: (wip) better image to image I tried it with the DDIM sampler and it didn't work. Probably need to use the k-diffusion sampler with it from https://gist.githubusercontent.com/trygvebw/c71334dd127d537a15e9d59790f7f5e1/raw/a846393251f5be8289d4febc75a19f1f962aabcc/find_noise.py needs https://github.com/crowsonkb/k-diffusion 2022-09-12 01:00:40 +00:00			`- performance optimizations`
			`- https://github.com/huggingface/diffusers/blob/main/docs/source/optimization/fp16.mdx`
			`- https://github.com/neonsecret/stable-diffusion`
			`- ✅ https://github.com/CompVis/stable-diffusion/compare/main...Doggettx:stable-diffusion:autocast-improvements#`
			`- ✅ https://www.reddit.com/r/StableDiffusion/comments/xalaws/test_update_for_less_memory_usage_and_higher/`
feature: image prompts 2022-09-09 04:30:20 +00:00			`- deploy to pypi`
feature: cleaned up logging - cleans up all the logging. hide most of it - create better readme. show example images - save metadata into image 2022-09-11 06:27:22 +00:00			`- add tests`
			`- set up ci (test/lint/format)`
feature: tile mode from https://github.com/replicate/cog-stable-diffusion/compare/main...TomMoore515:material_stable_diffusion:main 2022-09-11 20:56:41 +00:00			`- add docs`
feature: add nsfw image filter 2022-09-11 07:35:57 +00:00			`- notify https://github.com/CompVis/stable-diffusion/issues/25`
feature: cleaned up logging - cleans up all the logging. hide most of it - create better readme. show example images - save metadata into image 2022-09-11 06:27:22 +00:00			`- remove yaml config`
feature: tile mode from https://github.com/replicate/cog-stable-diffusion/compare/main...TomMoore515:material_stable_diffusion:main 2022-09-11 20:56:41 +00:00			`- delete more unused code`
feature: cleaned up logging - cleans up all the logging. hide most of it - create better readme. show example images - save metadata into image 2022-09-11 06:27:22 +00:00			`- Interface improvements`
			`- init-image at command line`
			`- prompt expansion?`
			`- webserver interface (low priority, this is a library)`
			`- Image Generation Features`
feature: add nsfw image filter 2022-09-11 07:35:57 +00:00			`- upscaling`
feature: (wip) better image to image I tried it with the DDIM sampler and it didn't work. Probably need to use the k-diffusion sampler with it from https://gist.githubusercontent.com/trygvebw/c71334dd127d537a15e9d59790f7f5e1/raw/a846393251f5be8289d4febc75a19f1f962aabcc/find_noise.py needs https://github.com/crowsonkb/k-diffusion 2022-09-12 01:00:40 +00:00			`- https://github.com/lowfuel/progrock-stable`
feature: add nsfw image filter 2022-09-11 07:35:57 +00:00			`- face improvements`
feature: (wip) better image to image I tried it with the DDIM sampler and it didn't work. Probably need to use the k-diffusion sampler with it from https://gist.githubusercontent.com/trygvebw/c71334dd127d537a15e9d59790f7f5e1/raw/a846393251f5be8289d4febc75a19f1f962aabcc/find_noise.py needs https://github.com/crowsonkb/k-diffusion 2022-09-12 01:00:40 +00:00			`- codeformer`
perf: performance optimizations from Doggettx https://github.com/CompVis/stable-diffusion/compare/main...Doggettx:stable-diffusion:autocast-improvements# https://www.reddit.com/r/StableDiffusion/comments/xalaws/test_update_for_less_memory_usage_and_higher/ 2022-09-11 10:08:51 +00:00			`- image describe feature - https://replicate.com/methexis-inc/img2prompt`
feature: cleaned up logging - cleans up all the logging. hide most of it - create better readme. show example images - save metadata into image 2022-09-11 06:27:22 +00:00			`- outpainting`
			`- inpainting`
feature: tile mode from https://github.com/replicate/cog-stable-diffusion/compare/main...TomMoore515:material_stable_diffusion:main 2022-09-11 20:56:41 +00:00			`- https://github.com/andreas128/RePaint`
			`- add more sampling methods?`
			`- img2img but keeps img stable`
			`- https://www.reddit.com/r/StableDiffusion/comments/xboy90/a_better_way_of_doing_img2img_by_finding_the/`
			`- https://gist.github.com/trygvebw/c71334dd127d537a15e9d59790f7f5e1`
			`- img2img for plms?`
			`- images as actual prompts instead of just init images`
feature: cleaned up logging - cleans up all the logging. hide most of it - create better readme. show example images - save metadata into image 2022-09-11 06:27:22 +00:00			`- cross-attention control:`
			`- https://github.com/bloc97/CrossAttentionControl/blob/main/CrossAttention_Release_NoImages.ipynb`
perf: performance optimizations from Doggettx https://github.com/CompVis/stable-diffusion/compare/main...Doggettx:stable-diffusion:autocast-improvements# https://www.reddit.com/r/StableDiffusion/comments/xalaws/test_update_for_less_memory_usage_and_higher/ 2022-09-11 10:08:51 +00:00			`- guided generation https://colab.research.google.com/drive/1dlgggNa5Mz8sEAGU0wFCHhGLFooW_pf1#scrollTo=UDeXQKbPTdZI`
feature: cleaned up logging - cleans up all the logging. hide most of it - create better readme. show example images - save metadata into image 2022-09-11 06:27:22 +00:00			`- tiling`
			`- output show-work videos`
refactor: simplify structure 2022-09-11 07:58:56 +00:00			`- image variations https://github.com/lstein/stable-diffusion/blob/main/VARIATIONS.md`
feature: (wip) better image to image I tried it with the DDIM sampler and it didn't work. Probably need to use the k-diffusion sampler with it from https://gist.githubusercontent.com/trygvebw/c71334dd127d537a15e9d59790f7f5e1/raw/a846393251f5be8289d4febc75a19f1f962aabcc/find_noise.py needs https://github.com/crowsonkb/k-diffusion 2022-09-12 01:00:40 +00:00			`- textual inversion`
			`- https://www.reddit.com/r/StableDiffusion/comments/xbwb5y/how_to_run_textual_inversion_locally_train_your/`
			`- https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/sd_textual_inversion_training.ipynb#scrollTo=50JuJUM8EG1h`
feature: cleaned up logging - cleans up all the logging. hide most of it - create better readme. show example images - save metadata into image 2022-09-11 06:27:22 +00:00			`- zooming videos? a la disco diffusion`
feature: tile mode from https://github.com/replicate/cog-stable-diffusion/compare/main...TomMoore515:material_stable_diffusion:main 2022-09-11 20:56:41 +00:00			`- fix saturation at high CFG https://www.reddit.com/r/StableDiffusion/comments/xalo78/fixing_excessive_contrastsaturation_resulting/`
feature: cleaned up logging - cleans up all the logging. hide most of it - create better readme. show example images - save metadata into image 2022-09-11 06:27:22 +00:00
feature: image prompts 2022-09-09 04:30:20 +00:00