Go to file
2022-09-18 06:10:39 -07:00
assets feature: txt2mask - automated text replacement 2022-09-18 06:07:47 -07:00
imaginairy feature: txt2mask - automated text replacement 2022-09-18 06:07:47 -07:00
tests feature: txt2mask - automated text replacement 2022-09-18 06:07:47 -07:00
.gitignore refactor/test: logging suppression + hashed image test 2022-09-14 19:40:50 -07:00
Dockerfile feature: urls as init images 2022-09-15 23:06:59 -07:00
LICENSE docs: update readme. add docs to package 2022-09-11 21:36:14 -07:00
Makefile feature: automatic mask generation 2022-09-17 17:08:48 -07:00
README.md feature: txt2mask - automated text replacement 2022-09-18 06:07:47 -07:00
requirements-dev.in ci: add linter, import sorting 2022-09-11 13:57:36 -07:00
requirements-dev.txt fix: restrict usage of difficult protobuf versions 2022-09-16 08:18:50 -07:00
setup.py feature: txt2mask - automated text replacement 2022-09-18 06:07:47 -07:00
STABLE_DIFFUSION_LICENSE refactor: simplify structure 2022-09-11 00:59:03 -07:00
tox.ini refactor: cleanup ddim 2022-09-17 14:02:27 -07:00

ImaginAIry 🤖🧠

AI imagined images. Pythonic generation of stable diffusion images.

"just works" on Linux and OSX(M1).

Examples

>> pip install imaginairy
>> imagine "a scenic landscape" "a photo of a dog" "photo of a fruit bowl" "portrait photo of a freckled woman"
Console Output
🤖🧠 received 4 prompt(s) and will repeat them 1 times to create 4 images.
Loading model onto mps backend...
Generating 🖼  : "a scenic landscape" 512x512px seed:557988237 prompt-strength:7.5 steps:40 sampler-type:PLMS
    PLMS Sampler: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 40/40 [00:29<00:00,  1.36it/s]
    🖼  saved to: ./outputs/000001_557988237_PLMS40_PS7.5_a_scenic_landscape.jpg
Generating 🖼  : "a photo of a dog" 512x512px seed:277230171 prompt-strength:7.5 steps:40 sampler-type:PLMS
    PLMS Sampler: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 40/40 [00:28<00:00,  1.41it/s]
    🖼  saved to: ./outputs/000002_277230171_PLMS40_PS7.5_a_photo_of_a_dog.jpg
Generating 🖼  : "photo of a fruit bowl" 512x512px seed:639753980 prompt-strength:7.5 steps:40 sampler-type:PLMS
    PLMS Sampler: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 40/40 [00:28<00:00,  1.40it/s]
    🖼  saved to: ./outputs/000003_639753980_PLMS40_PS7.5_photo_of_a_fruit_bowl.jpg
Generating 🖼  : "portrait photo of a freckled woman" 512x512px seed:500686645 prompt-strength:7.5 steps:40 sampler-type:PLMS
    PLMS Sampler: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 40/40 [00:29<00:00,  1.37it/s]
    🖼  saved to: ./outputs/000004_500686645_PLMS40_PS7.5_portrait_photo_of_a_freckled_woman.jpg


Automated Replacement (txt2mask) by clipseg

>> imagine --init-image pearl_earring.jpg --mask-prompt face --mask-mode keep --init-image-strength .4 "a female doctor" "an elegant woman"

➡️

>> imagine --init-image fruit-bowl.jpg --mask-prompt fruit --mask-mode replace --init-image-strength .1 "a bowl of pears" "a bowl of gold" "a bowl of popcorn" "a bowl of spaghetti"

➡️

Face Enhancement by CodeFormer

>> imagine "a couple smiling" --steps 40 --seed 1 --fix-faces

➡️

Upscaling by RealESRGAN

>> imagine "colorful smoke" --steps 40 --upscale

➡️

Tiled Images

>> imagine  "gold coins" "a lush forest" "piles of old books" leaves --tile


Image-to-Image

>> imagine "portrait of a smiling lady. oil painting" --init-image girl_with_a_pearl_earring.jpg

➡️

Features

  • It makes images from text descriptions! 🎉
  • Generate images either in code or from command line.
  • It just works. Proper requirements are installed. model weights are automatically downloaded. No huggingface account needed. (if you have the right hardware... and aren't on windows)
  • No more distorted faces!
  • Noisy logs are gone (which was surprisingly hard to accomplish)
  • WeightedPrompts let you smash together separate prompts (cat-dog)
  • Tile Mode creates tileable images
  • Prompt metadata saved into image file metadata

How To

from imaginairy import imagine, imagine_image_files, ImaginePrompt, WeightedPrompt, LazyLoadingImage

url = "https://upload.wikimedia.org/wikipedia/commons/thumb/6/6c/Thomas_Cole_-_Architect%E2%80%99s_Dream_-_Google_Art_Project.jpg/540px-Thomas_Cole_-_Architect%E2%80%99s_Dream_-_Google_Art_Project.jpg"
prompts = [
    ImaginePrompt("a scenic landscape", seed=1),
    ImaginePrompt("a bowl of fruit"),
    ImaginePrompt([
        WeightedPrompt("cat", weight=1),
        WeightedPrompt("dog", weight=1),
    ]),
    ImaginePrompt(
        "a spacious building", 
        init_image=LazyLoadingImage(url)
    )
]
for result in imagine(prompts):
    # do something
    result.save("my_image.jpg")

# or

imagine_image_files(prompts, outdir="./my-art")

Requirements

  • ~10 gb space for models to download
  • A decent computer with either a CUDA supported graphics card or M1 processor.

Running in Docker

See example Dockerfile (works on machine where you can pass the gpu into the container)

docker build . -t imaginairy
# you really want to map the cache or you end up wasting a lot of time and space redownloading the model weights
docker run -it --gpus all -v $HOME/.cache/huggingface:/root/.cache/huggingface -v $HOME/.cache/torch:/root/.cache/torch -v `pwd`/outputs:/outputs imaginairy /bin/bash

Improvements from CompVis

  • img2img actually does # of steps you specify
  • performance optimizations

Models Used

Not Supported

  • a web interface. this is a python library

Todo

Noteable Stable Diffusion Implementations

Further Reading