# ImaginAIry 🤖🧠 [![Downloads](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1rOvQNs0Cmn_yU1bKWjCOHzGVDgZkaTtO?usp=sharing) [![Downloads](https://pepy.tech/badge/imaginairy)](https://pepy.tech/project/imaginairy) [![image](https://img.shields.io/pypi/v/imaginairy.svg)](https://pypi.org/project/imaginairy/) [![image](https://img.shields.io/badge/license-MIT-green)](https://github.com/brycedrennan/imaginAIry/blob/master/LICENSE/) [![Discord](https://flat.badgen.net/discord/members/FdD7ut3YjW)](https://discord.gg/FdD7ut3YjW) AI imagined images. Pythonic generation of stable diffusion images **and videos** *!. "just works" on Linux and macOS(M1) (and sometimes windows). ```bash # on macOS, make sure rust is installed first # be sure to use Python 3.10, Python 3.11 is not supported at the moment >> pip install imaginairy >> imagine "a scenic landscape" "a photo of a dog" "photo of a fruit bowl" "portrait photo of a freckled woman" "a bluejay" # Make an AI video >> aimg videogen --start-image rocket.png ``` ## Stable Video Diffusion
### Rushed release of Stable Diffusion Video! Works with Nvidia GPUs. Does not work on Mac or CPU. On Windows you'll need to install torch 2.0 first via https://pytorch.org/get-started/locally/ ```text Usage: aimg videogen [OPTIONS] AI generate a video from an image Example: aimg videogen --start-image assets/rocket-wide.png Options: --start-image TEXT Input path for image file. --num-frames INTEGER Number of frames. --num-steps INTEGER Number of steps. --model TEXT Model to use. One of: svd, svd_xt, svd_image_decoder, svd_xt_image_decoder --fps INTEGER FPS for the AI to target when generating video --output-fps INTEGER FPS for the output video --motion-amount INTEGER How much motion to generate. value between 0 and 255. -r, --repeats INTEGER How many times to repeat the renders. [default: 1] --cond-aug FLOAT Conditional augmentation. --seed INTEGER Seed for random number generator. --decoding_t INTEGER Number of frames decoded at a time. --output_folder TEXT Output folder. --help Show this message and exit. ``` ### Images
### Whats New [See full Changelog here](./docs/changelog.md) **14.1.1** - tests: add installation tests for windows, mac, and conda - fix: dependency issues **14.1.0** - 🎉 feature: make video generation smooth by adding frame interpolation - feature: SDXL weights in the compvis format can now be used - feature: allow video generation at any size specified by user - feature: video generations output in "bounce" format - feature: choose video output format: mp4, webp, or gif - feature: fix random seed handling in video generation - docs: auto-publish docs on push to master - build: remove imageio dependency - build: vendorize facexlib so we don't install its unneeded dependencies **14.0.4** - docs: add a documentation website at https://brycedrennan.github.io/imaginAIry/ - build: remove fairscale dependency - fix: video generation was broken **14.0.3** - fix: several critical bugs with package - tests: add a wheel smoketest to detect these issues in the future **14.0.0** - 🎉 video generation using [Stable Video Diffusion](https://github.com/Stability-AI/generative-models) - add `--videogen` to any image generation to create a short video from the generated image - or use `aimg videogen` to generate a video from an image - 🎉 SDXL (Stable Diffusion Extra Large) models are now supported. - try `--model opendalle` or `--model sdxl` - inpainting and controlnets are not yet supported for SDXL - 🎉 imaginairy is now backed by the [refiners library](https://github.com/finegrain-ai/refiners) - This was a huge rewrite which is why some features are not yet supported. On the plus side, refiners supports cutting edge features (SDXL, image prompts, etc) which will be added to imaginairy soon. - [self-attention guidance](https://github.com/SusungHong/Self-Attention-Guidance) which makes details of images more accurate - 🎉 feature: larger image generations now work MUCH better and stay faithful to the same image as it looks at a smaller size. For example `--size 720p --seed 1` and `--size 1080p --seed 1` will produce the same image for SD15 - 🎉 feature: loading diffusers based models now supported. Example `--model https://huggingface.co/ainz/diseny-pixar --model-architecture sd15` - 🎉 feature: qrcode controlnet! ### Run API server and StableStudio web interface (alpha) Generate images via API or web interface. Much smaller featureset compared to the command line tool. ```bash >> aimg server ``` Visit http://localhost:8000/ and http://localhost:8000/docs ### Image Structure Control [by ControlNet](https://github.com/lllyasviel/ControlNet) #### (Not supported for SDXL yet) Generate images guided by body poses, depth maps, canny edges, hed boundaries, or normal maps. **Openpose Control** ```bash imagine --control-image assets/indiana.jpg --control-mode openpose --caption-text openpose "photo of a polar bear" ```
#### Canny Edge Control ```bash imagine --control-image assets/lena.png --control-mode canny "photo of a woman with a hat looking at the camera" ```
#### HED Boundary Control ```bash imagine --control-image dog.jpg --control-mode hed "photo of a dalmation" ```
#### Depth Map Control ```bash imagine --control-image fancy-living.jpg --control-mode depth "a modern living room" ```
#### Normal Map Control ```bash imagine --control-image bird.jpg --control-mode normal "a bird" ```
#### Image Shuffle Control Generates the image based on elements of the control image. Kind of similar to style transfer. ```bash imagine --control-image pearl-girl.jpg --control-mode shuffle "a clown" ``` The middle image is the "shuffled" input image
#### Editing Instructions Control Similar to instructPix2Pix (below) but works with any SD 1.5 based model. ```bash imagine --control-image pearl-girl.jpg --control-mode edit --init-image-strength 0.01 --steps 30 --negative-prompt "" --model openjourney-v2 "make it anime" "make it at the beach" ```
#### Add Details Control (upscaling/super-resolution) Replaces existing details in an image. Good to use with --init-image-strength 0.2 ```bash imagine --control-image "assets/wishbone.jpg" --control-mode details "sharp focus, high-resolution" --init-image-strength 0.2 --steps 30 -w 2048 -h 2048 ```
### Image (re)Colorization (using brightness control) Colorize black and white images or re-color existing images. The generated colors will be applied back to the original image. You can either provide a caption or allow the tool to generate one for you. ```bash aimg colorize pearl-girl.jpg --caption "photo of a woman" ```
### Instruction based image edits [by InstructPix2Pix](https://github.com/timothybrooks/instruct-pix2pix) #### (Broken as of 14.0.0) Just tell imaginairy how to edit the image and it will do it for you!
### Prompt Based Masking [by clipseg](https://github.com/timojl/clipseg) Specify advanced text based masks using boolean logic and strength modifiers. Mask syntax: - mask descriptions must be lowercase - keywords (`AND`, `OR`, `NOT`) must be uppercase - parentheses are supported - mask modifiers may be appended to any mask or group of masks. Example: `(dog OR cat){+5}` means that we'll select any dog or cat and then expand the size of the mask area by 5 pixels. Valid mask modifiers: - `{+n}` - expand mask by n pixels - `{-n}` - shrink mask by n pixels - `{*n}` - multiply mask strength. will expand mask to areas that weakly matched the mask description - `{/n}` - divide mask strength. will reduce mask to areas that most strongly matched the mask description. probably not useful When writing strength modifiers keep in mind that pixel values are between 0 and 1. ```bash >> imagine \ --init-image pearl_earring.jpg \ --mask-prompt "face AND NOT (bandana OR hair OR blue fabric){*6}" \ --mask-mode keep \ --init-image-strength .2 \ --fix-faces \ "a modern female president" "a female robot" "a female doctor" "a female firefighter" ``` ➡️ ```bash >> imagine \ --init-image fruit-bowl.jpg \ --mask-prompt "fruit OR fruit stem{*6}" \ --mask-mode replace \ --mask-modify-original \ --init-image-strength .1 \ "a bowl of kittens" "a bowl of gold coins" "a bowl of popcorn" "a bowl of spaghetti" ``` ➡️ ### Face Enhancement [by CodeFormer](https://github.com/sczhou/CodeFormer) ```bash >> imagine "a couple smiling" --steps 40 --seed 1 --fix-faces ``` ➡️ ### Upscaling [by RealESRGAN](https://github.com/xinntao/Real-ESRGAN) ```bash >> imagine "colorful smoke" --steps 40 --upscale # upscale an existing image >> aimg upscale my-image.jpg ```
➡️
### Outpainting Given a starting image, one can generate it's "surroundings". Example: `imagine --init-image pearl-earring.jpg --init-image-strength 0 --outpaint all250,up0,down600 "woman standing"` ➡️ ### Work with different generation models