mirror of https://github.com/brycedrennan/imaginAIry synced 2024-11-05 12:00:15 +00:00

Go to file

Bryce 7a33ee2480 feature: cleaned up logging - cleans up all the logging. hide most of it - create better readme. show example images - save metadata into image		2022-09-10 23:27:22 -07:00
assets	feature: cleaned up logging	2022-09-10 23:27:22 -07:00
imaginairy	feature: cleaned up logging	2022-09-10 23:27:22 -07:00
tests	feature: cleaned up logging	2022-09-10 23:27:22 -07:00
.gitignore	feature: cleaned up logging	2022-09-10 23:27:22 -07:00
LICENSE	first commit	2022-09-07 20:59:30 -07:00
README.md	feature: cleaned up logging	2022-09-10 23:27:22 -07:00
requirements-dev.in	feature: add ImageResult. step output option	2022-09-09 22:14:04 -07:00
requirements.txt	first commit	2022-09-07 20:59:30 -07:00
setup.py	feature: cleaned up logging	2022-09-10 23:27:22 -07:00

README.md

ImaginAIry

AI imagined images.

>> pip install imaginairy

>> imagine "a scenic landscape" "a photo of a dog" "photo of a fruit bowl" "portrait photo of a freckled woman"

Features

It makes images from text descriptions!
Generate images either in code or from command line.
It just works (if you have the right hardware)
Noisy logs are gone (which was surprisingly hard to accomplish)
WeightedPrompts let you smash together separate prompts ()

How To

from imaginairy import imagine_images, imagine_image_files, ImaginePrompt, WeightedPrompt

prompts = [
    ImaginePrompt("a scenic landscape", seed=1),
    ImaginePrompt("a bowl of fruit"),
    ImaginePrompt([
       WeightedPrompt("cat", weight=1),
       WeightedPrompt("dog", weight=1),
    ])
]
for result in imagine_images(prompts):
    # do something
    result.save("my_image.jpg")
    
# or

imagine_image_files(prompts, outdir="./my-art")

Requirements

Computer with CUDA supported graphics card. ~10 gb video ram OR
Apple M1 computer

Improvements from CompVis

img2img actually does # of steps you specify

Models Used

CLIP
LDM - Latent Diffusion
Stable Diffusion - https://github.com/CompVis/stable-diffusion

Todo

add safety model - https://github.com/CompVis/stable-diffusion/blob/main/scripts/txt2img.py#L21-L28
add docs
deploy to pypi
add tests
set up ci (test/lint/format)
remove yaml config
performance optimizations https://github.com/huggingface/diffusers/blob/main/docs/source/optimization/fp16.mdx
Interface improvements
- init-image at command line
- prompt expansion?
- webserver interface (low priority, this is a library)
Image Generation Features
- image describe feature
- outpainting
- inpainting
- face improvements
- upscaling
- cross-attention control:
  - https://github.com/bloc97/CrossAttentionControl/blob/main/CrossAttention_Release_NoImages.ipynb
- tiling
- output show-work videos
- zooming videos? a la disco diffusion