- feature: finetuning your own image models
- feature: image prep command. crops to face or other interesting parts of photo
- fix: back-compat for hf_hub_download
- feature: add prune-ckpt command
- feature: allow specification of model config file
- feature: negative prompting. `--negative-prompt` or `ImaginePrompt(..., negative_prompt="ugly, deformed, extra arms, etc")`
- feature: a default negative prompt is added to all generations. Images in SD-2.0 don't look bad anymore. Images in 1.5 look improved as well.
- feature: use different default steps and image sizes depending on sampler and model selceted
- fix: #110 use proper version in image metadata
- refactor: samplers all have their own class that inherits from ImageSampler
You can use `{}` to randomly pull values from lists. A list of values separated by `|` and enclosed in `{ }` will be randomly drawn from in a non-repeating fashion. Values that are surrounded by `_ _` will pull from a phrase list of the same name. Folders containing .txt phraselist files may be specified via
`--prompt_library_path`. The option may be specified multiple times. Built-in categories:
3d-term, adj-architecture, adj-beauty, adj-detailed, adj-emotion, adj-general, adj-horror, animal, art-movement,
art-site, artist, artist-botanical, artist-surreal, aspect-ratio, bird, body-of-water, body-pose, camera-brand,
camera-model, color, cosmic-galaxy, cosmic-nebula, cosmic-star, cosmic-term, dinosaur, eyecolor, f-stop,
fantasy-creature, fantasy-setting, fish, flower, focal-length, food, fruit, games, gen-modifier, hair, hd,
iso-stop, landscape-type, national-park, nationality, neg-weight, noun-beauty, noun-fantasy, noun-general,
noun-horror, occupation, photo-term, pop-culture, pop-location, punk-style, quantity, rpg-item, scenario-desc,
skin-color, spaceship, style, tree-species, trippy, world-heritage-site
Examples:
`imagine "a {red|black} dog" -r 2 --seed 0` will generate both "a red dog" and "a black dog"
`imagine "a {_color_} dog" -r 4 --seed 0` will generate four, different colored dogs. The colors will eb pulled from an included
phraselist of colors.
`imagine "a {_spaceship_|_fruit_|hot air balloon}. low-poly" -r 4 --seed 0` will generate images of spaceships or fruits or a hot air balloon
Credit to [noodle-soup-prompts](https://github.com/WASasquatch/noodle-soup-prompts/) where most, but not all, of the wordlists originate.
while the previous version did produce much better blending it also makes images that lack detail for some reason.
tests: Added more tests to help catch this sort of thing earlies
fix: found that median blur is really slow, so I made sure we only do it on downsampled masks. Was taking like 3 minutes to run on the large pearl girl picture on M1
- docs: update examples
Specify advanced text based masks using boolean logic and strength modifiers. Mask descriptions must be lowercase. Keywords uppercase.
Valid symbols: `AND`, `OR`, `NOT`, `()`, and mask strength modifier `{*1.5}` where `+` can be any of `+ - * /`. Single-character boolean
operators also work. When writing strength modifies know that pixel values are between 0 and 1.
- feature: apply mask edits to original files
- feature: auto-rotate images if exif data specifies to do so
- fix: accept mask images in command line
Seems to be caused by incompatible types in group_norm when we use autocast.
Patch group_norm to cast the weights to the same type as the inputs
From what I can understand all the other repos just switch to full precision instead
of addressing this. I think this would make things slower but I'm not sure. So maybe
the patching solution I'm doing is better?
https://github.com/pytorch/pytorch/pull/81852
- improved image logging functionality. can just stick log_latent wherever you want
- improved some variable naming
- moved all the samplers together
- vendored k-diffusion library