feature: dilation and erosion of masks

Previously the `+` and `-` characters in a mask (example: `face{+0.1}`) added to the grayscale value of any masked areas. This wasn't very useful. The new behavior is that the mask will expand or contract by the number of pixel specified. The technical terms for this are dilation and erosion.  This allows much greater control over the masked area.
pull/68/head
Bryce 2 years ago committed by Bryce Drennan
parent 6f1455e912
commit 8332593fed

@ -36,9 +36,19 @@ Generating 🖼 : "portrait photo of a freckled woman" 512x512px seed:500686645
<img src="https://raw.githubusercontent.com/brycedrennan/imaginAIry/master/assets/000056_293284644_PLMS40_PS7.5_photo_of_a_bowl_of_fruit.jpg" height="256"><img src="https://raw.githubusercontent.com/brycedrennan/imaginAIry/master/assets/000078_260972468_PLMS40_PS7.5_portrait_photo_of_a_freckled_woman.jpg" height="256">
### Prompt Based Editing [by clipseg](https://github.com/timojl/clipseg)
Specify advanced text based masks using boolean logic and strength modifiers. Mask descriptions must be lowercase. Keywords uppercase.
Valid symbols: `AND`, `OR`, `NOT`, `()`, and mask strength modifier `{*1.5}` where `+` can be any of `+ - * /`. Single-character boolean
operators also work. When writing strength modifies know that pixel values are between 0 and 1.
Specify advanced text based masks using boolean logic and strength modifiers.
Mask syntax:
- mask descriptions must be lowercase
- keywords (`AND`, `OR`, `NOT`) must be uppercase
- parentheses are supported
- mask modifiers may be appended to any mask or group of masks. Example: `(dog OR cat){+5}` means that we'll
select any dog or cat and then expand the size of the mask area by 5 pixels. Valid mask modifiers:
- `{+n}` - expand mask by n pixels
- `{-n}` - shrink mask by n pixels
- `{*n}` - multiply mask strength. will expand mask to areas that weakly matched the mask description
- `{/n}` - divide mask strength. will reduce mask to areas that most strongly matched the mask description. probably not useful
When writing strength modifiers keep in mind that pixel values are between 0 and 1.
```bash
>> imagine \
@ -213,7 +223,8 @@ docker run -it --gpus all -v $HOME/.cache/huggingface:/root/.cache/huggingface -
[Example Colab](https://colab.research.google.com/drive/1rOvQNs0Cmn_yU1bKWjCOHzGVDgZkaTtO?usp=sharing)
## ChangeLog
- feature: dilation and erosion of masks
Previously the `+` and `-` characters in a mask (example: `face{+0.1}`) added to the grayscale value of any masked areas. This wasn't very useful. The new behavior is that the mask will expand or contract by the number of pixel specified. The technical terms for this are dilation and erosion. This allows much greater control over the masked area.
- feature: update k-diffusion samplers. add k_dpm_adaptive and k_dpm_fast
**3.1.0**
@ -359,6 +370,9 @@ would be uncorrelated to the rest of the surrounding image. It created terrible
- ✅ text based image masking
- ✅ ClipSeg - https://github.com/timojl/clipseg
- https://github.com/facebookresearch/detectron2
- Attention Control Methods
- https://github.com/bloc97/CrossAttentionControl
- https://github.com/ChenWu98/cycle-diffusion
- Image Enhancement
- Photo Restoration - https://github.com/microsoft/Bringing-Old-Photos-Back-to-Life
- Upscaling
@ -392,8 +406,6 @@ would be uncorrelated to the rest of the surrounding image. It created terrible
- https://github.com/francislabountyjr/stable-diffusion/blob/main/inferencing_notebook.ipynb
- https://www.youtube.com/watch?v=E7aAFEhdngI
- https://github.com/pytti-tools/frame-interpolation
- cross-attention control:
- https://github.com/bloc97/CrossAttentionControl/blob/main/CrossAttention_Release_NoImages.ipynb
- guided generation
- https://colab.research.google.com/drive/1dlgggNa5Mz8sEAGU0wFCHhGLFooW_pf1#scrollTo=UDeXQKbPTdZI
- https://colab.research.google.com/github/aicrumb/doohickey/blob/main/Doohickey_Diffusion.ipynb#scrollTo=PytCwKXCmPid

@ -21,6 +21,7 @@ from abc import ABC
import pyparsing as pp
import torch
from kornia.morphology import dilation, erosion
from pyparsing import ParserElement
ParserElement.enablePackrat()
@ -70,7 +71,8 @@ class ModifiedMask(Mask):
modifier = modifier.strip("{}")
self.mask = mask
self.modifier = modifier
self.operand = self.ops[modifier[0]]
self.operand_str = modifier[0]
self.operand = self.ops[self.operand_str]
self.value = float(modifier[1:])
@classmethod
@ -85,6 +87,15 @@ class ModifiedMask(Mask):
def apply_masks(self, mask_cache):
mask = self.mask.apply_masks(mask_cache)
if self.operand_str in {"+", "-"}:
# kernel must be odd
kernel_size = int(round(self.value))
kernel_size = kernel_size if kernel_size % 2 else kernel_size + 1
morph_method = dilation if self.operand_str == "+" else erosion
mask = mask.unsqueeze_(0).unsqueeze_(0)
mask = morph_method(mask, torch.ones(kernel_size, kernel_size))
mask = mask.squeeze()
return mask
return torch.clamp(self.operand(mask, self.value), 0, 1)

Binary file not shown.

After

Width:  |  Height:  |  Size: 319 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 93 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 79 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 94 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 25 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 71 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 14 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 31 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 89 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 94 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 93 KiB

@ -19,13 +19,13 @@ def test_imagine(sampler_type, filename_base_for_outputs):
)
result = next(imagine(prompt))
threshold_lookup = {
"k_dpm_2_a": 26000
}
threshold_lookup = {"k_dpm_2_a": 26000}
threshold = threshold_lookup.get(sampler_type, 10000)
img_path = f"{filename_base_for_outputs}.png"
assert_image_similar_to_expectation(result.img, img_path=img_path, threshold=threshold)
assert_image_similar_to_expectation(
result.img, img_path=img_path, threshold=threshold
)
def test_img2img_beach_to_sunset(
@ -115,13 +115,16 @@ def test_img_to_img_fruit_2_gold(
result = next(imagine(prompt))
threshold_lookup = {
"k_dpm_2_a": 26000
"k_dpm_2_a": 26000,
"k_dpm_adaptive": 11000,
}
threshold = threshold_lookup.get(sampler_type, 10000)
pillow_fit_image_within(img).save(f"{filename_base_for_orig_outputs}__orig.jpg")
img_path = f"{filename_base_for_outputs}.png"
assert_image_similar_to_expectation(result.img, img_path=img_path, threshold=threshold)
assert_image_similar_to_expectation(
result.img, img_path=img_path, threshold=threshold
)
@pytest.mark.skipif(get_device() == "cpu", reason="Too slow to run on CPU")

@ -27,26 +27,23 @@ def test_fix_faces(filename_base_for_orig_outputs, filename_base_for_outputs):
@pytest.mark.skipif(get_device() == "cpu", reason="Too slow to run on CPU")
def test_clip_masking():
def test_clip_masking(filename_base_for_outputs):
img = Image.open(f"{TESTS_FOLDER}/data/girl_with_a_pearl_earring_large.jpg")
for mask_modifier in [
"*0.5",
"*1",
"*6",
]:
for mask_modifier in ["*0.5", "*6", "+1", "+11", "+101", "-25"]:
pred_bin, pred_grayscale = get_img_mask(
img,
f"face AND NOT (bandana OR hair OR blue fabric){{{mask_modifier}}}",
threshold=0.5,
)
pred_grayscale.save(
f"{TESTS_FOLDER}/test_output/earring_mask_{mask_modifier}_g.png"
)
pred_bin.save(
f"{TESTS_FOLDER}/test_output/earring_mask_{mask_modifier}_bin.png"
img_path = f"{filename_base_for_outputs}_mask{mask_modifier}_g.png"
assert_image_similar_to_expectation(
pred_grayscale, img_path=img_path, threshold=0
)
img_path = f"{filename_base_for_outputs}_mask{mask_modifier}_bin.png"
assert_image_similar_to_expectation(pred_bin, img_path=img_path, threshold=10)
prompt = ImaginePrompt(
"",
init_image=img,
@ -60,10 +57,8 @@ def test_clip_masking():
)
result = next(imagine(prompt))
result.save(
f"{TESTS_FOLDER}/test_output/earring_mask_photo.png",
image_type="generated",
)
img_path = f"{filename_base_for_outputs}.png"
assert_image_similar_to_expectation(result.img, img_path=img_path, threshold=10)
boolean_mask_test_cases = [

Loading…
Cancel
Save