code update

GIMP3-ML
DESKTOP-F04AGRR\Kritik Soman 3 years ago
parent fc91dc1e7e
commit a18423d67a

@ -1,3 +1,58 @@
# GIMP3-ML development
<img src="https://github.com/kritiksoman/tmp/blob/master/cover.png" width="1280" height="180"> <br>
# Semantics for GNU Image Manipulation Program
### [<img src="https://github.com/kritiksoman/tmp/blob/master/yt.png" width="70" height="50">](https://www.youtube.com/channel/UCzZn99R6Zh0ttGqvZieT4zw) [<img src="https://github.com/kritiksoman/tmp/blob/master/inst.png" width="50" height="50">](https://www.instagram.com/explore/tags/gimpml/) [<img src="https://github.com/kritiksoman/tmp/blob/master/arxiv.png" width="100" height="50">](https://arxiv.org/abs/2004.13060) [<img src="https://github.com/kritiksoman/tmp/blob/master/manual.png" width="100" height="50">](https://github.com/kritiksoman/GIMP-ML/wiki/User-Manual) [<img src="https://github.com/kritiksoman/tmp/blob/master/ref.png" width="100" height="50">](https://github.com/kritiksoman/GIMP-ML/wiki/References) [<img src="https://github.com/kritiksoman/tmp/blob/master/wiki.png" width="100" height="30">](https://en.wikipedia.org/wiki/GIMP#Extensions)<br>
:star: :star: :star: :star: are welcome. New tools will be added and existing will be improved with time.<br>
-> Work in progress.
Updates: <br>
[October 31] Use super-resolution as a filter for medium/large images. (Existing users should be able to update.)<br>
[October 17] Added image enlightening.<br>
[September 27] Added Force CPU use button and minor bug fixes. <br>
[August 28] Added deep learning based dehazing and denoising. <br>
[August 25] Simplified installation and updating method. <br>
[August 2] Added deep matting and k-means. <br>
[July 17] MonoDepth and Colorization models have been updated. <br>
# Screenshot of Menu
![image1](https://github.com/kritiksoman/tmp/blob/master/screenshot.png)
# Installation Steps
[1] Install [GIMP](https://www.gimp.org/downloads/) 2.10.<br>
[2] Clone this repository: git clone https://github.com/kritiksoman/GIMP-ML.git <br>
[3] Open terminal, go to GIMP-ML/gimp-plugins and run : <br>
```bash installGimpML.sh```<br>
[4] Open GIMP and go to Preferences -> Folders -> Plug-ins, add the folder gimp-plugins and restart GIMP. <br>
[5] Go to Layer->GIMP-ML->update, click on ok with "update weights" set to yes and restart GIMP. (Weights ~ 1.5GB will be downloaded)<br>
Manual install description if above is not working: [Link](https://github.com/kritiksoman/GIMP-ML/blob/master/INSTALLATION.md) <br>
# Update Steps
[1] Go to Layer->GIMP-ML->update, click on ok with "update weights" set to NO and restart GIMP. <br>
[2] Go to Layer->GIMP-ML->update, click on ok with "update weights" set to YES and restart GIMP. <br>
# Citation
Please cite using the following bibtex entry:
```
@article{soman2020GIMPML,
title={GIMP-ML: Python Plugins for using Computer Vision Models in GIMP},
author={Soman, Kritik},
journal={arXiv preprint arXiv:2004.13060},
year={2020}
}
```
# Tools
| Name | License | Dataset |
| ------------- |:-------------:| :-------------:|
| facegen | [CC BY-NC-SA 4.0](https://github.com/switchablenorms/CelebAMask-HQ#dataset-agreement) | CelebAMask-HQ |
| deblur | [BSD 3-clause](https://github.com/VITA-Group/DeblurGANv2/blob/master/LICENSE) | GoPro |
| faceparse | [MIT](https://github.com/zllrunning/face-parsing.PyTorch/blob/master/LICENSE) | CelebAMask-HQ |
| deepcolor | [MIT](https://github.com/junyanz/interactive-deep-colorization/blob/master/LICENSE) | ImageNet |
| monodepth | [MIT](https://github.com/intel-isl/MiDaS/blob/master/LICENSE) | [Multiple](https://arxiv.org/pdf/1907.01341v3.pdf) |
| super-resolution | [MIT](https://github.com/twtygqyy/pytorch-SRResNet/blob/master/LICENSE) | ImageNet |
| deepmatting | [Non-commercial purposes](https://github.com/poppinace/indexnet_matting/blob/master/Adobe%20Deep%20Image%20Mattng%20Dataset%20License%20Agreement.pdf) | Adobe Deep Image Matting |
| semantic-segmentation | MIT | COCO |
| kmeans | [BSD](https://github.com/scipy/scipy/blob/master/LICENSE.txt) | - |
| deep-dehazing | [MIT](https://github.com/MayankSingal/PyTorch-Image-Dehazing/blob/master/LICENSE) | [Custom](https://sites.google.com/site/boyilics/website-builder/project-page) |
| deep-denoising | [GPL3](https://github.com/SaoYan/DnCNN-PyTorch/blob/master/LICENSE) | BSD68 |
| enlighten | [BSD](https://github.com/VITA-Group/EnlightenGAN/blob/master/License) | [Custom](https://arxiv.org/pdf/1906.06972.pdf) |

@ -1,17 +1,15 @@
# from .kmeans import get_kmeans as kmeans
from .tools.kmeans import get_kmeans as kmeans
from .tools.deblur import get_deblur as deblur
# from .deepcolor import get_deepcolor as deepcolor
from .tools.coloring import get_deepcolor as deepcolor
from .tools.dehaze import get_dehaze as dehaze
# from .deepdenoise import get_denoise as denoise
# from .deepmatting import get_newalpha as matting
from .tools.denoise import get_denoise as denoise
from .tools.matting import get_matting as matting
from .tools.enlighten import get_enlighten as enlighten
# from .facegen import get_newface as newface
from .tools.faceparse import get_face as parseface
# from .interpolateframes import get_inter as interpolateframe
from .tools.interpolation import get_inter as interpolateframe
from .tools.monodepth import get_mono_depth as depth
from .tools.complete_install import setup_python_weights
# from .semseg import get_sem_seg as semseg
from .tools.semseg import get_seg as semseg
from .tools.superresolution import get_super as super
# from .inpainting import get_inpaint as inpaint
# from .syncWeights import sync as sync
from .tools.inpainting import get_inpaint as inpaint

@ -0,0 +1,240 @@
#!/usr/bin/env python3
# coding: utf-8
"""
.d8888b. 8888888 888b d888 8888888b. 888b d888 888
d88P Y88b 888 8888b d8888 888 Y88b 8888b d8888 888
888 888 888 88888b.d88888 888 888 88888b.d88888 888
888 888 888Y88888P888 888 d88P 888Y88888P888 888
888 88888 888 888 Y888P 888 8888888P" 888 Y888P 888 888
888 888 888 888 Y8P 888 888 888 Y8P 888 888
Y88b d88P 888 888 " 888 888 888 " 888 888
"Y8888P88 8888888 888 888 888 888 888 88888888
Extracts the monocular depth of the current layer.
"""
import sys
import gi
gi.require_version('Gimp', '3.0')
from gi.repository import Gimp
gi.require_version('GimpUi', '3.0')
from gi.repository import GimpUi
from gi.repository import GObject
from gi.repository import GLib
from gi.repository import Gio
gi.require_version('Gtk', '3.0')
from gi.repository import Gtk
import gettext
_ = gettext.gettext
def N_(message): return message
import subprocess
import pickle
import os
def coloring(procedure, image, n_drawables, drawables, force_cpu, progress_bar):
# layers = Gimp.Image.get_selected_layers(image)
# Gimp.get_pdb().run_procedure('gimp-message', [GObject.Value(GObject.TYPE_STRING, "Error")])
config_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "..", "..", "tools")
with open(os.path.join(config_path, 'gimp_ml_config.pkl'), 'rb') as file:
data_output = pickle.load(file)
weight_path = data_output["weight_path"]
python_path = data_output["python_path"]
plugin_path = os.path.join(config_path, 'coloring.py')
Gimp.context_push()
image.undo_group_start()
for index, drawable in enumerate(drawables):
interlace, compression = 0, 2
Gimp.get_pdb().run_procedure('file-png-save', [
GObject.Value(Gimp.RunMode, Gimp.RunMode.NONINTERACTIVE),
GObject.Value(Gimp.Image, image),
GObject.Value(GObject.TYPE_INT, 1),
GObject.Value(Gimp.ObjectArray, Gimp.ObjectArray.new(Gimp.Drawable, [drawable], 1)),
GObject.Value(Gio.File,
Gio.File.new_for_path(os.path.join(weight_path, '..', 'cache' + str(index) + '.png'))),
GObject.Value(GObject.TYPE_BOOLEAN, interlace),
GObject.Value(GObject.TYPE_INT, compression),
# write all PNG chunks except oFFs(ets)
GObject.Value(GObject.TYPE_BOOLEAN, True),
GObject.Value(GObject.TYPE_BOOLEAN, True),
GObject.Value(GObject.TYPE_BOOLEAN, False),
GObject.Value(GObject.TYPE_BOOLEAN, True),
])
with open(os.path.join(weight_path, '..', 'gimp_ml_run.pkl'), 'wb') as file:
pickle.dump({"force_cpu": bool(force_cpu)}, file)
subprocess.call([python_path, plugin_path])
result = Gimp.file_load(Gimp.RunMode.NONINTERACTIVE,
Gio.file_new_for_path(os.path.join(weight_path, '..', 'cache.png')))
result_layer = result.get_active_layer()
copy = Gimp.Layer.new_from_drawable(result_layer, image)
copy.set_name("Coloring")
copy.set_mode(Gimp.LayerMode.NORMAL_LEGACY) # DIFFERENCE_LEGACY
image.insert_layer(copy, None, -1)
image.undo_group_end()
Gimp.context_pop()
return procedure.new_return_values(Gimp.PDBStatusType.SUCCESS, GLib.Error())
def run(procedure, run_mode, image, n_drawables, layer, args, data):
# gio_file = args.index(0)
# bucket_size = args.index(0)
force_cpu = args.index(1)
# output_format = args.index(2)
progress_bar = None
config = None
if run_mode == Gimp.RunMode.INTERACTIVE:
config = procedure.create_config()
# Set properties from arguments. These properties will be changed by the UI.
# config.set_property("file", gio_file)
# config.set_property("bucket_size", bucket_size)
config.set_property("force_cpu", force_cpu)
# config.set_property("output_format", output_format)
config.begin_run(image, run_mode, args)
GimpUi.init("coloring.py")
use_header_bar = Gtk.Settings.get_default().get_property("gtk-dialogs-use-header")
dialog = GimpUi.Dialog(use_header_bar=use_header_bar,
title=_("Coloring..."))
dialog.add_button("_Cancel", Gtk.ResponseType.CANCEL)
dialog.add_button("_OK", Gtk.ResponseType.OK)
vbox = Gtk.Box(orientation=Gtk.Orientation.VERTICAL,
homogeneous=False, spacing=10)
dialog.get_content_area().add(vbox)
vbox.show()
# Create grid to set all the properties inside.
grid = Gtk.Grid()
grid.set_column_homogeneous(False)
grid.set_border_width(10)
grid.set_column_spacing(10)
grid.set_row_spacing(10)
vbox.add(grid)
grid.show()
# # Bucket size parameter
# label = Gtk.Label.new_with_mnemonic(_("_Bucket Size"))
# grid.attach(label, 0, 1, 1, 1)
# label.show()
# spin = GimpUi.prop_spin_button_new(config, "bucket_size", step_increment=0.001, page_increment=0.1, digits=3)
# grid.attach(spin, 1, 1, 1, 1)
# spin.show()
# Force CPU parameter
spin = GimpUi.prop_check_button_new(config, "force_cpu", _("Force _CPU"))
spin.set_tooltip_text(_("If checked, CPU is used for model inference."
" Otherwise, GPU will be used if available."))
grid.attach(spin, 1, 2, 1, 1)
spin.show()
# # Output format parameter
# label = Gtk.Label.new_with_mnemonic(_("_Output Format"))
# grid.attach(label, 0, 3, 1, 1)
# label.show()
# combo = GimpUi.prop_string_combo_box_new(config, "output_format", output_format_enum.get_tree_model(), 0, 1)
# grid.attach(combo, 1, 3, 1, 1)
# combo.show()
progress_bar = Gtk.ProgressBar()
vbox.add(progress_bar)
progress_bar.show()
dialog.show()
if dialog.run() != Gtk.ResponseType.OK:
return procedure.new_return_values(Gimp.PDBStatusType.CANCEL,
GLib.Error())
result = coloring(procedure, image, n_drawables, layer, force_cpu, progress_bar)
# If the execution was successful, save parameters so they will be restored next time we show dialog.
if result.index(0) == Gimp.PDBStatusType.SUCCESS and config is not None:
config.end_run(Gimp.PDBStatusType.SUCCESS)
return result
class Coloring(Gimp.PlugIn):
## Parameters ##
__gproperties__ = {
# "filename": (str,
# # TODO: I wanted this property to be a path (and not just str) , so I could use
# # prop_file_chooser_button_new to open a file dialog. However, it fails without an error message.
# # Gimp.ConfigPath,
# _("Histogram _File"),
# _("Histogram _File"),
# "coloring.csv",
# # Gimp.ConfigPathType.FILE,
# GObject.ParamFlags.READWRITE),
# "file": (Gio.File,
# _("Histogram _File"),
# "Histogram export file",
# GObject.ParamFlags.READWRITE),
# "bucket_size": (float,
# _("_Bucket Size"),
# "Bucket Size",
# 0.001, 1.0, 0.01,
# GObject.ParamFlags.READWRITE),
"force_cpu": (bool,
_("Force _CPU"),
"Force CPU",
False,
GObject.ParamFlags.READWRITE),
# "output_format": (str,
# _("Output format"),
# "Output format: 'pixel count', 'normalized', 'percent'",
# "pixel count",
# GObject.ParamFlags.READWRITE),
}
## GimpPlugIn virtual methods ##
def do_query_procedures(self):
self.set_translation_domain("gimp30-python",
Gio.file_new_for_path(Gimp.locale_directory()))
return ['coloring']
def do_create_procedure(self, name):
procedure = None
if name == 'coloring':
procedure = Gimp.ImageProcedure.new(self, name, Gimp.PDBProcType.PLUGIN, run, None)
procedure.set_image_types("*")
procedure.set_sensitivity_mask(
Gimp.ProcedureSensitivityMask.DRAWABLE | Gimp.ProcedureSensitivityMask.DRAWABLES)
procedure.set_documentation(
N_("Extracts the monocular depth of the current layer."),
globals()["__doc__"], # This includes the docstring, on the top of the file
name)
procedure.set_menu_label(N_("_Coloring..."))
procedure.set_attribution("Kritik Soman",
"GIMP-ML",
"2021")
procedure.add_menu_path("<Image>/Layer/GIMP-ML/")
# procedure.add_argument_from_property(self, "file")
# procedure.add_argument_from_property(self, "bucket_size")
procedure.add_argument_from_property(self, "force_cpu")
# procedure.add_argument_from_property(self, "output_format")
return procedure
Gimp.main(Coloring.__gtype__, sys.argv)

@ -0,0 +1,273 @@
#!/usr/bin/env python3
# coding: utf-8
"""
.d8888b. 8888888 888b d888 8888888b. 888b d888 888
d88P Y88b 888 8888b d8888 888 Y88b 8888b d8888 888
888 888 888 88888b.d88888 888 888 88888b.d88888 888
888 888 888Y88888P888 888 d88P 888Y88888P888 888
888 88888 888 888 Y888P 888 8888888P" 888 Y888P 888 888
888 888 888 888 Y8P 888 888 888 Y8P 888 888
Y88b d88P 888 888 " 888 888 888 " 888 888
"Y8888P88 8888888 888 888 888 888 888 88888888
Extracts the monocular depth of the current layer.
"""
import sys
import gi
gi.require_version('Gimp', '3.0')
from gi.repository import Gimp
gi.require_version('GimpUi', '3.0')
from gi.repository import GimpUi
from gi.repository import GObject
from gi.repository import GLib
from gi.repository import Gio
gi.require_version('Gtk', '3.0')
from gi.repository import Gtk
import gettext
_ = gettext.gettext
def N_(message): return message
import subprocess
import pickle
import os
def interpolation(procedure, image, n_drawables, drawables, force_cpu, progress_bar, gio_file):
# layers = Gimp.Image.get_selected_layers(image)
# Gimp.get_pdb().run_procedure('gimp-message', [GObject.Value(GObject.TYPE_STRING, "Error")])
config_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "..", "..", "tools")
with open(os.path.join(config_path, 'gimp_ml_config.pkl'), 'rb') as file:
data_output = pickle.load(file)
weight_path = data_output["weight_path"]
python_path = data_output["python_path"]
plugin_path = os.path.join(config_path, 'interpolation.py')
Gimp.context_push()
image.undo_group_start()
for index, drawable in enumerate(drawables):
interlace, compression = 0, 2
Gimp.get_pdb().run_procedure('file-png-save', [
GObject.Value(Gimp.RunMode, Gimp.RunMode.NONINTERACTIVE),
GObject.Value(Gimp.Image, image),
GObject.Value(GObject.TYPE_INT, 1),
GObject.Value(Gimp.ObjectArray, Gimp.ObjectArray.new(Gimp.Drawable, [drawable], 1)),
GObject.Value(Gio.File,
Gio.File.new_for_path(os.path.join(weight_path, '..', 'cache' + str(index) + '.png'))),
GObject.Value(GObject.TYPE_BOOLEAN, interlace),
GObject.Value(GObject.TYPE_INT, compression),
# write all PNG chunks except oFFs(ets)
GObject.Value(GObject.TYPE_BOOLEAN, True),
GObject.Value(GObject.TYPE_BOOLEAN, True),
GObject.Value(GObject.TYPE_BOOLEAN, False),
GObject.Value(GObject.TYPE_BOOLEAN, True),
])
with open(os.path.join(weight_path, '..', 'gimp_ml_run.pkl'), 'wb') as file:
pickle.dump({"force_cpu": bool(force_cpu), "gio_file": str(gio_file)}, file)
subprocess.call([python_path, plugin_path])
result = Gimp.file_load(Gimp.RunMode.NONINTERACTIVE,
Gio.file_new_for_path(os.path.join(weight_path, '..', 'cache.png')))
result_layer = result.get_active_layer()
copy = Gimp.Layer.new_from_drawable(result_layer, image)
copy.set_name("interpolation")
copy.set_mode(Gimp.LayerMode.NORMAL_LEGACY) # DIFFERENCE_LEGACY
image.insert_layer(copy, None, -1)
image.undo_group_end()
Gimp.context_pop()
return procedure.new_return_values(Gimp.PDBStatusType.SUCCESS, GLib.Error())
def run(procedure, run_mode, image, n_drawables, layer, args, data):
gio_file = args.index(0)
# bucket_size = args.index(0)
force_cpu = args.index(1)
# output_format = args.index(2)
progress_bar = None
config = None
if run_mode == Gimp.RunMode.INTERACTIVE:
config = procedure.create_config()
# Set properties from arguments. These properties will be changed by the UI.
# config.set_property("file", gio_file)
# config.set_property("bucket_size", bucket_size)
config.set_property("force_cpu", force_cpu)
# config.set_property("output_format", output_format)
config.begin_run(image, run_mode, args)
GimpUi.init("interpolation.py")
use_header_bar = Gtk.Settings.get_default().get_property("gtk-dialogs-use-header")
dialog = GimpUi.Dialog(use_header_bar=use_header_bar,
title=_("interpolation..."))
dialog.add_button("_Cancel", Gtk.ResponseType.CANCEL)
dialog.add_button("_OK", Gtk.ResponseType.OK)
vbox = Gtk.Box(orientation=Gtk.Orientation.VERTICAL,
homogeneous=False, spacing=10)
dialog.get_content_area().add(vbox)
vbox.show()
# Create grid to set all the properties inside.
grid = Gtk.Grid()
grid.set_column_homogeneous(False)
grid.set_border_width(10)
grid.set_column_spacing(10)
grid.set_row_spacing(10)
vbox.add(grid)
grid.show()
# UI for the file parameter
def choose_file(widget):
if file_chooser_dialog.run() == Gtk.ResponseType.OK:
if file_chooser_dialog.get_file() is not None:
config.set_property("file", file_chooser_dialog.get_file())
file_entry.set_text(file_chooser_dialog.get_file().get_path())
file_chooser_dialog.hide()
file_chooser_button = Gtk.Button.new_with_mnemonic(label=_("_Folder..."))
grid.attach(file_chooser_button, 0, 0, 1, 1)
file_chooser_button.show()
file_chooser_button.connect("clicked", choose_file)
file_entry = Gtk.Entry.new()
grid.attach(file_entry, 1, 0, 1, 1)
file_entry.set_width_chars(40)
file_entry.set_placeholder_text(_("Choose export folder..."))
if gio_file is not None:
file_entry.set_text(gio_file.get_path())
file_entry.show()
file_chooser_dialog = Gtk.FileChooserDialog(use_header_bar=use_header_bar,
title=_("Frame Export folder..."),
action=Gtk.FileChooserAction.SELECT_FOLDER)
file_chooser_dialog.add_button("_Cancel", Gtk.ResponseType.CANCEL)
file_chooser_dialog.add_button("_OK", Gtk.ResponseType.OK)
# # Bucket size parameter
# label = Gtk.Label.new_with_mnemonic(_("_Bucket Size"))
# grid.attach(label, 0, 1, 1, 1)
# label.show()
# spin = GimpUi.prop_spin_button_new(config, "bucket_size", step_increment=0.001, page_increment=0.1, digits=3)
# grid.attach(spin, 1, 1, 1, 1)
# spin.show()
# Force CPU parameter
spin = GimpUi.prop_check_button_new(config, "force_cpu", _("Force _CPU"))
spin.set_tooltip_text(_("If checked, CPU is used for model inference."
" Otherwise, GPU will be used if available."))
grid.attach(spin, 1, 2, 1, 1)
spin.show()
# # Output format parameter
# label = Gtk.Label.new_with_mnemonic(_("_Output Format"))
# grid.attach(label, 0, 3, 1, 1)
# label.show()
# combo = GimpUi.prop_string_combo_box_new(config, "output_format", output_format_enum.get_tree_model(), 0, 1)
# grid.attach(combo, 1, 3, 1, 1)
# combo.show()
progress_bar = Gtk.ProgressBar()
vbox.add(progress_bar)
progress_bar.show()
dialog.show()
if dialog.run() != Gtk.ResponseType.OK:
return procedure.new_return_values(Gimp.PDBStatusType.CANCEL,
GLib.Error())
gio_file = file_entry.get_text()
result = interpolation(procedure, image, n_drawables, layer, force_cpu, progress_bar, gio_file)
# If the execution was successful, save parameters so they will be restored next time we show dialog.
if result.index(0) == Gimp.PDBStatusType.SUCCESS and config is not None:
config.end_run(Gimp.PDBStatusType.SUCCESS)
return result
class Interpolation(Gimp.PlugIn):
## Parameters ##
__gproperties__ = {
# "filename": (str,
# # TODO: I wanted this property to be a path (and not just str) , so I could use
# # prop_file_chooser_button_new to open a file dialog. However, it fails without an error message.
# # Gimp.ConfigPath,
# _("Histogram _File"),
# _("Histogram _File"),
# "interpolation.csv",
# # Gimp.ConfigPathType.FILE,
# GObject.ParamFlags.READWRITE),
"file": (Gio.File,
_("Histogram _File"),
"Histogram export file",
GObject.ParamFlags.READWRITE),
# "bucket_size": (float,
# _("_Bucket Size"),
# "Bucket Size",
# 0.001, 1.0, 0.01,
# GObject.ParamFlags.READWRITE),
"force_cpu": (bool,
_("Force _CPU"),
"Force CPU",
False,
GObject.ParamFlags.READWRITE),
# "output_format": (str,
# _("Output format"),
# "Output format: 'pixel count', 'normalized', 'percent'",
# "pixel count",
# GObject.ParamFlags.READWRITE),
}
## GimpPlugIn virtual methods ##
def do_query_procedures(self):
self.set_translation_domain("gimp30-python",
Gio.file_new_for_path(Gimp.locale_directory()))
return ['interpolation']
def do_create_procedure(self, name):
procedure = None
if name == 'interpolation':
procedure = Gimp.ImageProcedure.new(self, name, Gimp.PDBProcType.PLUGIN, run, None)
procedure.set_image_types("*")
procedure.set_sensitivity_mask(
Gimp.ProcedureSensitivityMask.DRAWABLE | Gimp.ProcedureSensitivityMask.DRAWABLES)
procedure.set_documentation(
N_("Extracts the monocular depth of the current layer."),
globals()["__doc__"], # This includes the docstring, on the top of the file
name)
procedure.set_menu_label(N_("_Interpolation..."))
procedure.set_attribution("Kritik Soman",
"GIMP-ML",
"2021")
procedure.add_menu_path("<Image>/Layer/GIMP-ML/")
procedure.add_argument_from_property(self, "file")
# procedure.add_argument_from_property(self, "bucket_size")
procedure.add_argument_from_property(self, "force_cpu")
# procedure.add_argument_from_property(self, "output_format")
return procedure
Gimp.main(Interpolation.__gtype__, sys.argv)

@ -0,0 +1,240 @@
#!/usr/bin/env python3
# coding: utf-8
"""
.d8888b. 8888888 888b d888 8888888b. 888b d888 888
d88P Y88b 888 8888b d8888 888 Y88b 8888b d8888 888
888 888 888 88888b.d88888 888 888 88888b.d88888 888
888 888 888Y88888P888 888 d88P 888Y88888P888 888
888 88888 888 888 Y888P 888 8888888P" 888 Y888P 888 888
888 888 888 888 Y8P 888 888 888 Y8P 888 888
Y88b d88P 888 888 " 888 888 888 " 888 888
"Y8888P88 8888888 888 888 888 888 888 88888888
Extracts the monocular depth of the current layer.
"""
import sys
import gi
gi.require_version('Gimp', '3.0')
from gi.repository import Gimp
gi.require_version('GimpUi', '3.0')
from gi.repository import GimpUi
from gi.repository import GObject
from gi.repository import GLib
from gi.repository import Gio
gi.require_version('Gtk', '3.0')
from gi.repository import Gtk
import gettext
_ = gettext.gettext
def N_(message): return message
import subprocess
import pickle
import os
def matting(procedure, image, n_drawables, drawables, force_cpu, progress_bar):
# layers = Gimp.Image.get_selected_layers(image)
# Gimp.get_pdb().run_procedure('gimp-message', [GObject.Value(GObject.TYPE_STRING, "Error")])
config_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "..", "..", "tools")
with open(os.path.join(config_path, 'gimp_ml_config.pkl'), 'rb') as file:
data_output = pickle.load(file)
weight_path = data_output["weight_path"]
python_path = data_output["python_path"]
plugin_path = os.path.join(config_path, 'matting.py')
Gimp.context_push()
image.undo_group_start()
for index, drawable in enumerate(drawables):
interlace, compression = 0, 2
Gimp.get_pdb().run_procedure('file-png-save', [
GObject.Value(Gimp.RunMode, Gimp.RunMode.NONINTERACTIVE),
GObject.Value(Gimp.Image, image),
GObject.Value(GObject.TYPE_INT, 1),
GObject.Value(Gimp.ObjectArray, Gimp.ObjectArray.new(Gimp.Drawable, [drawable], 1)),
GObject.Value(Gio.File,
Gio.File.new_for_path(os.path.join(weight_path, '..', 'cache' + str(index) + '.png'))),
GObject.Value(GObject.TYPE_BOOLEAN, interlace),
GObject.Value(GObject.TYPE_INT, compression),
# write all PNG chunks except oFFs(ets)
GObject.Value(GObject.TYPE_BOOLEAN, True),
GObject.Value(GObject.TYPE_BOOLEAN, True),
GObject.Value(GObject.TYPE_BOOLEAN, False),
GObject.Value(GObject.TYPE_BOOLEAN, True),
])
with open(os.path.join(weight_path, '..', 'gimp_ml_run.pkl'), 'wb') as file:
pickle.dump({"force_cpu": bool(force_cpu)}, file)
subprocess.call([python_path, plugin_path])
result = Gimp.file_load(Gimp.RunMode.NONINTERACTIVE,
Gio.file_new_for_path(os.path.join(weight_path, '..', 'cache.png')))
result_layer = result.get_active_layer()
copy = Gimp.Layer.new_from_drawable(result_layer, image)
copy.set_name("Matting")
copy.set_mode(Gimp.LayerMode.NORMAL_LEGACY) # DIFFERENCE_LEGACY
image.insert_layer(copy, None, -1)
image.undo_group_end()
Gimp.context_pop()
return procedure.new_return_values(Gimp.PDBStatusType.SUCCESS, GLib.Error())
def run(procedure, run_mode, image, n_drawables, layer, args, data):
# gio_file = args.index(0)
# bucket_size = args.index(0)
force_cpu = args.index(1)
# output_format = args.index(2)
progress_bar = None
config = None
if run_mode == Gimp.RunMode.INTERACTIVE:
config = procedure.create_config()
# Set properties from arguments. These properties will be changed by the UI.
# config.set_property("file", gio_file)
# config.set_property("bucket_size", bucket_size)
config.set_property("force_cpu", force_cpu)
# config.set_property("output_format", output_format)
config.begin_run(image, run_mode, args)
GimpUi.init("matting.py")
use_header_bar = Gtk.Settings.get_default().get_property("gtk-dialogs-use-header")
dialog = GimpUi.Dialog(use_header_bar=use_header_bar,
title=_("matting..."))
dialog.add_button("_Cancel", Gtk.ResponseType.CANCEL)
dialog.add_button("_OK", Gtk.ResponseType.OK)
vbox = Gtk.Box(orientation=Gtk.Orientation.VERTICAL,
homogeneous=False, spacing=10)
dialog.get_content_area().add(vbox)
vbox.show()
# Create grid to set all the properties inside.
grid = Gtk.Grid()
grid.set_column_homogeneous(False)
grid.set_border_width(10)
grid.set_column_spacing(10)
grid.set_row_spacing(10)
vbox.add(grid)
grid.show()
# # Bucket size parameter
# label = Gtk.Label.new_with_mnemonic(_("_Bucket Size"))
# grid.attach(label, 0, 1, 1, 1)
# label.show()
# spin = GimpUi.prop_spin_button_new(config, "bucket_size", step_increment=0.001, page_increment=0.1, digits=3)
# grid.attach(spin, 1, 1, 1, 1)
# spin.show()
# Force CPU parameter
spin = GimpUi.prop_check_button_new(config, "force_cpu", _("Force _CPU"))
spin.set_tooltip_text(_("If checked, CPU is used for model inference."
" Otherwise, GPU will be used if available."))
grid.attach(spin, 1, 2, 1, 1)
spin.show()
# # Output format parameter
# label = Gtk.Label.new_with_mnemonic(_("_Output Format"))
# grid.attach(label, 0, 3, 1, 1)
# label.show()
# combo = GimpUi.prop_string_combo_box_new(config, "output_format", output_format_enum.get_tree_model(), 0, 1)
# grid.attach(combo, 1, 3, 1, 1)
# combo.show()
progress_bar = Gtk.ProgressBar()
vbox.add(progress_bar)
progress_bar.show()
dialog.show()
if dialog.run() != Gtk.ResponseType.OK:
return procedure.new_return_values(Gimp.PDBStatusType.CANCEL,
GLib.Error())
result = matting(procedure, image, n_drawables, layer, force_cpu, progress_bar)
# If the execution was successful, save parameters so they will be restored next time we show dialog.
if result.index(0) == Gimp.PDBStatusType.SUCCESS and config is not None:
config.end_run(Gimp.PDBStatusType.SUCCESS)
return result
class Matting(Gimp.PlugIn):
## Parameters ##
__gproperties__ = {
# "filename": (str,
# # TODO: I wanted this property to be a path (and not just str) , so I could use
# # prop_file_chooser_button_new to open a file dialog. However, it fails without an error message.
# # Gimp.ConfigPath,
# _("Histogram _File"),
# _("Histogram _File"),
# "matting.csv",
# # Gimp.ConfigPathType.FILE,
# GObject.ParamFlags.READWRITE),
# "file": (Gio.File,
# _("Histogram _File"),
# "Histogram export file",
# GObject.ParamFlags.READWRITE),
# "bucket_size": (float,
# _("_Bucket Size"),
# "Bucket Size",
# 0.001, 1.0, 0.01,
# GObject.ParamFlags.READWRITE),
"force_cpu": (bool,
_("Force _CPU"),
"Force CPU",
False,
GObject.ParamFlags.READWRITE),
# "output_format": (str,
# _("Output format"),
# "Output format: 'pixel count', 'normalized', 'percent'",
# "pixel count",
# GObject.ParamFlags.READWRITE),
}
## GimpPlugIn virtual methods ##
def do_query_procedures(self):
self.set_translation_domain("gimp30-python",
Gio.file_new_for_path(Gimp.locale_directory()))
return ['matting']
def do_create_procedure(self, name):
procedure = None
if name == 'matting':
procedure = Gimp.ImageProcedure.new(self, name, Gimp.PDBProcType.PLUGIN, run, None)
procedure.set_image_types("*")
procedure.set_sensitivity_mask(
Gimp.ProcedureSensitivityMask.DRAWABLE | Gimp.ProcedureSensitivityMask.DRAWABLES)
procedure.set_documentation(
N_("Extracts the monocular depth of the current layer."),
globals()["__doc__"], # This includes the docstring, on the top of the file
name)
procedure.set_menu_label(N_("_Matting..."))
procedure.set_attribution("Kritik Soman",
"GIMP-ML",
"2021")
procedure.add_menu_path("<Image>/Layer/GIMP-ML/")
# procedure.add_argument_from_property(self, "file")
# procedure.add_argument_from_property(self, "bucket_size")
procedure.add_argument_from_property(self, "force_cpu")
# procedure.add_argument_from_property(self, "output_format")
return procedure
Gimp.main(Matting.__gtype__, sys.argv)

@ -0,0 +1,21 @@
MIT License
Copyright (c) 2020 hzwer
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

@ -0,0 +1,117 @@
import torch
import numpy as np
import torch.nn as nn
import torch.nn.functional as F
from rife_model.warplayer import warp
# device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
def conv_wo_act(in_planes, out_planes, kernel_size=3, stride=1, padding=1, dilation=1):
return nn.Sequential(
nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride,
padding=padding, dilation=dilation, bias=False),
nn.BatchNorm2d(out_planes),
)
def conv(in_planes, out_planes, kernel_size=3, stride=1, padding=1, dilation=1):
return nn.Sequential(
nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride,
padding=padding, dilation=dilation, bias=False),
nn.BatchNorm2d(out_planes),
nn.PReLU(out_planes)
)
class ResBlock(nn.Module):
def __init__(self, in_planes, out_planes, stride=1):
super(ResBlock, self).__init__()
if in_planes == out_planes and stride == 1:
self.conv0 = nn.Identity()
else:
self.conv0 = nn.Conv2d(in_planes, out_planes,
3, stride, 1, bias=False)
self.conv1 = conv(in_planes, out_planes, 3, stride, 1)
self.conv2 = conv_wo_act(out_planes, out_planes, 3, 1, 1)
self.relu1 = nn.PReLU(1)
self.relu2 = nn.PReLU(out_planes)
self.fc1 = nn.Conv2d(out_planes, 16, kernel_size=1, bias=False)
self.fc2 = nn.Conv2d(16, out_planes, kernel_size=1, bias=False)
def forward(self, x):
y = self.conv0(x)
x = self.conv1(x)
x = self.conv2(x)
w = x.mean(3, True).mean(2, True)
w = self.relu1(self.fc1(w))
w = torch.sigmoid(self.fc2(w))
x = self.relu2(x * w + y)
return x
class IFBlock(nn.Module):
def __init__(self, in_planes, scale=1, c=64):
super(IFBlock, self).__init__()
self.scale = scale
self.conv0 = conv(in_planes, c, 3, 2, 1)
self.res0 = ResBlock(c, c)
self.res1 = ResBlock(c, c)
self.res2 = ResBlock(c, c)
self.res3 = ResBlock(c, c)
self.res4 = ResBlock(c, c)
self.res5 = ResBlock(c, c)
self.conv1 = nn.Conv2d(c, 8, 3, 1, 1)
self.up = nn.PixelShuffle(2)
def forward(self, x):
if self.scale != 1:
x = F.interpolate(x, scale_factor=1. / self.scale, mode="bilinear",
align_corners=False)
x = self.conv0(x)
x = self.res0(x)
x = self.res1(x)
x = self.res2(x)
x = self.res3(x)
x = self.res4(x)
x = self.res5(x)
x = self.conv1(x)
flow = self.up(x)
if self.scale != 1:
flow = F.interpolate(flow, scale_factor=self.scale, mode="bilinear",
align_corners=False)
return flow
class IFNet(nn.Module):
def __init__(self, cFlag):
super(IFNet, self).__init__()
self.block0 = IFBlock(6, scale=4, c=192)
self.block1 = IFBlock(8, scale=2, c=128)
self.block2 = IFBlock(8, scale=1, c=64)
self.cFlag = cFlag
def forward(self, x):
x = F.interpolate(x, scale_factor=0.5, mode="bilinear",
align_corners=False)
flow0 = self.block0(x)
F1 = flow0
warped_img0 = warp(x[:, :3], F1, self.cFlag)
warped_img1 = warp(x[:, 3:], -F1, self.cFlag)
flow1 = self.block1(torch.cat((warped_img0, warped_img1, F1), 1))
F2 = (flow0 + flow1)
warped_img0 = warp(x[:, :3], F2, self.cFlag)
warped_img1 = warp(x[:, 3:], -F2, self.cFlag)
flow2 = self.block2(torch.cat((warped_img0, warped_img1, F2), 1))
F3 = (flow0 + flow1 + flow2)
return F3, [F1, F2, F3]
if __name__ == '__main__':
img0 = torch.zeros(3, 3, 256, 256).float().to(device)
img1 = torch.tensor(np.random.normal(
0, 1, (3, 3, 256, 256))).float().to(device)
imgs = torch.cat((img0, img1), 1)
flownet = IFNet()
flow, _ = flownet(imgs)
print(flow.shape)

@ -0,0 +1,115 @@
import torch
import numpy as np
import torch.nn as nn
import torch.nn.functional as F
from model.warplayer import warp
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
def conv_wo_act(in_planes, out_planes, kernel_size=3, stride=1, padding=1, dilation=1):
return nn.Sequential(
nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride,
padding=padding, dilation=dilation, bias=False),
nn.BatchNorm2d(out_planes),
)
def conv(in_planes, out_planes, kernel_size=3, stride=1, padding=1, dilation=1):
return nn.Sequential(
nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride,
padding=padding, dilation=dilation, bias=False),
nn.BatchNorm2d(out_planes),
nn.PReLU(out_planes)
)
class ResBlock(nn.Module):
def __init__(self, in_planes, out_planes, stride=1):
super(ResBlock, self).__init__()
if in_planes == out_planes and stride == 1:
self.conv0 = nn.Identity()
else:
self.conv0 = nn.Conv2d(in_planes, out_planes,
3, stride, 1, bias=False)
self.conv1 = conv(in_planes, out_planes, 3, stride, 1)
self.conv2 = conv_wo_act(out_planes, out_planes, 3, 1, 1)
self.relu1 = nn.PReLU(1)
self.relu2 = nn.PReLU(out_planes)
self.fc1 = nn.Conv2d(out_planes, 16, kernel_size=1, bias=False)
self.fc2 = nn.Conv2d(16, out_planes, kernel_size=1, bias=False)
def forward(self, x):
y = self.conv0(x)
x = self.conv1(x)
x = self.conv2(x)
w = x.mean(3, True).mean(2, True)
w = self.relu1(self.fc1(w))
w = torch.sigmoid(self.fc2(w))
x = self.relu2(x * w + y)
return x
class IFBlock(nn.Module):
def __init__(self, in_planes, scale=1, c=64):
super(IFBlock, self).__init__()
self.scale = scale
self.conv0 = conv(in_planes, c, 3, 1, 1)
self.res0 = ResBlock(c, c)
self.res1 = ResBlock(c, c)
self.res2 = ResBlock(c, c)
self.res3 = ResBlock(c, c)
self.res4 = ResBlock(c, c)
self.res5 = ResBlock(c, c)
self.conv1 = nn.Conv2d(c, 2, 3, 1, 1)
self.up = nn.PixelShuffle(2)
def forward(self, x):
if self.scale != 1:
x = F.interpolate(x, scale_factor=1. / self.scale, mode="bilinear",
align_corners=False)
x = self.conv0(x)
x = self.res0(x)
x = self.res1(x)
x = self.res2(x)
x = self.res3(x)
x = self.res4(x)
x = self.res5(x)
x = self.conv1(x)
flow = x # self.up(x)
if self.scale != 1:
flow = F.interpolate(flow, scale_factor=self.scale, mode="bilinear",
align_corners=False)
return flow
class IFNet(nn.Module):
def __init__(self):
super(IFNet, self).__init__()
self.block0 = IFBlock(6, scale=4, c=192)
self.block1 = IFBlock(8, scale=2, c=128)
self.block2 = IFBlock(8, scale=1, c=64)
def forward(self, x):
x = F.interpolate(x, scale_factor=0.5, mode="bilinear",
align_corners=False)
flow0 = self.block0(x)
F1 = flow0
warped_img0 = warp(x[:, :3], F1)
warped_img1 = warp(x[:, 3:], -F1)
flow1 = self.block1(torch.cat((warped_img0, warped_img1, F1), 1))
F2 = (flow0 + flow1)
warped_img0 = warp(x[:, :3], F2)
warped_img1 = warp(x[:, 3:], -F2)
flow2 = self.block2(torch.cat((warped_img0, warped_img1, F2), 1))
F3 = (flow0 + flow1 + flow2)
return F3, [F1, F2, F3]
if __name__ == '__main__':
img0 = torch.zeros(3, 3, 256, 256).float().to(device)
img1 = torch.tensor(np.random.normal(
0, 1, (3, 3, 256, 256))).float().to(device)
imgs = torch.cat((img0, img1), 1)
flownet = IFNet()
flow, _ = flownet(imgs)
print(flow.shape)

@ -0,0 +1,262 @@
import torch
import torch.nn as nn
import numpy as np
from torch.optim import AdamW
import torch.optim as optim
import itertools
from rife_model.warplayer import warp
from torch.nn.parallel import DistributedDataParallel as DDP
from rife_model.IFNet import *
import torch.nn.functional as F
from rife_model.loss import *
def conv(in_planes, out_planes, kernel_size=3, stride=1, padding=1, dilation=1):
return nn.Sequential(
nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride,
padding=padding, dilation=dilation, bias=True),
nn.PReLU(out_planes)
)
def deconv(in_planes, out_planes, kernel_size=4, stride=2, padding=1):
return nn.Sequential(
torch.nn.ConvTranspose2d(in_channels=in_planes, out_channels=out_planes,
kernel_size=4, stride=2, padding=1, bias=True),
nn.PReLU(out_planes)
)
def conv_woact(in_planes, out_planes, kernel_size=3, stride=1, padding=1, dilation=1):
return nn.Sequential(
nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride,
padding=padding, dilation=dilation, bias=True),
)
class ResBlock(nn.Module):
def __init__(self, in_planes, out_planes, stride=2):
super(ResBlock, self).__init__()
if in_planes == out_planes and stride == 1:
self.conv0 = nn.Identity()
else:
self.conv0 = nn.Conv2d(in_planes, out_planes,
3, stride, 1, bias=False)
self.conv1 = conv(in_planes, out_planes, 3, stride, 1)
self.conv2 = conv_woact(out_planes, out_planes, 3, 1, 1)
self.relu1 = nn.PReLU(1)
self.relu2 = nn.PReLU(out_planes)
self.fc1 = nn.Conv2d(out_planes, 16, kernel_size=1, bias=False)
self.fc2 = nn.Conv2d(16, out_planes, kernel_size=1, bias=False)
def forward(self, x):
y = self.conv0(x)
x = self.conv1(x)
x = self.conv2(x)
w = x.mean(3, True).mean(2, True)
w = self.relu1(self.fc1(w))
w = torch.sigmoid(self.fc2(w))
x = self.relu2(x * w + y)
return x
c = 16
class ContextNet(nn.Module):
def __init__(self, cFlag):
super(ContextNet, self).__init__()
self.conv1 = ResBlock(3, c)
self.conv2 = ResBlock(c, 2 * c)
self.conv3 = ResBlock(2 * c, 4 * c)
self.conv4 = ResBlock(4 * c, 8 * c)
self.cFlag = cFlag
def forward(self, x, flow):
x = self.conv1(x)
f1 = warp(x, flow, self.cFlag)
x = self.conv2(x)
flow = F.interpolate(flow, scale_factor=0.5, mode="bilinear",
align_corners=False) * 0.5
f2 = warp(x, flow, self.cFlag)
x = self.conv3(x)
flow = F.interpolate(flow, scale_factor=0.5, mode="bilinear",
align_corners=False) * 0.5
f3 = warp(x, flow, self.cFlag)
x = self.conv4(x)
flow = F.interpolate(flow, scale_factor=0.5, mode="bilinear",
align_corners=False) * 0.5
f4 = warp(x, flow, self.cFlag)
return [f1, f2, f3, f4]
class FusionNet(nn.Module):
def __init__(self, cFlag):
super(FusionNet, self).__init__()
self.down0 = ResBlock(8, 2 * c)
self.down1 = ResBlock(4 * c, 4 * c)
self.down2 = ResBlock(8 * c, 8 * c)
self.down3 = ResBlock(16 * c, 16 * c)
self.up0 = deconv(32 * c, 8 * c)
self.up1 = deconv(16 * c, 4 * c)
self.up2 = deconv(8 * c, 2 * c)
self.up3 = deconv(4 * c, c)
self.conv = nn.Conv2d(c, 4, 3, 1, 1)
self.cFlag = cFlag
def forward(self, img0, img1, flow, c0, c1, flow_gt):
warped_img0 = warp(img0, flow, self.cFlag)
warped_img1 = warp(img1, -flow, self.cFlag)
if flow_gt == None:
warped_img0_gt, warped_img1_gt = None, None
else:
warped_img0_gt = warp(img0, flow_gt[:, :2])
warped_img1_gt = warp(img1, flow_gt[:, 2:4])
s0 = self.down0(torch.cat((warped_img0, warped_img1, flow), 1))
s1 = self.down1(torch.cat((s0, c0[0], c1[0]), 1))
s2 = self.down2(torch.cat((s1, c0[1], c1[1]), 1))
s3 = self.down3(torch.cat((s2, c0[2], c1[2]), 1))
x = self.up0(torch.cat((s3, c0[3], c1[3]), 1))
x = self.up1(torch.cat((x, s2), 1))
x = self.up2(torch.cat((x, s1), 1))
x = self.up3(torch.cat((x, s0), 1))
x = self.conv(x)
return x, warped_img0, warped_img1, warped_img0_gt, warped_img1_gt
class Model:
def __init__(self, c_flag, local_rank=-1):
self.flownet = IFNet(c_flag)
self.contextnet = ContextNet(c_flag)
self.fusionnet = FusionNet(c_flag)
self.device(c_flag)
self.optimG = AdamW(itertools.chain(
self.flownet.parameters(),
self.contextnet.parameters(),
self.fusionnet.parameters()), lr=1e-6, weight_decay=1e-5)
self.schedulerG = optim.lr_scheduler.CyclicLR(
self.optimG, base_lr=1e-6, max_lr=1e-3, step_size_up=8000, cycle_momentum=False)
self.epe = EPE()
self.ter = Ternary(c_flag)
self.sobel = SOBEL(c_flag)
if local_rank != -1:
self.flownet = DDP(self.flownet, device_ids=[
local_rank], output_device=local_rank)
self.contextnet = DDP(self.contextnet, device_ids=[
local_rank], output_device=local_rank)
self.fusionnet = DDP(self.fusionnet, device_ids=[
local_rank], output_device=local_rank)
def train(self):
self.flownet.train()
self.contextnet.train()
self.fusionnet.train()
def eval(self):
self.flownet.eval()
self.contextnet.eval()
self.fusionnet.eval()
def device(self, c_flag):
if torch.cuda.is_available() and not c_flag:
device = torch.device("cuda")
else:
device = torch.device("cpu")
self.flownet.to(device)
self.contextnet.to(device)
self.fusionnet.to(device)
def load_model(self, path, rank=0):
def convert(param):
return {
k.replace("module.", ""): v
for k, v in param.items()
if "module." in k
}
if rank == 0:
self.flownet.load_state_dict(
convert(torch.load('{}/flownet.pkl'.format(path), map_location=torch.device("cpu"))))
self.contextnet.load_state_dict(
convert(torch.load('{}/contextnet.pkl'.format(path), map_location=torch.device("cpu"))))
self.fusionnet.load_state_dict(
convert(torch.load('{}/unet.pkl'.format(path), map_location=torch.device("cpu"))))
def save_model(self, path, rank=0):
if rank == 0:
torch.save(self.flownet.state_dict(),
'{}/flownet.pkl'.format(path))
torch.save(self.contextnet.state_dict(),
'{}/contextnet.pkl'.format(path))
torch.save(self.fusionnet.state_dict(), '{}/unet.pkl'.format(path))
def predict(self, imgs, flow, training=True, flow_gt=None):
img0 = imgs[:, :3]
img1 = imgs[:, 3:]
c0 = self.contextnet(img0, flow)
c1 = self.contextnet(img1, -flow)
flow = F.interpolate(flow, scale_factor=2.0, mode="bilinear",
align_corners=False) * 2.0
refine_output, warped_img0, warped_img1, warped_img0_gt, warped_img1_gt = self.fusionnet(
img0, img1, flow, c0, c1, flow_gt)
res = torch.sigmoid(refine_output[:, :3]) * 2 - 1
mask = torch.sigmoid(refine_output[:, 3:4])
merged_img = warped_img0 * mask + warped_img1 * (1 - mask)
pred = merged_img + res
pred = torch.clamp(pred, 0, 1)
if training:
return pred, mask, merged_img, warped_img0, warped_img1, warped_img0_gt, warped_img1_gt
else:
return pred
def inference(self, img0, img1):
imgs = torch.cat((img0, img1), 1)
flow, _ = self.flownet(imgs)
return self.predict(imgs, flow, training=False).detach()
def update(self, imgs, gt, learning_rate=0, mul=1, training=True, flow_gt=None):
for param_group in self.optimG.param_groups:
param_group['lr'] = learning_rate
if training:
self.train()
else:
self.eval()
flow, flow_list = self.flownet(imgs)
pred, mask, merged_img, warped_img0, warped_img1, warped_img0_gt, warped_img1_gt = self.predict(
imgs, flow, flow_gt=flow_gt)
loss_ter = self.ter(pred, gt).mean()
if training:
with torch.no_grad():
loss_flow = torch.abs(warped_img0_gt - gt).mean()
loss_mask = torch.abs(
merged_img - gt).sum(1, True).float().detach()
loss_mask = F.interpolate(loss_mask, scale_factor=0.5, mode="bilinear",
align_corners=False).detach()
flow_gt = (F.interpolate(flow_gt, scale_factor=0.5, mode="bilinear",
align_corners=False) * 0.5).detach()
loss_cons = 0
for i in range(3):
loss_cons += self.epe(flow_list[i], flow_gt[:, :2], 1)
loss_cons += self.epe(-flow_list[i], flow_gt[:, 2:4], 1)
loss_cons = loss_cons.mean() * 0.01
else:
loss_cons = torch.tensor([0])
loss_flow = torch.abs(warped_img0 - gt).mean()
loss_mask = 1
loss_l1 = (((pred - gt) ** 2 + 1e-6) ** 0.5).mean()
if training:
self.optimG.zero_grad()
loss_G = loss_l1 + loss_cons + loss_ter
loss_G.backward()
self.optimG.step()
return pred, merged_img, flow, loss_l1, loss_flow, loss_cons, loss_ter, loss_mask
if __name__ == '__main__':
img0 = torch.zeros(3, 3, 256, 256).float().to(device)
img1 = torch.tensor(np.random.normal(
0, 1, (3, 3, 256, 256))).float().to(device)
imgs = torch.cat((img0, img1), 1)
model = Model()
model.eval()
print(model.inference(imgs).shape)

@ -0,0 +1,250 @@
import torch
import torch.nn as nn
import numpy as np
from torch.optim import AdamW
import torch.optim as optim
import itertools
from model.warplayer import warp
from torch.nn.parallel import DistributedDataParallel as DDP
from model.IFNet2F import *
import torch.nn.functional as F
from model.loss import *
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
def conv(in_planes, out_planes, kernel_size=3, stride=1, padding=1, dilation=1):
return nn.Sequential(
nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride,
padding=padding, dilation=dilation, bias=True),
nn.PReLU(out_planes)
)
def deconv(in_planes, out_planes, kernel_size=4, stride=2, padding=1):
return nn.Sequential(
torch.nn.ConvTranspose2d(in_channels=in_planes, out_channels=out_planes,
kernel_size=4, stride=2, padding=1, bias=True),
nn.PReLU(out_planes)
)
def conv_woact(in_planes, out_planes, kernel_size=3, stride=1, padding=1, dilation=1):
return nn.Sequential(
nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride,
padding=padding, dilation=dilation, bias=True),
)
class ResBlock(nn.Module):
def __init__(self, in_planes, out_planes, stride=2):
super(ResBlock, self).__init__()
if in_planes == out_planes and stride == 1:
self.conv0 = nn.Identity()
else:
self.conv0 = nn.Conv2d(in_planes, out_planes,
3, stride, 1, bias=False)
self.conv1 = conv(in_planes, out_planes, 3, stride, 1)
self.conv2 = conv_woact(out_planes, out_planes, 3, 1, 1)
self.relu1 = nn.PReLU(1)
self.relu2 = nn.PReLU(out_planes)
self.fc1 = nn.Conv2d(out_planes, 16, kernel_size=1, bias=False)
self.fc2 = nn.Conv2d(16, out_planes, kernel_size=1, bias=False)
def forward(self, x):
y = self.conv0(x)
x = self.conv1(x)
x = self.conv2(x)
w = x.mean(3, True).mean(2, True)
w = self.relu1(self.fc1(w))
w = torch.sigmoid(self.fc2(w))
x = self.relu2(x * w + y)
return x
c = 16
class ContextNet(nn.Module):
def __init__(self):
super(ContextNet, self).__init__()
self.conv1 = ResBlock(3, c, 1)
self.conv2 = ResBlock(c, 2*c)
self.conv3 = ResBlock(2*c, 4*c)
self.conv4 = ResBlock(4*c, 8*c)
def forward(self, x, flow):
x = self.conv1(x)
f1 = warp(x, flow)
x = self.conv2(x)
flow = F.interpolate(flow, scale_factor=0.5, mode="bilinear",
align_corners=False) * 0.5
f2 = warp(x, flow)
x = self.conv3(x)
flow = F.interpolate(flow, scale_factor=0.5, mode="bilinear",
align_corners=False) * 0.5
f3 = warp(x, flow)
x = self.conv4(x)
flow = F.interpolate(flow, scale_factor=0.5, mode="bilinear",
align_corners=False) * 0.5
f4 = warp(x, flow)
return [f1, f2, f3, f4]
class FusionNet(nn.Module):
def __init__(self):
super(FusionNet, self).__init__()
self.down0 = ResBlock(8, 2*c, 1)
self.down1 = ResBlock(4*c, 4*c)
self.down2 = ResBlock(8*c, 8*c)
self.down3 = ResBlock(16*c, 16*c)
self.up0 = deconv(32*c, 8*c)
self.up1 = deconv(16*c, 4*c)
self.up2 = deconv(8*c, 2*c)
self.up3 = deconv(4*c, c)
self.conv = nn.Conv2d(c, 4, 3, 2, 1)
def forward(self, img0, img1, flow, c0, c1, flow_gt):
warped_img0 = warp(img0, flow)
warped_img1 = warp(img1, -flow)
if flow_gt == None:
warped_img0_gt, warped_img1_gt = None, None
else:
warped_img0_gt = warp(img0, flow_gt[:, :2])
warped_img1_gt = warp(img1, flow_gt[:, 2:4])
s0 = self.down0(torch.cat((warped_img0, warped_img1, flow), 1))
s1 = self.down1(torch.cat((s0, c0[0], c1[0]), 1))
s2 = self.down2(torch.cat((s1, c0[1], c1[1]), 1))
s3 = self.down3(torch.cat((s2, c0[2], c1[2]), 1))
x = self.up0(torch.cat((s3, c0[3], c1[3]), 1))
x = self.up1(torch.cat((x, s2), 1))
x = self.up2(torch.cat((x, s1), 1))
x = self.up3(torch.cat((x, s0), 1))
x = self.conv(x)
return x, warped_img0, warped_img1, warped_img0_gt, warped_img1_gt
class Model:
def __init__(self, local_rank=-1):
self.flownet = IFNet()
self.contextnet = ContextNet()
self.fusionnet = FusionNet()
self.device()
self.optimG = AdamW(itertools.chain(
self.flownet.parameters(),
self.contextnet.parameters(),
self.fusionnet.parameters()), lr=1e-6, weight_decay=1e-5)
self.schedulerG = optim.lr_scheduler.CyclicLR(
self.optimG, base_lr=1e-6, max_lr=1e-3, step_size_up=8000, cycle_momentum=False)
self.epe = EPE()
self.ter = Ternary()
self.sobel = SOBEL()
if local_rank != -1:
self.flownet = DDP(self.flownet, device_ids=[
local_rank], output_device=local_rank)
self.contextnet = DDP(self.contextnet, device_ids=[
local_rank], output_device=local_rank)
self.fusionnet = DDP(self.fusionnet, device_ids=[
local_rank], output_device=local_rank)
def train(self):
self.flownet.train()
self.contextnet.train()
self.fusionnet.train()
def eval(self):
self.flownet.eval()
self.contextnet.eval()
self.fusionnet.eval()
def device(self):
self.flownet.to(device)
self.contextnet.to(device)
self.fusionnet.to(device)
def load_model(self, path, rank=0):
def convert(param):
return {
k.replace("module.", ""): v
for k, v in param.items()
if "module." in k
}
if rank == 0:
self.flownet.load_state_dict(
convert(torch.load('{}/flownet.pkl'.format(path), map_location=device)))
self.contextnet.load_state_dict(
convert(torch.load('{}/contextnet.pkl'.format(path), map_location=device)))
self.fusionnet.load_state_dict(
convert(torch.load('{}/unet.pkl'.format(path), map_location=device)))
def save_model(self, path, rank=0):
if rank == 0:
torch.save(self.flownet.state_dict(),
'{}/flownet.pkl'.format(path))
torch.save(self.contextnet.state_dict(),
'{}/contextnet.pkl'.format(path))
torch.save(self.fusionnet.state_dict(), '{}/unet.pkl'.format(path))
def predict(self, imgs, flow, training=True, flow_gt=None):
img0 = imgs[:, :3]
img1 = imgs[:, 3:]
flow = F.interpolate(flow, scale_factor=2.0, mode="bilinear",
align_corners=False) * 2.0
c0 = self.contextnet(img0, flow)
c1 = self.contextnet(img1, -flow)
refine_output, warped_img0, warped_img1, warped_img0_gt, warped_img1_gt = self.fusionnet(
img0, img1, flow, c0, c1, flow_gt)
res = torch.sigmoid(refine_output[:, :3]) * 2 - 1
mask = torch.sigmoid(refine_output[:, 3:4])
merged_img = warped_img0 * mask + warped_img1 * (1 - mask)
pred = merged_img + res
pred = torch.clamp(pred, 0, 1)
if training:
return pred, mask, merged_img, warped_img0, warped_img1, warped_img0_gt, warped_img1_gt
else:
return pred
def inference(self, img0, img1):
with torch.no_grad():
imgs = torch.cat((img0, img1), 1)
flow, _ = self.flownet(imgs)
return self.predict(imgs, flow, training=False).detach()
def update(self, imgs, gt, learning_rate=0, mul=1, training=True, flow_gt=None):
for param_group in self.optimG.param_groups:
param_group['lr'] = learning_rate
if training:
self.train()
else:
self.eval()
flow, flow_list = self.flownet(imgs)
pred, mask, merged_img, warped_img0, warped_img1, warped_img0_gt, warped_img1_gt = self.predict(
imgs, flow, flow_gt=flow_gt)
loss_ter = self.ter(pred, gt).mean()
if training:
with torch.no_grad():
loss_flow = torch.abs(warped_img0_gt - gt).mean()
loss_mask = torch.abs(
merged_img - gt).sum(1, True).float().detach()
loss_cons = 0
for i in range(3):
loss_cons += self.epe(flow_list[i], flow_gt[:, :2], 1)
loss_cons += self.epe(-flow_list[i], flow_gt[:, 2:4], 1)
loss_cons = loss_cons.mean() * 0.01
else:
loss_cons = torch.tensor([0])
loss_flow = torch.abs(warped_img0 - gt).mean()
loss_mask = 1
loss_l1 = (((pred - gt) ** 2 + 1e-6) ** 0.5).mean()
if training:
self.optimG.zero_grad()
loss_G = loss_l1 + loss_cons + loss_ter
loss_G.backward()
self.optimG.step()
return pred, merged_img, flow, loss_l1, loss_flow, loss_cons, loss_ter, loss_mask
if __name__ == '__main__':
img0 = torch.zeros(3, 3, 256, 256).float().to(device)
img1 = torch.tensor(np.random.normal(
0, 1, (3, 3, 256, 256))).float().to(device)
imgs = torch.cat((img0, img1), 1)
model = Model()
model.eval()
print(model.inference(imgs).shape)

@ -0,0 +1,90 @@
import torch
import numpy as np
import torch.nn as nn
import torch.nn.functional as F
# device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
class EPE(nn.Module):
def __init__(self):
super(EPE, self).__init__()
def forward(self, flow, gt, loss_mask):
loss_map = (flow - gt.detach()) ** 2
loss_map = (loss_map.sum(1, True) + 1e-6) ** 0.5
return (loss_map * loss_mask)
class Ternary(nn.Module):
def __init__(self, cFlag):
super(Ternary, self).__init__()
patch_size = 7
out_channels = patch_size * patch_size
self.w = np.eye(out_channels).reshape(
(patch_size, patch_size, 1, out_channels))
self.w = np.transpose(self.w, (3, 2, 0, 1))
self.device = torch.device("cuda" if torch.cuda.is_available() and not cFlag else "cpu")
self.w = torch.tensor(self.w).float().to(self.device)
def transform(self, img):
patches = F.conv2d(img, self.w, padding=3, bias=None)
transf = patches - img
transf_norm = transf / torch.sqrt(0.81 + transf**2)
return transf_norm
def rgb2gray(self, rgb):
r, g, b = rgb[:, 0:1, :, :], rgb[:, 1:2, :, :], rgb[:, 2:3, :, :]
gray = 0.2989 * r + 0.5870 * g + 0.1140 * b
return gray
def hamming(self, t1, t2):
dist = (t1 - t2) ** 2
dist_norm = torch.mean(dist / (0.1 + dist), 1, True)
return dist_norm
def valid_mask(self, t, padding):
n, _, h, w = t.size()
inner = torch.ones(n, 1, h - 2 * padding, w - 2 * padding).type_as(t)
mask = F.pad(inner, [padding] * 4)
return mask
def forward(self, img0, img1):
img0 = self.transform(self.rgb2gray(img0))
img1 = self.transform(self.rgb2gray(img1))
return self.hamming(img0, img1) * self.valid_mask(img0, 1)
class SOBEL(nn.Module):
def __init__(self, cFlag):
super(SOBEL, self).__init__()
self.kernelX = torch.tensor([
[1, 0, -1],
[2, 0, -2],
[1, 0, -1],
]).float()
self.kernelY = self.kernelX.clone().T
self.device = torch.device("cuda" if torch.cuda.is_available() and not cFlag else "cpu")
self.kernelX = self.kernelX.unsqueeze(0).unsqueeze(0).to(self.device)
self.kernelY = self.kernelY.unsqueeze(0).unsqueeze(0).to(self.device)
def forward(self, pred, gt):
N, C, H, W = pred.shape[0], pred.shape[1], pred.shape[2], pred.shape[3]
img_stack = torch.cat(
[pred.reshape(N*C, 1, H, W), gt.reshape(N*C, 1, H, W)], 0)
sobel_stack_x = F.conv2d(img_stack, self.kernelX, padding=1)
sobel_stack_y = F.conv2d(img_stack, self.kernelY, padding=1)
pred_X, gt_X = sobel_stack_x[:N*C], sobel_stack_x[N*C:]
pred_Y, gt_Y = sobel_stack_y[:N*C], sobel_stack_y[N*C:]
L1X, L1Y = torch.abs(pred_X-gt_X), torch.abs(pred_Y-gt_Y)
loss = (L1X+L1Y)
return loss
if __name__ == '__main__':
img0 = torch.zeros(3, 3, 256, 256).float().to(device)
img1 = torch.tensor(np.random.normal(
0, 1, (3, 3, 256, 256))).float().to(device)
ternary_loss = Ternary()
print(ternary_loss(img0, img1).shape)

@ -0,0 +1,23 @@
import torch
import torch.nn as nn
backwarp_tenGrid = {}
def warp(tenInput, tenFlow, cFlag):
device = torch.device("cuda" if torch.cuda.is_available() and not cFlag else "cpu")
k = (str(tenFlow.device), str(tenFlow.size()))
if k not in backwarp_tenGrid:
tenHorizontal = torch.linspace(-1.0, 1.0, tenFlow.shape[3]).view(
1, 1, 1, tenFlow.shape[3]).expand(tenFlow.shape[0], -1, tenFlow.shape[2], -1)
tenVertical = torch.linspace(-1.0, 1.0, tenFlow.shape[2]).view(
1, 1, tenFlow.shape[2], 1).expand(tenFlow.shape[0], -1, -1, tenFlow.shape[3])
backwarp_tenGrid[k] = torch.cat(
[tenHorizontal, tenVertical], 1).to(device)
tenFlow = torch.cat([tenFlow[:, 0:1, :, :] / ((tenInput.shape[3] - 1.0) / 2.0),
tenFlow[:, 1:2, :, :] / ((tenInput.shape[2] - 1.0) / 2.0)], 1)
g = (backwarp_tenGrid[k] + tenFlow).permute(0, 2, 3, 1)
return torch.nn.functional.grid_sample(input=tenInput, grid=torch.clamp(g, -1, 1), mode='bilinear',
padding_mode='zeros', align_corners=True)

@ -0,0 +1,61 @@
import pickle
import os
import sys
plugin_loc = os.path.dirname(os.path.realpath(__file__)) + '/'
sys.path.extend([plugin_loc + 'ideepcolor'])
import numpy as np
import torch
import cv2
from data import colorize_image as CI
def get_deepcolor(layerimg, layerc = None, cpu_flag = False):
if layerc is not None:
input_ab = cv2.cvtColor(layerc[:, :, 0:3].astype(np.float32) / 255, cv2.COLOR_RGB2LAB)
mask = layerc[:, :, 3] > 0
input_ab = cv2.resize(input_ab, (256, 256))
mask = mask.astype(np.uint8)
mask = cv2.resize(mask, (256, 256))
mask = mask[np.newaxis, :, :]
input_ab = input_ab[:, :, 1:3].transpose((2, 0, 1))
else:
mask = np.zeros((1, 256, 256)) # giving no user points, so mask is all 0's
input_ab = np.zeros((2, 256, 256))
gpu_id = 0 if torch.cuda.is_available() and not cpu_flag else None
if layerimg.shape[2] == 4: # remove alpha channel in image if present
layerimg = layerimg[:, :, 0:3]
colorModel = CI.ColorizeImageTorch(Xd=256)
colorModel.prep_net(gpu_id, os.path.join(weight_path, 'colorize', 'caffemodel.pth'))
colorModel.load_image(layerimg) # load an image
img_out = colorModel.net_forward(input_ab, mask, f=cpu_flag) # run model, returns 256x256 image
img_out_fullres = colorModel.get_img_fullres() # get image at full resolution
return img_out_fullres
if __name__ == "__main__":
config_path = os.path.dirname(os.path.realpath(__file__))
with open(os.path.join(config_path, 'gimp_ml_config.pkl'), 'rb') as file:
data_output = pickle.load(file)
weight_path = data_output["weight_path"]
image1 = cv2.imread(os.path.join(weight_path, '..', "cache0.png"), cv2.IMREAD_UNCHANGED)
image2 = cv2.imread(os.path.join(weight_path, '..', "cache1.png"), cv2.IMREAD_UNCHANGED)
with open(os.path.join(weight_path, '..', 'gimp_ml_run.pkl'), 'rb') as file:
data_output = pickle.load(file)
force_cpu = data_output["force_cpu"]
if image1.shape[2] == 4 and (np.sum(image1 == [0, 0, 0, 0])) / (
image1.shape[0] * image1.shape[1] * 4) > 0.8:
image2 = image2[:, :, [2, 1, 0]]
image1 = image1[:, :, [2, 1, 0, 3]]
output = get_deepcolor(image2, image1, cpu_flag=force_cpu)
else:
image1 = image1[:, :, [2, 1, 0]]
image2 = image2[:, :, [2, 1, 0, 3]]
output = get_deepcolor(image1, image2, cpu_flag=force_cpu)
cv2.imwrite(os.path.join(weight_path, '..', 'cache.png'), output[:, :, ::-1])
# with open(os.path.join(weight_path, 'gimp_ml_run.pkl'), 'wb') as file:
# pickle.dump({"run_success": True}, file)

@ -27,12 +27,13 @@ def setup_python_weights(install_location=None):
os.mkdir(weight_path)
if os.name == 'nt': # windows
print("Automatic downloading of weights not supported on Windows.")
print("Please downloads weights folder from: \n"
print("\n##########\n1>> Automatic downloading of weights not supported on Windows.")
print("2>> Please downloads weights folder from: \n"
"https://drive.google.com/drive/folders/10IiBO4fuMiGQ-spBStnObbk9R-pGp6u8?usp=sharing")
print("and place in: " + weight_path)
else: # linux
with open('model_info.csv') as csv_file:
file_path = os.path.dirname(os.path.realpath(__file__))
with open(os.path.join(file_path, 'model_info.csv')) as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
headings = next(csv_reader)
line_count = 0
@ -62,9 +63,9 @@ def setup_python_weights(install_location=None):
with open(os.path.join(plugin_loc, 'gimp_ml_config.pkl'), 'wb') as file:
pickle.dump({"python_path": python_path, "weight_path": weight_path}, file)
print("Please add this path to Preferences-->Plug-ins : ",
print("3>> Please add this path to Preferences-->Plug-ins : ",
os.path.join(os.path.dirname(os.path.abspath(__file__)), "..", "plugins"))
print("##########\n")
if __name__ == "__main__":
setup_python_weights()

Binary file not shown.

@ -0,0 +1,21 @@
The MIT License (MIT)
Copyright (c) 2017 Jun-Yan Zhu and Richard Zhang
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

@ -0,0 +1,565 @@
import numpy as np
# import matplotlib.pyplot as plt
# from skimage import color
# from sklearn.cluster import KMeans
import os
import cv2
from scipy.ndimage.interpolation import zoom
def create_temp_directory(path_template, N=1e8):
print(path_template)
cur_path = path_template % np.random.randint(0, N)
while(os.path.exists(cur_path)):
cur_path = path_template % np.random.randint(0, N)
print('Creating directory: %s' % cur_path)
os.mkdir(cur_path)
return cur_path
def lab2rgb_transpose(img_l, img_ab):
''' INPUTS
img_l 1xXxX [0,100]
img_ab 2xXxX [-100,100]
OUTPUTS
returned value is XxXx3 '''
pred_lab = np.concatenate((img_l, img_ab), axis=0).transpose((1, 2, 0))
# im = color.lab2rgb(pred_lab)
im = cv2.cvtColor(pred_lab.astype('float32'),cv2.COLOR_LAB2RGB)
pred_rgb = (np.clip(im, 0, 1) * 255).astype('uint8')
return pred_rgb
def rgb2lab_transpose(img_rgb):
''' INPUTS
img_rgb XxXx3
OUTPUTS
returned value is 3xXxX '''
# im=color.rgb2lab(img_rgb)
im = cv2.cvtColor(img_rgb.astype(np.float32)/255, cv2.COLOR_RGB2LAB)
return im.transpose((2, 0, 1))
class ColorizeImageBase():
def __init__(self, Xd=256, Xfullres_max=10000):
self.Xd = Xd
self.img_l_set = False
self.net_set = False
self.Xfullres_max = Xfullres_max # maximum size of maximum dimension
self.img_just_set = False # this will be true whenever image is just loaded
# net_forward can set this to False if they want
def prep_net(self):
raise Exception("Should be implemented by base class")
# ***** Image prepping *****
def load_image(self, im):
# rgb image [CxXdxXd]
self.img_rgb_fullres = im.copy()
self._set_img_lab_fullres_()
im = cv2.resize(im, (self.Xd, self.Xd))
self.img_rgb = im.copy()
# self.img_rgb = sp.misc.imresize(plt.imread(input_path),(self.Xd,self.Xd)).transpose((2,0,1))
self.img_l_set = True
# convert into lab space
self._set_img_lab_()
self._set_img_lab_mc_()
def set_image(self, input_image):
self.img_rgb_fullres = input_image.copy()
self._set_img_lab_fullres_()
self.img_l_set = True
self.img_rgb = input_image
# convert into lab space
self._set_img_lab_()
self._set_img_lab_mc_()
def net_forward(self, input_ab, input_mask):
# INPUTS
# ab 2xXxX input color patches (non-normalized)
# mask 1xXxX input mask, indicating which points have been provided
# assumes self.img_l_mc has been set
if(not self.img_l_set):
print('I need to have an image!')
return -1
if(not self.net_set):
print('I need to have a net!')
return -1
self.input_ab = input_ab
self.input_ab_mc = (input_ab - self.ab_mean) / self.ab_norm
self.input_mask = input_mask
self.input_mask_mult = input_mask * self.mask_mult
return 0
def get_result_PSNR(self, result=-1, return_SE_map=False):
if np.array((result)).flatten()[0] == -1:
cur_result = self.get_img_forward()
else:
cur_result = result.copy()
SE_map = (1. * self.img_rgb - cur_result)**2
cur_MSE = np.mean(SE_map)
cur_PSNR = 20 * np.log10(255. / np.sqrt(cur_MSE))
if return_SE_map:
return(cur_PSNR, SE_map)
else:
return cur_PSNR
def get_img_forward(self):
# get image with point estimate
return self.output_rgb
def get_img_gray(self):
# Get black and white image
return lab2rgb_transpose(self.img_l, np.zeros((2, self.Xd, self.Xd)))
def get_img_gray_fullres(self):
# Get black and white image
return lab2rgb_transpose(self.img_l_fullres, np.zeros((2, self.img_l_fullres.shape[1], self.img_l_fullres.shape[2])))
def get_img_fullres(self):
# This assumes self.img_l_fullres, self.output_ab are set.
# Typically, this means that set_image() and net_forward()
# have been called.
# bilinear upsample
zoom_factor = (1, 1. * self.img_l_fullres.shape[1] / self.output_ab.shape[1], 1. * self.img_l_fullres.shape[2] / self.output_ab.shape[2])
output_ab_fullres = zoom(self.output_ab, zoom_factor, order=1)
return lab2rgb_transpose(self.img_l_fullres, output_ab_fullres)
def get_input_img_fullres(self):
zoom_factor = (1, 1. * self.img_l_fullres.shape[1] / self.input_ab.shape[1], 1. * self.img_l_fullres.shape[2] / self.input_ab.shape[2])
input_ab_fullres = zoom(self.input_ab, zoom_factor, order=1)
return lab2rgb_transpose(self.img_l_fullres, input_ab_fullres)
def get_input_img(self):
return lab2rgb_transpose(self.img_l, self.input_ab)
def get_img_mask(self):
# Get black and white image
return lab2rgb_transpose(100. * (1 - self.input_mask), np.zeros((2, self.Xd, self.Xd)))
def get_img_mask_fullres(self):
# Get black and white image
zoom_factor = (1, 1. * self.img_l_fullres.shape[1] / self.input_ab.shape[1], 1. * self.img_l_fullres.shape[2] / self.input_ab.shape[2])
input_mask_fullres = zoom(self.input_mask, zoom_factor, order=0)
return lab2rgb_transpose(100. * (1 - input_mask_fullres), np.zeros((2, input_mask_fullres.shape[1], input_mask_fullres.shape[2])))
def get_sup_img(self):
return lab2rgb_transpose(50 * self.input_mask, self.input_ab)
def get_sup_fullres(self):
zoom_factor = (1, 1. * self.img_l_fullres.shape[1] / self.output_ab.shape[1], 1. * self.img_l_fullres.shape[2] / self.output_ab.shape[2])
input_mask_fullres = zoom(self.input_mask, zoom_factor, order=0)
input_ab_fullres = zoom(self.input_ab, zoom_factor, order=0)
return lab2rgb_transpose(50 * input_mask_fullres, input_ab_fullres)
# ***** Private functions *****
def _set_img_lab_fullres_(self):
# adjust full resolution image to be within maximum dimension is within Xfullres_max
Xfullres = self.img_rgb_fullres.shape[0]
Yfullres = self.img_rgb_fullres.shape[1]
if Xfullres > self.Xfullres_max or Yfullres > self.Xfullres_max:
if Xfullres > Yfullres:
zoom_factor = 1. * self.Xfullres_max / Xfullres
else:
zoom_factor = 1. * self.Xfullres_max / Yfullres
self.img_rgb_fullres = zoom(self.img_rgb_fullres, (zoom_factor, zoom_factor, 1), order=1)
self.img_lab_fullres = cv2.cvtColor(self.img_rgb_fullres.astype(np.float32) / 255, cv2.COLOR_RGB2LAB).transpose((2, 0, 1))
# self.img_lab_fullres = color.rgb2lab(self.img_rgb_fullres).transpose((2, 0, 1))
self.img_l_fullres = self.img_lab_fullres[[0], :, :]
self.img_ab_fullres = self.img_lab_fullres[1:, :, :]
def _set_img_lab_(self):
# set self.img_lab from self.im_rgb
self.img_lab = cv2.cvtColor(self.img_rgb.astype(np.float32) / 255, cv2.COLOR_RGB2LAB).transpose((2, 0, 1))
# self.img_lab = color.rgb2lab(self.img_rgb).transpose((2, 0, 1))
self.img_l = self.img_lab[[0], :, :]
self.img_ab = self.img_lab[1:, :, :]
def _set_img_lab_mc_(self):
# set self.img_lab_mc from self.img_lab
# lab image, mean centered [XxYxX]
self.img_lab_mc = self.img_lab / np.array((self.l_norm, self.ab_norm, self.ab_norm))[:, np.newaxis, np.newaxis] - np.array(
(self.l_mean / self.l_norm, self.ab_mean / self.ab_norm, self.ab_mean / self.ab_norm))[:, np.newaxis, np.newaxis]
self._set_img_l_()
def _set_img_l_(self):
self.img_l_mc = self.img_lab_mc[[0], :, :]
self.img_l_set = True
def _set_img_ab_(self):
self.img_ab_mc = self.img_lab_mc[[1, 2], :, :]
def _set_out_ab_(self):
self.output_lab = rgb2lab_transpose(self.output_rgb)
self.output_ab = self.output_lab[1:, :, :]
class ColorizeImageTorch(ColorizeImageBase):
def __init__(self, Xd=256, maskcent=False):
print('ColorizeImageTorch instantiated')
ColorizeImageBase.__init__(self, Xd)
self.l_norm = 1.
self.ab_norm = 1.
self.l_mean = 50.
self.ab_mean = 0.
self.mask_mult = 1.
self.mask_cent = .5 if maskcent else 0
# Load grid properties
self.pts_in_hull = np.array(np.meshgrid(np.arange(-110, 120, 10), np.arange(-110, 120, 10))).reshape((2, 529)).T
# ***** Net preparation *****
def prep_net(self, gpu_id=None, path='', dist=False):
import torch
import pytorch.model as model
print('path = %s' % path)
print('Model set! dist mode? ', dist)
self.net = model.SIGGRAPHGenerator(dist=dist)
state_dict = torch.load(path)
if hasattr(state_dict, '_metadata'):
del state_dict._metadata
# patch InstanceNorm checkpoints prior to 0.4
for key in list(state_dict.keys()): # need to copy keys here because we mutate in loop
self.__patch_instance_norm_state_dict(state_dict, self.net, key.split('.'))
self.net.load_state_dict(state_dict)
if gpu_id != None:
self.net.cuda()
self.net.eval()
self.net_set = True
def __patch_instance_norm_state_dict(self, state_dict, module, keys, i=0):
key = keys[i]
if i + 1 == len(keys): # at the end, pointing to a parameter/buffer
if module.__class__.__name__.startswith('InstanceNorm') and \
(key == 'running_mean' or key == 'running_var'):
if getattr(module, key) is None:
state_dict.pop('.'.join(keys))
if module.__class__.__name__.startswith('InstanceNorm') and \
(key == 'num_batches_tracked'):
state_dict.pop('.'.join(keys))
else:
self.__patch_instance_norm_state_dict(state_dict, getattr(module, key), keys, i + 1)
# ***** Call forward *****
def net_forward(self, input_ab, input_mask, f):
# INPUTS
# ab 2xXxX input color patches (non-normalized)
# mask 1xXxX input mask, indicating which points have been provided
# assumes self.img_l_mc has been set
if ColorizeImageBase.net_forward(self, input_ab, input_mask) == -1:
return -1
# net_input_prepped = np.concatenate((self.img_l_mc, self.input_ab_mc, self.input_mask_mult), axis=0)
# return prediction
# self.net.blobs['data_l_ab_mask'].data[...] = net_input_prepped
# embed()
output_ab = self.net.forward(self.img_l_mc, self.input_ab_mc, self.input_mask_mult, self.mask_cent,f)[0, :, :, :].cpu().data.numpy()
self.output_rgb = lab2rgb_transpose(self.img_l, output_ab)
# self.output_rgb = lab2rgb_transpose(self.img_l, self.net.blobs[self.pred_ab_layer].data[0, :, :, :])
self._set_out_ab_()
return self.output_rgb
def get_img_forward(self):
# get image with point estimate
return self.output_rgb
def get_img_gray(self):
# Get black and white image
return lab2rgb_transpose(self.img_l, np.zeros((2, self.Xd, self.Xd)))
class ColorizeImageTorchDist(ColorizeImageTorch):
def __init__(self, Xd=256, maskcent=False):
ColorizeImageTorch.__init__(self, Xd)
self.dist_ab_set = False
self.pts_grid = np.array(np.meshgrid(np.arange(-110, 120, 10), np.arange(-110, 120, 10))).reshape((2, 529)).T
self.in_hull = np.ones(529, dtype=bool)
self.AB = self.pts_grid.shape[0] # 529
self.A = int(np.sqrt(self.AB)) # 23
self.B = int(np.sqrt(self.AB)) # 23
self.dist_ab_full = np.zeros((self.AB, self.Xd, self.Xd))
self.dist_ab_grid = np.zeros((self.A, self.B, self.Xd, self.Xd))
self.dist_entropy = np.zeros((self.Xd, self.Xd))
self.mask_cent = .5 if maskcent else 0
def prep_net(self, gpu_id=None, path='', dist=True, S=.2):
ColorizeImageTorch.prep_net(self, gpu_id=gpu_id, path=path, dist=dist)
# set S somehow
def net_forward(self, input_ab, input_mask):
# INPUTS
# ab 2xXxX input color patches (non-normalized)
# mask 1xXxX input mask, indicating which points have been provided
# assumes self.img_l_mc has been set
# embed()
if ColorizeImageBase.net_forward(self, input_ab, input_mask) == -1:
return -1
# set distribution
(function_return, self.dist_ab) = self.net.forward(self.img_l_mc, self.input_ab_mc, self.input_mask_mult, self.mask_cent)
function_return = function_return[0, :, :, :].cpu().data.numpy()
self.dist_ab = self.dist_ab[0, :, :, :].cpu().data.numpy()
self.dist_ab_set = True
# full grid, ABxXxX, AB = 529
self.dist_ab_full[self.in_hull, :, :] = self.dist_ab
# gridded, AxBxXxX, A = 23
self.dist_ab_grid = self.dist_ab_full.reshape((self.A, self.B, self.Xd, self.Xd))
# return
return function_return
# def get_ab_reccs(self, h, w, K=5, N=25000, return_conf=False):
# ''' Recommended colors at point (h,w)
# Call this after calling net_forward
# '''
# if not self.dist_ab_set:
# print('Need to set prediction first')
# return 0
#
# # randomly sample from pdf
# cmf = np.cumsum(self.dist_ab[:, h, w]) # CMF
# cmf = cmf / cmf[-1]
# cmf_bins = cmf
#
# # randomly sample N points
# rnd_pts = np.random.uniform(low=0, high=1.0, size=N)
# inds = np.digitize(rnd_pts, bins=cmf_bins)
# rnd_pts_ab = self.pts_in_hull[inds, :]
#
# # run k-means
# kmeans = KMeans(n_clusters=K).fit(rnd_pts_ab)
#
# # sort by cluster occupancy
# k_label_cnt = np.histogram(kmeans.labels_, np.arange(0, K + 1))[0]
# k_inds = np.argsort(k_label_cnt, axis=0)[::-1]
#
# cluster_per = 1. * k_label_cnt[k_inds] / N # percentage of points within cluster
# cluster_centers = kmeans.cluster_centers_[k_inds, :] # cluster centers
#
# # cluster_centers = np.random.uniform(low=-100,high=100,size=(N,2))
# if return_conf:
# return cluster_centers, cluster_per
# else:
# return cluster_centers
def compute_entropy(self):
# compute the distribution entropy (really slow right now)
self.dist_entropy = np.sum(self.dist_ab * np.log(self.dist_ab), axis=0)
# def plot_dist_grid(self, h, w):
# # Plots distribution at a given point
# plt.figure()
# plt.imshow(self.dist_ab_grid[:, :, h, w], extent=[-110, 110, 110, -110], interpolation='nearest')
# plt.colorbar()
# plt.ylabel('a')
# plt.xlabel('b')
# def plot_dist_entropy(self):
# # Plots distribution at a given point
# plt.figure()
# plt.imshow(-self.dist_entropy, interpolation='nearest')
# plt.colorbar()
class ColorizeImageCaffe(ColorizeImageBase):
def __init__(self, Xd=256):
print('ColorizeImageCaffe instantiated')
ColorizeImageBase.__init__(self, Xd)
self.l_norm = 1.
self.ab_norm = 1.
self.l_mean = 50.
self.ab_mean = 0.
self.mask_mult = 110.
self.pred_ab_layer = 'pred_ab' # predicted ab layer
# Load grid properties
self.pts_in_hull_path = './data/color_bins/pts_in_hull.npy'
self.pts_in_hull = np.load(self.pts_in_hull_path) # 313x2, in-gamut
# ***** Net preparation *****
def prep_net(self, gpu_id, prototxt_path='', caffemodel_path=''):
import caffe
print('gpu_id = %d, net_path = %s, model_path = %s' % (gpu_id, prototxt_path, caffemodel_path))
if gpu_id == -1:
caffe.set_mode_cpu()
else:
caffe.set_device(gpu_id)
caffe.set_mode_gpu()
self.gpu_id = gpu_id
self.net = caffe.Net(prototxt_path, caffemodel_path, caffe.TEST)
self.net_set = True
# automatically set cluster centers
if len(self.net.params[self.pred_ab_layer][0].data[...].shape) == 4 and self.net.params[self.pred_ab_layer][0].data[...].shape[1] == 313:
print('Setting ab cluster centers in layer: %s' % self.pred_ab_layer)
self.net.params[self.pred_ab_layer][0].data[:, :, 0, 0] = self.pts_in_hull.T
# automatically set upsampling kernel
for layer in self.net._layer_names:
if layer[-3:] == '_us':
print('Setting upsampling layer kernel: %s' % layer)
self.net.params[layer][0].data[:, 0, :, :] = np.array(((.25, .5, .25, 0), (.5, 1., .5, 0), (.25, .5, .25, 0), (0, 0, 0, 0)))[np.newaxis, :, :]
# ***** Call forward *****
def net_forward(self, input_ab, input_mask):
# INPUTS
# ab 2xXxX input color patches (non-normalized)
# mask 1xXxX input mask, indicating which points have been provided
# assumes self.img_l_mc has been set
if ColorizeImageBase.net_forward(self, input_ab, input_mask) == -1:
return -1
net_input_prepped = np.concatenate((self.img_l_mc, self.input_ab_mc, self.input_mask_mult), axis=0)
self.net.blobs['data_l_ab_mask'].data[...] = net_input_prepped
self.net.forward()
# return prediction
self.output_rgb = lab2rgb_transpose(self.img_l, self.net.blobs[self.pred_ab_layer].data[0, :, :, :])
self._set_out_ab_()
return self.output_rgb
def get_img_forward(self):
# get image with point estimate
return self.output_rgb
def get_img_gray(self):
# Get black and white image
return lab2rgb_transpose(self.img_l, np.zeros((2, self.Xd, self.Xd)))
class ColorizeImageCaffeGlobDist(ColorizeImageCaffe):
# Caffe colorization, with additional global histogram as input
def __init__(self, Xd=256):
ColorizeImageCaffe.__init__(self, Xd)
self.glob_mask_mult = 1.
self.glob_layer = 'glob_ab_313_mask'
def net_forward(self, input_ab, input_mask, glob_dist=-1):
# glob_dist is 313 array, or -1
if np.array(glob_dist).flatten()[0] == -1: # run without this, zero it out
self.net.blobs[self.glob_layer].data[0, :-1, 0, 0] = 0.
self.net.blobs[self.glob_layer].data[0, -1, 0, 0] = 0.
else: # run conditioned on global histogram
self.net.blobs[self.glob_layer].data[0, :-1, 0, 0] = glob_dist
self.net.blobs[self.glob_layer].data[0, -1, 0, 0] = self.glob_mask_mult
self.output_rgb = ColorizeImageCaffe.net_forward(self, input_ab, input_mask)
self._set_out_ab_()
return self.output_rgb
class ColorizeImageCaffeDist(ColorizeImageCaffe):
# caffe model which includes distribution prediction
def __init__(self, Xd=256):
ColorizeImageCaffe.__init__(self, Xd)
self.dist_ab_set = False
self.scale_S_layer = 'scale_S'
self.dist_ab_S_layer = 'dist_ab_S' # softened distribution layer
self.pts_grid = np.load('./data/color_bins/pts_grid.npy') # 529x2, all points
self.in_hull = np.load('./data/color_bins/in_hull.npy') # 529 bool
self.AB = self.pts_grid.shape[0] # 529
self.A = int(np.sqrt(self.AB)) # 23
self.B = int(np.sqrt(self.AB)) # 23
self.dist_ab_full = np.zeros((self.AB, self.Xd, self.Xd))
self.dist_ab_grid = np.zeros((self.A, self.B, self.Xd, self.Xd))
self.dist_entropy = np.zeros((self.Xd, self.Xd))
def prep_net(self, gpu_id, prototxt_path='', caffemodel_path='', S=.2):
ColorizeImageCaffe.prep_net(self, gpu_id, prototxt_path=prototxt_path, caffemodel_path=caffemodel_path)
self.S = S
self.net.params[self.scale_S_layer][0].data[...] = S
def net_forward(self, input_ab, input_mask):
# INPUTS
# ab 2xXxX input color patches (non-normalized)
# mask 1xXxX input mask, indicating which points have been provided
# assumes self.img_l_mc has been set
function_return = ColorizeImageCaffe.net_forward(self, input_ab, input_mask)
if np.array(function_return).flatten()[0] == -1: # errored out
return -1
# set distribution
# in-gamut, CxXxX, C = 313
self.dist_ab = self.net.blobs[self.dist_ab_S_layer].data[0, :, :, :]
self.dist_ab_set = True
# full grid, ABxXxX, AB = 529
self.dist_ab_full[self.in_hull, :, :] = self.dist_ab
# gridded, AxBxXxX, A = 23
self.dist_ab_grid = self.dist_ab_full.reshape((self.A, self.B, self.Xd, self.Xd))
# return
return function_return
# def get_ab_reccs(self, h, w, K=5, N=25000, return_conf=False):
# ''' Recommended colors at point (h,w)
# Call this after calling net_forward
# '''
# if not self.dist_ab_set:
# print('Need to set prediction first')
# return 0
#
# # randomly sample from pdf
# cmf = np.cumsum(self.dist_ab[:, h, w]) # CMF
# cmf = cmf / cmf[-1]
# cmf_bins = cmf
#
# # randomly sample N points
# rnd_pts = np.random.uniform(low=0, high=1.0, size=N)
# inds = np.digitize(rnd_pts, bins=cmf_bins)
# rnd_pts_ab = self.pts_in_hull[inds, :]
#
# # run k-means
# kmeans = KMeans(n_clusters=K).fit(rnd_pts_ab)
#
# # sort by cluster occupancy
# k_label_cnt = np.histogram(kmeans.labels_, np.arange(0, K + 1))[0]
# k_inds = np.argsort(k_label_cnt, axis=0)[::-1]
#
# cluster_per = 1. * k_label_cnt[k_inds] / N # percentage of points within cluster
# cluster_centers = kmeans.cluster_centers_[k_inds, :] # cluster centers
#
# # cluster_centers = np.random.uniform(low=-100,high=100,size=(N,2))
# if return_conf:
# return cluster_centers, cluster_per
# else:
# return cluster_centers
def compute_entropy(self):
# compute the distribution entropy (really slow right now)
self.dist_entropy = np.sum(self.dist_ab * np.log(self.dist_ab), axis=0)
# def plot_dist_grid(self, h, w):
# Plots distribution at a given point
# plt.figure()
# plt.imshow(self.dist_ab_grid[:, :, h, w], extent=[-110, 110, 110, -110], interpolation='nearest')
# plt.colorbar()
# plt.ylabel('a')
# plt.xlabel('b')
# def plot_dist_entropy(self):
# Plots distribution at a given point
# plt.figure()
# plt.imshow(-self.dist_entropy, interpolation='nearest')
# plt.colorbar()

@ -0,0 +1,90 @@
import numpy as np
from skimage import color
import warnings
def qcolor2lab_1d(qc):
# take 1d numpy array and do color conversion
c = np.array([qc.red(), qc.green(), qc.blue()], np.uint8)
return rgb2lab_1d(c)
def rgb2lab_1d(in_rgb):
# take 1d numpy array and do color conversion
# print('in_rgb', in_rgb)
return color.rgb2lab(in_rgb[np.newaxis, np.newaxis, :]).flatten()
def lab2rgb_1d(in_lab, clip=True, dtype='uint8'):
warnings.filterwarnings("ignore")
tmp_rgb = color.lab2rgb(in_lab[np.newaxis, np.newaxis, :]).flatten()
if clip:
tmp_rgb = np.clip(tmp_rgb, 0, 1)
if dtype == 'uint8':
tmp_rgb = np.round(tmp_rgb * 255).astype('uint8')
return tmp_rgb
def snap_ab(input_l, input_rgb, return_type='rgb'):
''' given an input lightness and rgb, snap the color into a region where l,a,b is in-gamut
'''
T = 20
warnings.filterwarnings("ignore")
input_lab = rgb2lab_1d(np.array(input_rgb)) # convert input to lab
conv_lab = input_lab.copy() # keep ab from input
for t in range(T):
conv_lab[0] = input_l # overwrite input l with input ab
old_lab = conv_lab
tmp_rgb = color.lab2rgb(conv_lab[np.newaxis, np.newaxis, :]).flatten()
tmp_rgb = np.clip(tmp_rgb, 0, 1)
conv_lab = color.rgb2lab(tmp_rgb[np.newaxis, np.newaxis, :]).flatten()
dif_lab = np.sum(np.abs(conv_lab - old_lab))
if dif_lab < 1:
break
# print(conv_lab)
conv_rgb_ingamut = lab2rgb_1d(conv_lab, clip=True, dtype='uint8')
if (return_type == 'rgb'):
return conv_rgb_ingamut
elif(return_type == 'lab'):
conv_lab_ingamut = rgb2lab_1d(conv_rgb_ingamut)
return conv_lab_ingamut
class abGrid():
def __init__(self, gamut_size=110, D=1):
self.D = D
self.vals_b, self.vals_a = np.meshgrid(np.arange(-gamut_size, gamut_size + D, D),
np.arange(-gamut_size, gamut_size + D, D))
self.pts_full_grid = np.concatenate((self.vals_a[:, :, np.newaxis], self.vals_b[:, :, np.newaxis]), axis=2)
self.A = self.pts_full_grid.shape[0]
self.B = self.pts_full_grid.shape[1]
self.AB = self.A * self.B
self.gamut_size = gamut_size
def update_gamut(self, l_in):
warnings.filterwarnings("ignore")
thresh = 1.0
pts_lab = np.concatenate((l_in + np.zeros((self.A, self.B, 1)), self.pts_full_grid), axis=2)
self.pts_rgb = (255 * np.clip(color.lab2rgb(pts_lab), 0, 1)).astype('uint8')
pts_lab_back = color.rgb2lab(self.pts_rgb)
pts_lab_diff = np.linalg.norm(pts_lab - pts_lab_back, axis=2)
self.mask = pts_lab_diff < thresh
mask3 = np.tile(self.mask[..., np.newaxis], [1, 1, 3])
self.masked_rgb = self.pts_rgb.copy()
self.masked_rgb[np.invert(mask3)] = 255
return self.masked_rgb, self.mask
def ab2xy(self, a, b):
y = self.gamut_size + a
x = self.gamut_size + b
# print('ab2xy (%d, %d) -> (%d, %d)' % (a, b, x, y))
return x, y
def xy2ab(self, x, y):
a = y - self.gamut_size
b = x - self.gamut_size
# print('xy2ab (%d, %d) -> (%d, %d)' % (x, y, a, b))
return a, b

@ -0,0 +1,177 @@
import torch
import torch.nn as nn
class SIGGRAPHGenerator(nn.Module):
def __init__(self, dist=False):
super(SIGGRAPHGenerator, self).__init__()
self.dist = dist
use_bias = True
norm_layer = nn.BatchNorm2d
# Conv1
model1 = [nn.Conv2d(4, 64, kernel_size=3, stride=1, padding=1, bias=use_bias), ]
model1 += [nn.ReLU(True), ]
model1 += [nn.Conv2d(64, 64, kernel_size=3, stride=1, padding=1, bias=use_bias), ]
model1 += [nn.ReLU(True), ]
model1 += [norm_layer(64), ]
# add a subsampling operation
# Conv2
model2 = [nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1, bias=use_bias), ]
model2 += [nn.ReLU(True), ]
model2 += [nn.Conv2d(128, 128, kernel_size=3, stride=1, padding=1, bias=use_bias), ]
model2 += [nn.ReLU(True), ]
model2 += [norm_layer(128), ]
# add a subsampling layer operation
# Conv3
model3 = [nn.Conv2d(128, 256, kernel_size=3, stride=1, padding=1, bias=use_bias), ]
model3 += [nn.ReLU(True), ]
model3 += [nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1, bias=use_bias), ]
model3 += [nn.ReLU(True), ]
model3 += [nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1, bias=use_bias), ]
model3 += [nn.ReLU(True), ]
model3 += [norm_layer(256), ]
# add a subsampling layer operation
# Conv4
model4 = [nn.Conv2d(256, 512, kernel_size=3, stride=1, padding=1, bias=use_bias), ]
model4 += [nn.ReLU(True), ]
model4 += [nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1, bias=use_bias), ]
model4 += [nn.ReLU(True), ]
model4 += [nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1, bias=use_bias), ]
model4 += [nn.ReLU(True), ]
model4 += [norm_layer(512), ]
# Conv5
model5 = [nn.Conv2d(512, 512, kernel_size=3, dilation=2, stride=1, padding=2, bias=use_bias), ]
model5 += [nn.ReLU(True), ]
model5 += [nn.Conv2d(512, 512, kernel_size=3, dilation=2, stride=1, padding=2, bias=use_bias), ]
model5 += [nn.ReLU(True), ]
model5 += [nn.Conv2d(512, 512, kernel_size=3, dilation=2, stride=1, padding=2, bias=use_bias), ]
model5 += [nn.ReLU(True), ]
model5 += [norm_layer(512), ]
# Conv6
model6 = [nn.Conv2d(512, 512, kernel_size=3, dilation=2, stride=1, padding=2, bias=use_bias), ]
model6 += [nn.ReLU(True), ]
model6 += [nn.Conv2d(512, 512, kernel_size=3, dilation=2, stride=1, padding=2, bias=use_bias), ]
model6 += [nn.ReLU(True), ]
model6 += [nn.Conv2d(512, 512, kernel_size=3, dilation=2, stride=1, padding=2, bias=use_bias), ]
model6 += [nn.ReLU(True), ]
model6 += [norm_layer(512), ]
# Conv7
model7 = [nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1, bias=use_bias), ]
model7 += [nn.ReLU(True), ]
model7 += [nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1, bias=use_bias), ]
model7 += [nn.ReLU(True), ]
model7 += [nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1, bias=use_bias), ]
model7 += [nn.ReLU(True), ]
model7 += [norm_layer(512), ]
# Conv7
model8up = [nn.ConvTranspose2d(512, 256, kernel_size=4, stride=2, padding=1, bias=use_bias)]
model3short8 = [nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1, bias=use_bias), ]
model8 = [nn.ReLU(True), ]
model8 += [nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1, bias=use_bias), ]
model8 += [nn.ReLU(True), ]
model8 += [nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1, bias=use_bias), ]
model8 += [nn.ReLU(True), ]
model8 += [norm_layer(256), ]
# Conv9
model9up = [nn.ConvTranspose2d(256, 128, kernel_size=4, stride=2, padding=1, bias=use_bias), ]
model2short9 = [nn.Conv2d(128, 128, kernel_size=3, stride=1, padding=1, bias=use_bias), ]
# add the two feature maps above
model9 = [nn.ReLU(True), ]
model9 += [nn.Conv2d(128, 128, kernel_size=3, stride=1, padding=1, bias=use_bias), ]
model9 += [nn.ReLU(True), ]
model9 += [norm_layer(128), ]
# Conv10
model10up = [nn.ConvTranspose2d(128, 128, kernel_size=4, stride=2, padding=1, bias=use_bias), ]
model1short10 = [nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1, bias=use_bias), ]
# add the two feature maps above
model10 = [nn.ReLU(True), ]
model10 += [nn.Conv2d(128, 128, kernel_size=3, dilation=1, stride=1, padding=1, bias=use_bias), ]
model10 += [nn.LeakyReLU(negative_slope=.2), ]
# classification output
model_class = [nn.Conv2d(256, 529, kernel_size=1, padding=0, dilation=1, stride=1, bias=use_bias), ]
# regression output
model_out = [nn.Conv2d(128, 2, kernel_size=1, padding=0, dilation=1, stride=1, bias=use_bias), ]
model_out += [nn.Tanh()]
self.model1 = nn.Sequential(*model1)
self.model2 = nn.Sequential(*model2)
self.model3 = nn.Sequential(*model3)
self.model4 = nn.Sequential(*model4)
self.model5 = nn.Sequential(*model5)
self.model6 = nn.Sequential(*model6)
self.model7 = nn.Sequential(*model7)
self.model8up = nn.Sequential(*model8up)
self.model8 = nn.Sequential(*model8)
self.model9up = nn.Sequential(*model9up)
self.model9 = nn.Sequential(*model9)
self.model10up = nn.Sequential(*model10up)
self.model10 = nn.Sequential(*model10)
self.model3short8 = nn.Sequential(*model3short8)
self.model2short9 = nn.Sequential(*model2short9)
self.model1short10 = nn.Sequential(*model1short10)
self.model_class = nn.Sequential(*model_class)
self.model_out = nn.Sequential(*model_out)
self.upsample4 = nn.Sequential(*[nn.Upsample(scale_factor=4, mode='nearest'), ])
self.softmax = nn.Sequential(*[nn.Softmax(dim=1), ])
def forward(self, input_A, input_B, mask_B, maskcent=0,f=False):
# input_A \in [-50,+50]
# input_B \in [-110, +110]
# mask_B \in [0, +1.0]
input_A = torch.Tensor(input_A)[None, :, :, :]
input_B = torch.Tensor(input_B)[None, :, :, :]
mask_B = torch.Tensor(mask_B)[None, :, :, :]
if torch.cuda.is_available() and not f:
input_A = input_A.cuda()
input_B = input_B.cuda()
mask_B = mask_B.cuda()
mask_B = mask_B - maskcent
conv1_2 = self.model1(torch.cat((input_A / 100., input_B / 110., mask_B), dim=1))
conv2_2 = self.model2(conv1_2[:, :, ::2, ::2])
conv3_3 = self.model3(conv2_2[:, :, ::2, ::2])
conv4_3 = self.model4(conv3_3[:, :, ::2, ::2])
conv5_3 = self.model5(conv4_3)
conv6_3 = self.model6(conv5_3)
conv7_3 = self.model7(conv6_3)
conv8_up = self.model8up(conv7_3) + self.model3short8(conv3_3)
conv8_3 = self.model8(conv8_up)
if(self.dist):
out_cl = self.upsample4(self.softmax(self.model_class(conv8_3) * .2))
conv9_up = self.model9up(conv8_3) + self.model2short9(conv2_2)
conv9_3 = self.model9(conv9_up)
conv10_up = self.model10up(conv9_3) + self.model1short10(conv1_2)
conv10_2 = self.model10(conv10_up)
out_reg = self.model_out(conv10_2) * 110
return (out_reg * 110, out_cl)
else:
conv9_up = self.model9up(conv8_3) + self.model2short9(conv2_2)
conv9_3 = self.model9(conv9_up)
conv10_up = self.model10up(conv9_3) + self.model1short10(conv1_2)
conv10_2 = self.model10(conv10_up)
out_reg = self.model_out(conv10_2)
return out_reg * 110

@ -0,0 +1,85 @@
import pickle
import os
import sys
plugin_loc = os.path.dirname(os.path.realpath(__file__)) + '/'
sys.path.extend([plugin_loc + 'RIFE'])
import cv2
import torch
from torch.nn import functional as F
from rife_model import RIFE
import numpy as np
def get_inter(img_s, img_e, string_path, cpu_flag=False):
exp = 4
out_path = string_path
model = RIFE.Model(cpu_flag)
model.load_model(os.path.join(weight_path, 'interpolateframes'))
model.eval()
model.device(cpu_flag)
img0 = img_s
img1 = img_e
img0 = (torch.tensor(img0.transpose(2, 0, 1).copy()) / 255.).unsqueeze(0)
img1 = (torch.tensor(img1.transpose(2, 0, 1).copy()) / 255.).unsqueeze(0)
if torch.cuda.is_available() and not cpu_flag:
device = torch.device("cuda")
else:
device = torch.device("cpu")
img0 = img0.to(device)
img1 = img1.to(device)
n, c, h, w = img0.shape
ph = ((h - 1) // 32 + 1) * 32
pw = ((w - 1) // 32 + 1) * 32
padding = (0, pw - w, 0, ph - h)
img0 = F.pad(img0, padding)
img1 = F.pad(img1, padding)
img_list = [img0, img1]
idx = 0
t = exp * (len(img_list) - 1)
for i in range(exp):
tmp = []
for j in range(len(img_list) - 1):
mid = model.inference(img_list[j], img_list[j + 1])
tmp.append(img_list[j])
tmp.append(mid)
idx = idx + 1
try:
gimp.progress_update(float(idx)/float(t))
gimp.displays_flush()
except:
pass
tmp.append(img1)
img_list = tmp
if not os.path.exists(out_path):
os.makedirs(out_path)
for i in range(len(img_list)):
cv2.imwrite(os.path.join(out_path, 'img{}.png'.format(i)),
(img_list[i][0] * 255).byte().cpu().numpy().transpose(1, 2, 0)[:h, :w, ::-1])
if __name__ == "__main__":
config_path = os.path.dirname(os.path.realpath(__file__))
with open(os.path.join(config_path, 'gimp_ml_config.pkl'), 'rb') as file:
data_output = pickle.load(file)
weight_path = data_output["weight_path"]
image1 = cv2.imread(os.path.join(weight_path, '..', "cache0.png"))[:, :, ::-1]
image2 = cv2.imread(os.path.join(weight_path, '..', "cache1.png"))[:, :, ::-1]
with open(os.path.join(weight_path, '..', 'gimp_ml_run.pkl'), 'rb') as file:
data_output = pickle.load(file)
force_cpu = data_output["force_cpu"]
gio_file = data_output["gio_file"]
get_inter(image1, image2, gio_file, cpu_flag=force_cpu)
# cv2.imwrite(os.path.join(weight_path, '..', 'cache.png'), output[:, :, [2, 1, 0, 3]])
# with open(os.path.join(weight_path, 'gimp_ml_run.pkl'), 'wb') as file:
# pickle.dump({"run_success": True}, file)

@ -0,0 +1,76 @@
import pickle
import os
import sys
plugin_loc = os.path.dirname(os.path.realpath(__file__)) + '/'
sys.path.extend([plugin_loc + 'pytorch-deep-image-matting'])
import torch
from argparse import Namespace
import deepmatting_net
import cv2
import os
import numpy as np
from deploy import inference_img_whole
def get_matting(image, mask, cpu_flag=False):
if image.shape[2] == 4: # get rid of alpha channel
image = image[:, :, 0:3]
if mask.shape[2] == 4: # get rid of alpha channel
mask = mask[:, :, 0:3]
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
trimap = mask[:, :, 0]
cudaFlag = False
if torch.cuda.is_available() and not cpu_flag:
cudaFlag = True
args = Namespace(crop_or_resize='whole', cuda=cudaFlag, max_size=1600,
resume=os.path.join(weight_path, 'deepmatting', 'stage1_sad_57.1.pth'), stage=1)
model = deepmatting_net.VGG16(args)
if cudaFlag:
ckpt = torch.load(args.resume)
else:
ckpt = torch.load(args.resume, map_location=torch.device("cpu"))
model.load_state_dict(ckpt['state_dict'], strict=True)
if cudaFlag:
model = model.cuda()
# ckpt = torch.load(args.resume)
# model.load_state_dict(ckpt['state_dict'], strict=True)
# model = model.cuda()
torch.cuda.empty_cache()
with torch.no_grad():
pred_mattes = inference_img_whole(args, model, image, trimap)
pred_mattes = (pred_mattes * 255).astype(np.uint8)
pred_mattes[trimap == 255] = 255
pred_mattes[trimap == 0] = 0
# pred_mattes = np.repeat(pred_mattes[:, :, np.newaxis], 3, axis=2)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
pred_mattes = np.dstack((image, pred_mattes))
return pred_mattes
if __name__ == "__main__":
config_path = os.path.dirname(os.path.realpath(__file__))
with open(os.path.join(config_path, 'gimp_ml_config.pkl'), 'rb') as file:
data_output = pickle.load(file)
weight_path = data_output["weight_path"]
image1 = cv2.imread(os.path.join(weight_path, '..', "cache0.png"))[:, :, ::-1]
image2 = cv2.imread(os.path.join(weight_path, '..', "cache1.png"))[:, :, ::-1]
with open(os.path.join(weight_path, '..', 'gimp_ml_run.pkl'), 'rb') as file:
data_output = pickle.load(file)
force_cpu = data_output["force_cpu"]
if (np.sum(image1 == [0, 0, 0]) + np.sum(image1 == [255, 255, 255]) + np.sum(image1 == [128, 128, 128])) / (
image1.shape[0] * image1.shape[1] * 3) > 0.8:
output = get_matting(image2, image1, cpu_flag=force_cpu)
else:
output = get_matting(image1, image2, cpu_flag=force_cpu)
cv2.imwrite(os.path.join(weight_path, '..', 'cache.png'), output[:, :, [2, 1, 0, 3]])
# with open(os.path.join(weight_path, 'gimp_ml_run.pkl'), 'wb') as file:
# pickle.dump({"run_success": True}, file)

@ -0,0 +1,194 @@
import torch
import cv2
import os
import random
import numpy as np
from torchvision import transforms
import logging
def gen_trimap(alpha):
k_size = random.choice(range(2, 5))
iterations = np.random.randint(5, 15)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (k_size, k_size))
dilated = cv2.dilate(alpha, kernel, iterations=iterations)
eroded = cv2.erode(alpha, kernel, iterations=iterations)
trimap = np.zeros(alpha.shape)
trimap.fill(128)
#trimap[alpha >= 255] = 255
trimap[eroded >= 255] = 255
trimap[dilated <= 0] = 0
'''
alpha_unknown = alpha[trimap == 128]
num_all = alpha_unknown.size
num_0 = (alpha_unknown == 0).sum()
num_1 = (alpha_unknown == 255).sum()
print("Debug: 0 : {}/{} {:.3f}".format(num_0, num_all, float(num_0)/num_all))
print("Debug: 255: {}/{} {:.3f}".format(num_1, num_all, float(num_1)/num_all))
'''
return trimap
def compute_gradient(img):
x = cv2.Sobel(img, cv2.CV_16S, 1, 0)
y = cv2.Sobel(img, cv2.CV_16S, 0, 1)
absX = cv2.convertScaleAbs(x)
absY = cv2.convertScaleAbs(y)
grad = cv2.addWeighted(absX, 0.5, absY, 0.5, 0)
grad=cv2.cvtColor(grad, cv2.COLOR_BGR2GRAY)
return grad
class MatTransform(object):
def __init__(self, flip=False):
self.flip = flip
def __call__(self, img, alpha, fg, bg, crop_h, crop_w):
h, w = alpha.shape
# trimap is dilated maybe choose some bg region(0)
# random crop in the unknown region center
target = np.where((alpha > 0) & (alpha < 255))
delta_h = center_h = crop_h / 2
delta_w = center_w = crop_w / 2
if len(target[0]) > 0:
rand_ind = np.random.randint(len(target[0]))
center_h = min(max(target[0][rand_ind], delta_h), h - delta_h)
center_w = min(max(target[1][rand_ind], delta_w), w - delta_w)
# choose unknown point as center not as left-top
start_h = int(center_h - delta_h)
start_w = int(center_w - delta_w)
end_h = int(center_h + delta_h)
end_w = int(center_w + delta_w)
#print("Debug: center({},{}) start({},{}) end({},{}) alpha:{} alpha-len:{} unknown-len:{}".format(center_h, center_w, start_h, start_w, end_h, end_w, alpha[int(center_h), int(center_w)], alpha.size, len(target[0])))
img = img [start_h : end_h, start_w : end_w]
fg = fg [start_h : end_h, start_w : end_w]
bg = bg [start_h : end_h, start_w : end_w]
alpha = alpha [start_h : end_h, start_w : end_w]
# random flip
if self.flip and random.random() < 0.5:
img = cv2.flip(img, 1)
alpha = cv2.flip(alpha, 1)
fg = cv2.flip(fg, 1)
bg = cv2.flip(bg, 1)
return img, alpha, fg, bg
def get_files(mydir):
res = []
for root, dirs, files in os.walk(mydir, followlinks=True):
for f in files:
if f.endswith(".jpg") or f.endswith(".png") or f.endswith(".jpeg") or f.endswith(".JPG"):
res.append(os.path.join(root, f))
return res
# Dataset not composite online
class MatDatasetOffline(torch.utils.data.Dataset):
def __init__(self, args, transform=None, normalize=None):
self.samples=[]
self.transform = transform
self.normalize = normalize
self.args = args
self.size_h = args.size_h
self.size_w = args.size_w
self.crop_h = args.crop_h
self.crop_w = args.crop_w
self.logger = logging.getLogger("DeepImageMatting")
assert(len(self.crop_h) == len(self.crop_w))
fg_paths = get_files(self.args.fgDir)
self.cnt = len(fg_paths)
for fg_path in fg_paths:
alpha_path = fg_path.replace(self.args.fgDir, self.args.alphaDir)
img_path = fg_path.replace(self.args.fgDir, self.args.imgDir)
bg_path = fg_path.replace(self.args.fgDir, self.args.bgDir)
assert(os.path.exists(alpha_path))
assert(os.path.exists(fg_path))
assert(os.path.exists(bg_path))
assert(os.path.exists(img_path))
self.samples.append((alpha_path, fg_path, bg_path, img_path))
self.logger.info("MatDatasetOffline Samples: {}".format(self.cnt))
assert(self.cnt > 0)
def __getitem__(self,index):
alpha_path, fg_path, bg_path, img_path = self.samples[index]
img_info = [fg_path, alpha_path, bg_path, img_path]
# read fg, alpha
fg = cv2.imread(fg_path)[:, :, :3]
bg = cv2.imread(bg_path)[:, :, :3]
img = cv2.imread(img_path)[:, :, :3]
alpha = cv2.imread(alpha_path)[:, :, 0]
assert(bg.shape == fg.shape and bg.shape == img.shape)
img_info.append(fg.shape)
bh, bw, bc, = fg.shape
rand_ind = random.randint(0, len(self.crop_h) - 1)
cur_crop_h = self.crop_h[rand_ind]
cur_crop_w = self.crop_w[rand_ind]
# if ratio!=1: make the img (h==croph and w>=cropw)or(w==cropw and h>=croph)
wratio = float(cur_crop_w) / bw
hratio = float(cur_crop_h) / bh
ratio = wratio if wratio > hratio else hratio
if ratio > 1:
nbw = int(bw * ratio + 1.0)
nbh = int(bh * ratio + 1.0)
fg = cv2.resize(fg, (nbw, nbh), interpolation=cv2.INTER_LINEAR)
bg = cv2.resize(bg, (nbw, nbh), interpolation=cv2.INTER_LINEAR)
img = cv2.resize(img, (nbw, nbh), interpolation=cv2.INTER_LINEAR)
alpha = cv2.resize(alpha, (nbw, nbh), interpolation=cv2.INTER_LINEAR)
# random crop(crop_h, crop_w) and flip
if self.transform:
img, alpha, fg, bg = self.transform(img, alpha, fg, bg, cur_crop_h, cur_crop_w)
# resize to (size_h, size_w)
if self.size_h != img.shape[0] or self.size_w != img.shape[1]:
# resize
img =cv2.resize(img, (self.size_w, self.size_h), interpolation=cv2.INTER_LINEAR)
fg =cv2.resize(fg, (self.size_w, self.size_h), interpolation=cv2.INTER_LINEAR)
bg =cv2.resize(bg, (self.size_w, self.size_h), interpolation=cv2.INTER_LINEAR)
alpha =cv2.resize(alpha, (self.size_w, self.size_h), interpolation=cv2.INTER_LINEAR)
trimap = gen_trimap(alpha)
grad = compute_gradient(img)
if self.normalize:
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# first, 0-255 to 0-1
# second, x-mean/std and HWC to CHW
img_norm = self.normalize(img_rgb)
else:
img_norm = None
#img_id = img_info[0].split('/')[-1]
#cv2.imwrite("result/debug/{}_img.png".format(img_id), img)
#cv2.imwrite("result/debug/{}_alpha.png".format(img_id), alpha)
#cv2.imwrite("result/debug/{}_fg.png".format(img_id), fg)
#cv2.imwrite("result/debug/{}_bg.png".format(img_id), bg)
#cv2.imwrite("result/debug/{}_trimap.png".format(img_id), trimap)
#cv2.imwrite("result/debug/{}_grad.png".format(img_id), grad)
alpha = torch.from_numpy(alpha.astype(np.float32)[np.newaxis, :, :])
trimap = torch.from_numpy(trimap.astype(np.float32)[np.newaxis, :, :])
grad = torch.from_numpy(grad.astype(np.float32)[np.newaxis, :, :])
img = torch.from_numpy(img.astype(np.float32)).permute(2, 0, 1)
fg = torch.from_numpy(fg.astype(np.float32)).permute(2, 0, 1)
bg = torch.from_numpy(bg.astype(np.float32)).permute(2, 0, 1)
return img, alpha, fg, bg, trimap, grad, img_norm, img_info
def __len__(self):
return len(self.samples)

@ -0,0 +1,126 @@
import torch
import torch.nn as nn
import math
import cv2
import torch.nn.functional as F
class VGG16(nn.Module):
def __init__(self, args):
super(VGG16, self).__init__()
self.stage = args.stage
self.conv1_1 = nn.Conv2d(4, 64, kernel_size=3,stride = 1, padding=1,bias=True)
self.conv1_2 = nn.Conv2d(64, 64, kernel_size=3,stride = 1, padding=1,bias=True)
self.conv2_1 = nn.Conv2d(64, 128, kernel_size=3, padding=1,bias=True)
self.conv2_2 = nn.Conv2d(128, 128, kernel_size=3, padding=1,bias=True)
self.conv3_1 = nn.Conv2d(128, 256, kernel_size=3, padding=1,bias=True)
self.conv3_2 = nn.Conv2d(256, 256, kernel_size=3, padding=1,bias=True)
self.conv3_3 = nn.Conv2d(256, 256, kernel_size=3, padding=1,bias=True)
self.conv4_1 = nn.Conv2d(256, 512, kernel_size=3, padding=1,bias=True)
self.conv4_2 = nn.Conv2d(512, 512, kernel_size=3, padding=1,bias=True)
self.conv4_3 = nn.Conv2d(512, 512, kernel_size=3, padding=1,bias=True)
self.conv5_1 = nn.Conv2d(512, 512, kernel_size=3, padding=1,bias=True)
self.conv5_2 = nn.Conv2d(512, 512, kernel_size=3, padding=1,bias=True)
self.conv5_3 = nn.Conv2d(512, 512, kernel_size=3, padding=1,bias=True)
# model released before 2019.09.09 should use kernel_size=1 & padding=0
#self.conv6_1 = nn.Conv2d(512, 512, kernel_size=1, padding=0,bias=True)
self.conv6_1 = nn.Conv2d(512, 512, kernel_size=3, padding=1,bias=True)
self.deconv6_1 = nn.Conv2d(512, 512, kernel_size=1,bias=True)
self.deconv5_1 = nn.Conv2d(512, 512, kernel_size=5, padding=2,bias=True)
self.deconv4_1 = nn.Conv2d(512, 256, kernel_size=5, padding=2,bias=True)
self.deconv3_1 = nn.Conv2d(256, 128, kernel_size=5, padding=2,bias=True)
self.deconv2_1 = nn.Conv2d(128, 64, kernel_size=5, padding=2,bias=True)
self.deconv1_1 = nn.Conv2d(64, 64, kernel_size=5, padding=2,bias=True)
self.deconv1 = nn.Conv2d(64, 1, kernel_size=5, padding=2,bias=True)
if args.stage == 2:
# for stage2 training
for p in self.parameters():
p.requires_grad=False
if self.stage == 2 or self.stage == 3:
self.refine_conv1 = nn.Conv2d(4, 64, kernel_size=3, padding=1, bias=True)
self.refine_conv2 = nn.Conv2d(64, 64, kernel_size=3, padding=1, bias=True)
self.refine_conv3 = nn.Conv2d(64, 64, kernel_size=3, padding=1, bias=True)
self.refine_pred = nn.Conv2d(64, 1, kernel_size=3, padding=1, bias=True)
def forward(self, x):
# Stage 1
x11 = F.relu(self.conv1_1(x))
x12 = F.relu(self.conv1_2(x11))
x1p, id1 = F.max_pool2d(x12,kernel_size=(2,2), stride=(2,2),return_indices=True)
# Stage 2
x21 = F.relu(self.conv2_1(x1p))
x22 = F.relu(self.conv2_2(x21))
x2p, id2 = F.max_pool2d(x22,kernel_size=(2,2), stride=(2,2),return_indices=True)
# Stage 3
x31 = F.relu(self.conv3_1(x2p))
x32 = F.relu(self.conv3_2(x31))
x33 = F.relu(self.conv3_3(x32))
x3p, id3 = F.max_pool2d(x33,kernel_size=(2,2), stride=(2,2),return_indices=True)
# Stage 4
x41 = F.relu(self.conv4_1(x3p))
x42 = F.relu(self.conv4_2(x41))
x43 = F.relu(self.conv4_3(x42))
x4p, id4 = F.max_pool2d(x43,kernel_size=(2,2), stride=(2,2),return_indices=True)
# Stage 5
x51 = F.relu(self.conv5_1(x4p))
x52 = F.relu(self.conv5_2(x51))
x53 = F.relu(self.conv5_3(x52))
x5p, id5 = F.max_pool2d(x53,kernel_size=(2,2), stride=(2,2),return_indices=True)
# Stage 6
x61 = F.relu(self.conv6_1(x5p))
# Stage 6d
x61d = F.relu(self.deconv6_1(x61))
# Stage 5d
x5d = F.max_unpool2d(x61d,id5, kernel_size=2, stride=2)
x51d = F.relu(self.deconv5_1(x5d))
# Stage 4d
x4d = F.max_unpool2d(x51d, id4, kernel_size=2, stride=2)
x41d = F.relu(self.deconv4_1(x4d))
# Stage 3d
x3d = F.max_unpool2d(x41d, id3, kernel_size=2, stride=2)
x31d = F.relu(self.deconv3_1(x3d))
# Stage 2d
x2d = F.max_unpool2d(x31d, id2, kernel_size=2, stride=2)
x21d = F.relu(self.deconv2_1(x2d))
# Stage 1d
x1d = F.max_unpool2d(x21d, id1, kernel_size=2, stride=2)
x12d = F.relu(self.deconv1_1(x1d))
# Should add sigmoid? github repo add so.
raw_alpha = self.deconv1(x12d)
pred_mattes = F.sigmoid(raw_alpha)
if self.stage <= 1:
return pred_mattes, 0
# Stage2 refine conv1
refine0 = torch.cat((x[:, :3, :, :], pred_mattes), 1)
refine1 = F.relu(self.refine_conv1(refine0))
refine2 = F.relu(self.refine_conv2(refine1))
refine3 = F.relu(self.refine_conv3(refine2))
# Should add sigmoid?
# sigmoid lead to refine result all converge to 0...
#pred_refine = F.sigmoid(self.refine_pred(refine3))
pred_refine = self.refine_pred(refine3)
pred_alpha = F.sigmoid(raw_alpha + pred_refine)
#print(pred_mattes.mean(), pred_alpha.mean(), pred_refine.sum())
return pred_mattes, pred_alpha

@ -0,0 +1,36 @@
import torch
from argparse import Namespace
import net
import cv2
import os
import numpy as np
from deploy import inference_img_whole
# input file list
image_path = "boy-1518482_1920_12_img.png"
trimap_path = "boy-1518482_1920_12.png"
image = cv2.imread(image_path)
trimap = cv2.imread(trimap_path)
# print(trimap.shape)
trimap = trimap[:, :, 0]
# init model
args = Namespace(crop_or_resize='whole', cuda=True, max_size=1600, resume='model/stage1_sad_57.1.pth', stage=1)
model = net.VGG16(args)
ckpt = torch.load(args.resume)
model.load_state_dict(ckpt['state_dict'], strict=True)
model = model.cuda()
torch.cuda.empty_cache()
with torch.no_grad():
pred_mattes = inference_img_whole(args, model, image, trimap)
pred_mattes = (pred_mattes * 255).astype(np.uint8)
pred_mattes[trimap == 255] = 255
pred_mattes[trimap == 0] = 0
# print(pred_mattes)
# cv2.imwrite('out.png', pred_mattes)
# import matplotlib.pyplot as plt
# plt.imshow(image)
# plt.show()

@ -0,0 +1,281 @@
import torch
import argparse
import torch.nn as nn
import deepmatting_net
import cv2
import os
from torchvision import transforms
import torch.nn.functional as F
import numpy as np
import time
def get_args():
# Training settings
parser = argparse.ArgumentParser(description='PyTorch Super Res Example')
parser.add_argument('--size_h', type=int, default=320, help="height size of input image")
parser.add_argument('--size_w', type=int, default=320, help="width size of input image")
parser.add_argument('--imgDir', type=str, required=True, help="directory of image")
parser.add_argument('--trimapDir', type=str, required=True, help="directory of trimap")
parser.add_argument('--cuda', action='store_true', help='use cuda?')
parser.add_argument('--resume', type=str, required=True, help="checkpoint that model resume from")
parser.add_argument('--saveDir', type=str, required=True, help="where prediction result save to")
parser.add_argument('--alphaDir', type=str, default='', help="directory of gt")
parser.add_argument('--stage', type=int, required=True, choices=[0,1,2,3], help="backbone stage")
parser.add_argument('--not_strict', action='store_true', help='not copy ckpt strict?')
parser.add_argument('--crop_or_resize', type=str, default="whole", choices=["resize", "crop", "whole"], help="how manipulate image before test")
parser.add_argument('--max_size', type=int, default=1600, help="max size of test image")
args = parser.parse_args()
print(args)
return args
def gen_dataset(imgdir, trimapdir):
sample_set = []
img_ids = os.listdir(imgdir)
img_ids.sort()
cnt = len(img_ids)
cur = 1
for img_id in img_ids:
img_name = os.path.join(imgdir, img_id)
trimap_name = os.path.join(trimapdir, img_id)
assert(os.path.exists(img_name))
assert(os.path.exists(trimap_name))
sample_set.append((img_name, trimap_name))
return sample_set
def compute_gradient(img):
x = cv2.Sobel(img, cv2.CV_16S, 1, 0)
y = cv2.Sobel(img, cv2.CV_16S, 0, 1)
absX = cv2.convertScaleAbs(x)
absY = cv2.convertScaleAbs(y)
grad = cv2.addWeighted(absX, 0.5, absY, 0.5, 0)
grad=cv2.cvtColor(grad, cv2.COLOR_BGR2GRAY)
return grad
# inference once for image, return numpy
def inference_once(args, model, scale_img, scale_trimap, aligned=True):
if aligned:
assert(scale_img.shape[0] == args.size_h)
assert(scale_img.shape[1] == args.size_w)
normalize = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(mean = [0.485, 0.456, 0.406],std = [0.229, 0.224, 0.225])
])
scale_img_rgb = cv2.cvtColor(scale_img, cv2.COLOR_BGR2RGB)
# first, 0-255 to 0-1
# second, x-mean/std and HWC to CHW
tensor_img = normalize(scale_img_rgb).unsqueeze(0)
scale_grad = compute_gradient(scale_img)
#tensor_img = torch.from_numpy(scale_img.astype(np.float32)[np.newaxis, :, :, :]).permute(0, 3, 1, 2)
tensor_trimap = torch.from_numpy(scale_trimap.astype(np.float32)[np.newaxis, np.newaxis, :, :])
tensor_grad = torch.from_numpy(scale_grad.astype(np.float32)[np.newaxis, np.newaxis, :, :])
if args.cuda:
tensor_img = tensor_img.cuda()
tensor_trimap = tensor_trimap.cuda()
tensor_grad = tensor_grad.cuda()
#print('Img Shape:{} Trimap Shape:{}'.format(img.shape, trimap.shape))
input_t = torch.cat((tensor_img, tensor_trimap / 255.), 1)
# forward
if args.stage <= 1:
# stage 1
pred_mattes, _ = model(input_t)
else:
# stage 2, 3
_, pred_mattes = model(input_t)
pred_mattes = pred_mattes.data
if args.cuda:
pred_mattes = pred_mattes.cpu()
pred_mattes = pred_mattes.numpy()[0, 0, :, :]
return pred_mattes
# forward for a full image by crop method
def inference_img_by_crop(args, model, img, trimap):
# crop the pictures, and forward one by one
h, w, c = img.shape
origin_pred_mattes = np.zeros((h, w), dtype=np.float32)
marks = np.zeros((h, w), dtype=np.float32)
for start_h in range(0, h, args.size_h):
end_h = start_h + args.size_h
for start_w in range(0, w, args.size_w):
end_w = start_w + args.size_w
crop_img = img[start_h: end_h, start_w: end_w, :]
crop_trimap = trimap[start_h: end_h, start_w: end_w]
crop_origin_h = crop_img.shape[0]
crop_origin_w = crop_img.shape[1]
#print("startH:{} startW:{} H:{} W:{}".format(start_h, start_w, crop_origin_h, crop_origin_w))
if len(np.where(crop_trimap == 128)[0]) <= 0:
continue
# egde patch in the right or bottom
if crop_origin_h != args.size_h or crop_origin_w != args.size_w:
crop_img = cv2.resize(crop_img, (args.size_w, args.size_h), interpolation=cv2.INTER_LINEAR)
crop_trimap = cv2.resize(crop_trimap, (args.size_w, args.size_h), interpolation=cv2.INTER_LINEAR)
# inference for each crop image patch
pred_mattes = inference_once(args, model, crop_img, crop_trimap)
if crop_origin_h != args.size_h or crop_origin_w != args.size_w:
pred_mattes = cv2.resize(pred_mattes, (crop_origin_w, crop_origin_h), interpolation=cv2.INTER_LINEAR)
origin_pred_mattes[start_h: end_h, start_w: end_w] += pred_mattes
marks[start_h: end_h, start_w: end_w] += 1
# smooth for overlap part
marks[marks <= 0] = 1.
origin_pred_mattes /= marks
return origin_pred_mattes
# forward for a full image by resize method
def inference_img_by_resize(args, model, img, trimap):
h, w, c = img.shape
# resize for network input, to Tensor
scale_img = cv2.resize(img, (args.size_w, args.size_h), interpolation=cv2.INTER_LINEAR)
scale_trimap = cv2.resize(trimap, (args.size_w, args.size_h), interpolation=cv2.INTER_LINEAR)
pred_mattes = inference_once(args, model, scale_img, scale_trimap)
# resize to origin size
origin_pred_mattes = cv2.resize(pred_mattes, (w, h), interpolation = cv2.INTER_LINEAR)
assert(origin_pred_mattes.shape == trimap.shape)
return origin_pred_mattes
# forward a whole image
def inference_img_whole(args, model, img, trimap):
h, w, c = img.shape
new_h = min(args.max_size, h - (h % 32))
new_w = min(args.max_size, w - (w % 32))
# resize for network input, to Tensor
scale_img = cv2.resize(img, (new_w, new_h), interpolation=cv2.INTER_LINEAR)
scale_trimap = cv2.resize(trimap, (new_w, new_h), interpolation=cv2.INTER_LINEAR)
pred_mattes = inference_once(args, model, scale_img, scale_trimap, aligned=False)
# resize to origin size
origin_pred_mattes = cv2.resize(pred_mattes, (w, h), interpolation = cv2.INTER_LINEAR)
assert(origin_pred_mattes.shape == trimap.shape)
return origin_pred_mattes
def main():
print("===> Loading args")
args = get_args()
print("===> Environment init")
#os.environ["CUDA_VISIBLE_DEVICES"] = "0"
if args.cuda and not torch.cuda.is_available():
raise Exception("No GPU found, please run without --cuda")
model = net.VGG16(args)
ckpt = torch.load(args.resume)
if args.not_strict:
model.load_state_dict(ckpt['state_dict'], strict=False)
else:
model.load_state_dict(ckpt['state_dict'], strict=True)
if args.cuda:
model = model.cuda()
print("===> Load dataset")
dataset = gen_dataset(args.imgDir, args.trimapDir)
mse_diffs = 0.
sad_diffs = 0.
cnt = len(dataset)
cur = 0
t0 = time.time()
for img_path, trimap_path in dataset:
img = cv2.imread(img_path)
trimap = cv2.imread(trimap_path)[:, :, 0]
assert(img.shape[:2] == trimap.shape[:2])
img_info = (img_path.split('/')[-1], img.shape[0], img.shape[1])
cur += 1
print('[{}/{}] {}'.format(cur, cnt, img_info[0]))
with torch.no_grad():
torch.cuda.empty_cache()
if args.crop_or_resize == "whole":
origin_pred_mattes = inference_img_whole(args, model, img, trimap)
elif args.crop_or_resize == "crop":
origin_pred_mattes = inference_img_by_crop(args, model, img, trimap)
else:
origin_pred_mattes = inference_img_by_resize(args, model, img, trimap)
# only attention unknown region
origin_pred_mattes[trimap == 255] = 1.
origin_pred_mattes[trimap == 0 ] = 0.
# origin trimap
pixel = float((trimap == 128).sum())
# eval if gt alpha is given
if args.alphaDir != '':
alpha_name = os.path.join(args.alphaDir, img_info[0])
assert(os.path.exists(alpha_name))
alpha = cv2.imread(alpha_name)[:, :, 0] / 255.
assert(alpha.shape == origin_pred_mattes.shape)
#x1 = (alpha[trimap == 255] == 1.0).sum() # x3
#x2 = (alpha[trimap == 0] == 0.0).sum() # x5
#x3 = (trimap == 255).sum()
#x4 = (trimap == 128).sum()
#x5 = (trimap == 0).sum()
#x6 = trimap.size # sum(x3,x4,x5)
#x7 = (alpha[trimap == 255] < 1.0).sum() # 0
#x8 = (alpha[trimap == 0] > 0).sum() #
#print(x1, x2, x3, x4, x5, x6, x7, x8)
#assert(x1 == x3)
#assert(x2 == x5)
#assert(x6 == x3 + x4 + x5)
#assert(x7 == 0)
#assert(x8 == 0)
mse_diff = ((origin_pred_mattes - alpha) ** 2).sum() / pixel
sad_diff = np.abs(origin_pred_mattes - alpha).sum()
mse_diffs += mse_diff
sad_diffs += sad_diff
print("sad:{} mse:{}".format(sad_diff, mse_diff))
origin_pred_mattes = (origin_pred_mattes * 255).astype(np.uint8)
res = origin_pred_mattes.copy()
# only attention unknown region
res[trimap == 255] = 255
res[trimap == 0 ] = 0
if not os.path.exists(args.saveDir):
os.makedirs(args.saveDir)
cv2.imwrite(os.path.join(args.saveDir, img_info[0]), res)
print("Avg-Cost: {} s/image".format((time.time() - t0) / cnt))
if args.alphaDir != '':
print("Eval-MSE: {}".format(mse_diffs / cur))
print("Eval-SAD: {}".format(sad_diffs / cur))
if __name__ == "__main__":
main()

@ -0,0 +1,63 @@
import torch
import torchvision
import collections
import os
HOME = os.environ['HOME']
model_path = "{}/.torch/models/vgg16-397923af.pth".format(HOME)
#model_path = "/data/liuliang/deep_image_matting/train/vgg16-397923af.pth"
if not os.path.exists(model_path):
model = torchvision.models.vgg16(pretrained=True)
assert(os.path.exists(model_path))
x = torch.load(model_path)
val = collections.OrderedDict()
val['conv1_1.weight'] = torch.cat((x['features.0.weight'], torch.zeros(64, 1, 3, 3)), 1)
replace = { u'features.0.bias' : 'conv1_1.bias',
u'features.2.weight' : 'conv1_2.weight',
u'features.2.bias' : 'conv1_2.bias',
u'features.5.weight' : 'conv2_1.weight',
u'features.5.bias' : 'conv2_1.bias',
u'features.7.weight' : 'conv2_2.weight',
u'features.7.bias' : 'conv2_2.bias',
u'features.10.weight': 'conv3_1.weight',
u'features.10.bias' : 'conv3_1.bias',
u'features.12.weight': 'conv3_2.weight',
u'features.12.bias' : 'conv3_2.bias',
u'features.14.weight': 'conv3_3.weight',
u'features.14.bias' : 'conv3_3.bias',
u'features.17.weight': 'conv4_1.weight',
u'features.17.bias' : 'conv4_1.bias',
u'features.19.weight': 'conv4_2.weight',
u'features.19.bias' : 'conv4_2.bias',
u'features.21.weight': 'conv4_3.weight',
u'features.21.bias' : 'conv4_3.bias',
u'features.24.weight': 'conv5_1.weight',
u'features.24.bias' : 'conv5_1.bias',
u'features.26.weight': 'conv5_2.weight',
u'features.26.bias' : 'conv5_2.bias',
u'features.28.weight': 'conv5_3.weight',
u'features.28.bias' : 'conv5_3.bias'
}
#print(x['classifier.0.weight'].shape)
#print(x['classifier.0.bias'].shape)
#tmp1 = x['classifier.0.weight'].reshape(4096, 512, 7, 7)
#print(tmp1.shape)
#val['conv6_1.weight'] = tmp1[:512, :, :, :]
#val['conv6_1.bias'] = x['classifier.0.bias']
for key in replace.keys():
print(key, replace[key])
val[replace[key]] = x[key]
y = {}
y['state_dict'] = val
y['epoch'] = 0
if not os.path.exists('./model'):
os.makedirs('./model')
torch.save(y, './model/vgg_state_dict.pth')

@ -0,0 +1,137 @@
# composite image with dataset from "deep image matting"
import os
import cv2
import math
import time
import shutil
root_dir = "/home/liuliang/Downloads/Combined_Dataset"
test_bg_dir = '/home/liuliang/Desktop/dataset/matting/VOCdevkit/VOC2012/JPEGImages'
train_bg_dir = '/home/liuliang/Desktop/dataset/matting/mscoco/train2017'
def my_composite(fg_names, bg_names, fg_dir, alpha_dir, bg_dir, num_bg, comp_dir):
fg_ids = open(fg_names).readlines()
bg_ids = open(bg_names).readlines()
fg_cnt = len(fg_ids)
bg_cnt = len(bg_ids)
print(fg_cnt, bg_cnt)
assert(fg_cnt * num_bg == bg_cnt)
for i in range(fg_cnt):
im_name = fg_ids[i].strip("\n").strip("\r")
fg_path = os.path.join(fg_dir, im_name)
alpha_path = os.path.join(alpha_dir, im_name)
#print(fg_path, alpha_path)
assert(os.path.exists(fg_path))
assert(os.path.exists(alpha_path))
fg = cv2.imread(fg_path)
alpha = cv2.imread(alpha_path)
#print("alpha shape:", alpha.shape, "image shape:", fg.shape)
assert(alpha.shape == fg.shape)
h, w ,c = fg.shape
base = i * num_bg
for bcount in range(num_bg):
bg_path = os.path.join(bg_dir, bg_ids[base + bcount].strip("\n").strip("\r"))
print(base + bcount, fg_path, bg_path)
assert(os.path.exists(bg_path))
bg = cv2.imread(bg_path)
bh, bw, bc = bg.shape
wratio = float(w) / bw
hratio = float(h) / bh
ratio = wratio if wratio > hratio else hratio
if ratio > 1:
new_bw = int(bw * ratio + 1.0)
new_bh = int(bh * ratio + 1.0)
bg = cv2.resize(bg, (new_bw, new_bh), interpolation=cv2.INTER_LINEAR)
bg = bg[0 : h, 0 : w, :]
#print(bg.shape)
assert(bg.shape == fg.shape)
alpha_f = alpha / 255.
comp = fg * alpha_f + bg * (1. - alpha_f)
img_save_id = im_name[:len(im_name)-4] + '_' + str(bcount) + '.png'
comp_save_path = os.path.join(comp_dir, "image/" + img_save_id)
fg_save_path = os.path.join(comp_dir, "fg/" + img_save_id)
bg_save_path = os.path.join(comp_dir, "bg/" + img_save_id)
alpha_save_path = os.path.join(comp_dir, "alpha/" + img_save_id)
cv2.imwrite(comp_save_path, comp)
cv2.imwrite(fg_save_path, fg)
cv2.imwrite(bg_save_path, bg)
cv2.imwrite(alpha_save_path, alpha)
def copy_dir2dir(src_dir, des_dir):
for img_id in os.listdir(src_dir):
shutil.copyfile(os.path.join(src_dir, img_id), os.path.join(des_dir, img_id))
def main():
test_num_bg = 20
test_fg_names = os.path.join(root_dir, "Test_set/test_fg_names.txt")
test_bg_names = os.path.join(root_dir, "Test_set/test_bg_names.txt")
test_fg_dir = os.path.join(root_dir, "Test_set/Adobe-licensed images/fg")
test_alpha_dir = os.path.join(root_dir, "Test_set/Adobe-licensed images/alpha")
test_trimap_dir = os.path.join(root_dir, "Test_set/Adobe-licensed images/trimaps")
test_comp_dir = os.path.join(root_dir, "Test_set/comp")
train_num_bg = 100
train_fg_names = os.path.join(root_dir, "Training_set/training_fg_names.txt")
train_bg_names_coco2014 = os.path.join(root_dir, "Training_set/training_bg_names.txt")
train_bg_names_coco2017 = os.path.join(root_dir, "Training_set/training_bg_names_coco2017.txt")
train_fg_dir = os.path.join(root_dir, "Training_set/all/fg")
train_alpha_dir = os.path.join(root_dir, "Training_set/all/alpha")
train_comp_dir = os.path.join(root_dir, "Training_set/comp")
# change the bg names formate if is coco 2017
fin = open(train_bg_names_coco2014, 'r')
fout = open(train_bg_names_coco2017, 'w')
lls = fin.readlines()
for l in lls:
fout.write(l[15:])
fin.close()
fout.close()
if not os.path.exists(test_comp_dir):
os.makedirs(test_comp_dir + '/image')
os.makedirs(test_comp_dir + '/fg')
os.makedirs(test_comp_dir + '/bg')
os.makedirs(test_comp_dir + '/alpha')
os.makedirs(test_comp_dir + '/trimap')
if not os.path.exists(train_comp_dir):
os.makedirs(train_comp_dir + '/image')
os.makedirs(train_comp_dir + '/fg')
os.makedirs(train_comp_dir + '/bg')
os.makedirs(train_comp_dir + '/alpha')
if not os.path.exists(train_alpha_dir):
os.makedirs(train_alpha_dir)
if not os.path.exists(train_fg_dir):
os.makedirs(train_fg_dir)
# copy test trimaps
copy_dir2dir(test_trimap_dir, test_comp_dir + '/trimap')
# copy train images together
copy_dir2dir(os.path.join(root_dir, "Training_set/Adobe-licensed images/alpha"), train_alpha_dir)
copy_dir2dir(os.path.join(root_dir, "Training_set/Adobe-licensed images/fg"), train_fg_dir)
copy_dir2dir(os.path.join(root_dir, "Training_set/Other/alpha"), train_alpha_dir)
copy_dir2dir(os.path.join(root_dir, "Training_set/Other/fg"), train_fg_dir)
# composite test image
my_composite(test_fg_names, test_bg_names, test_fg_dir, test_alpha_dir, test_bg_dir, test_num_bg, test_comp_dir)
# composite train image
my_composite(train_fg_names, train_bg_names_coco2017, train_fg_dir, train_alpha_dir, train_bg_dir, train_num_bg, train_comp_dir)
if __name__ == "__main__":
main()

@ -0,0 +1,50 @@
import re
import numpy as np
import matplotlib.pyplot as plt
from pylab import *
# args: log_name, match_rule, self_log_interval, smooth_log_interation
loss_file_name = "simple_loss"
title = "{}_Loss".format(loss_file_name)
f = open("../log/{}.log".format(loss_file_name))
pattern = re.compile(r'Loss:[ ]*\d+\.\d+')
self_inter = 10
smooth = 20
# read log file
lines = f.readlines()
print("Line: {}".format(len(lines)))
ys = []
k = 0
cnt = 0
sum_y = 0.
# read one by one
for line in lines:
obj = re.search(pattern, line)
if obj:
val = float(obj.group().split(':')[-1])
sum_y += val
k += 1
if k >= smooth:
ys.append(sum_y / k)
sum_y = 0.
k = 0
cnt += 1
if cnt % 10 == 0:
print("ys cnt: {}".format(cnt))
if k > 0:
ys.append(sum_y / k)
ys = np.array(ys)
xs = np.arange(len(ys)) * self_inter * smooth
print(xs)
print(ys)
plt.plot(xs, ys)
plt.title(title)
plt.xlabel("Iter")
plt.ylabel("Loss")
plt.savefig("../log/{}.png".format(title))
plt.show()

@ -0,0 +1,27 @@
:<<BATCH
@echo off
echo **** GIMP-ML Setup started ****
python -m pip install virtualenv
python -m virtualenv gimpenv3
if "%1"=="gpu" (gimpenv3\Scripts\python.exe -m pip install torch==1.8.1+cu102 torchvision==0.9.1+cu102 -f https://download.pytorch.org/whl/torch_stable.html) else (gimpenv3\Scripts\python.exe -m pip install torch==1.8.1+cpu torchvision==0.9.1+cpu -f https://download.pytorch.org/whl/torch_stable.html)
gimpenv3\Scripts\python.exe -m pip install GIMP3-ML-pip\.
gimpenv3\Scripts\python.exe -c "import gimpml; gimpml.setup_python_weights()"
echo **** GIMP-ML Setup Ended ****
exit /b
BATCH
echo '**** GIMP-ML Setup started ****'
if python --version 2>&1 | grep -q '^Python 3\.'; then #
echo 'Python 3 found.' #
else #
if python3 --version 2>&1 | grep -q '^Python 3\.'; then #
echo 'Python 3 found.' #
alias python='python3' #
fi #
fi #
python -m pip install virtualenv
python -m virtualenv gimpenv3 #
source gimpenv3/bin/activate #
python -m pip install GIMP3-ML-pip/.
python -c "import gimpml; gimpml.setup_python_weights()"
deactivate #
echo '*** GIMP-ML Setup Ended ****'

@ -1,6 +1,5 @@
from setuptools import setup, find_packages
import os
here = os.path.dirname(os.path.realpath(__file__))
with open(os.path.join(here, "README.md"), "r") as fh:
long_description = fh.read()
@ -69,12 +68,12 @@ setup(
#
# For an analysis of "install_requires" vs pip's requirements files see:
# https://packaging.python.org/en/latest/requirements.html
install_requires=['torchvision==0.5.0; python_version <= "2.7"', 'torchvision; python_version > "2.7"',
'numpy', 'future; python_version <= "2.7"',
'torch==1.4.0; python_version <= "2.7"', 'torch; python_version > "2.7"',
'scipy', 'gdown',
'typing', 'enum; python_version <= "2.7"', 'requests', 'opencv-python<=4.3',
'pretrainedmodels'], # Optional
install_requires=['numpy', 'future; python_version <= "2.7"',
'scipy', 'gdown', 'typing', 'enum; python_version <= "2.7"', 'requests', 'opencv-python<=4.3',
'pretrainedmodels', "torch", "torchvision"],
# Optional
# List additional groups of dependencies here (e.g. development
# dependencies). Users will be able to install these using the "extras"
@ -115,7 +114,7 @@ setup(
# ],
# },
# List additional URLs that are relevant to your project as a dict.
# List additional is that are relevant to your project as a dict.
#
# This field corresponds to the "Project-URL" metadata fields:
# https://packaging.python.org/specifications/core-metadata/#project-url-multiple-use

Loading…
Cancel
Save