Coloring-greyscale-images

授权协议 MIT License
开发语言 Python
所属分类 神经网络/人工智能、 机器学习/深度学习
软件类型 开源软件
地区 不详
投 递 者 金高轩
操作系统 跨平台
开源组织
适用人群 未知
 软件概览


A detailed tutorial covering the code in this repository: Coloring Black and White photos with Neural Networks

Plug: I write about learning Machine learning online and independent research �� Enjoy!

The network is built in four parts and gradually becomes more complex. The first part is the bare minimum to understand the core parts of the network. It's built to color one image. Once I have something to experiment with, I find it easier to add the remaining 80% of the network.

For the second stage, the Beta version, I start automating the training flow. In the full version, I add features from a pre-trained classifier. The GAN version is not covered in the tutorial. It's an experimental version using some of the emerging best practices in image colorization.

�� Featured by Google >>>

Note: The display images below are cherry-picked. A large majority of the images are mostly black and white or are lightly colored in brown. A narrow and simple dataset often creates better results.

Installation

pip install keras tensorflow pillow h5py jupyter scikit-image
git clone https://github.com/emilwallner/Coloring-greyscale-images
cd Coloring-greyscale-images/
jupyter notebook

Go do the desired notebook, files that end with '.ipynb'. To run the model, go to the menu then click on Cell > Run all

For the GAN version, enter the GAN-version folder, and run:

python3 colorize_base.py

Pre-trained weights: Download the pre-trained weights for the GAN-version here. Create a folder called 'resources' and put it inside of Coloring-greyscale-images/GAN-version/. It's trained on contemporary photography with different objects but not a lot of people.

Alpha Version

This is a great starting point to get a hang of the moving pieces. How an image is transformed into RGB pixel values and later translated into LAB pixel values, changing the color space. It also builds a core intuition for how the network learns. How the network compares the input with the output and adjusts the network.

In this version, you will see a result in a few minutes. Once you have trained the network, try coloring an image it was not trained on. This will build an intuition for the purpose of the later versions.

Beta Version

The network in the beta version is very similar to the alpha version. The difference is that we use more than one image to train the network. I'd recommend running top/htop and nvidia-smi to see how different batch sizes affect your computer's memory.

For this model, I'd go with a this cropped celebrity dataset or Nvidia's StyleGAN dataset. Because the images are very similar, the network can learn basic colorization despite being trivial. To get a feel for the limits of this network, you can try it on this dataset of diverse images from Unsplash. If you are on a laptop, I'd run it for a day. If you are using a GPU, train it at least 6 - 12h.

Full Version

The full version adds information from a pre-trained classifier. You can think of the information as 20% nature, 30% humans, 30% sky, and 20% brick buildings. It then learns to combine that information with the black and white photo. It gives the network more confidence to color the image. Otherwise, it tends to default to the safest color, brown.

The model comes from an elegant insight by Baldassarre and his team.

In the article, I use the Unsplash dataset, but in retrospect, I'd choose five to ten categories in the Imagenet dataset. You can also go with the Nvidia's StyleGAN dataset or create a dataset from Pixabay categories. You'll start getting some results after about 12 - 24 hours on a GPU.

GAN Version

The GAN version uses Generative Adversarial Networks to make the coloring more consistent and vibrant. However, the network is a magnitude more complex and requires more computing power to work with. Many of the techniques in this network are inspired by the brilliant work of Jason Antic and his DeOldify coloring network.

In breif, the generator comes from the pix2pix model, the discriminators and loss function from the pix2pixHD model, and a few optimizations from the Self-Attention GAN. If you want to experiment with this approach, I'd recommend starting with Erik Linder-Norén's excellent pix2pix implementation.

Implementation details:

  • With a 16GB GPU, you can fit 150 images that are 128x128 and 25 images that are 256x256.
  • The learning improved a magnitude faster on the 128x128 images compared to the 256x256 images.
  • I'd recommend experimenting with pre-trained U-nets (One of the secrets in Jason's model)
  • Test different normalizations. I prefer spectral normalization, but I've also added instance normalization.
  • The network uses 3 discriminators for different image resolutions, based on the pix2pixHD paper. However, this might be overkill, so I'd try it with one.
  • Nvidia's StyleGAN model has shown some incredible images. It might be worth experimenting with some of the best practice they developed. Same goes with the Large Scale GAN paper.
  • I've added the pix2pixHD generator, but it requires more compute to converge.
  • The image generator has some memory problems. Perhaps go with the original generator in Keras or find something equivalent.
  • If you want to build your own dataset, I've inluded a few scraping and cleaning scripts in 'download_and_clean_data_scripts'. You can build the datasets based on keywords from Yahoo's 100M images or Pixabay.
  • I've implemented it for multi-gpu, however, all the models are copied on each GPU. This increases the batch sizes which improves the result, but it only marginally increases images/sec. I'd recommend specifing on which GPU each model is loaded, to avoid merging the weights for each batch.

Run the code on FloydHub

Click this button to open a Workspace on FloydHub where you will find the same environment and dataset used for the Full version.

Acknowledgments

  • Thanks to IBM for donating computing power through their PowerAI platform
  • The full-model is a reproduction of Baldassarre alt el., 2017. Code Paper
  • The advanced model is inspired by the pix2pix, pix2pixHD, SA-GAN, and DeOldify models.
 相关资料
  • 描述 (Description) 它会丢弃所选元素中颜色的饱和度。 它有以下参数 - color - 它代表一个颜色对象。 例子 (Example) 以下示例演示了在LESS文件中使用灰度颜色操作 - <html> <head> <title>Greyscale</title> <link rel = "stylesheet" type = "text/css" hr

  • 描述 (Description) 对标签着色可以通过使用用于按钮的类和图标来完成。 可以使用任何图标字体或基于图像的图标。 例子 (Example) 以下示例演示如何在Foundation中使用label coloring 。 <html> <head> <title>Label Coloring</title> <link rel = "stylesheet" hr

  • 描述 (Description) 您可以使用.secondary , .primary , .success , .warning或.alert类为标注.alert 。 例子 (Example) 以下示例演示了在Foundation中使用coloring - <!doctype html> <head> <meta charset = "utf-8" /> <meta

  • 描述 (Description) 按钮组中的每个按钮都可以单独着色,也可以使用同一个类对每个按钮进行着色。 例子 (Example) 以下示例演示如何在Foundation中为color button group 。 <html> <head> <title>Button Group Coloring</title> <link rel = "stylesheet"

  • 描述 (Description) 按钮可以着色以赋予它们额外的含义。 可以使用.disabled类禁用按钮,该类将按钮显示为淡入淡出,并且不会禁用该控件。 如果要禁用“按钮”元素,请为其添加disabled属性。 如果要禁用链接,请添加aria-disabled属性。 例子 (Example) 以下示例演示如何为Foundation中color the buttons 。 <html> <h

  • 本章将教您如何使用JOGL将颜色应用于对象。 要将颜色应用于对象,请使用GL2 glColor()方法。 下面给出了使用glColor方法的语法。 语法 (Syntax) gl.glColorXY(1f,0f,0f); Where, X表示使用的颜色数,3(红色,绿色,蓝色)或4(红色,绿色,蓝色,alpha)。 要获得各种颜色组合,这些颜色的值将作为参数传递。 必须按该顺序维护颜色参数的顺序