TriColVid – Colorvision – Issue 1 (Nov/Dec 2024)

Uncertainty of Colorization (Colorvision – Issue 1 (Nov/Dec 2024))

Charles UMESI

Introduction

When film director Peter Jackson colorized footage from World War 1, which he used to produce the film "They Shall Not Grow Old" in 2018, it was to critical acclaim. The quality of the colorized image brought a subjective visual realism to the scenes, that emotionally, you felt you were there. Accomplishing such a feat was formidable, but since the rapidly increasing pervasiveness of machine learning (now commonly called "artificial intelligence" or "AI" even though machine learning is only part of AI), colorization has become easier compared to what Jackson had to do. However, machine learning is at its heart, statistical modelling, and hence, probabilistic, and with that, comes uncertainty in the accuracy of the colors produced.

Main

Developing a mechanism for accurate colorization is difficult. As McCarthy (2021) alludes to, color perception is a 3D process. Traditionally, we think of colors as either a primary of red/green/blue, or mixture of two/three primaries, but our perception of color occurs along three dimensions: hue, saturation, and lightness. The problem with grayscale perception is that it occurs in one dimension only—lightness. And just as it is difficult to accurately determine the volume of a cuboid by only seeing its height, why should color determination from a grayscale image be any different?

Various approaches and algorithms for overcoming the obstacles highlighted have been developed. One approach makes use of the L*a*b* color space during training, described by Zhang et al. (2016). A different approach works directly with the RGB color space—this is the approach used for TriColVid (Elshazly et al., 2023; see References for TriColVid data). Deep learning is heavily used in AI colorization, for which a range of neural network architectures have been developed, as have a range of training regimens; reviews are given by Dalal et al. (2021) and Liang et al. (2024). Accuracies of results have been variable (Dalal et al., 2021; Liang et al., 2024). Indeed, trying to determine the original colors in these grayscale images based on lightness alone has proved challenging, even when working with L*a*b* space, which is supposed to be a more discriminatory color space than RGB (Print Peppermint, 2022). Add to that, the fact that even when a particular lightness of gray is correctly tracked to the right color in the RGB space in one image, that same lightness might not equate to the same RGB color in a different image. In the same way that if from a dimension in a cuboid, say 5 cm, we correctly identify the volume of that cuboid as say, 300 cm³ in one image, that same 5 cm might not translate to a cuboid volume of 300 cm³ in a different image (the volume could be significantly different). That sums up the problem with AI colorization.

Despite these obstacles, interest in colorization is intense. Not just because of the aesthetic qualities of such a process, but also because colorizing historic grayscale images and videos are not the only applications of the process; there is also surveillance, infrared and night vision colorization, to mention but a few. It is appreciated that during colorization, knowledge of the background and context to the image is also important (McCarty, 2021).

Conclusion

Colorization of grayscale images bring a subjective visual realism that has brought critical acclaim, but the process is difficult to accomplish. The use of AI has recently made the task easier, but due to the probabilistic nature of its results, accuracy is challenging. Add to the fact that color perception is a 3D process while grayscale perception is 1D. So, trying to determine the correct color from a grayscale image is akin to attempting to determine the volume of a cuboid by only seeing its height. Despite the challenges, there is intense interest in colorization, not least because it has other applications. It is appreciated that during colorization, understanding the background and context to the image are also important.

References

Dalal et al. (2021) “Image Colorization Progress: A Review of Deep Learning Techniques for Automation of Colorization”, IJATCSE, 10: 2908–15, DOI: https://doi.org/10.30534/ijatcse/2021/401042021.

Elshazly et al. (2023) “Image Colorization Using GANs”. Available from: https://www.kaggle.com/code/ziyadelshazly/image-colorization-using-gans/notebook [Accessed June 24, 2024].

Liang et al. (2024) “Grayscale Image Colorization with GAN and CycleGAN in Different Image Domains”, arXiv:2401.11425v1 [cs.CV], DOI: https://doi.org/10.48550/arXiv.2401.11425.

McCarty (2021) “AI can’t color old photos accurately. Here’s why”, Scienceline. Available from: https://scienceline.org/2021/01/ai-cant-color-old-photos [Accessed: November 12, 2024].

Print Peppermint (2022) “What Is Lab Color Space? And what should you know about it?”. Available from: https://printpeppermint.com/blogs/graphic-design/what-is-lab-color-space-and-what-should-you-know-about-it [Accessed: November 13, 2024].

Zhang et al. (2016) “Colorful Image Colorization”, arXiv:1603.08511 [cs.CV], DOI: https://doi.org/10.48550/arXiv.1603.08511.

(Details of the neural network currently being used for TriColVid, training regimen and subsequent results, including losses, can be accessed here.)