That recent 2D NeRF-like model "LIIF" works pretty nicely on emojis 😎 pic.twitter.com/hDyaxj82sM
— Louis Maddox (@permutans) January 2, 2021
By popular demand: 🔎 more 🔎 emoji 🔎 magnification 🔎 pic.twitter.com/5dYN2omaPP
— Louis Maddox (@permutans) January 10, 2021
One of the nicest results with 2 different textures: 🥌 pic.twitter.com/uLfz9hUjEP
— Louis Maddox (@permutans) January 10, 2021
- If tweets aren't displaying, read the thread here
This was a lot of fun to try over the new year into early 2021. After checking that it was legal to do so, and figuring out how to extract glyphs from a font, I made a database of emojis from the font file on Mac OS X Catalina, composited the glyphs onto a background colour as far away from any in the image as possible, and ran them through the LIIF super-resolution model.
I'd previously begun some image processing on emojis in a project called emo, but had been 'scooped' by the (hugely popular) Emoji Mashup Bot, and subsequently lost interest in developing the idea.
I expected to have some bright idea of what these large emojis would be useful 'feedstock' for after creating them, but either due to burning through my start of year energy or whatever other reason, I just ended up appreciating them as visual oddities and not much more.
Decompositing was an unforeseen problem here: the LIIF model had 3 channels and the authors said it would be "straight-forward" to modify the code for 4 channels.
- Alas I couldn't find a readily available dataset of images with an alpha channel, and didn't feel like creating 'stickers' (with all-transparent or all-opaque pixels in a binary manner) would succeed, so I began to look into scraping one from Wikipedia: Wikitransp, akin to the DIV2K dataset used to train LIIF and most other super-resolution networks.
- This was before Google Research released WIT, a large scale dataset of images on Wikipedia.
I expect to revisit this project once I've assembled such a dataset, and perhaps new models will come out (surprisingly few have built on LIIF in the months that followed its publication, but I'm checking each new citation as they come out).
The most promising I've seen is JIIF ("Joint Implicit Image Function for Guided Depth Super-Resolution") published in July 2021 for the ACM Multimedia 2021 conference. The task of upsampling low resolution or noisy depth maps using RGB 'guides' is isomorphic to the task of upsampling an RGBA image.
That said, the paper explains how this differs from such a 'dense regression' approach, and I agree: the alpha channel should be supervised by the other three.
TBC...