During the COVID-19 crisis, we’re committed to supporting the community with complimentary access to Unity Learn Premium for three months (March 19 through June 20). Get exclusive access to Unity experts, live interactive sessions, on-demand learning resources, and more.
“This is certainly the coolest thing I’ve ever worked on, and it might be one of the coolest things I’ve ever seen.”
My Google Research colleague Jon Barron routinely makes amazing stuff, so when he gets a little breathless about a project, you know it’s something special. I’ll pass the mic to him to explain their new work around capturing multiple photos, then synthesizing a 3D model:
I’ve been collaborating with Berkeley for the last few months and we seem to have cracked neural rendering. You just train a boring (non-convolutional) neural network with five inputs (xyz position and viewing angle) and four outputs (RGB+alpha), combine it with the fundamentals of volume rendering, and get an absurdly simple algorithm that beats the state of the art in neural rendering / view synthesis by *miles*.
You can change the camera angle, change the lighting, insert objects, extract depth maps — pretty much anything you would do with a CGI model, and the renderings are basically photorealistic. It’s so simple that you can implement the entire algorithm in a few dozen lines of TensorFlow.
If you’re on an iPhone or compatible Android device, try watching this video in the YouTube app to see what my team has cooked up around virtual makeup try-on (complementing the lipstick try-ons we launched a while back):
Tapping on a given item will show the dish, along with pictures also taken by customers. You can then choose to search for that dish to see what it is, or use Google Translate to display it in your native language.
He first appears as a crude collection of 3-D pixels—or voxels. Soon, he looks like a conglomeration of blocks morphing into the shape of an animal. Gradually, his image evolves until he becomes a sharp representation of a northern white rhino, grunting and squealing as he might in a grassy African or Asian field. There comes a moment—just a moment—when the viewer’s eyes meet his. Then, the 3-D creature vanishes, just like his sub-species, which due to human poaching is disappearing into extinction.
The Mill, which has studios in London, New York, Los Angeles, Chicago, Bangalore and Berlin, provided animation for this project, and Dr. Andrea Banino at DeepMind, an international company that develops useful forms of artificial intelligence, provided the experimental data to set the rhino’s paths. After each two-minute episode, the rhino reappears and follows another of the three programmed paths.
The Looking Glass is powered by our proprietary 45-element light field technology, generating 45 distinct and simultaneous perspectives of three-dimensional content of any sort.
This means multiple people around a Looking Glass are shown different perspectives of that three-dimensional content—whether that’s a 3D animation, DICOM medical imaging data, or a Unity project – in super-stereoscopic 3D, in the real world without any VR or AR headgear.
Back in 2014, Action Movie Dad posted a delightful vid of his niño evading the hot foot:
But now instead of needing an hour-long tutorial on how to create this effect, you can do it it realtime, with zero effort, on your friggin’ telephone. (Old Man Nack does wonder just how much this cheapens the VFX coin—but on charges progress.)