While we’re all still getting our heads around the 2D image-generation magic of DALL•E, Imagen, MidJourney, and more, Google researchers are stepping into a new dimension as well with Dream Fields—synthesizing geometry simply from words.
“Not a single keyframe of animation was set in the making of the title, created by tweaking and bending the alignment knobs of a vintage TV,” writes Anthony Vitagliano. “Instead, I shot it using a vintage Montgomery Ward ‘Airline’ Portable Television, an iPhone, and a patchwork of cables and converters in my basement.”
Check out the results:
See Anthony’s site for high-res captures of the frames.
I’ve long considered augmented reality apps to be “realtime Photoshop”—or perhaps more precisely, “realtime After Effects.” I think that’s true & wonderful, but most consumer AR tends to be ultra-confined filters that produce ~1 outcome well.
Walking around San Francisco today, it struck me today that DALL•E & other emerging generative-art tools could—if made available via a simple mobile UI—offer a new kind of (almost) realtime Photoshop, with radically greater creative flexibility.
Here I captured a nearby sculpture, dropped out the background in Photoshop, uploaded it to DALL•E, and requested “a low-polygon metallic tree surrounded by big dancing robots and small dancing robots.” I like the results!
I’m suddenly craving a mobile #dalle app that lets me photograph things, select them/backgrounds, and then inpaint with prompts. Here’s a quick experiment based on a “tree” I just saw 🤖: pic.twitter.com/Sx3LAACOVs
Hard on the heels of OpenAI revealing DALL•E 2 last month, Google has announced Imagen, promising “unprecedented photorealism × deep level of language understanding.” Unlike DALL•E, it’s not yet available via a demo, but the sample images (below) are impressive.
I’m slightly amused to see Google flexing on DALL•E by highlighting Imagen’s strengths in figuring out spatial arrangements & coherent text (places where DALL•E sometimes currently struggles). The site claims that human evaluators rate Imagen output more highly than what comes from competitors (e.g. MidJourney).
I couldn’t be more excited about these developments—most particularly to figure out how such systems can enable amazing things in concert with Adobe tools & users.
With reporting from 250 locations around the world, AP is a key addition to the CAI’s mission to help consumers everywhere better understand the provenance and attribution of images and video.
“We are pleased to join the CAI in its efforts to combat misinformation and disinformation around photojournalism,” said AP Director of Photography David Ake. “AP has worked to advance factual reporting for over 175 years. Teaming up to help ensure the authenticity of images aligns with that mission.”
We are building some rad stuff (seriously, I wish I could show you already) and would love to have you join us:
Some key responsibilities:
Architect efficient and reusable full-stack systems that can support several different deep learned models
Building simple, robust, and scalable platforms used by many external users
Work closely with UX designers, Product managers, Machine Learning engineers to develop compelling experiences
Take a project from scoping requirements through the actual launch