Say it -> Select it: Runway ML promises semantic video segmentation

I find myself recalling something that Twitter founder Evan Williams wrote about “value moving up the stack“:

As industries evolve, core infrastructure gets built and commoditized, and differentiation moves up the hierarchy of needs from basic functionality to non-basic functionality, to design, and even to fashion.

For example, there was a time when chief buying concerns included how well a watch might tell time and how durable a pair of jeans was.

Now apps like FaceTune deliver what used to be Photoshop-only levels of power to millions of people, and Runway ML promises to let you just type words to select & track objects in video—using just a Web browser. 👀

ML + MIDI = trippy facial fun

“Hijacking Brains: The Why I’m Here Story” 😌

As I wrote many years ago, it was the chance to work with alpha geeks that drew me to Adobe:

When I first encountered the LiveMotion team, I heard that engineer Chris Prosser had built himself a car MP3 player (this was a couple of years before the iPod). Evidently he’d disassembled an old Pentium 90, stuck it in his trunk, connected it to the glovebox with some Ethernet cable, added a little LCD track readout, and written a Java Telnet app for synching the machine with his laptop. Okay, I thought, I don’t want to do that, but I’d like to hijack the brains of someone who could.

Now my new teammate Cameron Smith has spent a weekend wiring MIDI hardware to StyleGAN to control facial synthesis & modification:

VFX & photography: Fireside chat tonight with Paul Debevec

If you liked yesterday’s news about Total Relighting, or pretty much anything else related to HDR capture over the last 20 years, you might dig this SIGGRAPH LA session, happening tonight at 7pm Pacific:

Paul Debevec is one of the most recognized researchers in the field of CG today. LA ACM SIGGRAPH’s “fireside chat” with Paul and Carolyn Giardina, of the Hollywood Reporter, will allow us a glimpse at the person behind all the innovative scientific work. This event promises to be one of our most popularas Paul always draws a crowd and is constantly in demand to speak at conferences around the world.

“Total Relighting” promises to teleport(rait) you into new vistas

This stuff makes my head spin around—and not just because the demo depicts heads spinning around!

You might remember the portrait relighting features that launched on Google Pixel devices last year, leveraging some earlier research. Now a number of my former Google colleagues have created a new method for figuring out how a portrait is lit, then imposing new light sources in order to help it blend into new environments. Check it out:

“No One Is Coming. It’s Up To Us.”

“Everyone sweeps the floor around here.”

As I’ve noted many times, that core ethos from Adobe’s founders has really stuck with me over the years. In a similar, if superficially darker, vein, I keep meditating on the phrase “No One Is Coming, It’s Up To Us,” which appears in a sticker I put on the back of my car:

It’s reeeeealy easy to sit around and complain that we don’t have enough XYZ support (design cycles, eng bodies, etc.), and it’s all true/fair—but F that ‘cause it doesn’t move the ball. I keep thinking of DMX, with regard to myself & other comfortable folks:

I put in work, and it’s all for the kids (uh)
But these cats done forgot what work is (uh-huh)

Some brief & bracing wisdom:

Happy Monday. Go get some.

A thoughtful conversation about race

I know it’s not a subject that draws folks to this blog, but I wanted to share a really interesting talk I got to attend recently at Google. Broadcaster & former NFL player Emmanuel Acho hosts “Uncomfortable Conversations With A Black Man,” and I was glad that he shared his time and perspective with us. If you stick around to the end, I pop in with a question. The conversation is also available in podcast form.

This episode is with Emmanuel Acho, who discusses his book and YouTube Channel series of the same name: “Uncomfortable Conversations with a Black Man”, which offers conversations about race in an effort to drive open dialogue.

Emmanuel is a Fox Sports analyst and co-host of “Speak for Yourself”. After earning his undergraduate degree in sports management in 2012, Emmanuel was drafted by the Cleveland Browns. He was then traded to the Philadelphia Eagles in 2013, where he spent most of his career. While in the NFL, Emmanuel spent off seasons at the University of Texas to earn his master’s degree in Sports Psychology. Emmanuel left the football field and picked up the microphone to begin his broadcast career. He served as the youngest national football analyst and was named a 2019 Forbes 30 Under 30 Selection. Due to the success of his web series, with over 70 million views across social media platforms, he wrote the book “Uncomfortable Conversations with a Black Man”, and it became an instant New York Times Best Seller.

Interesting, interactive mash-ups powered by AI

Check out how StyleMapGAN (paper, PDF, code) enables combinations of human & animal faces, vehicles, buildings, and more. Unlike simple copy-paste-blend, this technique permits interactive morphing between source & target pixels:

From the authors, a bit about what’s going on here:

Generative adversarial networks (GANs) synthesize realistic images from random latent vectors. Although manipulating the latent vectors controls the synthesized outputs, editing real images with GANs suffers from i) time-consuming optimization for projecting real images to the latent vectors, ii) or inaccurate embedding through an encoder. We propose StyleMapGAN: the intermediate latent space has spatial dimensions, and a spatially variant modulation replaces AdaIN. It makes the embedding through an encoder more accurate than existing optimization-based methods while maintaining the properties of GANs. Experimental results demonstrate that our method significantly outperforms state-of-the-art models in various image manipulation tasks such as local editing and image interpolation. Last but not least, conventional editing methods on GANs are still valid on our StyleMapGAN. Source code is available at https://github.com/naver-ai/StyleMapGAN​.