All posts by jnack

Kling AI promises virtual try-ons

Accurately rendering clothing on humans, and especially estimating their dimensions to enable proper fit (and thus reduce costly returns), has remained a seductive yet stubbornly difficult problem. I’ve written previously about challenges I observed at Google, plus possible steps forward.

Now Kling is promising to use generative video to pair real people & real outfits for convincing visualization (but not fit estimation). Check it out:

Celebrating Saul Bass

It’s a real joy to see my 15yo son Henry’s interest in design & photography blossom, and last night he fell asleep perusing the giant book of vintage logos we scored at the Chicago Art Institute. I’m looking forward to acquainting him with the groundbreaking work of Saul Bass & figured we’d start here:

FlipSketch promises text-to-animation

We present FlipSketch, a system that brings back the magic of flip-book animation — just draw your idea and describe how you want it to move! …

Unlike constrained vector animations, our raster frames support dynamic sketch transformations, capturing the expressive freedom of traditional animation. The result is an intuitive system that makes sketch animation as simple as doodling and describing, while maintaining the artistic essence of hand-drawn animation.

BlendBox AI promises fast, interactive compositing

I’m finding the app (which is free to try for a couple of moves, but which quickly runs out of credits) to be pretty wacky, as it continuously regenerates elements & thus struggles with identity preservation. The hero vid looks cool, though:

AI fixes (?) The Polar Express

Hmm—”fix” is a strong word for reinterpreting the creative choices & outcomes of an earlier generation of artists, but it’s certainly interesting to see the divisive Christmas movie re-rendered via emerging AI tech (Midjourney Retexturing + Hailuo Minimax). Do you think the results escape the original’s deep uncanny valley? See more discussion here.

Incisive points on AI & filmmaking from Ben Affleck

Ignoring the misguided (IMHO) contents of the surrounding tweet, I found these four minutes of commentary to be extremely sharp & well informed:

New Google ReCapture tech enables post-capture camera control

Man, I miss working with these guys & gals…

We present ReCapture, a method for generating new videos with novel camera trajectories from a single user-provided video. Our method allows us to re-generate the source video, with all its existing scene motion, from vastly different angles and with cinematic camera motion.

They note that ReCapture is substantially different from other work. Existing methods can control camera either on images or on generated videos and not arbitrary user-provided videos. Check it out:

A love letter to splats

Paul Trillo relentlessly redefines what’s possible in VFX—in this case scanning his back yard to tour a magical tiny world:

Here he gives a peek behind the scenes: 

And here’s the After Effects plugin he used:

Thunder & The Deep Blue Sea

Everybody needs a good wingman, and when it comes to celebrating the beauty of aviation, I’ve got a great one in my son Henry. Much as we’ve done the last couple of years, this month we first took in the air show in Salinas, featuring the USAF Thunderbirds…

…followed by the Blue Angels buzzing Alcatraz & the Golden Gate at Fleet Week in San Francisco.

In both cases we were treated to some jaw-dropping performances—from a hovering F-35 to choreographed walls of fire—from some of the best aviators in the world. Check ’em out:

And thanks for the nice shootin’, MiniMe!

Relighting via Midjourney

Check out this impressive use of the new “retexture” feature, which enables image-to-image transformations:

Here’s a bit more on how the new editing features work:

Ideogram Canvas arrives

I’ve become an Ideogram superfan, using it to create imagery daily, so I’m excited to kick the tires on this new interactive tool—especially around its ability to synthesize new text in the style of a visual reference.

You can upload your own images or generate new ones within Canvas, then seamlessly edit, extend, or combine them using industry-leading Magic Fill (inpainting) and Extend (outpainting) tools. Use Magic Fill and Extend to bring your face or brand visuals to Ideogram Canvas and blend them with creative, AI-generated elements. Perfect for graphic design, Ideogram Canvas offers advanced text rendering and precise prompt adherence, allowing you to bring your vision to life through a flexible, iterative process.

Project Perfect Blend promises game-changing compositing in Photoshop

Oh man, for years we wanted to build this feature into Photoshop—years! We tried many times (e.g. I wanted this + scribble selection to be the marquee features in Photoshop Touch back in 2011), but the tech just wasn’t ready. But now, maybe, the magic is real—or at least tantalizingly close!

Being a huge nerd, I wonder about how the tech works, and whether it’s substantially the same as what Magnific has been offering (including via a Photoshop panel) for the last several months. Here’s how I used that on my pooch:

But even if it’s all the same, who cares?

Being useful to people right where they live & work, with zero friction, is tremendous. Generative Fill is a perfect example: similar (if lower quality) inpainting was available from DALL•E for a year+ before we shipped GenFill in Photoshop, but the latter has quietly become an indispensible, game-changing piece of the imaging puzzle for millions of people. I’d love to see compositing improvements go the same way.

The ceiling can’t hold us stuffed animals

As I drove the Micronaxx to preschool back in 2013, Macklemore’s “Can’t Hold Us” hit the radio & the boys flipped out, making their stuffed buddies Leo & Ollie go nuts dancing to the tune. I remember musing with Dave Werner (a fellow dad to young kids) about being able to animate said buddies.

Fast forward a decade+, and now Dave is using Adobe’s recently unveiled Firefly Video model to do what we could only dimly imagine back then:

Time to unearth Leo & get him on stage at last. :->

Flair AI promises brand-consistent video creation

As soon as Google dropped DreamBooth back in 2022, people have been trying—generally without much success—to train generative models that can incorporate the fine details of specific products. Thus far it just hasn’t been possible to meet most brands’ demanding requirements for fidelity.

Now tiny startup Flair AI promises to do just that—and to pair the object definitions with custom styling and even video. Check it out:

In search of The Something Else

Late last night my wife & I found ourselves in the depths of the Sunday Evening Blues—staring out towards the expanse of yet another week of work & school, without much differentiation from most of those before & after it. I’m keenly aware of the following fact, of course:

And yet, oof… it’s okay to acknowledge the petty creeping of tomorrow & tomorrow & tomorrow. The ennui will pass—as everything always does—but it’s real.

This reminded me of the penguin heroine in what was one of our favorite books to read to the Micronaxx back when they were actually micro, A Penguin Story by Antoinette Portis. Ol’ Edna is always searching for The Something Else—and she finds it! I came across this charming little narration of the story, and just in case you too might need a little avian encouragement—well, enjoy:

Meta AI introduces conversational editing

I was super hyped last year when Meta announced “Emu Edit” tech for selectively editing images using just language:

Now you can try the tech via Meta.ai and in various apps:

In my limited experience so far, it’s cool but highly unpredictable. I’ll test it further, and I’d love to know how it works for you. Meanwhile you can try similar techniques via https://playground.com/:

RIP Dikembe Mutombo

[I know this note seems supremely off topic, but bear with me.]

I’m sorry to hear of the passing of larger-than-life NBA star Dikembe Mutombo. He inspired the name of a “Project Mutombo” at Google, which was meant to block unintended sharing of content outside of one’s company. Unrelated (AFAIK he never knew of the project), back in 2015 I happened to see him biking around campus—dwarfing a hapless Google Bike & making its back tire cartoonishly flat.

RIP, big guy. Thanks for the memories, GIFs, and inspiration.

Zuck talks AR wearables & much more

I quite enjoyed the Verge’s interview with Mark Zuckerberg, discussing how they think about building a whole range of reality-augmenting devices, from no-display Wayfarers to big-ass goggles, and especially to “glasses that look like glasses”—the Holy Grail in between.

Links to some of the wide-ranging topics they covered:

00:00 Orion AR smart glasses
00:27 Platform shift from mobile to AR
02:15 The vision for Orion & AR glasses
03:55 Why people will upgrade to AR glasses
05:20 A range of options for smart glasses
07:32 Consumer ambitions for Orion
11:40 Reality Labs spending & the cost of AR
12:44 Ray-Ban partnership
17:11 Ray-Ban Meta sales & success
18:59 Bringing AI to the Ray-Ban Meta
21:54 Replacing phones with AR glasses
25:18 Influx of AI content on social media
28:32 The vision for AI-filled social media
34:04 Will AI lead to less human interaction?
35:24 Success of Threads
36:41 Competing with X & the role of news
40:04 Why politics can hurt social platforms
41:52 Mark’s shift away from politics
46:00 Cambridge Analytica, in hindsight
49:09 Link between teen mental health and social media
53:52 Disagreeing with EU regulation
56:06 Debate around AI training data & copyright
1:00:07 Responsibility around AR as a platform

Tangentially, I gave myself an unintended chuckle with this:

iPhone goes on safari

Austin Mann puts the new gear through its paces in Kenya:

Last week at the Apple keynote event, the iPhone camera features that stood out the most to me were the new Camera Control button, upgraded 48-megapixel Ultra Wide sensor, improved audio recording features (wind reduction and Audio Mix), and Photographic Styles. […]

Over the past week we’ve traveled over a thousand kilometers across Kenya, capturing more than 10,000 photos and logging over 3TB of ProRes footage with the new iPhone 16 Pro and iPhone 16 Pro Max cameras. Along the way, we’ve gained valuable insights into these camera systems and their features.

“Jurassic Park – 1950’s Super Panavision 70”

Chaos reigns!

I have no idea what AI and other tools were used here, but it’d be fun to get a peek behind the curtain. As a commenter notes,

The meandering strings in the soundtrack. The hard studio lighting of the close-ups. The midtone-heavy Technicolor grading. The macro-lens DOF for animation sequences. This is spot-on 50’s film aesthetic, bravo.

[Via Andy Russell]

Flux goes realtime with Krea

And if that headline makes no sense, it probably just means your not terminally AI-pilled, and I’m caught flipping a grunt. 😉 Anyway, the tiny but mighty crew at Krea have brought the new Flux text-to-image model—including its ability to spell—to their realtime creation tool: