Nearly a decade ago now (good grief), my entree to working with the Google AI team was in collaborating with Peyman Milanfar & team to ship a cool upsampling algorithm in Google+ (double good grief) and related apps. Since then they’ve continued to redefine what’s possible, and on the latest Pixel devices, zoom now extends to an eye-popping 100x. Check out this 7-second demo:
I’d seen some eye-popping snippets of the Google XR team’s TED talk a few months back, but until now I hadn’t watched the whole thing. It’s well worth doing so, and I truly can’t process the step change in realtime perceptual capabilities that has recently arrived in Gemini:
A recent Time Magazine cover featuring Zohran Mamdani made me recall a super interesting customer visit I did years ago with photographer Gregory Heisler. Politics aside, this is a pretty cool peek behind the curtains on the making of an epic image:
As for the Mamdani shoot, it sounds quite memorable unto itself—for incredibly different reasons:
was reading the photogs substack and ive seen a lot of tricks on set but ive got to say i did not see this one coming lmfao pic.twitter.com/9pHbkIe9z0
“Coming first to Pixel 10 in the U.S., you can simply describe the edits you want to make by text or voice in Photos’ editor, and watch the changes appear. And to further improve transparency around AI edits, we’re adding support for C2PA Content Credentials in Google Photos.”
Because this is an open-ended, conversational experience, you don’t have to indicate which tools you want to use. For example, you could ask for a specific edit, like “remove the cars in the background” or something more general like “restore this old photo” and Photos will understand the changes you’re trying to make. You can even make multiple requests in a single prompt like “remove the reflections and fix the washed out colors.”
Turntable is now available in the Adobe #Illustrator Public Beta Build 29.9.14!!!
A feature that lets you “turn” your 2D artwork to view it from different angles. With just a few steps, you can generate multiple views without redrawing from scratch.
Given that I’m thinking ahead to photographing air shows this fall, here’s a short, sweet, and relevant little tutorial on creating realistic motion blur on backgrounds:
A couple of weeks ago I saw Photoshop trainer Rob de Winter experimenting with integrating ChatGPT’s image model into Photoshop, much as I’d been quietly colluding with Christian Cantrell to do three years ago using DALL•E (long before Firefly existed, when Adobe was afraid to do anything in the generative space).
I suggested that Rob try using Flux Kontext, and he promptly whipped up this free plugin. Check out the results:
From Rob’s site:
This custom-made Flux Kontext JSX-plugin lets you create context-aware AI edits directly inside Photoshop, based on your selection and a short prompt. Your selection is sent to Replicate’s Flux Kontext models (Pro or Max), and the result is placed back as a new layer with a mask, keeping lighting, shadows, and materials consistent.
Watching the face-swapping portion of Jesús’s otherwise excellent demo above made me wince: this part of Photoshop’s toolbox just hasn’t evolved in years and years. It’s especially painful for me, as I returned to Adobe in 2021 to make things like this better. Despite building some really solid tech, however, we were blocked by concerns about ethics (“What if a war criminal got access to this?”; yes, seriously). So it goes.
Maybe someday PS will update its face-related features (heck, for all I know they’re integrating a new API now!). In the meantime, here’s a nice 4-minute tour of how to do this (for free!) in Ideogram:
Wow—well, you sure can’t fault these guys for beating around the bush: video creator Higgsfield has introduced a browser extension that lets you click any image, then convert it to video & create related images. For better or worse, here’s how it works (additional details in thread):
this should be banned..
AI now can clone any ad, change the actor, keep the brand and make it yours
Jesús Ramirez is a master Photoshop compositor, so it’s especially helpful to see his exploration of some of the new tool’s strengths & weaknesses (e.g. limited resolution)—including ways to work around them.
The AI generator—of which I’ve been a longtime fan—has introduced the ability to upload a single image of a person (or cat!), then use it in creating images. It’s hard to overstate just how long people have wanted this kind of control & simplicity.
For a deeper look, here’s a quick demo from the team:
The app promises to let you turn static images into short videos and transform them into fun art styles, plus explore a new creation hub.
I’m excited to try it out, but despite the iOS app having been just updated, it’s not yet available—at least for me. Meanwhile, although I just bit the bullet & signed up for the $20/mo. plan, the three video attempts that Gemini allowed me today all failed. ¯\_(ツ)_/¯
To be honest I’ve never taken a more than passing interest in most birds, and certainly in photographing them, but the insane diversity of those in southern Africa was too much to resist. Here are some of my favorites we spied on our journey through Zimbabwe & Botswana:
Meanwhile we enjoyed visiting Painted Dog Conservation and learning about their tireless efforts to preserved & rehabilitate some of the 6,000 or so of these unique animals that remain in the wild—and that often fall prey to poachers’ snares. Tap/click to see a rather charming little vid:
Even though I got absolutely wrecked for having the temerity to use one of my son’s cute old drawings in an AI project last year (no point in now digging up the hundreds of flames it drew), I still enjoy seeing this kind of creative interpretation:
My mom sent me a 30-year-old drawing…
That I made for her when I was a kid.
Naturally, I animated with Midjourney..
+ It’s not perfect + But it’s awesome + And it captures the innocent chaos + Of my childhood imagination.
Man, am I now gonna splash out for another monthly subscription? I haven’t done so yet, but these results are pretty darn impressive:
To turn your photos into videos, select ‘Videos’ from the tool menu in the prompt box and upload a photo. … The photo-to-video capability is starting to roll out today to Google AI Pro and Ultra subscribers in select countries around the world. Try it out at gemini.google.com. These same capabilities are also available in Flow, Google’s AI filmmaking tool.
You wouldn’t think a hyena might require one of those “Do Not Pet” badges sported by service dogs—but you haven’t met all of our travel companions! :->
Hey friends—we’ve made it home to Cali after a whirlwind trip to Zimbabwe & Botswana. I’ll try to post some observations about the state of photo editing these days, and I’d love to hear yours. Meanwhile, while my body still tries to clue into where & when the heck I am, here are a few small galleries I’ve shared so far:
D’oh—before heading to Zimbabwe & Botswana with my wife to celebrate our 20th anniversary, I neglected to mention that things will be a bit quieter around here than normal. We plan to return to the States next week, and I might share a few posts between now & then. Meanwhile, check out some new friends we made this morning!
My family, having seen so many of my AI-powered image generations over the last 3 years, is just utterly inured to them. So, for my MiniMe’s 16th, I sketched up the patriotic little HO-scale engine we’re getting him, along with a cute large ground squirrel (to quote the Dude, “Nice marmot”).
I feel like this is my micro version of when the world revolted against too-perfect Instagram culture, swinging towards Snapchat & stories, where “rough is real,” and flaws are a feature. In any case, my dude was happy as a clam—and that’s all that matters to me.
For my son’s birthday, I ditched AI and broke out my pen. Felt good to work without a net. pic.twitter.com/TJedYCPWMj
Okay, so this isn’t precisely what I thought it was at first (video inpainting), but rather an creation->inpainting->animation flow. Still, the results look impressive:
How it works:
→ Generate an image in Higgsfield Soul → Inpaint directly with a mask and a prompt → Combine with Camera moves, VFX, and Avatars to turn static edits into living, speaking visuals pic.twitter.com/ENHqdA3WHm
“If you’re into weird cars, forgotten history, and stories that don’t end well, hit that subscribe button.”
I found this piece really interesting, not least because my wife & I are headed to Africa for the first time next week, and I’m eager to learn what kinds of vehicles & roads we’ll experience. Seems like something like the Africar would make a ton of sense in many places:
As I’ve noted previously, Google has been trying to crack the try-on game for a long time. Back in the day (c. 2017), we really want to create AR-enabled mirrors that could do this kind of thing. The tech wasn’t quite ready, and for the realtime mirror use case it likely still isn’t, but check out the new free iOS & Android app Doppl:
In May, Google Shopping announced the ability to virtually try billions of clothing items on yourself, just by uploading a photo. Doppl builds on these capabilities, bringing additional experimental features, including the ability to use photos or screenshots to “try on” outfits whenever inspiration strikes.
Doppl also brings your looks to life with AI-generated videos — converting static images into dynamic visuals that give you an even better sense for how an outfit might feel. Just upload a picture of an outfit, and Doppl does the rest.
Several years ago, MyHeritage saw a huge (albeit short-lived) spike in interest from their Deep Nostalgia feature that animated one’s old photos. Everything old is new again, in many senses. Check out Reddit founder Alexis Ohanian talk about how touching he found the tech—as well as tons of blowback from people who find it dystopian.
Damn, I wasn’t ready for how this would feel. We didn’t have a camcorder, so there’s no video of me with my mom. I dropped one of my favorite photos of us in midjourney as ‘starting frame for an AI video’ and wow… This is how she hugged me. I’ve rewatched it 50 times. pic.twitter.com/n2jNwdCkxF
I’ve heard people referring to the recent release of Google’s Veo 3 as the ChatGPT moment for video generation—that is, a true inflection point at which a mere curosity becomes something of real value. The spatial & character coherence of its output, and especially its ability to generate speech & other audio, turn it into a genuine storytelling tool.
You’ve probably seen some of the myriad vlogger-genre creations making the rounds. Here’s one of my faves:
I’ll note the fact of AI having been involved only because at this point who cares whether AI was involved? We’re happily reaching a plane of maturity where the particular mix of tooling is much less interesting than the vision & vibe.
John Gruber recently linked back to this clip in which designer Neven Mrgan highlights what feels like an important consideration in the age of mass-generated AI “designs”:
I think that was what mattered is that they looked rich, they looked like a lot of work had been put into them. That’s what people latch onto. It seems it’s something that, yes, they should have spent money on, and they should be spending time on right now.
Regardless of what tools were used in the making of a piece, does it feel rich, crafted, thoughtfully made? Does it have a point, and a point of view? As production gets faster, those qualities will become all the more critical for anything—and anyone—wishing to stand out.
This could be an awesome opportunity for the right person, who’d get to work on things I’ve wanted the team to do for 15+ years!
We’re looking for an expert technical product manager to lead Photoshop’s foundational architecture and performance strategy. This is a pivotal role responsible for evolving the core technologies that power Photoshop’s speed, stability, and future scalability across platforms.
You’ll drive major efforts to modernize our rendering and compute architecture, migrate legacy systems to more scalable platforms, and accelerate performance through GPU and hardware optimization. This work touches nearly every part of Photoshop, from canvas rendering to feature responsiveness to long-term cross-platform consistency.
This is a principal-level individual contributor role with the potential to grow a team in the future.
I interviewed many hundreds of PM candidates at Google, and if things were going well, I’d ask, “Tell me about a product you hate that you use regularly. Why do you hate it?”
This proved to be a great bozo detector. Does this person have curiosity, conviction, passion, unreasonableness? Were they forced into coding & now just want to escape life in the damn debugger, or do they have a semi-pathological need to build stuff they’re proud of? Would I want them in the proverbial foxhole with me? Are they willing to sweep the floor?
Unsurprisingly, most candidates offer shallow, banal answers (“Uh, wow… I mean, I guess the ESPN app is kinda slow…?”), whereas great ones explain not just what sucks, but why it sucks. Like, why—systemically—is every car infotainment system such crap? Those are the PMs I want asking the questions, then questioning the answers.
——-
Specifically the car front, as Tolstoy might say, “Each one is unhappy in its own way.” The most interesting thing, I think, isn’t just to talk about the crappy mismatched & competing experiences, but rather about why every system I’ve ever used sucks. The answer can’t be “Every person at every company is a moron”—so what is it?
So much comes down to the structure of the industry, with hardware & software being made by a mishmash of corporate frenemies, all contending with a soup of regulations, risk aversion (one recall can destroy the profitability of a whole product line), and surprisingly bargain-bin electronics.
Check out this short vid for some great insights from Ford CEO Jim Farley:
Ford CEO Jim Farley on why it’s so difficult for legacy car companies to get software right & why @Tesla’s vertically integrated approach is the right one:
“We farmed out all the modules that control the vehicles to our suppliers because we could bid them against each other, so… pic.twitter.com/kWsIaiOJlI
A while back, Sam Harris & Ricky Gervais discussed the impossibility of translating a joke discovered during a dream (“What noise does a monster make?”) back into our consensus waking reality. Like… what?
I get the same vibes watching ChatGPT try to dredge up some model of me and of… humor?… in creating a comic strip based on our interactions. I find it uncanny, inscrutable, and yet consequently charming all at once.
“Hey ChatGPT, based on what you know about me, please create a four-panel comic you think I’d like…” https://t.co/U7WRfShGRh
Splice (2D/3D design in your browser) has added support for progressive blur & gradients, and the results look awesome.
I haven’t seen anything advance like this in Adobe‘s core apps in maybe 20 years— maybe 25, since Illustrator & Acrobat added support for transparency.
We are adding Progressive Blur + Gradients to Hana! All interactive, all real-time.
On an aesthetically similar note, check out the launch video for the new version of Sketch (still very much alive & kicking in an age of Figma, it seems):
Remember when we said auto layout was coming to Sketch? It’s here. It’s called Stacks, and it’s part of our biggest release ever — out now.
There’s a lot to cover, so buckle up and we’ll give you a tour.
Also, stick around for a surprise at the end of the thread
Opt in to get started: Head over to Search Labs and opt into the “try on” experiment.
Browse your style: When you’re shopping for shirts, pants or dresses on Google, simply tap the “try it on” icon on product listings.
Strike a pose: Upload a full-length photo of yourself. For best results, ensure it’s a full-body shot with good lighting and fitted clothing. Within moments, you can see how the garment will look on you.
Several years ago, my old teammates shared some promising research on how to facilitate more interesting typesetting. Check out this 1-minute overview:
Ever since the work landed in Adobe Express a while back, I’ve wondered why it hadn’t yet made its way to Photoshop or Illustrator. Now, at least, it looks like it’s on its way to PS:
The feature looks cool, and I’m eager to try it out, but I hope that Adobe will keep trying to offer something more semantically grounded (i.e. where word size is tied to actual semantic importance, not just rectangular shape bounds)—like what we shipped last year: