Everybody needs a good wingman, and when it comes to celebrating the beauty of aviation, I’ve got a great one in my son Henry. Much as we’ve done the last couple of years, this month we first took in the air show in Salinas, featuring the USAF Thunderbirds…
…followed by the Blue Angels buzzing Alcatraz & the Golden Gate at Fleet Week in San Francisco.
In both cases we were treated to some jaw-dropping performances—from a hovering F-35 to choreographed walls of fire—from some of the best aviators in the world. Check ’em out:
Here’s a bit more on how the new editing features work:
We’re testing two new features today: our image editor for uploaded images and image re-texturing for exploring materials, surfacing, and lighting. Everything works with all our advanced features, such as style references, character references, and personalized models pic.twitter.com/jl3a1ZDKNg
I’ve become an Ideogram superfan, using it to create imagery daily, so I’m excited to kick the tires on this new interactive tool—especially around its ability to synthesize new text in the style of a visual reference.
Today, we’re introducing Ideogram Canvas, an infinite creative board for organizing, generating, editing, and combining images.
Bring your face or brand visuals to Ideogram Canvas and use industry-leading Magic Fill and Extend to blend them with creative, AI-generated content. pic.twitter.com/m2yjulvmE2
You can upload your own images or generate new ones within Canvas, then seamlessly edit, extend, or combine them using industry-leading Magic Fill (inpainting) and Extend (outpainting) tools. Use Magic Fill and Extend to bring your face or brand visuals to Ideogram Canvas and blend them with creative, AI-generated elements. Perfect for graphic design, Ideogram Canvas offers advanced text rendering and precise prompt adherence, allowing you to bring your vision to life through a flexible, iterative process.
Filmmaker & Pika Labs creative director Matan Cohen Grumi makes this town look way more dynamic than usual (than ever?) through the power of his team’s tech:
Adobe’s new generative 3D/vector tech is a real head-turner. I’m impressed that the results look like clean, handmade paths, with colors that match the original—and not like automatic tracing of crummy text-to-3D output. I can’t wait to take it for a… oh man, don’t say it don’t say it… spin.
Oh man, for years we wanted to build this feature into Photoshop—years! We tried many times (e.g. I wanted this + scribble selection to be the marquee features in Photoshop Touch back in 2011), but the tech just wasn’t ready. But now, maybe, the magic is real—or at least tantalizingly close!
Being a huge nerd, I wonder about how the tech works, and whether it’s substantially the same as what Magnific has been offering (including via a Photoshop panel) for the last several months. Here’s how I used that on my pooch:
But even if it’s all the same, who cares?
Being useful to people right where they live & work, with zero friction, is tremendous. Generative Fill is a perfect example: similar (if lower quality) inpainting was available from DALL•E for a year+ before we shipped GenFill in Photoshop, but the latter has quietly become an indispensible, game-changing piece of the imaging puzzle for millions of people. I’d love to see compositing improvements go the same way.
As I drove the Micronaxx to preschool back in 2013, Macklemore’s “Can’t Hold Us” hit the radio & the boys flipped out, making their stuffed buddies Leo & Ollie go nuts dancing to the tune. I remember musing with Dave Werner (a fellow dad to young kids) about being able to animate said buddies.
Fast forward a decade+, and now Dave is using Adobe’s recently unveiled Firefly Video model to do what we could only dimly imagine back then:
Amazing, and literally immersive, work by artists at The Weather Channel. Yikes—stay safe out there, everybody.
The 3D artists at the weather channel deserve a raise for this insane visual
Now watch this, and then realize forecasts are now predicting up to 15 ft of storm surge in certain areas on the western coast of Florida pic.twitter.com/HHrCVWNgpg
As soon as Google dropped DreamBooth back in 2022, people have been trying—generally without much success—to train generative models that can incorporate the fine details of specific products. Thus far it just hasn’t been possible to meet most brands’ demanding requirements for fidelity.
Now tiny startup Flair AI promises to do just that—and to pair the object definitions with custom styling and even video. Check it out:
You can now generate brand-consistent video advertisements for your products on @flairAI_
1. Train a model on your brand’s aesthetic 2. Train a model on your clothing or product 3. Combine both models in one prompt 4. Animate✨
Late last night my wife & I found ourselves in the depths of the Sunday Evening Blues—staring out towards the expanse of yet another week of work & school, without much differentiation from most of those before & after it. I’m keenly aware of the following fact, of course:
And yet, oof… it’s okay to acknowledge the petty creeping of tomorrow & tomorrow & tomorrow. The ennui will pass—as everything always does—but it’s real.
This reminded me of the penguin heroine in what was one of our favorite books to read to the Micronaxx back when they were actually micro, A Penguin Story by Antoinette Portis. Ol’ Edna is always searching for The Something Else—and she finds it! I came across this charming little narration of the story, and just in case you too might need a little avian encouragement—well, enjoy:
In my limited experience so far, it’s cool but highly unpredictable. I’ll test it further, and I’d love to know how it works for you. Meanwhile you can try similar techniques via https://playground.com/:
Welcome to the new Playground
Use AI to design logos, t-shirts, social media posts, and more by just texting it like a person.
[I know this note seems supremely off topic, but bear with me.]
I’m sorry to hear of the passing of larger-than-life NBA star Dikembe Mutombo. He inspired the name of a “Project Mutombo” at Google, which was meant to block unintended sharing of content outside of one’s company. Unrelated (AFAIK he never knew of the project), back in 2015 I happened to see him biking around campus—dwarfing a hapless Google Bike & making its back tire cartoonishly flat.
RIP, big guy. Thanks for the memories, GIFs, and inspiration.
Wow @runwayml just dropped an updated Gen-3 Alpha Turbo Video-to-Video mode & it’s awesome! It’s super fast & lets you do 9:16 portrait video. Anything is possible! pic.twitter.com/AxeFaJwAPR
I quite enjoyed the Verge’s interview with Mark Zuckerberg, discussing how they think about building a whole range of reality-augmenting devices, from no-display Wayfarers to big-ass goggles, and especially to “glasses that look like glasses”—the Holy Grail in between.
Links to some of the wide-ranging topics they covered:
00:00 Orion AR smart glasses 00:27 Platform shift from mobile to AR 02:15 The vision for Orion & AR glasses 03:55 Why people will upgrade to AR glasses 05:20 A range of options for smart glasses 07:32 Consumer ambitions for Orion 11:40 Reality Labs spending & the cost of AR 12:44 Ray-Ban partnership 17:11 Ray-Ban Meta sales & success 18:59 Bringing AI to the Ray-Ban Meta 21:54 Replacing phones with AR glasses 25:18 Influx of AI content on social media 28:32 The vision for AI-filled social media 34:04 Will AI lead to less human interaction? 35:24 Success of Threads 36:41 Competing with X & the role of news 40:04 Why politics can hurt social platforms 41:52 Mark’s shift away from politics 46:00 Cambridge Analytica, in hindsight 49:09 Link between teen mental health and social media 53:52 Disagreeing with EU regulation 56:06 Debate around AI training data & copyright 1:00:07 Responsibility around AR as a platform
Tangentially, I gave myself an unintended chuckle with this:
Last week at the Apple keynote event, the iPhone camera features that stood out the most to me were the new Camera Control button, upgraded 48-megapixel Ultra Wide sensor, improved audio recording features (wind reduction and Audio Mix), and Photographic Styles. […]
Over the past week we’ve traveled over a thousand kilometers across Kenya, capturing more than 10,000 photos and logging over 3TB of ProRes footage with the new iPhone 16 Pro and iPhone 16 Pro Max cameras. Along the way, we’ve gained valuable insights into these camera systems and their features.
Fernando Livschitz, whose amazing work I’ve featured many times over the years, is back with some delightfully pillowy interactions in & over the Big Apple:
I have no idea what AI and other tools were used here, but it’d be fun to get a peek behind the curtain. As a commenter notes,
The meandering strings in the soundtrack. The hard studio lighting of the close-ups. The midtone-heavy Technicolor grading. The macro-lens DOF for animation sequences. This is spot-on 50’s film aesthetic, bravo.
And if that headline makes no sense, it probably just means your not terminally AI-pilled, and I’m caught flipping a grunt. 😉 Anyway, the tiny but mighty crew at Krea have brought the new Flux text-to-image model—including its ability to spell—to their realtime creation tool:
Flux now in Realtime.
available in Krea with hundreds of styles included.
What a fun little project & great NYC vibe-catcher: the folks at Runway captured street scenes with a disposable film camera, then used their model to put the images in motion. Check it out:
Check out my friend Bilawal’s summary thread, which pairs quick demos from Apple with bits of useful context:
Caught the Apple keynote? I’ve distilled down the most intriguing highlights for AI and spatial computing creators and builders—no need to sift through it yourself. Thread: pic.twitter.com/hiLM7iMzi4
There are some great additional details in this thread from Halide Camera as well:
There’s a lot of info to digest from the keynote, so here’s our summary of all the changes and new features of iPhone 16 and 16 Pro cameras in this quick thread pic.twitter.com/z7xB0aekLi
Somehow, despite my wife being a huge fan of the show over the last couple of years, I hadn’t previously seen the delightful titles for Only Murders In The Building:
“The brief was this idea of a love letter to New York in a way and true crime and true crime podcasts,” Lisa Bolan, a creative director at Elastic, told Salon. “John really wanted to capture this romantic illustrative approach to New York, building on the magic of Hirschfeld and The New Yorker – illustrators who have abstracted New York in a way that’s beautiful and also speaks to these little glimpses of magic in the urban landscape.
I love seeing how scrappy creators combine tools in new ways, blazing trails that we may come to see as commonplace soon enough. Here Eric Solorio (enigmatic_e) shows how he used Viggle & other tools to create his viral Deadpool animation:
As promised, here is a breakdown of how I did the Deadpool animation I recently posted. pic.twitter.com/F130Skq17U
If you never see the use of After Effects in this delightfully madcap vid—well, that’s exactly as it should be. Apparently the filmmakers were featured in an Adobe trade show booth after it was released.
I’ve been having a ball using the new Ideogram app for iOS to import photos & remix them into new creations. This is possible via their web UI as well, but there’s something extra magical about the immediacy of capture & remix. Check out a couple quick explorations I did while out with the kids, starting from a ballcap & the fuel tank of an old motorcycle:
I love this level of transparency from the folks behind Photo AI. Developer @levelsio reports,
[Flux] made Photo AI finally good enough overnight to be actually used by people and be satisfied with the results… it’s more expensive [than SD] but worth it because the photos are way way better… Not sure about profitability but with SD it was about 85% profit. With Flux def less maybe 65%… Very unplanned and grateful the foundational models got better.
We’re arguably in something of a trough of disillusionment in the AI-art hype cycle, but this kind of progress gives reason for hope: more quality & more utility do translate into more sustainable value—and there’s every reason to think that things will only improve from here.
Flux, the new AI model, changes businesses (and lives)
It made https://t.co/1vEawpI5vb finally good enough overnight to be actually used by people and be satisfied with the results
All my improvements before helped but now it’s accelerating with Flux’s photo quality pic.twitter.com/BiAqi5BgnY
Listen, I know that it’s a lot more seductive & cathartic to say “I f*cking hate generative AI,” and you can get 90,000+ likes for doing so, but—believe it or not—thoughtfulness & nuance actually matter. That is, how one uses generative tech can have very different implications for the creative community.
It’s therefore important to evaluate a range of risk/reward scenarios: What’s unambiguously useful & low-risk, vs. what’s an inducement to ripping people off, and what lies in the middle?
I see a continuum like this (click/tap to see larger):
None of this will draw any attention or generate much conversation—at least if my attempts to engage people on Twitter are any indication—but it’s the kind of thing actual toolmakers must engage with if we’re to make progress together. And so, back to work.
“Tell me about a product you hate that you use regularly.” I asked this question of hundreds of Google PM candidates I interviewed, and it was always a great bozo detector. Most people don’t have much of an answer—no real passion or perspective. I want to know not just what sucks, but why it sucks.
If I were asked the same question, I’d immediately say “Every car infotainment system ever made.” As Tolstoy might say, “Each one is unhappy in its own way.” The most interesting thing, I think, isn’t just to talk about the crappy mismatched & competing experiences, but rather about why every system I’ve ever used sucks. The answer can’t be “Every person at every company is a moron”—so what is it?
So much comes down to the structure of the industry, with hardware & software being made by a mishmash of corporate frenemies, all contending with a soup of regulations, risk aversion (one recall can destroy the profitability of a whole product line), and surprisingly bargain-bin electronics.
Despite all that, talented folks continue to fight the good fight, and I enjoyed John LePore’s speculative designs that reinterpret the instrument clusters of classic cars (from Corvettes to DeLoreans) through Apple’s latest CarPlay framework:
My friend Nathan has fed a mix of Schwarzenegger photos & drawings from Aesop’s Fables into the new open-source Flux model, creating a rad woodcut style. That’s interesting enough on its own—but it’s so 24 hours ago, and thus he’s now taken to animating the results. Check out the thread below for details:
Animating yesterday’s #FLUX woodcut Arnold using one of my favorite clips from the old soundboards
This uses Follow-Your-Emoji / Reference UNet in ComfyUI, which did a better job than LivePortrait.
It’s wild that capabilities that blew our minds two years ago—for which I & others spent months on a waiting list for DALL•E, which demanded beefy servers to run—are now available (only better) running in your pocket, on your telephone. Check out the latest from Google:
Pixel Studio is a first-of-its-kind image generator. So now you can bring all ideas to life from scratch, right on your phone — a true creative canvas.9
It’s powered by combining an on-device diffusion model running on Tensor G4 and our Imagen 3 text-to-image model in the cloud. With a UI optimized for easy prompting, style changes and editing, you can quickly bring your ideas to conversations with friends and family.
3. Pixel Studio
Create anything you imagine with PixelStudio, a groundbreaking image generator powered by an on-device diffusion model. It’s your AI canvas. pic.twitter.com/oDBqkUfqOR
Back when I worked on Google Photos, and especially later when I worked in Research, I really wanted to ship a camera mode that would help ensure great group photos. Prior to the user pressing the capture button, it would observe the incoming video stream, notice when it had at least one instance of each face smiling with their eyes open, and then knit together a single image in which everyone looked good.
Of course, the idea was hardly new: I’d done the same thing manually with my own wedding photos back in 2005, and in 2013 Google+ introduced “AutoAwesome Smile” to select good expressions across images & merge them into a single shot. It was a great feature, though sadly the only time people noticed its existence is when it failed in often hilarious “AutoAwful” ways (turning your baby or dog into, say, a two-nosed Picasso). My idea was meant to improve on this by not requiring multiple photos, and of course by suppressing unwanted hilarity.
Anyway, Googlers gonna Google, and now the Pixel team has introduced an interactive mode that helps you capture & merge two shots—the first one of a group, and the second of the photographer who took the first. Check out Marques Brownlee’s 1-minute demo:
The most interesting AI feature on the new Pixels IMO: “Add Me”