Category Archives: AI/ML

AI in Ai: Illustrator adds Vector GenFill

As I’ve probably mentioned already, when I first surveyed Adobe customers a couple of years ago (right after DALL•E & Midjourney first shipped), it was clear that they wanted selective synthesis—adding things to compositions, and especially removing them—much more strongly than whole-image synthesis.

Thus it’s no surprise that Generative Fill in Photoshop has so clearly delivered Firefly’s strongest product-market fit, and I’m excited to see Illustrator following the same path—but for vectors:

Generative Shape Fill will help you improve your workflow including:

  • Create detailed, scalable vectors: After you draw or select your shape, silhouette, or outline in your artboard, use a text prompt to ideate on vector options to fill it.
  • Style Reference for brand consistency: Create a wide variety of options that match the color, style, and shape of your artwork to ensure a consistent look and feel.
  • Add effects to your creations: Enhance your vector options further by adding styles like 3D, geometric, pixel art or more.

They’re also adding the ability to create vector patterns simply via prompting:

Photoshop’s new Selection Brush helps control GenFill

Soon after Generative Fill shipped last year, people discovered that using a semi-opaque selection could help blend results into an environment (e.g. putting fish under water). The new Selection Brush in Photoshop takes functionality that’s been around for 30+ years (via Quick Select mode) and brings it more to the surface, which in turn makes it easier to control GenFill behavior:

Magnific magic comes to Photoshop

I’m delighted to see that Magnific is now available as a free Photoshop panel!

For now the functionality is limited to upscaling, but I have to think that they’ll soon turn on the super cool relighting & restyling tech that enables fun like transforming my dog using just different prompts (click to see larger):

Realtime face editing with LivePortrait

I wish Adobe hadn’t given up (at least for the last couple of years and foreseeable future) on the Smart Portrait tech we were developing. It’s been stuck at 1.0 since 2020 and could be so much better. Maybe someday!

In the meantime, check out LivePortrait:

And now you can try it out for yourself:

tyFlow: Stable Diffusion-based rendering in 3ds Max

Being able to declare what you want, instead of having to painstakingly set up parameters for materials, lighting, etc. may prove to be an incredibly unlock for visual expressivity, particularly around the generally intimidating realm of 3D. Check out what tyFlow is bringing to the table:

You can see a bit more about how it works in this vid…

…or a lot more in this one:

How I wish Photoshop would embrace AI

Years ago Adobe experimented with a real-time prototype of Photoshop’s Landscape Mixer Neural Filter, and the resulting responsiveness made one feel like a deity—fluidly changing summer to winter & back again. I was reminded of using Google Earth VR, where grabbing & dragging th

Nothing came of it, but in the time since then, realtime diffusion rendering (see amazing examples from Krea & others) and image-to-image restyling have opened some amazing new doors. I wish I could attach filters to any layer in Photoshop (text, 3D, shape, image) and have it reinterpreted like this:

Magic Insert promises stylistically harmonized compositing

New tech from my old Google teammates makes some exciting claims:

Using Magic Insert we are, for the first time, able to drag-and-drop a subject from an image with an arbitrary style onto another target image with a vastly different style and achieve a style-aware and realistic insertion of the subject into the target image.

Of course, much of the challenge here—where art meets science—is around identity preservation: to what extent can & should the output resemble the input? Here it’s subject to some interpretation. In other applications one wants an exact copy of a given person or thing, but optionally transformed in just certain ways (e.g. pose & lighting).

When we launched Firefly last year, we showed off some of Adobe’s then-new ObjectStitch tech for making realistic composites. It didn’t ship while I was there due to challenges around identity preservation. As far as I know those challenges remain only partially solved, so I’ll continue holding out hope—as I have for probably 30 years now!—for future tech breakthroughs that get us all the way across that line.

Day & Night, Magnific + Luma Edition

Check out this striking application of AI-powered relighting: a single rendering is deeply & realistically transformed via one AI tool, and the results are then animated & extended by another.

Meanwhile Krea has just jumped into the game with similar-looking relighting tech. I’m off to check it out!

Can you use Photoshop GenFill on video?

Well, it doesn’t create animated results, but it can work perhaps surprisingly well on regions in static shots:

It can also be used to expand the canvas of similar shots:


Much amaze, wowo wowo:

This Lego machine can easily create a beautiful pixelart of anything you want! It is programmed in Python, and, with help of OpenAI’s DALL-E 3, it can make anything!

DesignBoom writes,

Sten of the YouTube channel Creative Mindstorms demonstrates his very own robot printer named Pixelbot 3000, made of LEGO bricks, that can produce pixel art with the help of OpenAI’s DALL-E 3 and AI images. Using a 32 x 32 plate and numerous round LEGO bricks, the robot printer automatically pins the pieces onto their designated positions until it forms the pixel art version of the image. He uses Python as his main programming language, and to create pixel art of anything, he employs AI, specifically OpenAI’s DALL-E 3.

Glif enables SD-powered image remixing via right click

Fun! You can grab the free browser extension here.

* right-click-remix any image w/ tons of amazing AI presets: Style Transfer, Controlnets… * build & remix your own workflows with full comfyUI support * local + cloud!

besides some really great default presets using all sorts of amazing ComfyUI workflows (which you can inspect and remix on, the extension will now also pull your own compatible glifs into it!

MimicBrush promises prompt-free regional adjustment

The tech, a demo of which you can try here, promises “‘imitative editing,’ allowing users to edit images using reference images without the need for detailed text descriptions.”

Here it is in action:

Runway introduces Gen-3 video

Good grief, the pace of change makes “AI vertigo” such a real thing. Just last week we were seeing “skeleton underwater” memes with Runway submerged in a rusty chair. :-p I’m especially excited to see how it handles text (which remains a struggle for text-to-image models including DALL•E):

Google introduces a super fun GenType tool

I’m really digging the simple joy in this little experiment, powered by Imagen:

Here’s a bit of fun enabled by “weedy seadragons on PVC pipes in a magical undersea kingdom” (click to see at full res):

Luma unveils Dream Machine video generator

I’m super eager to try this one out!

It is a highly scalable and efficient transformer model trained directly on videos making it capable of generating physically accurate, consistent and eventful shots. Dream Machine is our first step towards building a universal imagination engine and it is available to everyone now!

Adobe TOS = POS? Not so much.

There’s been a firestorm this week about the terms of service that my old home team put forward, based (as such things have been since time immemorial) on a lot of misunderstanding & fear. Fortunately the company has been working to clarify what’s really going on.

I did at least find this bit of parody amusing:

HyperDreamBooth, explained in 5 minutes

My former Google teammates have been cranking out some amazing AI personalization tech, with HyperDreamBooth far surpassing the performance of their original DreamBooth (y’know, from 2022—such a simpler ancient time!). Here they offer a short & pretty accessible overview of how it works:

Using only a single input image, HyperDreamBooth is able to personalize a text-to-image diffusion model 25x faster than DreamBooth, by using (1) a HyperNetwork to generate an initial prediction of a subset of network weights that are then (2) refined using fast finetuning for high fidelity to subject detail. Our method both conserves model integrity and style diversity while closely approximating the subject’s essence and details.

Check out The TED AI Show

“Maybe the real treasure was the friends we made along the way” is, generally, ironic shorthand for “worthless treasure”—but I’ve also found it to be true. That’s particularly the case for the time I spent at Google, where I met excellent folks like Bilawal Sidhu (a fellow PM veteran of the augmented reality group). I’m delighted that he’s now crushing it as the new host of the TED AI Show podcast.

Check out their episodes so far, including an interview with former OpenAI board member Helen Toner, who discusses the circumstances of firing Sam Altman last year before losing her board position.

Microsoft Paint (Paint!) does generative AI

Who’d a thunk it? But now everyone is getting into the game:

“Combine your ink strokes with text prompts to generate new images in nearly real time with Cocreator,” Microsoft explains. “As you iterate, so does the artwork, helping you more easily refine, edit and evolve your ideas. Powerful diffusion-based algorithms optimize for the highest quality output over minimum steps to make it feel like you are creating alongside AI.”

GenFill comes to Lightroom!

When I surveyed thousands of Photoshop customers waaaaaay back in the Before Times—y’know, summer 2022—I was struck by the fact that beyond wanting to insert things into images, and far beyond wanting to create images from scratch, just about everyone wanted better ways to remove things.

Happily, that capability has now come to Lightroom. It’s a deceptively simple change that, I believe, required a lot of work to evolve Lr’s non-destructive editing pipeline. Traditionally all edits were expressed as simple parameters, and then masks got added—but as far as I know, this is the first time Lr has ventured into transforming pixels in an additive way (that is, modify one bunch, then make subsequent edits that depend on the previous edits). That’s a big deal, and a big step forward for the team.

A few more examples courtesy of Howard Pinsky:

Podcast: Shantanu on The Verge

Adobe’s CEO (duh :-)) sat down with Nilay Patel for an in-depth interview. Here are some of the key points, as summarized by ChatGPT:


  1. AI as a Paradigm Shift: Narayen views AI as a fundamental shift, similar to the transitions to mobile and cloud technologies. He emphasizes that AI, especially generative AI, can automate tasks, enhance creative processes, and democratize access to creative tools. This allows users who might not have traditional artistic skills to create compelling content​ (GIGAZINE)​​ (Stanford Graduate School of Business)​.
  2. Generative AI in Adobe Products: Adobe’s Firefly, a family of generative AI models, has been integrated into various Adobe products. Firefly enhances creative workflows by enabling users to generate images, text effects, and video content with simple text prompts. This integration aims to accelerate ideation, exploration, and production, making it easier for creators to bring their visions to life​ (Adobe News)​​ (Welcome to the Adobe Blog)​.
  3. Empowering Creativity: Narayen highlights that Adobe’s approach to AI is centered around augmenting human creativity rather than replacing it. Tools like Generative Fill in Photoshop and new generative AI features in Premiere Pro are designed to streamline tedious tasks, allowing creators to focus on the more creative aspects of their work. This not only improves productivity but also expands creative possibilities​ (The Print)​​ (Adobe News)​.
  4. Business Model and Innovation: Narayen discusses how Adobe is adapting its business model to leverage AI. By integrating AI across Creative Cloud, Document Cloud, and Experience Cloud, Adobe aims to enhance its products and deliver more value to users. This includes experimenting with new business models and monetizing AI-driven features to stay at the forefront of digital creativity​ (Stanford Graduate School of Business)​​ (The Print)​.
  5. Content Authenticity and Ethics: Adobe emphasizes transparency and ethical use of AI. Initiatives like Content Credentials help ensure that AI-generated content is properly attributed and distinguishable from human-created content. This approach aims to maintain trust and authenticity in digital media​ (Adobe News)​​ (Welcome to the Adobe Blog)​.

Google’s CAT3D makes eye-popping worlds

I still can’t believe I was allowed in the building with these giant throbbing brains. 🙂

This kind of evolution should make a lot of people rethink what it means to be an image editor going forward—or even an image.

Conversational search is coming to Google Photos

I’ve gotta say, this one touches a kinda painful nerve with me.

10 years ago I walked into the Google Photos team expecting normal humans to do things like say, “Show me the best pictures of my grandkids.” I immediately felt like a fool: something like 97% of daily users don’t search, preferring to simply launch the app and scroll scroll scroll forever.

A decade later, the Photos team is talking about using large language models to enable uses like the following:

With Ask Photos, you can ask for what you’re looking for in a natural way, like: “Show me the best photo from each national park I’ve visited.” Google Photos can show you what you need, saving you from all that scrolling.

For example, you can ask: “What themes have we had for Lena’s birthday parties?”. Ask Photos will understand details, like what decorations are in the background or on the birthday cake, to give you the answer.

Will anyone actually do this? It’s really hard for me to imagine, at least as it’s been framed above.

Now, what I can imagine working—in pretty great ways—is a real Assistant experience that suggests a bunch of useful tasks with which it can assist, such as gathering up photos to make birthday or holiday cards. (The latter task always falls to me every year, and I wish I could more confidently do it better.) Assistant could easily ask whose birthday it is & on what date, then scan one’s library and suggest a nice range of images as well as presentation options (cards, short animations, etc.). That kind of agent could be a joy to interact with.

Conversational editing & ControlNet arrive in Photoshop (via plugin)

Never doubt the power of a motivated person or two to do what needs to be done. Stick around to the last section of this short vid to see Stable Diffusion-powered “Find & Replace” (maskless inpainting powered by prompts) in action:

Some eye-popping AI/3D demos

Martin Evening combines Adobe Substance 3D modeler and Krea to go from 3D sketch to burning rubber:

Jon Finger combines a whole slew of tools for sketch->AR:

Krea video arrives

Unlike Runway, Pika, Sora, and other generative video models, this approach from Krea (well-known for their realtime, multimodal AI composition tools) is simply keyframing states of image generation—which is a pretty powerful approach unto itself.

Here’s a lovely uses of it in action:

Drawing-based magic with Firefly & Magnific

Man, who knew that posting the tweet below would get me absolutely dragged by AI haters (“Worst. Dad. Ever.”) who briefly turned me into the Bean Dad of AI art? I should say more about that eye-opening experience, but for now, enjoy (unlike apparently thousands of others!) this innocuous mixing of AI & kid art:

Elsewhere, here’s a cool thread showing how even simple sketches can be interpreted in the style of 3D renderings via Magnific:

AI mashups of Star Wars x classic art

Check out Min Choi’s crossbreeding of Star Wars characters with iconic paintings (click tweet below to see the thread):

Here’s a look at his process (also a thread):

Finger-lickin’ body horror!

KFC is making a characteristic AI bug into a feature:

KFC celebrates the launch of their most finger-lickin’ product yet, with even more extrAI fingers.

With help from Meta’s new AI experience, KFC is encouraging people to use the new feature and generate images with more than five fingers. This AI idea builds on KFC’s new Saucy Nuggets campaign promoting their new saucy nuggets. To reward their participation, users will unlock a saucy nuggets coupon on the restaurant’s app.

Clever, though I’m reminded of Wint’s remark that “you do not, under any circumstances, ‘gotta hand it to them.'”

Tomorrow & tomorrow & tomorrow…

I told filmmaker Paul Trillo that I’ve apparently blogged his work here more than a dozen times over the past 10 years—long before AI generation became a thing. That’s because he’s always been eager to explore the boundaries of what’s possible with any given set of tools. In “Notes To My Future Self,” he combines new & traditional methods to make a haunting, melancholy meditation:

And here he provides an illuminating 1-minute peek into the processes that helped him create all this in just over a week’s time:

GenFill: Eternal Sunshine Edition

I get that it’s all in good fun, but hoo boy, the “Ex-Terminator” feature from PhotoRoom makes me melancholy. Meet me in Montauk…

Leonardo AI generates images with transparency

I keep meaning to try out this new capability, but there are so many tools, so few hours! In any case, it promises to be an exciting breakthrough. If you take it for a spin, I’d love to hear what you think of the results.

Highlighted use cases:

  • Image and Video Compositions: Quickly generate and incorporate assets into graphic designs or videos.
  • 2D Game Assets: Create game icons and illustrations with ease.
  • Stickers and Prints: Design stickers for apps or printable designs for merchandise like t-shirts and mugs.
  • Editorial: Seamlessly integrate images into articles, creating engaging banners without background concerns.

Tutorial: Video memes with Viggle

Sure, all this stuff—including what’s now my career’s work—will likely make it semi-impossible to reason together about any shared conception of reality, thereby calling into question the viability of democracy… but on the upside, moar dank memes!

Here’s how to create a dancing character using just an image + an existing video clip:

The new plugin brings ControlNet & DALL•E 3 to Photoshop

Check out the latest work (downloadable for free here) from longtime Adobe veteran (and former VP of product at Stability AI) Christian Cantrell:

The new version of the Concept Art #photoshop plugin is here! Create your own AI-powered workflows by combining hundreds of different imaging models from @replicate — as well as DALL•E 2 and 3 — without leaving @Photoshop. This is a complete rewrite with tons of new features coming (including local inference).

Google enables AI-powered generative fill

Not content to let Adobe & ChatGPT have all the fun, Google is now making its Imagen available to developers for image synthesis, including inserting items & expanding images:

Imagen, Google’s text-to-image mode, can now create live images from text, in preview. Just imagine generating animated images such as GIFs from a simple text prompt… Imagen also gets advanced photo editing features, including inpainting and outpainting, and a digital watermarking feature powered by Google DeepMind’s SynthID

I’m eager to learn more about the last bit re: content provenance. Adobe has talked a bunch about image watermarking, but has not (as far as I know) shipped any support.

Meanwhile Google is also challenging Runway, Pika, & others in the creation of short video clips:

Filmmaker Paul Trillo talks AI on “Hard Fork”

For 10 years or so I’ve been posting admiringly about the work of Paul Trillo (16 times so far; 17 now, good Lord), so I was excited to hear his conversation with the NYT Hard Fork crew—especially as he’s recently been pushing the limits with OpenAI’s Sora model. I think you’ll really enjoy this thoughtful, candid, and in-depth discussion about the possibilities & pitfalls of our new AI-infused creative world:

Krea adds multi-image prompt guidance

Some companies spend three months just on wringing their hands about whether to let you load a style reference image; others spend three people and go way beyond that, in realtime ¯\_(ツ)_/¯ :