Conversational editing & ControlNet arrive in Photoshop (via plugin)

Never doubt the power of a motivated person or two to do what needs to be done. Stick around to the last section of this short vid to see Stable Diffusion-powered “Find & Replace” (maskless inpainting powered by prompts) in action:

Some eye-popping AI/3D demos

Martin Evening combines Adobe Substance 3D modeler and Krea to go from 3D sketch to burning rubber:

Jon Finger combines a whole slew of tools for sketch->AR:

Throwback: “Packed with vectors, Physics Pak really satisfies”

I came across this post (originally from 2017) just now while looking for other work from Paul Asente. Here’s hoping it can finally see the light of day in Illustrator! —J.

———–

Paul Asente is an OG of the graphics world, having been responsible for (if I recall correctly) everything from Illustrator’s vector meshes & art brushes to variable-width strokes. Now he’s back with new Adobe illustration tech to drop some millefleurs science:

PhysicsPak automatically fills a shape with copies of elements, growing, stretching, and distorting them to fill the space. It uses a physics simulation to do this and to control the amount of distortion.

NewImage

[YouTube]

Krea video arrives

Unlike Runway, Pika, Sora, and other generative video models, this approach from Krea (well-known for their realtime, multimodal AI composition tools) is simply keyframing states of image generation—which is a pretty powerful approach unto itself.

Here’s a lovely uses of it in action:

Drawing-based magic with Firefly & Magnific

Man, who knew that posting the tweet below would get me absolutely dragged by AI haters (“Worst. Dad. Ever.”) who briefly turned me into the Bean Dad of AI art? I should say more about that eye-opening experience, but for now, enjoy (unlike apparently thousands of others!) this innocuous mixing of AI & kid art:


Elsewhere, here’s a cool thread showing how even simple sketches can be interpreted in the style of 3D renderings via Magnific:

AI mashups of Star Wars x classic art

Check out Min Choi’s crossbreeding of Star Wars characters with iconic paintings (click tweet below to see the thread):

Here’s a look at his process (also a thread):

Finger-lickin’ body horror!


KFC is making a characteristic AI bug into a feature:

KFC celebrates the launch of their most finger-lickin’ product yet, with even more extrAI fingers.

With help from Meta’s new AI experience, KFC is encouraging people to use the new feature and generate images with more than five fingers. This AI idea builds on KFC’s new Saucy Nuggets campaign promoting their new saucy nuggets. To reward their participation, users will unlock a saucy nuggets coupon on the restaurant’s app.

Clever, though I’m reminded of Wint’s remark that “you do not, under any circumstances, ‘gotta hand it to them.'”

Tomorrow & tomorrow & tomorrow…

I told filmmaker Paul Trillo that I’ve apparently blogged his work here more than a dozen times over the past 10 years—long before AI generation became a thing. That’s because he’s always been eager to explore the boundaries of what’s possible with any given set of tools. In “Notes To My Future Self,” he combines new & traditional methods to make a haunting, melancholy meditation:

And here he provides an illuminating 1-minute peek into the processes that helped him create all this in just over a week’s time:

GenFill: Eternal Sunshine Edition

I get that it’s all in good fun, but hoo boy, the “Ex-Terminator” feature from PhotoRoom makes me melancholy. Meet me in Montauk…

Tiny Glade: “Wholesome” 3D sculpting—and more?

This app looks like a delightful little creation tool that’s just meant for doodling, but I’d love to see this kind of physical creation paired with the world of generative AI rendering. I’m reminded of how “Little Big Planet” years ago made me yearn for Photoshop tools that felt like Sackboy’s particle-emitting jetpack. Someday, maybe…?

Leonardo AI generates images with transparency

I keep meaning to try out this new capability, but there are so many tools, so few hours! In any case, it promises to be an exciting breakthrough. If you take it for a spin, I’d love to hear what you think of the results.

Highlighted use cases:

  • Image and Video Compositions: Quickly generate and incorporate assets into graphic designs or videos.
  • 2D Game Assets: Create game icons and illustrations with ease.
  • Stickers and Prints: Design stickers for apps or printable designs for merchandise like t-shirts and mugs.
  • Editorial: Seamlessly integrate images into articles, creating engaging banners without background concerns.

Tutorial: Video memes with Viggle

Sure, all this stuff—including what’s now my career’s work—will likely make it semi-impossible to reason together about any shared conception of reality, thereby calling into question the viability of democracy… but on the upside, moar dank memes!

Here’s how to create a dancing character using just an image + an existing video clip:

The new Concept.art plugin brings ControlNet & DALL•E 3 to Photoshop

Check out the latest work (downloadable for free here) from longtime Adobe veteran (and former VP of product at Stability AI) Christian Cantrell:

The new version of the Concept Art #photoshop plugin is here! Create your own AI-powered workflows by combining hundreds of different imaging models from @replicate — as well as DALL•E 2 and 3 — without leaving @Photoshop. This is a complete rewrite with tons of new features coming (including local inference).

Google enables AI-powered generative fill

Not content to let Adobe & ChatGPT have all the fun, Google is now making its Imagen available to developers for image synthesis, including inserting items & expanding images:

Imagen, Google’s text-to-image mode, can now create live images from text, in preview. Just imagine generating animated images such as GIFs from a simple text prompt… Imagen also gets advanced photo editing features, including inpainting and outpainting, and a digital watermarking feature powered by Google DeepMind’s SynthID

I’m eager to learn more about the last bit re: content provenance. Adobe has talked a bunch about image watermarking, but has not (as far as I know) shipped any support.

Meanwhile Google is also challenging Runway, Pika, & others in the creation of short video clips:

Filmmaker Paul Trillo talks AI on “Hard Fork”

For 10 years or so I’ve been posting admiringly about the work of Paul Trillo (16 times so far; 17 now, good Lord), so I was excited to hear his conversation with the NYT Hard Fork crew—especially as he’s recently been pushing the limits with OpenAI’s Sora model. I think you’ll really enjoy this thoughtful, candid, and in-depth discussion about the possibilities & pitfalls of our new AI-infused creative world:

Krea adds multi-image prompt guidance

Some companies spend three months just on wringing their hands about whether to let you load a style reference image; others spend three people and go way beyond that, in realtime ¯\_(ツ)_/¯ :

ChatGPT adds image editing

When DALL•E first dropped, it wasn’t full-image creation that captured my attention so much as inpainting, i.e. creating/removing objects in designated regions. Over the years (all two of ’em ;-)) I’ve lost track of whether DALL•E’s Web interface has remained available (’cause who’s needed it after Generative Fill?), but I’m very happy to see this sort of selective synthesis emerge in the ChatGPT-DALL•E environment:

It’s also nice to see more visual suggestions appearing there:

Lego + GenFill = Yosemite Magic

Or… something like that. Whatever the case, I had fun popping our little Lego family photo (captured this weekend at Yosemite Valley’s iconic Tunnel View viewpoint) into Photoshop, selecting part of the excessively large rock wall, and letting Generative Fill give me some more nature. Click or tap (if needed) to see the before/after animation:

Infographic magic via Firefly?

Hey, I know what you know (or quite possibly less :-)), but this demo (which for some reason includes Shaq) looks pretty cool:

From the description:

Elevate your data storytelling with #ProjectInfographIt, a game-changing solution leveraging Adobe Firefly generative AI. Simplify the infographic creation process by instantly generating design elements tailored to your key messages and data. With intuitive features for color palettes, chart types, graphics, and animations, effortlessly transform complex insights into visually stunning infographics.

Fun uses of Firefly’s Structure Reference

Man, I can’t tell you how long I wanted folks to get this tech into their hands, and I’m excited that you can finally take it for a spin. Here are some great examples (from a thread by Min Choi, which contains more) showing how people are putting it into action:

Reinterpreted kids’ drawings:

More demanding sketch-to-image:

Stylized Bitmoji:

Google Research promises better image compositing

Speaking of folks with whom I’ve somehow had the honor of working, some of my old teammates from Google have unveiled ObjectDrop. Check out this video & thread:

A bit more detail, from the project site:

Diffusion models have revolutionized image editing but often generate images that violate physical laws, particularly the effects of objects on the scene, e.g., occlusions, shadows, and reflections. By analyzing the limitations of self-supervised approaches, we propose a practical solution centered on a counterfactual dataset.

Our method involves capturing a scene before and after removing a single object, while minimizing other changes. By fine-tuning a diffusion model on this dataset, we are able to not only remove objects but also their effects on the scene. However, we find that applying this approach for photorealistic object insertion requires an impractically large dataset. To tackle this challenge, we propose bootstrap supervision; leveraging our object removal model trained on a small counterfactual dataset, we synthetically expand this dataset considerably.

Our approach significantly outperforms prior methods in photorealistic object removal and insertion, particularly at modeling the effects of objects on the scene.

DesignEdit: AI-powered image editing from Microsoft Research

“Why would you go work at Microsoft? What do they know or care about creative imaging…?” 🙂

I’m delighted to say that my new teammates have been busy working on some promising techniques for performing a range of image edits, from erasing to swapping, zooming, and more:

Firefly adds Structure Reference

I’m delighted to see that the longstanding #1 user request for Firefly—namely the ability to upload an image to guide the structure of a generated image—has now arrived:

This nicely complements the extremely popular style-matching capability we enabled back in October. You can check out details of how it works, as well a look at the UI (below)—plus my first creation made using the new tech ;-).

Magnific style transfer is amazing

It’s amazing to see what two people (?!) are able to do. Check out this video & the linked thread, as well as the tool itself.

I’m gonna have a ball going down this rabbit hole, especially for type:

A lovely Guinness ad from… Jason Momoa?

It’s somehow true!

I think the spirit of maximally inclusive “Irishness” has special resonance for millions of people around the world, like me, who can trace a portion (but not all) of their ancestry to the Emerald Isle. (For me it’s 75%, surname notwithstanding.) I’m reminded of Notre Dame’s “What Would You Fight For?” campaign, which features scientists, engineers, and humanitarians from around the world who conclude with “We are the Fighting Irish.” I dunno—it’s hard to explain, but it really warms my heart—as did the Irish & Chinese Railroad Workers float we saw in SF’s St. Paddy’s parade on Saturday.

Anyway, I found this bit starring & directed by Jason Momoa to be pretty charming. Enjoy:

Irish blessings

Hey gang—I hope you’ve had a safe & festive St. Patrick’s Day. To mark the occasion, I figured I’d reshare a couple of the videos I captured in the old country with my dad back in August.

Here’s Co. Clare’s wild burren (“rocky district,” hence the choice of Chieftains/Stones banger)…

…my dad’s grandparents’ medieval town in Galway…

…and my mom’s mother’s farm in Mayo:

Amazing: Realtime AI rendering of Photoshop

I cannot tell you how deeply I hope that the Photoshop team is paying attention to developments like this…

Celebrating “Subpar Parks”

During our recent road trip to Death Valley, my 15yo son rolled his eyes at nature’s majesty:

This made me chuckle & remember “Subpar Parks,” a visual celebration of the most dismissive reviews of our natural treasures. My wife & I have long decorated our workspaces with these unintentional gems, and I think you’ll dig the Insta feed & book (now complemented by “Subpar Planet“).

Creating the creepy infrared world of Dune

I really enjoyed Dolby’s recent podcast on Greig Fraser and the Cinematography of Dune: Part Two, as well as this deep dive with Denis Villeneuve on how they modified an ARRI Alexa LF IMAX camera to create the Harkonnens’ alienating home world.

I love this idea and I tried, for Giedi Prime, the home world of Harkonnen, there’s less information in the book and it’s a world that is disconnected from nature. It’s a plastic world. So, I thought that it could be interesting if the light, the sunlight could give us some insight on their psyche. What if instead of revealing colors, the sunlight was killing them and creating a very eerie black and white world, that will give us information about how these people perceive reality, about their political system, about how that primitive brutalist culture and it was in the screenplay.

Fun little AI->3D->AR experiments with Vision Pro

I love watching people connect the emerging creative dots, right in front of our eyes:

AI Mortal Kombat

Heh—these are obviously silly but well done, and they speak to the creative importance of being specific—i.e. representing particular famous faces. I sometimes note that a joke about a singer & a football player is one thing, whereas a joke about Taylor Swift & Travis Kelce is a whole other thing, all due to it being specific. Thus, for an AI toolmaker, knowing exactly where to draw the line (e.g. disallowing celebrity likenesses) isn’t always so clear.

So… what am I actually doing at Microsoft?

It’s a great question, and I think it’s really thoughtful that the day before I joined, the company was generous enough to run a Superb Owl—er, Super Bowl—commercial, just to help me explain the mission to my parents. 😀

But seriously, this ad provides a brief peek into the world of how Copilot can already generate beautiful, interesting things based on your needs—and that’s a core part of the mission I’ve come here to tackle.

A few salient screenshots: