Category Archives: AI/ML

Magic Insert promises stylistically harmonized compositing

July 6, 2024AI/MLjnack

New tech from my old Google teammates makes some exciting claims:

Using Magic Insert we are, for the first time, able to drag-and-drop a subject from an image with an arbitrary style onto another target image with a vastly different style and achieve a style-aware and realistic insertion of the subject into the target image.

Here is a demo that you can access on the desktop version of the website. We’re excited by the options Magic Insert opens up for artistic creation, content creation and for the overall expansion of GenAI controllability. pic.twitter.com/HhbfrEfXZH

— Nataniel Ruiz (@natanielruizg) July 3, 2024

Of course, much of the challenge here—where art meets science—is around identity preservation: to what extent can & should the output resemble the input? Here it’s subject to some interpretation. In other applications one wants an exact copy of a given person or thing, but optionally transformed in just certain ways (e.g. pose & lighting).

When we launched Firefly last year, we showed off some of Adobe’s then-new ObjectStitch tech for making realistic composites. It didn’t ship while I was there due to challenges around identity preservation. As far as I know those challenges remain only partially solved, so I’ll continue holding out hope—as I have for probably 30 years now!—for future tech breakthroughs that get us all the way across that line.

Day & Night, Magnific + Luma Edition

July 5, 2024AI/ML, Relightingjnack

Check out this striking application of AI-powered relighting: a single rendering is deeply & realistically transformed via one AI tool, and the results are then animated & extended by another.

Style Transfer + Relight + Upscale + Luma (key frames) = pic.twitter.com/i7FujiZ5P1

— Javi Lopez (@javilopen) June 29, 2024

Meanwhile Krea has just jumped into the game with similar-looking relighting tech. I’m off to check it out!

announcing Scene Transfer.

create new scenes in seconds with perfect light and color consistency.

free for everyone. pic.twitter.com/JxYff4NZrP

— KREA AI (@krea_ai) July 5, 2024

“Biblically accurate gymnastics”

July 2, 2024AI/MLjnack

Honestly I’ll be kinda sad when this kind of madness gets “fixed”:

Gymnastics is the Turing test of video generation models pic.twitter.com/cOhmUJjI2m

— Deedy (@deedydas) July 2, 2024

May we live in interesting times…

Dall-E on biblically accurate gymnastics — pic.twitter.com/mKKxdS0HGv

— Deedy (@deedydas) July 2, 2024

Check out the fun “Conceptual Camera”

June 28, 2024AI/MLjnack

Heh—we’re way beyond Not Hotdog now. Alexander Reben writes,

“Silly AI Label Maker” [is] a mode of the “Conceptual Camera” developed as part of my artist residency at @openai

View this post on Instagram

A post shared by Alexander Reben (@artboffin)

Destination wedding in the Uncanny Valley

June 24, 2024AI/MLjnack

Tired: Using generative video to animate memes.
Wired: Using it to insert phantom “friends” into your wedding memories!

I threw our 15-year old wedding photo into Luma’s Dream Machine just to see what happens. I’m thoroughly amused. pic.twitter.com/TIEVK6gCuf

— Howard Pinsky (@Pinsky) June 15, 2024

Can you use Photoshop GenFill on video?

June 23, 2024AI/ML, Generative Filljnack

Well, it doesn’t create animated results, but it can work perhaps surprisingly well on regions in static shots:

Generative Fill isn’t available for moving videos just yet, but Photoshop can handle stationary clips quite well pic.twitter.com/e8GGGdomrC

— Howard Pinsky (@Pinsky) June 19, 2024

It can also be used to expand the canvas of similar shots:

“But can videos be EXPANDED in @Photoshop?!”

Sound up! pic.twitter.com/hyeoJs9Bse

— Howard Pinsky (@Pinsky) June 20, 2024

OMG: DALL•E -> LEGO

June 22, 2024AI/ML, DALL•Ejnack

Much amaze, wowo wowo:

This Lego machine can easily create a beautiful pixelart of anything you want! It is programmed in Python, and, with help of OpenAI’s DALL-E 3, it can make anything!

DesignBoom writes,

Sten of the YouTube channel Creative Mindstorms demonstrates his very own robot printer named Pixelbot 3000, made of LEGO bricks, that can produce pixel art with the help of OpenAI’s DALL-E 3 and AI images. Using a 32 x 32 plate and numerous round LEGO bricks, the robot printer automatically pins the pieces onto their designated positions until it forms the pixel art version of the image. He uses Python as his main programming language, and to create pixel art of anything, he employs AI, specifically OpenAI’s DALL-E 3.

You know you’re getting AI-pilled…

June 21, 2024AI/ML, DALL•E, Typographyjnack

…when you start seeing correctly spelled letting in the real world & thinking that it’s a DALL•E spelling fail. :-p

Glif enables SD-powered image remixing via right click

June 20, 2024AI/MLjnack

Fun! You can grab the free browser extension here.

* right-click-remix any image w/ tons of amazing AI presets: Style Transfer, Controlnets… * build & remix your own workflows with full comfyUI support * local + cloud!

besides some really great default presets using all sorts of amazing ComfyUI workflows (which you can inspect and remix on http://glif.app), the extension will now also pull your own compatible glifs into it!

MimicBrush promises prompt-free regional adjustment

June 18, 2024AI/MLjnack

The tech, a demo of which you can try here, promises “‘imitative editing,’ allowing users to edit images using reference images without the need for detailed text descriptions.”

Here it is in action:

兄弟们，这个牛P

MimicBrush：通过模仿参考图像对目标图像选定区域自动进行局部编辑

也就是你可以选择图片中的某个部分（比如一个人的衣服、背景等），然后选择一个你喜欢的参考图片。

MimicBrush会根据参考图片的样子自动修改你选择的部分，让它看起来像参考图片中的那样。… pic.twitter.com/8wVy2hPgW3

— 小互 (@imxiaohu) June 18, 2024

Runway introduces Gen-3 video

June 17, 2024AI/MLjnack

Good grief, the pace of change makes “AI vertigo” such a real thing. Just last week we were seeing “skeleton underwater” memes with Runway submerged in a rusty chair. :-p I’m especially excited to see how it handles text (which remains a struggle for text-to-image models including DALL•E):

Being able to render text has also been incredibly fun to play with pic.twitter.com/Y12ZvLm6I8

— Cristóbal Valenzuela (@c_valenzuelab) June 17, 2024

Google introduces a super fun GenType tool

June 13, 2024AI/ML, Typographyjnack

I’m really digging the simple joy in this little experiment, powered by Imagen:

1 Prompt. 26 letters. Any kind of alphabet you can imagine. #GenType empowers you to craft, refine, and download one-of-a-kind AI generated type, building from A-Z with just your imagination.

Watch our Creative Lab teammates @trudypainter and @soybean_gx demo this latest… pic.twitter.com/rr6FIoEg2f

— labs.google (@labsdotgoogle) June 12, 2024

Here’s a bit of fun enabled by “weedy seadragons on PVC pipes in a magical undersea kingdom” (click to see at full res):

Luma unveils Dream Machine video generator

June 12, 2024AI/MLjnack

I’m super eager to try this one out!

It is a highly scalable and efficient transformer model trained directly on videos making it capable of generating physically accurate, consistent and eventful shots. Dream Machine is our first step towards building a universal imagination engine and it is available to everyone now!

Unleash your inner monster maker: Monster Camp by @monster_library pic.twitter.com/HyH56WyvAr

— Luma AI (@LumaLabsAI) June 12, 2024

Adobe TOS = POS? Not so much.

June 7, 2024Adobe Firefly, AI/MLjnack

There’s been a firestorm this week about the terms of service that my old home team put forward, based (as such things have been since time immemorial) on a lot of misunderstanding & fear. Fortunately the company has been working to clarify what’s really going on.

Sorry for delay on this. Info here, including what actually changed in the TOS (not much), as well as what Adobe can / cannot do with your content. https://t.co/LZFkDXrmep

— Mike Chambers (@mesh) June 6, 2024

I did at least find this bit of parody amusing:

Huge if true. https://t.co/AFK8nyhrDg

— John Nack (@jnack) June 6, 2024

HyperDreamBooth, explained in 5 minutes

June 5, 2024AI/MLjnack

My former Google teammates have been cranking out some amazing AI personalization tech, with HyperDreamBooth far surpassing the performance of their original DreamBooth (y’know, from 2022—such a simpler ancient time!). Here they offer a short & pretty accessible overview of how it works:

Using only a single input image, HyperDreamBooth is able to personalize a text-to-image diffusion model 25x faster than DreamBooth, by using (1) a HyperNetwork to generate an initial prediction of a subset of network weights that are then (2) refined using fast finetuning for high fidelity to subject detail. Our method both conserves model integrity and style diversity while closely approximating the subject’s essence and details.

Check out The TED AI Show

June 4, 2024AI/MLjnack

“Maybe the real treasure was the friends we made along the way” is, generally, ironic shorthand for “worthless treasure”—but I’ve also found it to be true. That’s particularly the case for the time I spent at Google, where I met excellent folks like Bilawal Sidhu (a fellow PM veteran of the augmented reality group). I’m delighted that he’s now crushing it as the new host of the TED AI Show podcast.

Check out their episodes so far, including an interview with former OpenAI board member Helen Toner, who discusses the circumstances of firing Sam Altman last year before losing her board position.

Microsoft Designer enables DALL•E-infused greeting card creation

May 31, 2024AI/ML, Microsoftjnack

My new teammates continue to roll out good stuff. (I can’t yet take credit for anything.) Come take it for a spin!

New feature! Introducing Greeting Cards in Designer!

Transform a simple prompt into a beautiful card for any occasion in 4 easy steps. #MicrosoftDesigner pic.twitter.com/JYfpafuKH8

— Microsoft Designer (@MSFT365Designer) May 29, 2024

Microsoft Paint (Paint!) does generative AI

May 27, 2024AI/ML, Illustration, Microsoftjnack

Who’d a thunk it? But now everyone is getting into the game:

“Combine your ink strokes with text prompts to generate new images in nearly real time with Cocreator,” Microsoft explains. “As you iterate, so does the artwork, helping you more easily refine, edit and evolve your ideas. Powerful diffusion-based algorithms optimize for the highest quality output over minimum steps to make it feel like you are creating alongside AI.”

My new team brings DALL•E to Teams

May 22, 2024AI/ML, DALL•E, Microsoftjnack

The Designer team at Microsoft is working to enable AI-powered creation & editing experiences across a wide range of tools, and I’m delighted that my new teammates are rolling out a new set of integrations. Check out how you can now create images right inside Microsoft Teams:

GenFill comes to Lightroom!

May 21, 2024AI/ML, Generative Fill, Lightroom, Photographyjnack

When I surveyed thousands of Photoshop customers waaaaaay back in the Before Times—y’know, summer 2022—I was struck by the fact that beyond wanting to insert things into images, and far beyond wanting to create images from scratch, just about everyone wanted better ways to remove things.

Happily, that capability has now come to Lightroom. It’s a deceptively simple change that, I believe, required a lot of work to evolve Lr’s non-destructive editing pipeline. Traditionally all edits were expressed as simple parameters, and then masks got added—but as far as I know, this is the first time Lr has ventured into transforming pixels in an additive way (that is, modify one bunch, then make subsequent edits that depend on the previous edits). That’s a big deal, and a big step forward for the team.

A few more examples courtesy of Howard Pinsky:

Removing distracting objects just got that much more powerful in @Lightroom. Generative Remove has arrived! pic.twitter.com/CrZ6A3AKOF

— Howard Pinsky (@Pinsky) May 21, 2024

Podcast: Shantanu on The Verge

May 20, 2024Adobe Firefly, AI/MLjnack

Adobe’s CEO (duh :-)) sat down with Nilay Patel for an in-depth interview. Here are some of the key points, as summarized by ChatGPT:

———-

AI as a Paradigm Shift: Narayen views AI as a fundamental shift, similar to the transitions to mobile and cloud technologies. He emphasizes that AI, especially generative AI, can automate tasks, enhance creative processes, and democratize access to creative tools. This allows users who might not have traditional artistic skills to create compelling content (GIGAZINE) (Stanford Graduate School of Business).
Generative AI in Adobe Products: Adobe’s Firefly, a family of generative AI models, has been integrated into various Adobe products. Firefly enhances creative workflows by enabling users to generate images, text effects, and video content with simple text prompts. This integration aims to accelerate ideation, exploration, and production, making it easier for creators to bring their visions to life (Adobe News) (Welcome to the Adobe Blog).
Empowering Creativity: Narayen highlights that Adobe’s approach to AI is centered around augmenting human creativity rather than replacing it. Tools like Generative Fill in Photoshop and new generative AI features in Premiere Pro are designed to streamline tedious tasks, allowing creators to focus on the more creative aspects of their work. This not only improves productivity but also expands creative possibilities (The Print) (Adobe News).
Business Model and Innovation: Narayen discusses how Adobe is adapting its business model to leverage AI. By integrating AI across Creative Cloud, Document Cloud, and Experience Cloud, Adobe aims to enhance its products and deliver more value to users. This includes experimenting with new business models and monetizing AI-driven features to stay at the forefront of digital creativity (Stanford Graduate School of Business) (The Print).
Content Authenticity and Ethics: Adobe emphasizes transparency and ethical use of AI. Initiatives like Content Credentials help ensure that AI-generated content is properly attributed and distinguishable from human-created content. This approach aims to maintain trust and authenticity in digital media (Adobe News) (Welcome to the Adobe Blog).

Google’s CAT3D makes eye-popping worlds

May 17, 20243D, AI/MLjnack

I still can’t believe I was allowed in the building with these giant throbbing brains. 🙂

Create a 3D model from a single image, set of images or a text prompt in < 1 minute

This new AI paper called CAT3D shows us that it’ll keep getting easier to produce 3D models from 2D images — whether it’s a sparser real world 3D scan (a few photos instead of hundreds) or… pic.twitter.com/sOsOBsjC8Q

— Bilawal Sidhu (@bilawalsidhu) May 17, 2024

This kind of evolution should make a lot of people rethink what it means to be an image editor going forward—or even an image.

Amazing work by @RuiqiGao @holynski_ @philipphenzler @rmbrualla @_pratul_ @jon_barron @poolio

The Google crew strike again! Looks better than ReconFusion too. Hope there’s a code release.pic.twitter.com/RArpAZfJJB

— Bilawal Sidhu (@bilawalsidhu) May 17, 2024

Conversational search is coming to Google Photos

May 16, 2024AI/ML, Google Photosjnack

I’ve gotta say, this one touches a kinda painful nerve with me.

10 years ago I walked into the Google Photos team expecting normal humans to do things like say, “Show me the best pictures of my grandkids.” I immediately felt like a fool: something like 97% of daily users don’t search, preferring to simply launch the app and scroll scroll scroll forever.

A decade later, the Photos team is talking about using large language models to enable uses like the following:

With Ask Photos, you can ask for what you’re looking for in a natural way, like: “Show me the best photo from each national park I’ve visited.” Google Photos can show you what you need, saving you from all that scrolling.

For example, you can ask: “What themes have we had for Lena’s birthday parties?”. Ask Photos will understand details, like what decorations are in the background or on the birthday cake, to give you the answer.

Will anyone actually do this? It’s really hard for me to imagine, at least as it’s been framed above.

Now, what I can imagine working—in pretty great ways—is a real Assistant experience that suggests a bunch of useful tasks with which it can assist, such as gathering up photos to make birthday or holiday cards. (The latter task always falls to me every year, and I wish I could more confidently do it better.) Assistant could easily ask whose birthday it is & on what date, then scan one’s library and suggest a nice range of images as well as presentation options (cards, short animations, etc.). That kind of agent could be a joy to interact with.

Conversational editing & ControlNet arrive in Photoshop (via plugin)

May 16, 2024AI/MLjnack

Never doubt the power of a motivated person or two to do what needs to be done. Stick around to the last section of this short vid to see Stable Diffusion-powered “Find & Replace” (maskless inpainting powered by prompts) in action:

The newest version of the https://t.co/zMFye0YPsP #Photoshop plugin adds support for v2 of the @StabilityAI APIs. You now get all @replicate models, @OpenAI‘s DALL•E 3, and all of Stability’s models (including SD3). And it’s still free!https://t.co/exnJVygz4m pic.twitter.com/t03GQR0Do7

— Christian Cantrell (@cantrell) May 16, 2024

Some eye-popping AI/3D demos

May 13, 2024AI/ML, Illustrationjnack

Martin Evening combines Adobe Substance 3D modeler and Krea to go from 3D sketch to burning rubber:

I don’t know about you, but I think this is pretty incredible..

Adobe Substance 3D modeler and Krea mainly here.

Quick, gestural 3d sculpting plus ai =(I think I’ve said that a few times haven’t I?) #ai #art pic.twitter.com/w0ZhBvQ0Pr

— Martin Nebelong (@MartinNebelong) May 13, 2024

Jon Finger combines a whole slew of tools for sketch->AR:

More playing with a Procreate to Simulon pipeline. It’s still a rough pipeline but fun to play with.

Sketched in @Procreate
Style transfer in @Magnific_AI
3d conversion in @fondantai
Motion capture in @MoveAI_
Rigged in @MaxonVFX c4d
Final capture on @Simulon pic.twitter.com/GiI1wJ2IWR

— Jon Finger (@mrjonfinger) May 13, 2024

Krea video arrives

May 9, 2024AI/MLjnack

Unlike Runway, Pika, Sora, and other generative video models, this approach from Krea (well-known for their realtime, multimodal AI composition tools) is simply keyframing states of image generation—which is a pretty powerful approach unto itself.

Krea Video is here!

this is how it works

–
(sound on) pic.twitter.com/eld5RAoHdO

— KREA AI (@krea_ai) May 9, 2024

Here’s a lovely uses of it in action:

Cities, made on @krea_ai‘s new video product pic.twitter.com/gLFFnK5AkX

— Justine Moore (@venturetwins) May 9, 2024

Drawing-based magic with Firefly & Magnific

May 9, 20243D, Adobe Firefly, AI/ML, ControlNet, Illustrationjnack

Man, who knew that posting the tweet below would get me absolutely dragged by AI haters (“Worst. Dad. Ever.”) who briefly turned me into the Bean Dad of AI art? I should say more about that eye-opening experience, but for now, enjoy (unlike apparently thousands of others!) this innocuous mixing of AI & kid art:

Having fun reinterpreting my son’s old drawings via #AdobeFirefly Structure Reference: pic.twitter.com/ALLBqdyPEc

— John Nack (@jnack) April 16, 2024

Elsewhere, here’s a cool thread showing how even simple sketches can be interpreted in the style of 3D renderings via Magnific:

THIS IS NOT 3D

Did you know you can use AI as kind of pseudo 3d renderer?

In the future, every pixel in a video game will not be RENDERED but GENERATED in real time. But people are already creating insane “AI renders” today.

Here are 18 mind blowing examples + a tutorial: pic.twitter.com/MujuYpJcO3

— Javi Lopez (@javilopen) April 16, 2024

AI mashups of Star Wars x classic art

May 8, 2024AI/ML, Illustrationjnack

Check out Min Choi’s crossbreeding of Star Wars characters with iconic paintings (click tweet below to see the thread):

3. “The Scream” by Edvard Munch pic.twitter.com/a3FchLr4B4

— Min Choi (@minchoi) May 4, 2024

Here’s a look at his process (also a thread):

Yesterday, I showed wild AI results of famous art in Star Wars style using Midjourney v6.

Here are the comparison to original art.

How did AI do?

1. “The Last Supper” by Leonardo da Vinci pic.twitter.com/7Y4r08TQPJ

— Min Choi (@minchoi) May 5, 2024

Finger-lickin’ body horror!

May 6, 2024AI/MLjnack

KFC is making a characteristic AI bug into a feature:

KFC celebrates the launch of their most finger-lickin’ product yet, with even more extrAI fingers.

With help from Meta’s new AI experience, KFC is encouraging people to use the new feature and generate images with more than five fingers. This AI idea builds on KFC’s new Saucy Nuggets campaign promoting their new saucy nuggets. To reward their participation, users will unlock a saucy nuggets coupon on the restaurant’s app.

Clever, though I’m reminded of Wint’s remark that “you do not, under any circumstances, ‘gotta hand it to them.'”

Tomorrow & tomorrow & tomorrow…

May 1, 2024AI/ML, Generative Filljnack

I told filmmaker Paul Trillo that I’ve apparently blogged his work here more than a dozen times over the past 10 years—long before AI generation became a thing. That’s because he’s always been eager to explore the boundaries of what’s possible with any given set of tools. In “Notes To My Future Self,” he combines new & traditional methods to make a haunting, melancholy meditation:

And here he provides an illuminating 1-minute peek into the processes that helped him create all this in just over a week’s time:

Can AI create better VFX? Lots of VFX don’t look great because they don’t know what they’re lighting to on set. Using a variety of AI tools, we can now move fluidly between pre and post. BGs made with stable diffusion, Photoshop Gen Fill, Magnific, Krea and Topaz and Runway Gen-2 pic.twitter.com/DLyA60XaUB

— Paul Trillo (@paultrillo) April 26, 2024

GenFill: Eternal Sunshine Edition

April 30, 2024AI/ML, Generative Filljnack

I get that it’s all in good fun, but hoo boy, the “Ex-Terminator” feature from PhotoRoom makes me melancholy. Meet me in Montauk…

We are excited to launch our latest AI partnership campaign with @okcupid ! This one was so much fun to build.

More than half of singles want to erase their exes from their photos, so @photoroom_app and @okcupid teamed up to help singles ditch the ex and keep the selfies… pic.twitter.com/n0WRN7zICH

— Matthieu Rouif (@matthieurouif) April 29, 2024

Making a hyperlapse using GenFill on video

April 26, 2024Adobe Firefly, AI/ML, Generative Filljnack

Super quick & fun—and hey, temporal stability is overrated anyway! 🙂

View this post on Instagram

A post shared by Matthew Vandeputte (@matjoez)

Distraction removal headed to Photoshop, Lightroom

April 25, 2024AI/ML, Photographyjnack

Adobe friends like Eli Shechtman have been publishing research for several years, and Creative Bloq reports that the functionality is due to make its way to the flagship imaging apps in the near future. Check out their post for details.

Automatic selection:

Cleaned-up results:

Object removal in Lightroom:

AI + Ai: Illustrator mockup magic

April 24, 20243D, Adobe Firefly, AI/MLjnack

Check out this nice little tutorial from Howard Pinsky:

Firefly v3 comes to Photoshop

April 23, 2024Adobe Firefly, AI/MLjnack

You love to see it—available now via the beta (which you can download via that little “CC” icon you generally ignore in your menubar :-)):

Just released! Don’t just edit images in #Photoshop. Now Ps can make them with #adobefirefly integrated! #adobexcommunity pic.twitter.com/VL33b58QY0

— Paul Trani (@paultrani) April 23, 2024

Also, props to Paul on his HELVETICA shirt, which reminds me of my old METADATA beauty.

Meta.ai introduces realtime image generation

April 23, 2024AI/ML, Illustrationjnack

You can try this now at Meta.ai. I’m very curious to see how much people favor speed vs. output quality.

Meta’s new AI does crazy realtime image generation…feels like improv or a tool for karaoke pic.twitter.com/uzzJ3g4lxD

— Scott Stein (@jetscott) April 19, 2024

Leonardo AI generates images with transparency

April 22, 2024AI/MLjnack

I keep meaning to try out this new capability, but there are so many tools, so few hours! In any case, it promises to be an exciting breakthrough. If you take it for a spin, I’d love to hear what you think of the results.

We’re thrilled to unveil #Transparency— another new https://t.co/LlErGl3jwe feature that enables true native transparent PNG generation!

https://t.co/Qh8wsnpKIg

Transparency is more than background removal—this is native image diffusion with clean edges.

Read on! pic.twitter.com/JCEGas8H8z

— Leonardo.Ai (@LeonardoAi_) March 19, 2024

Highlighted use cases:

Image and Video Compositions: Quickly generate and incorporate assets into graphic designs or videos.
2D Game Assets: Create game icons and illustrations with ease.
Stickers and Prints: Design stickers for apps or printable designs for merchandise like t-shirts and mugs.
Editorial: Seamlessly integrate images into articles, creating engaging banners without background concerns.

Tutorial: Video memes with Viggle

April 16, 2024AI/MLjnack

Sure, all this stuff—including what’s now my career’s work—will likely make it semi-impossible to reason together about any shared conception of reality, thereby calling into question the viability of democracy… but on the upside, moar dank memes!

Here’s how to create a dancing character using just an image + an existing video clip:

Viggle is the new hottest AI Creative Tool That is forever changing Memes and the future of AI Video.@aiwarper created a meme with the joker and Lil Yachty that caused a hilarious explosion.

In this video I’ll show you:

1. What Viggle is and how it works
2. Why its more… pic.twitter.com/dl2XSyQ0oT

— Riley Brown (@rileybrown_ai) April 16, 2024

Adobe previews GenFill for video, Sora/Pika integration, & more

April 15, 2024Adobe Firefly, AI/ML, Generative Filljnack

Removing objects will be huge, and Generative Extend—which can add a couple of seconds to clips to ease transitions—seems handy. Check out what’s in the works:

The new Concept.art plugin brings ControlNet & DALL•E 3 to Photoshop

April 14, 2024AI/ML, ControlNet, DALL•Ejnack

Check out the latest work (downloadable for free here) from longtime Adobe veteran (and former VP of product at Stability AI) Christian Cantrell:

The new version of the Concept Art #photoshop plugin is here! Create your own AI-powered workflows by combining hundreds of different imaging models from @replicate — as well as DALL•E 2 and 3 — without leaving @Photoshop. This is a complete rewrite with tons of new features coming (including local inference).

Google enables AI-powered generative fill

April 10, 2024AI/MLjnack

Not content to let Adobe & ChatGPT have all the fun, Google is now making its Imagen available to developers for image synthesis, including inserting items & expanding images:

We’re also adding advanced photo editing features, including inpainting and outpainting.

These features make it easy to remove unwanted elements, include new ones, and expand the borders of images to create a wider field of view. → https://t.co/Cbz4Pajkch #GoogleCloudNext pic.twitter.com/SRsEQjHWD5

— Google DeepMind (@GoogleDeepMind) April 9, 2024

Imagen, Google’s text-to-image mode, can now create live images from text, in preview. Just imagine generating animated images such as GIFs from a simple text prompt… Imagen also gets advanced photo editing features, including inpainting and outpainting, and a digital watermarking feature powered by Google DeepMind’s SynthID.

I’m eager to learn more about the last bit re: content provenance. Adobe has talked a bunch about image watermarking, but has not (as far as I know) shipped any support.

Meanwhile Google is also challenging Runway, Pika, & others in the creation of short video clips:

Our generative technology Imagen 2 can now create short, 4-second live images from a single prompt.

It’s available to use in @GoogleCloud’s #VertexAI platform. → https://t.co/CLMN3wNmeP #GoogleCloudNext pic.twitter.com/B4RocdDXrk

— Google DeepMind (@GoogleDeepMind) April 9, 2024

Filmmaker Paul Trillo talks AI on “Hard Fork”

April 7, 2024AI/MLjnack

For 10 years or so I’ve been posting admiringly about the work of Paul Trillo (16 times so far; 17 now, good Lord), so I was excited to hear his conversation with the NYT Hard Fork crew—especially as he’s recently been pushing the limits with OpenAI’s Sora model. I think you’ll really enjoy this thoughtful, candid, and in-depth discussion about the possibilities & pitfalls of our new AI-infused creative world:

Krea adds multi-image prompt guidance

April 5, 2024AI/MLjnack

Some companies spend three months just on wringing their hands about whether to let you load a style reference image; others spend three people and go way beyond that, in realtime ¯\_(ツ)_/¯ :

These guys are doing such a good job creating intuitive visual interfaces for prompting

This is the new real-time image blending interface from @krea_ai

Such a smart designpic.twitter.com/qgzC86DNm7

— Nick St. Pierre (@nickfloats) April 4, 2024

ChatGPT adds image editing

April 4, 2024AI/ML, DALL•E, Generative Filljnack

When DALL•E first dropped, it wasn’t full-image creation that captured my attention so much as inpainting, i.e. creating/removing objects in designated regions. Over the years (all two of ’em ;-)) I’ve lost track of whether DALL•E’s Web interface has remained available (’cause who’s needed it after Generative Fill?), but I’m very happy to see this sort of selective synthesis emerge in the ChatGPT-DALL•E environment:

You can now edit DALL·E images in ChatGPT across web, iOS, and Android. pic.twitter.com/AJvHh5ftKB

— OpenAI (@OpenAI) April 3, 2024

It’s also nice to see more visual suggestions appearing there:

You can also get inspiration on styles when creating images in the DALL·E GPT. pic.twitter.com/mRrkwJKHyq

— OpenAI (@OpenAI) April 3, 2024

Lego + GenFill = Yosemite Magic

April 3, 2024Adobe Firefly, Generative Fill, Photographyjnack

Or… something like that. Whatever the case, I had fun popping our little Lego family photo (captured this weekend at Yosemite Valley’s iconic Tunnel View viewpoint) into Photoshop, selecting part of the excessively large rock wall, and letting Generative Fill give me some more nature. Click or tap (if needed) to see the before/after animation:

Generative Fill, remaining awesome for family photos. From Yosemite yesterday: pic.twitter.com/GtRP0UCaV6

— John Nack (@jnack) April 1, 2024

Infographic magic via Firefly?

April 2, 2024Adobe Firefly, AI/ML, Infographicsjnack

Hey, I know what you know (or quite possibly less :-)), but this demo (which for some reason includes Shaq) looks pretty cool:

From the description:

Elevate your data storytelling with #ProjectInfographIt, a game-changing solution leveraging Adobe Firefly generative AI. Simplify the infographic creation process by instantly generating design elements tailored to your key messages and data. With intuitive features for color palettes, chart types, graphics, and animations, effortlessly transform complex insights into visually stunning infographics.

Fun uses of Firefly’s Structure Reference

April 1, 2024Adobe Firefly, AI/MLjnack

Man, I can’t tell you how long I wanted folks to get this tech into their hands, and I’m excited that you can finally take it for a spin. Here are some great examples (from a thread by Min Choi, which contains more) showing how people are putting it into action:

Reinterpreted kids’ drawings:

Adobe Firefly structure reference:

I created these images using my kid’s art as reference + text prompts like these:

– red aeroplane toy made with felt, appliqué stitch, clouds, blue background
– broken ship, flowing paint from a palette of yellow and green colors

Kept the… https://t.co/TMofxYx8E8 pic.twitter.com/nZpG3MnnZg

— Anu Aakash (@anukaakash) March 30, 2024

More demanding sketch-to-image:

Honestly, #AdobeFirefly ‘s new structure reference feature is super useful for going from a sketch to a realistic rendering. pic.twitter.com/v0HCCsTmZY

— Pierrick Chevallier | IA (@CharaspowerAI) March 29, 2024

Stylized Bitmoji:

Teachers!
You can also customize your @Bitmoji with @Adobe Firefly! #ai #aiforeducation #AdobeFirefly pic.twitter.com/WGV6oNvrwS

— Andrew Davies, M.Ed. (@EduTechWizard) March 30, 2024

Google Research promises better image compositing

March 31, 2024AI/ML, Photographyjnack

Speaking of folks with whom I’ve somehow had the honor of working, some of my old teammates from Google have unveiled ObjectDrop. Check out this video & thread:

Google presents ObjectDrop

Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion

Diffusion models have revolutionized image editing but often generate images that violate physical laws, particularly the effects of objects on the scene, e.g., pic.twitter.com/j7TMadRhxo

— AK (@_akhaliq) March 28, 2024

A bit more detail, from the project site:

Diffusion models have revolutionized image editing but often generate images that violate physical laws, particularly the effects of objects on the scene, e.g., occlusions, shadows, and reflections. By analyzing the limitations of self-supervised approaches, we propose a practical solution centered on a counterfactual dataset.

Our method involves capturing a scene before and after removing a single object, while minimizing other changes. By fine-tuning a diffusion model on this dataset, we are able to not only remove objects but also their effects on the scene. However, we find that applying this approach for photorealistic object insertion requires an impractically large dataset. To tackle this challenge, we propose bootstrap supervision; leveraging our object removal model trained on a small counterfactual dataset, we synthetically expand this dataset considerably.

Our approach significantly outperforms prior methods in photorealistic object removal and insertion, particularly at modeling the effects of objects on the scene.

DesignEdit: AI-powered image editing from Microsoft Research

March 29, 2024AI/MLjnack

“Why would you go work at Microsoft? What do they know or care about creative imaging…?” 🙂

I’m delighted to say that my new teammates have been busy working on some promising techniques for performing a range of image edits, from erasing to swapping, zooming, and more:

Microsoft presents DesignEdit!

It’s a image editing method that can remove objects, edit typography, swap, relocate, resize, add and flip multiple objects, pan and zoom images, remove decorations from images, and edit posters.https://t.co/1DGNiNAFw1 pic.twitter.com/2N5n6MNkqf

— Dreaming Tulpa (@dreamingtulpa) March 28, 2024

Firefly adds Structure Reference

March 26, 2024Adobe Firefly, AI/MLjnack

I’m delighted to see that the longstanding #1 user request for Firefly—namely the ability to upload an image to guide the structure of a generated image—has now arrived:

Good morning!
I’m excited to share with you a new tool on Adobe Firefly website called Structure Reference. I spent whole weekend creating art with it and find this new feature the most inspiring for my art.

You can draw a form (or use a photo or your sketch) and reach… pic.twitter.com/9icx1iJoVJ

— Kris Kashtanova (@icreatelife) March 26, 2024

This nicely complements the extremely popular style-matching capability we enabled back in October. You can check out details of how it works, as well a look at the UI (below)—plus my first creation made using the new tech ;-).