Category Archives: AI/ML

Roblox lets players create 3D objects simply by describing them

March 18, 20253D, AI/MLjnack

I guarantee you, the TTP for the feature is less than the length of this 45-second promo. :-p

Putting Gemini editing to the test

March 17, 2025AI/MLjnack

Here’s a little holiday-appropriate experiment featuring a shot of my dad & me (in Lego form, naturally) at my grandmother’s family farm in County Mayo. Sláinte!

A little St. Paddy’s fun testing Google @GeminiApp‘s conversational editing abilities on Lego pics from Ireland: pic.twitter.com/LPCD0D3igi

— John Nack (@jnack) March 17, 2025

Wardrobe upgrades courtesy of Gemini

March 16, 2025AI/ML, Try-onjnack

Speaking of reskinning imagery (see last several posts), check out what’s now possible via Google’s Gemini model, below. I’ve been putting it to the test & will share results shortly.

Alright, Google really killed it here.

You can easily swap your garment just by uploading the pieces to Gemini Flash 2.0 and telling it what to do. pic.twitter.com/pNPBkIdRqy

— Halim Alrasihi (@HalimAlrasihi) March 14, 2025

Photoshop gets new background-removal skills

March 14, 2025AI/MLjnack

This enhanced capability, which apparently now uses a cloud-hosted model, looks really promising. See before & after:

The Photoshop Beta also has some pretty wild improvements to Remove Background pic.twitter.com/yu7u8ISbMW

— Howard Pinsky (@Pinsky) March 13, 2025

Another example:

https://t.co/VuXQVHMkN1 pic.twitter.com/mcy0nQ3b6m

— Howard Pinsky (@Pinsky) March 14, 2025

Runway reskins rock

March 13, 2025AI/ML, Illustrationjnack

Another day, another set of amazing reinterpretations of reality. Take it away Nathan…

3 tests of Runway’s first frame feature. It’s very impressive and temporally coherent. Input is a video and stylized first frame. ✨

First example here is a city aerial to: circuit board, frost, fire, Swiss cheese, Tokyo. #aivideo #VFX pic.twitter.com/Y7HST74uBy

— Nathan Shipley (@CitizenPlain) March 6, 2025

…and Bilawal:

Playing guitar, reskinned with Runway’s restyle feature — pretty epic for digital character replacement.

I’m genuinely impressed by how well the fretting & strumming hands hold up.

Not perfect yet, but pulling this off would basically be impossible with Viggle or even Wonder… pic.twitter.com/UJBS9c8U1a

— Bilawal Sidhu (@bilawalsidhu) March 7, 2025

PikaSwaps nails virtual try-on

March 12, 2025AI/ML, AR/VR, Try-onjnack

This temporally coherent inpainting is utterly bonkers. It’s just the latest—and perhaps the most promising—in myriad virtual try-on techniques I’ve seen & written about over the years.

This is effortless fashion

Made with @pika_labs Pikaswaps feature pic.twitter.com/BE9LDP8eAR

— Jessie_Ma (@ytjessie_) March 12, 2025

Mystic structure reference: Dracarys!

March 11, 20253D, AI/ML, Illustrationjnack

I love seeing the Magnific team’s continued rapid march in delivering identity-preserving reskinning

IT’S FINALLY HERE!

Mystic Structure Reference!

Generate any image controlling structural integrity Infinite use cases! Films, 3D, video games, art, interiors, architecture… From cartoon to real, the opposite, or ANYTHING in between!

Details & 12 tutorials pic.twitter.com/brw4Dx39gz

— Javi Lopez (@javilopen) February 27, 2025

This example makes me wish my boys were, just for a moment, 10 years younger and still up for this kind of father/son play. 🙂

Storyboarding? No clue! But with some toy blocks, my daughter’s wild imagination, and a little help from Magnific Structure Reference, we built a castle attacked by dragons. Her idea coming to life powered up with AI magic.
Just a normal Saturday Morning.
Behold, my daughter’s… pic.twitter.com/52tDZokmIT

— Jesus Plaza (@JesusPlazaX) March 8, 2025

Behind the scenes: AI-augmented animation

March 10, 2025AI/ML, Illustrationjnack

“Rather than removing them from the process, it actually allowed [the artists] to do a lot more—so a small team can dream a lot bigger.”

Paul Trillo’s been killing it for years (see innumerable previous posts), and now he’s given a peek into how his team has been pushing 2D & 3D forward with the help of custom-trained generative AI:”

Traditional 2d animation meets the bleeding edge of experimental techniques. This is a behind the scenes look at how we at Asteria brought the old and the new together in this throwback animation “A Love Letter to Los Angeles” and collaboration with music artist Cuco and visual… pic.twitter.com/3eWSdgckXn

— Paul Trillo (@paultrillo) March 7, 2025

“Fast food, but make it Lego”

March 7, 2025AI/MLjnack

Here’s a fun use of Flux->Minimax (see workflow details):

Fast food, but make it Lego.
byu/Sad-Ambassador-9040 incomfyui

Charmingly terrible AI-made infographics

March 6, 2025AI/ML, Illustration, Infographicsjnack

A passing YouTube vid made me wonder about the relative strengths of World War II-era bombers, and ChatGPT quickly obliged by making me a great little summary, including a useful table. I figured, however, that it would totally fail at making me a useful infographic from the data—and that it did!

Just for the lulz, I then ran the prompt (“An infographic comparing the Avro Lancaster, Boeing B-17, and Consolidated B-24 Liberator bombers”) through a variety of apps (Ideogram, Flux, Midjourney, and even ol’ Firefly), creating a rogue’s gallery of gibberish & Franken-planes. Check ’em out.

Currently amusing myself with how charmingly bad every AI image generator is at making infographics—each uniquely bizarre! pic.twitter.com/U3cs8ySoVa

— John Nack (@jnack) March 6, 2025

Surrealism blooms through Pika

March 5, 2025AI/MLjnack

Check out this delightful demo:

By combining @pika_labs Pikaframes and @freepik, I now have the magical ability to jump through space and time and in this example, music becomes a transformative element teleporting this woman to a new location. This is how it’s done. 1/6

The videos below are fully narrated… pic.twitter.com/06WtgI50ZV

— Travis Davids (@MrDavids1) March 3, 2025

Individual steps, as I understand them:

Generate image (in this example, using Google Imagen).
Apply background segmentation.
Synthesize a new background, and run what I think is a fine-tuned version of IC-Light (using Stable Diffusion) to relight the entire image, harmonizing foreground/background. Note that identity preservation (face shape, hair color, dress pattern, etc.) is very good but not perfect; see changes in the woman’s hair color, expression, and dress pattern.
Put the original & modified images into Pika, then describe the desired transformation (smooth transition, flowers growing, clouds moving, etc.).

VSCO introduces Canvas

March 4, 2025AI/ML, Photographyjnack

Another day, another ~infinite canvas for ideation & synthesis. This time, somewhat to my surprise, the surface comes from VSCO—a company whose users I’d have expected to be precious & doctrinaire in their opposition to any kind of AI-powered image generation. But who knows, “you can just do things.” ¯\_(ツ)_/¯

View this post on Instagram

A post shared by VSCO | Photo & Video Editor (@vsco)

NeRFtastic BAFTAs

February 28, 20253D, AI/MLjnack

The British Academy Film Awards have jumped into a whole new dimension to commemorate the winners of this year’s awards:

The capturing work was led by Harry Nelder and Amity Studio. Nelder used his 16-camera rig to capture the recent winners. The reconstruction software was a combination of a cloud-based platform created by Nelder, which is expected to be released later this year, along with Postshot. Nelder further utilized the Radiance Field method known as Gaussian Splatting for the reconstruction. A compilation video of all the captures, recently posted by BAFTA, was edited by Amity Studio

[Via Dan Goldman]

Lego together creative AI blocks in Flora

February 26, 2025AI/MLjnack

Looks promising:

Introducing FLORA, Your Intelligent Canvas.

Every creative AI tool, thoughtfully connected. pic.twitter.com/SUHrHtrQmn

— weber (@weberwongwong) February 26, 2025

Their pitch:

Create workflows, not just outputs. Connect Blocks to shape, refine, and scale your creative process.
Collaborate in real time. Work like you would in Figma, but for AI-powered media creation.
Discover & clone workflows. Learn from top creatives, build on proven systems and share generative workflows inside FLORA’s Community.

Perhaps image-to-3D was a mistake…

February 22, 20253D, AI/MLjnack

Behold the majesty (? :-)) of CapCut’s new “Microwave” filter (whose name makes more sense if you listen with sound on):

https://youtube.com/shorts/bshQXczbZdw?si=aFwvtgs-fKf2wl8x

As I asked Bilawal, who posted the compilation, “What is this, and how can I know less about it?”

Impressive product insertion in Ideogram

February 21, 2025AI/ML, Ideogramjnack

Slightly funky UI (I’d never have figured this out on my own), but amazing identity preservation! (Why can’t I do anything like this in Photoshop…?)

EditIQ edits single long shots into multiples virtual shots

February 19, 2025AI/MLjnack

Check it out (probably easier to grok by watching vs. reading a description):

From the static camera feed, EditIQ initially generates multiple virtual feeds, emulating a team of cameramen. These virtual camera shots termed rushes are subsequently assembled using an automated editing algorithm, whose objective is to present the viewer with the most vivid scene content.

Controlling video generation with simple props

February 18, 2025AI/MLjnack

Tired: Random “slot machine”-style video generation
Inspired: Placing & moving simple guidance objects to control results:
Check out VideoNoiseWarp:

Every now and then something comes along that feels like it could change everything… NoiseWarp + CogVideoX lets you animate live action scenes with rough mockups!

ComfyUI nodes by @Kijaidesign https://t.co/AziU049jbg pic.twitter.com/eZsXJ38lxv

— Ingi Erlingsson (@ingi_erlingsson) January 21, 2025

Analog meets AI in the papercraft world of Karen X Cheng

February 14, 2025Adobe Firefly, AI/ML, Generative Filljnack

Check out this fun mixed-media romp, commissioned by Adobe:

This video combines AI-generated elements (balloon, kite, surfboard, and backgrounds) with my own real-world practical effects and stop motion.

I made this for @Adobe Firefly and I’ll share tutorial tomorrow!

Thanks @Adobe for sponsoring my art #AdobePartner #AdobeFirefly pic.twitter.com/dPLrzCchH9

— Karen X. Cheng (@karenxcheng) February 12, 2025

And here’s a look behind the scenes:

Here’s the tutorial! This video combines AI-generated elements (balloon, kite, surfboard, and backgrounds) with my own real-world practical effects and stop motion.

I made this for #AdobeFirefly
Thanks @Adobe for sponsoring my art #AdobePartner pic.twitter.com/yUZtMlwk2r

— Karen X. Cheng (@karenxcheng) February 13, 2025

YouTube + Veo

February 13, 2025AI/MLjnack

The YouTube mobile app can now tap into Google’s Veo model to generate video, as shown below. Hmm—this feels pretty niche at the moment, but it may suggest the shape of things to come (ubiquitous media synthesis, anywhere & anytime it’s wanted).

View this post on Instagram

A post shared by Bilawal Sidhu (@bilawal.ai)

A cool Firefly image->video flow

February 12, 2025Adobe Firefly, AI/MLjnack

For the longest time, Firefly users’ #1 request was to use images to guide composition of new images. Now that Firefly Video has arrived, you can use a reference image to guide the creation of video. Here’s a slick little demo from Paul Trani:

Firefly Video (beta) is now available to everyone! Give it a whirl and share your results!https://t.co/sOeN1pwXcV #adobefirefly #communityxadobe pic.twitter.com/ZOvkqKSq9T

— Paul Trani (@paultrani) February 12, 2025

Google Photos will flag AI-manipulated images

February 10, 2025AI/ML, Google Photosjnack

These changes, reported by Forbes, sound like reasonable steps in the right direction:

Starting now, Google will be adding invisible watermarks to images that have been edited on a Pixel using Magic Editor’s Reimagine feature that lets users change any element in an image by issuing text prompts.

The new information will show up in the AI Info section that appears when swiping up on an image in Google Photos.

The feature should make it easier for users to distinguish real photos from AI-powered manipulations, which will be especially useful as Reimagined photos continue to become more realistic.

DeepSeek meets Flux in Krea Chat

February 9, 2025AI/MLjnack

Conversational creation & iteration is such a promising pattern, as shown through people making ChatGPT take images to greater & greater extremes:

pic.twitter.com/7SQwBMyrlv

— No Context Shitposting (@NoContextCrap) February 8, 2025

But how do we go from ironic laughs to actual usefulness? Krea is taking a swing by integrating (I think) the Flux imaging model with the DeepSeek LLM:

Krea Chat is here.

a brand new way of creating images and videos with AI.

open beta out now. pic.twitter.com/dbHX31l92A

— KREA AI (@krea_ai) February 7, 2025

It doesn’t yet offer the kind of localized refinements people want (e.g. “show me a dog on the beach,” then “put a hat on the dog” and don’t change anything outside the hat area). Even so, it’s great to be able to create an image, add a photo reference to refine it, and then create a video. Here’s my cute, if not exactly accurate, first attempt. 🙂

A mind-blowing Gemini + Illustrator demo

February 6, 2025AI/MLjnack

Wow—check out this genuinely amazing demo from my old friend (and former Illustrator PM) Mordy:

In this video, I show how you can use Gemini in the free Google AI Studio as your own personal tutor to help you get your work done. After you watch me using it to learn how to take a sketch I made on paper to recreating a logo in Illustrator, I promise you’ll be running to do the same.

MatAnyone promises incredible video segmentation

February 3, 2025AI/MLjnack

What the what?

this looks insane, MatAnyone

Stable Video Matting with Consistent Memory Propagation pic.twitter.com/tt1k23raYv

— AK (@_akhaliq) February 3, 2025

Per the paper,

We propose MatAnyone, a robust framework tailored for target-assigned video matting. Specifically, building on a memory-based paradigm, we introduce a consistent memory propagation module via region-adaptive memory fusion, which adaptively integrates memory from the previous frame. This ensures semantic stability in core regions while preserving fine-grained details along object boundaries.

Premiere Pro now lets you find video clips by describing them

February 2, 2025AI/MLjnack

I love it: nothing too fancy, nothing controversial, just a solid productivity boost:

Users can enter search terms like “a person skating with a lens flare” to find corresponding clips within their media library. Adobe says the media intelligence AI can automatically recognize “objects, locations, camera angles, and more,” alongside spoken words — providing there’s a transcript attached to the video. The feature doesn’t detect audio or identify specific people, but it can scrub through any metadata attached to video files, which allows it to fetch clips based on shoot dates, locations, and camera types. The media analysis runs on-device, so doesn’t require an internet connection, and Adobe reiterates that users’ video content isn’t used to train any AI models.

Goodbye, endless scrolling. Hello, AI-powered search panel. With the all-new Media Intelligence in #PremierePro (beta), the content of your clips is automatically recognized, including objects, locations, camera angles & more. Just input your search to find exactly what you need. pic.twitter.com/cOYXDKKaFI

— Adobe Video & Motion (@AdobeVideo) January 22, 2025

Gemini turns photos into interactive simulations (!)

January 28, 2025AI/MLjnack

Check out this wild proof of concept from Trudy Painter at Google, and click into the thread for details.

Photos → Creative Code using Gemini

I built an experiment that turns photos into interactive @p5xjs sketches using Gemini 2.0 Flash.

Unlike UI generators, this creates code that mimics the *behavior* of what’s in the image – like smoke swirling or ripples spreading.

Check… pic.twitter.com/BbhYqUmZxA

— Trudy Painter (@trudypainter) January 23, 2025

Quick fun with Krea, Flux, custom training, and 3D

January 27, 20253D, AI/MLjnack

Putting the proverbial chocolate in the peanut butter, those fast-moving kids at Krea have combined custom model training with 3D-guided image generation. Generation is amazingly fast, and the results are some combo of delightful & grotesque (aka “…The JNack Story”). Check it out:

God help you, though, if you import your photo & convert it to 3D for use with the realtime mode. (Who knew I was Cletus the Slack-Jawed Yokel?) pic.twitter.com/nuesUOZ1Db

— John Nack (@jnack) January 27, 2025

FreePik introduces Flux-powered editing

January 26, 2025AI/MLjnack

Check out this impressive demo that includes face swapping, selective editing, and conversion to video:

wow..

Freepik just launched AI Suite, it lets you edit images with FLUX PRO and..

you can even swap face to your own characters and animate them with top AI video generators.

step by step tutorial: pic.twitter.com/vLWHQAfHM1

— el.cine (@EHuanglu) January 25, 2025

“The Heist,” conjured entirely in Google Veo

January 22, 2025AI/MLjnack

Here’s another interesting snapshot of progress in our collective speedrun towards generative storytelling. It’s easy to pick on the shortcomings, but can you imagine what you’d say upon seeing this in, say, the olden times of 2023?

The creator writes,

Introducing The Heist – Directed by Jason Zada. Every shot of this film was done via text-to video with Google Veo 2. It took thousands of generations to get the final film, but I am absolutely blown away by the quality, the consistency, and adherence to the original prompt. When I described “gritty NYC in the 80s” it delivered in spades – CONSISTENTLY. While this is still not perfect, it is, hands down, the best video generation model out there, by a long shot. Additionally, it’s important to add that no VFX, no clean up, no color correction has been added. Everything is straight out of Veo 2. Google DeepMind

SynthLight promises state-of-the-art relighting

January 21, 2025AI/ML, Photography, Relightingjnack

Here’s a nice write-up covering this paper. It’ll be interesting to dig into the details of how it compares to previous work (see category). [Update: The work comes in part from Adobe Research—I knew those names looked familiar :-)—so here’s hoping we see it in Photoshop & other tools soon.]

this is wild..

this new AI relighting tool can detect the light source in the 3D environment of your image and relight your character, the shadows look so realistic..

it’s especially helpful for AI images

10 examples: pic.twitter.com/sxNR39YTeT

— el.cine (@EHuanglu) January 18, 2025

Krea introduces realtime 3D-guided image generation

January 16, 20253D, AI/MLjnack

Part 9,201 of me never getting over the fact we were working on stuff like this 2 years ago at Adobe (modulo the realtime aspect, which is rad) & couldn’t manage to ship it. It’ll be interesting to see whether the Krea guys (and/or others) pair this kind of interactive-quality rendering with a really high-quality pass, as NVIDIA demonstrated last week using Flux.

3D arrived to Krea.

this new feature lets you turn images into 3D objects and use them in our Real-time tool.

free for everyone. pic.twitter.com/b8gQMhUCN9

— KREA AI (@krea_ai) January 16, 2025

Creating a 3D scene from text

January 13, 20253D, AI/MLjnack

…featuring a dose of Microsoft Trellis!

Here’s how to create this cool 3D scene from a single image!

Midjourney (isometric image generation)
Trellis (Image to 3D Gaussian Splat)
Browser Lab (3D Editor Splat Import) pic.twitter.com/O1vJdaQRbc

— IAN CURTIS (@XRarchitect) January 9, 2025

More about Trellis:

Powered by advanced AI, TRELLIS enables users to create high-quality, customizable 3D objects effortlessly using simple text or image prompts. This innovation promises to improve 3D design workflows, making it accessible to professionals and beginner alike. Here are some examples:

Adobe demos generation of video with transparency

January 8, 2025AI/MLjnack

Exciting!

adobe just released a method that can generate transparent videos from text and images pic.twitter.com/zWKGvDxPxk

— Dreaming Tulpa (@dreamingtulpa) January 8, 2025

From the project page:

Alpha channels are crucial for visual effects (VFX), allowing transparent elements like smoke and reflections to blend seamlessly into scenes. We introduce TransPixar, a method to extend pretrained video models for RGBA generation while retaining the original RGB capabilities. […] Our approach effectively generates diverse and consistent RGBA videos, advancing the possibilities for VFX and interactive content creation.

NVIDIA + Flux = 3D magic

January 7, 20253D, AI/MLjnack

I may never stop being pissed that that the Firefly-3D integration we previewed nearly two years ago didn’t yield more fruit, at least on my watch:

Check out a sneak peek of #AdobeFirefly‘s forthcoming 3D module: https://t.co/3OHUqD4ZmI pic.twitter.com/E2ylawcPC1

— John Nack (@jnack) April 22, 2023

The world moves on, and now NVIDIA has teamed up with Black Forest Labs to enable 3D-conditioned image generation. Check out this demo (starting around 1:31:48):

Details:

For users interested in integrating the FLUX NIM microservice into their workflows, we have collaborated with NVIDIA to launch the NVIDIA AI Blueprint for 3D-guided generative AI. This packaged workflow allows users to guide image generation by laying out a scene in 3D applications like Blender, and using that composition with the FLUX NIM microservice to generate images that adhere to the scene. This integration simplifies image generation control and showcases what’s possible with FLUX models.

Skillful Lovecraftian horror

January 6, 2025AI/MLjnack

The Former Bird App™ is of course awash in mediocre AI-generated video creations, so it’s refreshing to see what a gifted filmmaker (in this case Ruairi Robinson) can do with emerging tools (in this case Google Veo)—even if that’s some slithering horror I’d frankly rather not behold!

AI vids get sublime & ridiculous

January 5, 2025AI/MLjnack

Matan Cohen-Grumi (see previous) asks, “What if music icons walked into their art?”

View this post on Instagram

A post shared by Matan Cohen Grumi (@matancohengrumi)

And on a much more ridiculous tip, there’s the Belt Squared & beyond!

pic.twitter.com/BcblyzAChF

— Awful Taste But Great Execution (@AwfulButGreat) January 4, 2025

New AI-powered upscalers arrive

December 30, 2024AI/ML, Photographyjnack

Check out the latest from Topaz:

Topaz really cooked with their new upscaling model called “redefine” — basically every CSI “enhance” meme you’ve seen IRL.

Settings:
– 4x Upscale
– Creativity: 2
– Texture: 3
– No prompt

It’s basically the Topaz take on the magnific style of “creative upscaling” where you use… pic.twitter.com/T7dLoAjFJt

— Bilawal Sidhu (@bilawalsidhu) December 17, 2024

Alternately, you can run InvSR via Gradio:

Image super-resolution model just dropped! Superior results even with a single sampling step.

InvSR: Arbitrary-steps Image Super-resolution via Diffusion Inversion. pic.twitter.com/gS7uoGwnQ8

— Gradio (@Gradio) December 16, 2024

Strolling through the latent space in Runway

December 24, 2024AI/ML, User Interfacejnack

I’ve long wanted—and advocated for building—this kind of flexible, spatial way to compose & blend among ideas. Here’s to new ideas for using new tools.

Supporting Non-Linear Exploration

Creative exploration rarely follows a straight line. The graph structure naturally affords exploration by allowing users to diverge at various points, creating new forks of possible alternatives. As more exploration occurs, the graph grows… pic.twitter.com/Yq18Caj94T

— Runway (@runwayml) December 2, 2024

Instagram previews AI-generated clothing & environments

December 21, 2024AI/MLjnack

It’s a touch odd to me that Meta is investing here while also shutting down the Meta Spark AR lens platform, but I guess interest in lenses has broadly faded, and AI interpretation of images may prove to be more accessible & scalable. (I wonder what’ll be its Dancing Hot Dog moment.)

View this post on Instagram

A post shared by Adam Mosseri (@mosseri)

Cute creatures emerge from Sora

December 20, 2024AI/MLjnack

I love seeing exactly how Chad Nelson was able to construct a Little Big Planet-inspired game world through some creative prompting & tweening in Open AI’s new Sora video creation model. Check out his exploratory process:

View this post on Instagram

A post shared by Chad Nelson (@dailydall.e)

A rather incredible demo of Pika Scene Ingredients

December 19, 2024AI/MLjnack

Director Matan Cohen-Grumi shows off the radical acceleration in VFX-heavy storytelling that’s possible through emerging tools—including Pika’s new Scene Ingredients:

For 10 years, I directed TV commercials, where storytelling was intuitive—casting characters, choosing locations, and directing scenes effortlessly. When I shifted to AI over a year ago, the process felt clunky—hacking together solutions, spending hours generating images, and… pic.twitter.com/pJUamLFgWI

— Matan Cohen-Grumi (@MatanCohenGrumi) December 18, 2024

Google introduces “Whisk,” a fun image remixer

December 19, 2024AI/MLjnack

Check out this fun little toy:

Instead of generating images with long, detailed text prompts, Whisk lets you prompt with images. Simply drag in images, and start creating.

Whisk lets you input images for the subject, one for the scene and another image for the style. Then, you can remix them to create something uniquely your own, from a digital plushie to an enamel pin or sticker.

Meet Whisk! Our new experiment that lets you use images as prompts to visualize your ideas and tell your story. Try it now: https://t.co/BR1z7gmDs6 pic.twitter.com/2zrPLQZlga

— labs.google (@labsdotgoogle) December 16, 2024

The blog post gives a bit more of a peek behind the scenes & sets some expectations:

Since Whisk extracts only a few key characteristics from your image, it might generate images that differ from your expectations. For example, the generated subject might have a different height, weight, hairstyle or skin tone. We understand these features may be crucial for your project and Whisk may miss the mark, so we let you view and edit the underlying prompts at any time.

In our early testing with artists and creatives, people have been describing Whisk as a new type of creative tool — not a traditional image editor. We built it for rapid visual exploration, not pixel-perfect edits. It’s about exploring ideas in new and creative ways, allowing you to work through dozens of options and download the ones you love.

And yes, uploading a 19th-century dog illustration to generate a plushie dancing an Irish jig is definitely the most JNack way to ~~squander precious work time~~ do vital market research. 🙂

Ideogram AI enables batch creation

December 18, 2024AI/ML, Ideogramjnack

I’m a near-daily user of Ideogram to create all manner of images—mainly goofy dad jokes to (ostensibly) entertain my family. Now they’re enabling batch creation to facilitate creation of lots of variations (e.g. versions of a logo):

Sora + scissors = … crazy bird puppetry?

December 17, 2024AI/MLjnack

Check out this wild video-to-video demo from Nathan Shipley:

Sora Remix test: Scissors to crane

Prompt was “Close up of a curious crane bird looking around a beautiful nature scene by a pond. The birds head pops into the shot and then out.” pic.twitter.com/CvAkdkmFBQ

— Nathan Shipley (@CitizenPlain) December 10, 2024

The cool generative 3D hits keep coming

December 15, 20243D, AI/MLjnack

Just a taste of the torrent the blows past daily on The Former Bird App:

Rodin 3D: “Rodin 3D AI can create stunning, high-quality 3D models from just text or image inputs.”
Trellis 3D: “Iterative prompting/mesh editing. You can now prompt ‘remove X, add Y, Move Z, etc.’… Allows decoding to different output formats: Radiance Fields, 3D Gaussians, and meshes.”
Blender GPT: “Generating 3D assets has never been easier. Here’s me putting together an entire 3D scene in just over a minute.”

Google demos amazing image editing done purely through voice

December 13, 2024AI/MLjnack

This might be the world’s lowest-key demo of what promises to be truly game-changing technology!

I’ve tried a number of other attempts at unlocking this capability (e.g. Meta.ai (see previous), Playground.com, and what Adobe sneak-peeked at the Firefly launch in early 2023), but so far I’ve found them all more unpredictable & frustrating than useful. Could Gemini now have turned the corner? Only hands-on testing (not yet broadly available) will tell!

Photoshop’s Object Selection tool gets upgraded

December 11, 2024AI/MLjnack

Nice to see this progress. (FWIW Microsoft Designer features similar tech; just putting that out there. :-))

New in the @Photoshop Beta! The Object Selection tool just got supercharged! pic.twitter.com/VrfxCQa84W

— Howard Pinsky (@Pinsky) December 9, 2024

Oil painting in Photoshop with AI

December 10, 2024AI/ML, Illustrationjnack

Karen X, back doing crafty Karen X things:

AI painting tutorial

Edited on my Intel AI PC – the DELL XPS 13 powered by Intel Core Ultra #ad pic.twitter.com/wcqpR3RhFk

— Karen X. Cheng (@karenxcheng) December 10, 2024

Shedding new light with LumiNet

December 9, 2024AI/ML, Relightingjnack

Diffusion models are ushering in what feels like a golden(-hour) age in relighting (see previous). Among the latest offerings is LumiNet:

[6/7] Here are a few more random relighting!

How accurate are these results? That’s very hard to answer at the moment But our tests on the MIT dataset, our user study, plus qualitative results all point to us being on the right track.

It’s like we’ve cracked open a… pic.twitter.com/1FNlz8S9Fk

— Anand Bhattad (@anand_bhattad) December 5, 2024