All posts by jnack

PetPortrait.ai promises bespoke images of animals

We’re at just the start of what I expect to be an explosion of hyper-specific offerings powered by AI.

For $24, PetPortrait.ai offers “40 high resolution, beautiful, one-of-a-kind portraits of your pets in a variety of styles.” They say it takes 4-6 hours and requires the following input:

  • ~10 portrait photos of their face
  • ~5 photos from different angles of their head and chest
  • ~5 full-body photos

It’ll be interesting to see what kind of traction this gets. The service Turn Me Royal offers more human-made offerings in a similar vein, and we delighted our son by commissioning this doge-as-Venetian-doge portrait (via an artist on Etsy) a couple of years ago:

Podcast: “Why Figma is selling to Adobe for $20 billion, with CEO Dylan Field”

I had the chance to grab breakfast with Figma founder & CEO Dylan Field a couple of weeks ago, and I found him to be incredibly modest and down to earth. He reminded me of certain fellow Brown CS majors—the brilliant & gracious founding team of Adobe After Effects. I can’t wait for them all to meet someday soon.

In any case, I really enjoyed the hour-long interview Dylan did with Nilay Patel of The Verge. Here’s hoping that the Adobe deal goes through as planned & that we get to do great things together!

Midjourney can produce stunning type

At Adobe MAX a couple of weeks ago, the company offered a sneak peek of editable type in Adobe Express being rendered via a generative model:

https://twitter.com/jnack/status/1582818166698217472?s=20&t=yI2t5EpbhqVNWb7Ws9DWxQ

That sort of approach could pair amazingly with this sort of Midjourney output:

I’m not working on such efforts & am not making an explicit link between the two—but broadly speaking, I find the intersection of such primitives/techniques to be really promising.

Adobe 3D Design is looking for 2023 interns

These sound like great gigs!

The 3D and Immersive Design Team at Adobe is looking for a design intern who will help envision and build the future of Adobe’s 3D and MR creative tools.

With the Adobe Substance 3D Collection and Adobe Aero, we’re making big moves in 3D, but it is still early days! This is a huge opportunity space to shape the future of 3D and AR at Adobe. We believe that tools shape our world, and by building the tools that power 3D creativity we can have an outsized impact on our world.

Runway “Infinite Canvas” enables outpainting

I’ve tried it & it’s pretty slick. These guys are cooking with gas! (Also, how utterly insane would this have been to see even six months ago?! What a year, what a world.)

“Mundane Halloween” win: “Person whose skeleton is being estimated by machine learning” 

Happy day to all who celebrate. 😌

The whole thread is hilarious & well worth a look:

A fistful of generative imaging news

Man, I can’t keep up with this stuff—and that’s a great problem to have. Here are some interesting finds from just the last few days:

Adobe “Made In The Shade” sneak is 😎

OMG—interactive 3D shadow casting in 2D photos FTW! 🔥

In this sneak, we re-imagine what image editing would look like if we used Adobe Sensei-powered technologies to understand the 3D space of a scene – the geometry of a road and the car on the road, and the trees surrounding, the lighting coming from the sun and the sky, the interactions between all these objects leading to occlusions and shadows – from a single 2D photograph.

Check out AI backdrop generation, right in the Photoshop beta today

One of the sleeper features that debuted at Adobe MAX is the new Create Background, found under Neural Filters. (Note that you need to be running the current public beta release of Photoshop, available via the Creative Cloud app—y’know, that little “Cc” icon dealio you ignore in your menu bar. 🙃)

As this quick vid demonstrates, the filter can not only generate backgrounds based on text, it links to a Behance gallery containing images and popular prompts. You can use these visuals as inspiration, then use the prompts to produce artwork within the plugin:

https://youtu.be/oMVfxyQbO5c?t=74

Here’s the Behance browser:

New Lightroom features: A 1-minute tour, plus a glimpse of the future

The Lightroom team has rolled out a ton of new functionality, from smarter selections to adaptive presets to performance improvements. You should read up on the whole shebang—but for a top-level look, spend a minute with Ben Warde:

And looking a bit more to the future, here’s a glimpse at how generative imaging (in the style of DALL•E, Stable Diffusion, et al) might come into LR. Feedback & ideas welcome!

Stable Diffusion + Adobe Fonts = 🧙‍♂️🔥

Check out my teammates’ new explorations, demoed here on Adobe Express:

Per the blog post:

Generative AI incorporated into Adobe Express will help less experienced creators achieve their unique goals. Rather than having to find a pre-made template to start a project with, Express users could generate a template through a prompt, and use Generative AI to add an object to the scene, or create a unique font based on their description. But they still will have full control — they can use all of the Adobe Express tools for editing images, changing colors, and adding fonts to create the flyer, poster, or social media post they imagine.

Turn images into usable Stable Diffusion prompts

LatentSpace.dev promises to turn your images into text prompts that can be used in Stable Diffusion to create new artwork. Watch it work:

It interpreted a pic of my old whip as being, among other things, a “5. 1975 pontiac firebird shooting brake wagon estate.” Not entirely bad! 😌

“Imagic”: Text-based editing of photos

It seems almost too good to be true, but Google Researchers & their university collaborators have unveiled a way to edit images using just text:

In this paper we demonstrate, for the very first time, the ability to apply complex (e.g., non-rigid) text-guided semantic edits to a single real image. For example, we can change the posture and composition of one or multiple objects inside an image, while preserving its original characteristics. Our method can make a standing dog sit down or jump, cause a bird to spread its wings, etc. — each within its single high-resolution natural image provided by the user.

Contrary to previous work, our proposed method requires only a single input image and a target text (the desired edit). It operates on real images, and does not require any additional inputs (such as image masks or additional views of the object).

I can’t wait to see it in action!

Stable Diffusion meets WebAR

Back at the start of my DALL•E journey, I wished aloud for a diffusion-powered mobile app:

https://twitter.com/jnack/status/1529977613623496704?s=20&t=dlYc1z2m-Cxb61G0KCaiIw

Now, thanks to the openness of Stable Diffusion & WebAR, creators are bringing that vision closer to reality:

https://twitter.com/stspanho/status/1581707753747537920?s=20&t=JPLmD_bV0U4Gkv2-2bJX-g

I can’t wait to see what’s next!

Blender + Stable Diffusion = 🪄

Easy placement/movement of 3D primitives -> realistic/illustrative rendering has long struck me as extremely promising. Using tech like StyleGAN to render from 3D can produce interesting results, but it’s been difficult to bring the level of quality & consistency up to what Adobe users demand.

Now with Stable Diffusion (and, one hopes, other diffusion models in the future) attached to Blender (and, one hopes, other object manipulation tools), the vision is getting closer to reality:

Wayback machine: When “AI” was “Adobe Illustrator”

Check out a fun historical find from Adobe evangelist Paul Trani:

https://twitter.com/paultrani/status/1581008882541133824?s=46&t=XjcRX5DdV1OKyzGKVimjTA

The video below shipped on VHS with the very first version of Adobe Illustrator. Adobe CEO & Illustrator developer John Warnock demonstrated the new product in a single one-hour take. He was certainly qualified, being one of the four developers whose names were listed on the splash screen!

How lucky it was for the world that a brilliant graphics engineer (John) married a graphic designer (Marva Warnock) who could provide constant input as this groundbreaking app took shape. 

If you’re interested in more of the app’s rich history, check out The Adobe Illustrator Story:

Check out NeRF Studio & some eye-popping results

The power & immersiveness of rendering 3D from images is growing at an extraordinary rate. NeRF Studio promises to make creation much more approachable:

https://twitter.com/akanazawa/status/1577686321119645696?s=20&t=OA61aUUy3A6P1aMQiUIzbA

The kind of results one can generate from just a series of photos or video frames is truly bonkers:

Here’s a tutorial on how to use it:

Zooming around the world through Google Street View

“My whole life has been one long ultraviolent hyperkinetic nightmare,” wrote Mark Leyner in “Et Tu, Babe?” That thought comes to mind when glimpsing this short film by Adam Chitayat, stitched together from thousands of Street View images (see Vimeo page for a list of locations).

I love the idea—indeed, back in 2014 I tried to get Google Photos to stitch together visual segues that could interconnect one’s photos—but the pacing here has my old man brain pulling the e-brake after just some short exposure. YMMV, so here ya go:

[Via]

Google Photos redesigns Memories

Nice work from my old crew:

With the update that starts rolling out today, you’ll see more videos — including the best snippets from your longer videos that Photos will automatically select and trim so you can relive the most meaningful moments. Even your still photos will feel more dynamic thanks to a subtle zoom that brings movement to your memories. And to bring it all together, next month we’ll start adding instrumental music to some Memories.

Happily, they’ve finally built a subset of the collage editor I spec’d out eight years ago (🧂🤷🏼).

Also,

Soon, you’ll begin to see full Cinematic Memories that transform multiple still photos into an end-to-end cinematic experience, taking you back to that moment in time. Cinematic Memories will also have music, making your photos feel a little more like a movie.

Snapchat: Even simple AR is effective AR

A quarter billion people engage with AR content every day, the company says.

And interestingly, one need not create a complex lens in order to have it pay off:

“The research found that simple AR can be just as performant as a sophisticated, custom Lens in driving both upper and lower-funnel metrics like brand awareness and purchase intent. Brands with the resources to execute a more sophisticated Lens will see additional benefits in mid-funnel brand metrics, including favorability and consideration.”

Are high-res shots from the Insta360 X3 any good?

Well… kinda? I’m feeling somewhat hoodwinked, though. The new cam promises 72-megapixel captures, compared to 18 from its predecessor. This happens via some kind of 4x upsampling, it appears, and at least right now that’s incompatible with shooting HDR images.

Thus, as you can see via the comparisons below & via these original images, I was able to capture somewhat better detail (e.g. look at text) at the cost of getting worse tonal range (e.g. see the X2 lying on top of the book).

I need to carve out time to watch the tutorial below on how to wring the best out of this new cam.

Meta introduces text to video 👀

OMG, what is even happening?!

Per the site,

The system uses images with descriptions to learn what the world looks like and how it is often described. It also uses unlabeled videos to learn how the world moves. With this data, Make-A-Video lets you bring your imagination to life by generating whimsical, one-of-a-kind videos with just a few words or lines of text.

Completely insane. DesireToKnowMoreIntensifies.gif!

DALL•E is now available to everyone

Whew—no more wheedling my “grand-mentee” Joanne on behalf of colleagues wanting access. 😅

Starting today, we are removing the waitlist for the DALL·E beta so users can sign up and start using it immediately. More than 1.5M users are now actively creating over 2M images a day with DALL·E—from artists and creative directors to authors and architects—with over 100K users sharing their creations and feedback in our Discord community.

You can sign up here. Also exciting:

We are currently testing a DALL·E API with several customers and are excited to soon offer it more broadly to developers and businesses so they can build apps on this powerful system.

It’s hard to overstate just how much this groundbreaking technology has rocked our whole industry—all since publicly debuting less than 6 months ago! Congrats to the whole team. I can’t wait to see what they’re cooking up next.

NVIDIA’s GET3D promises text-to-model generation

Depending on how well it works, tech like this could be the greatest unlock in 3D creation the world has ever known.

The company blog post features interesting, promising details:

Though quicker than manual methods, prior 3D generative AI models were limited in the level of detail they could produce. Even recent inverse rendering methods can only generate 3D objects based on 2D images taken from various angles, requiring developers to build one 3D shape at a time.

GET3D can instead churn out some 20 shapes a second when running inference on a single NVIDIA GPU — working like a generative adversarial network for 2D images, while generating 3D objects. […]

GET3D gets its name from its ability to Generate Explicit Textured 3D meshes — meaning that the shapes it creates are in the form of a triangle mesh, like a papier-mâché model, covered with a textured material. This lets users easily import the objects into game engines, 3D modelers and film renderers — and edit them.

See also Dream Fields (mentioned previously) from Google:

Photoshop-Stable Diffusion plugin adds inpainting with masks, layer-based img2img

Christian Cantrell + the Stability devs remain a house on fire:

Here’s a more detailed (3-minute) walk-through of this free plugin:

Demo: Generating an illustrated narrative with DreamBooth

The Corridor Crew has been banging on Stable Diffusion & Google’s new DreamBooth tech (see previous) that enables training the model to understand a specific concept—e.g. one person’s face. Here they’ve trained it using a few photos of team member Sam Gorski, then inserted him into various genres:

From there they trained up models for various guys at the shop, then created an illustrated fantasy narrative. Just totally incredible, and their sheer exuberance makes the making-of pretty entertaining:

Generative dancing about architecture

Paul Trillo is back at it, extending a Chinese restaurant via Stable Diffusion, After Effects, and Runway:

Elsewhere, check out this mutating structure. (Next up: Falling Water made of actual falling water?)

Lexica adds reverse-image search

The Stable Diffusion-centered search engine (see a few posts back) now makes it easy to turn a real-world concept into a Stable Diffusion prompt:

This seems like precisely what I pined for publicly, albeit then about DALL•E:

Honoring creators’ wishes: Source+ & “Have I Been Trained”

I’m really excited to see this work from artists Holly Dryhurst & Mat Herndon. From Input Mag:

Dryhurst and Herndon are developing a standard they’re calling Source+, which is designed as a way of allowing artists to and opt into — or out of — allowing their work being used as training data for AI. (The standard will cover not just visual artists, but musicians and writers, too.) They hope that AI generator developers will recognize and respect the wishes of artists whose work could be used to train such generative tools.

Source+ (now in beta) is a product of the organization Spawning… [It] also developed Have I Been Trained, a site that lets artists see if their work is among the 5.8 billion images in the Laion-5b dataset, which is used to train the Stable Diffusion and MidJourney AI generators. The team plans to add more training datasets to pore through in the future.

The creators also draw a distinction between the rights of living vs. dead creators:

The project isn’t aimed at stopping people putting, say, “A McDonalds restaurant in the style of Rembrandt” into DALL-E and gazing on the wonder produced. “Rembrandt is dead,” Dryhurst says, “and Rembrandt, you could argue, is so canonized that his work has surpassed the threshold of extreme consequence in generating in their image.” He’s more concerned about AI image generators impinging on the rights of living, mid-career artists who have developed a distinctive style of their own.

And lastly,

“We’re not looking to build tools for DMCA takedowns and copyright hell,” he says. “That’s not what we’re going for, and I don’t even think that would work.”

On a personal note, I’m amused to see what the system thinks constitutes “John Nack”—apparently chubby German-ish old chaps…? 🙃

Google & NASA bring 3D to search

Great to see my old teammates (with whom I was working to enable cloud-rendered as well as locally rendered 3D experiences) continuing their work.

NASA and Google Arts & Culture have partnered to bring more than 60 3D models of planets, moons and NASA spacecraft to Google Search. When you use Google Search to learn about these topics, just click on the View in 3D button to understand the different elements of what you’re looking at even better. These 3D annotations will also be available for cells, biological concepts (like skeletal systems), and other educational models on Search.