Category Archives: Illustration

AI-made avatars for LinkedIn, Tinder, and more

As I say, another day, another specialized application of algorithmic fine-tuning. Per Vice:

For $19, a service called PhotoAI will use 12-20 of your mediocre, poorly-lit selfies to generate a batch of fake photos specially tailored to the style or platform of your choosing. The results speak to an AI trend that seems to regularly jump the shark: A “LinkedIn” package will generate photos of you wearing a suit or business attire…

…while the “Tinder” setting promises to make you “the best you’ve ever looked”—which apparently means making you into an algorithmically beefed-up dudebro with sunglasses. 

Meanwhile, the quality of generated faces continues to improve at a blistering pace:

Crowdsourced AI Snoop Doggs (is a real headline you can now read)

The Doggfather recently shared a picture of himself (rendered presumably via some Stable Diffusion/DreamBooth personalization instance)…

…thus inducing fans to reply with their own variations (click tweet above to see the thread). Among the many fun Snoop Doggs (or is it Snoops Dogg?), I’m partial to Cyberpunk…

…and Yodogg:

My Heritage introduces “AI Time Machine”

Another day, another special-purpose variant of AI image generation.

A couple of years ago, MyHeritage struck a chord with the world via Deep Nostalgia, an online app that could animate the faces of one’s long-lost ancestors. In reality it could animate just about any face in a photo, but I give them tons of credit for framing the tech in a really emotionally resonant way. It offered not a random capability, but rather a magical window into one’s roots.

Now the company is licensing tech from Astria, which itself builds on Stable Diffusion & Google Research’s DreamBooth paper. Check it out:

Interestingly (perhaps only to me), it’s been hard for MyHeritage to sustain the kind of buzz generated by Deep Nostalgia. They later introduced the much more ambitious DeepStory, which lets you literally put words in your ancestors’ mouths. That seems not to have bent the overall needle in awareness, at least in the way that the earlier offering did. Let’s see how portrait generation fares.

PetPortrait.ai promises bespoke images of animals

We’re at just the start of what I expect to be an explosion of hyper-specific offerings powered by AI.

For $24, PetPortrait.ai offers “40 high resolution, beautiful, one-of-a-kind portraits of your pets in a variety of styles.” They say it takes 4-6 hours and requires the following input:

  • ~10 portrait photos of their face
  • ~5 photos from different angles of their head and chest
  • ~5 full-body photos

It’ll be interesting to see what kind of traction this gets. The service Turn Me Royal offers more human-made offerings in a similar vein, and we delighted our son by commissioning this doge-as-Venetian-doge portrait (via an artist on Etsy) a couple of years ago:

Runway “Infinite Canvas” enables outpainting

I’ve tried it & it’s pretty slick. These guys are cooking with gas! (Also, how utterly insane would this have been to see even six months ago?! What a year, what a world.)

A fistful of generative imaging news

Man, I can’t keep up with this stuff—and that’s a great problem to have. Here are some interesting finds from just the last few days:

Wayback machine: When “AI” was “Adobe Illustrator”

Check out a fun historical find from Adobe evangelist Paul Trani:

https://twitter.com/paultrani/status/1581008882541133824?s=46&t=XjcRX5DdV1OKyzGKVimjTA

The video below shipped on VHS with the very first version of Adobe Illustrator. Adobe CEO & Illustrator developer John Warnock demonstrated the new product in a single one-hour take. He was certainly qualified, being one of the four developers whose names were listed on the splash screen!

How lucky it was for the world that a brilliant graphics engineer (John) married a graphic designer (Marva Warnock) who could provide constant input as this groundbreaking app took shape. 

If you’re interested in more of the app’s rich history, check out The Adobe Illustrator Story:

Demo: Generating an illustrated narrative with DreamBooth

The Corridor Crew has been banging on Stable Diffusion & Google’s new DreamBooth tech (see previous) that enables training the model to understand a specific concept—e.g. one person’s face. Here they’ve trained it using a few photos of team member Sam Gorski, then inserted him into various genres:

From there they trained up models for various guys at the shop, then created an illustrated fantasy narrative. Just totally incredible, and their sheer exuberance makes the making-of pretty entertaining:

AI art -> “Bullet Hell” & Sirenhead

Shoon is a recently released side scrolling shmup,” says Vice, “that is fairly unremarkable, except for one quirk: it’s made entirely with art created by Midjourney, an AI system that generates images from text prompts written by users.’ Check out the results:

Meanwhile my friend Bilawal is putting generative imaging to work in creating viral VFX:

Alpaca brings Stable Diffusion to Photoshop 🔥

I don’t know much about these folks, but I’m excited to see that they’re working to integrate Stable Diffusion into Photoshop:

You can add your name to the waitlist via their site. Meanwhile here’s another exploration of SD + Photoshop:

🤘Death Metal Furby!🤘

See, isn’t that a more seductive title than “Personalizing Text-to-Image Generation using Textual Inversion“? 😌 But the so-titled paper seems really important in helping generative models like DALL•E to become much more precise. The team writes:

We ask: how can we use language-guided models to turn our cat into a painting, or imagine a new product based on our favorite toy? Here we present a simple approach that allows such creative freedom.

Using only 3-5 images of a user-provided concept, like an object or a style, we learn to represent it through new “words” in the embedding space of a frozen text-to-image model. These “words” can be composed into natural language sentences, guiding personalized creation in an intuitive way.

Check out the kind of thing it yields:

AI art: “…Y’know, for kids!”

Many years ago (nearly 10!), when I was in the thick of making up bedtime stories every night, I wished aloud for an app that would help do the following:

  • Record you telling your kids bedtime stories (maybe after prompting you just before bedtime)
  • Transcribe the text
  • Organize the sound & text files (into a book, journal, and/or timeline layout)
  • Add photos, illustrations, and links.
  • Share from the journal to a blog, Tumblr, etc.

I was never in a position to build it, but seeing this fusion of kid art + AI makes me hope again:

So here’s my tweet-length PRD:

  • Record parents’/kids’ voices.
  • Transcribe as a journal.
  • Enable scribbling.
  • Synthesize images on demand.

On behalf of parents & caregivers everywhere, come on, world—LFG! 😛

“Hyperlapse vs. AI,” + AR fashion

Malick Lombion & friends combined “more than 1,200 AI-generated art pieces combined with around 1,400 photographs” to create this trippy tour:

Elsewhere, After Effects ninja Paul Trillo is back at it with some amazing video-meets-DALL•E-inpainting work:

I’m eager to see all the ways people might combine generation & fashion—e.g. pre-rendering fabric for this kind of use in AR:

https://twitter.com/XRarchitect/status/1492269937829707776

“Make-A-Scene” promises generative imaging cued via sketching

This new tech from Facebook Meta one-ups DALL•E et al by offering more localized control over where elements are placed:

The team writes,

We found that the image generated from both text and sketch was almost always (99.54 percent of the time) rated as better aligned with the original sketch. It was often (66.3 percent of the time) more aligned with the text prompt too. This demonstrates that Make-A-Scene generations are indeed faithful to a person’s vision communicated via the sketch.

“Content-Aware Fill… cubed”: DALL•E inpainting is nuts

The technology’s ability not only to synthesize new content, but to match it to context, blows my mind. Check out this thread showing the results of filling in the gap in a simple cat drawing via various prompts. Some of my favorites are below:

Also, look at what it can build out around just a small sample image plus a text prompt (a chef in a sushi restaurant); just look at it!

Meet “Imagen,” Google’s new AI image synthesizer

What a time to be alive…

Hard on the heels of OpenAI revealing DALL•E 2 last month, Google has announced Imagen, promising “unprecedented photorealism × deep level of language understanding.” Unlike DALL•E, it’s not yet available via a demo, but the sample images (below) are impressive.

I’m slightly amused to see Google flexing on DALL•E by highlighting Imagen’s strengths in figuring out spatial arrangements & coherent text (places where DALL•E sometimes currently struggles). The site claims that human evaluators rate Imagen output more highly than what comes from competitors (e.g. MidJourney).

I couldn’t be more excited about these developments—most particularly to figure out how such systems can enable amazing things in concert with Adobe tools & users.

What a time to be alive

A charming Route 66 doodle from Google

Last year I took my then-11yo son Henry (aka my astromech droid) on a 2000-mile “Miodyssey” down Route 66 in my dad’s vintage Miata. It was a great way to see the country (see more pics & posts than you might ever want), and despite the tight quarters we managed not to kill one another—or to get slain by Anton Chigurh in an especially murdery Texas town (but that’s another story!).

In any event, we were especially charmed to see the Goog celebrate the Mother Road in this doodle:

DALL•E 2 looks too amazing to be true

There’s no way this is real, is there?! I think it must use NFW technology (No F’ing Way), augmented with a side of LOL WTAF. 😛

Here’s an NYT video showing the system in action:

The NYT article offers a concise, approachable description of how the approach works:

A neural network learns skills by analyzing large amounts of data. By pinpointing patterns in thousands of avocado photos, for example, it can learn to recognize an avocado. DALL-E looks for patterns as it analyzes millions of digital images as well as text captions that describe what each image depicts. In this way, it learns to recognize the links between the images and the words.

When someone describes an image for DALL-E, it generates a set of key features that this image might include. One feature might be the line at the edge of a trumpet. Another might be the curve at the top of a teddy bear’s ear.

Then, a second neural network, called a diffusion model, creates the image and generates the pixels needed to realize these features. The latest version of DALL-E, unveiled on Wednesday with a new research paper describing the system, generates high-resolution images that in many cases look like photos.

Though DALL-E often fails to understand what someone has described and sometimes mangles the image it produces, OpenAI continues to improve the technology. Researchers can often refine the skills of a neural network by feeding it even larger amounts of data.

I can’t wait to try it out.

Fantastic Shadow Beasts (and Where To Find Them)

I’m not sure who captured this image (conservationist Beverly Joubert, maybe?), or whether it’s indeed the National Geographic Picture of The Year, but it’s stunning no matter what. Take a close look:

Elsewhere I love this compilation of work from “Shadowologist & filmmaker” Vincent Bal:

 

 
 
 
 
 
View this post on Instagram
 
 
 
 
 
 
 
 
 
 
 

 

A post shared by WELCOME TO THE UNIVERSE OF ART (@artistsuniversum)

“Why are NFTs so ugly?”

I swear to God, stuff like this makes me legitimately feel like I’m having a stroke:

https://twitter.com/hapebeastgang/status/1450431456216588290?s=21

And that example, curiously, seems way more technically & aesthetically sophisticated than the bulk of what I see coming from the “NFT art” world. I really enjoyed this explication of why so much of such content seems like cynical horseshit—sometimes even literally:

Different Strokes: 3D surface analysis helps computers identify painters

Researchers at NVIDIA & Case Western Reserve University have developed an algorithm that can distinguish different painters’ brush strokes “at the bristle level”:

Extracting topographical data from a surface with an optical profiler, the researchers scanned 12 paintings of the same scene, painted with identical materials, but by four different artists. Sampling small square patches of the art, approximately 5 to 15 mm, the optical profiler detects and logs minute changes on a surface, which can be attributed to how someone holds and uses a paintbrush. 

They then trained an ensemble of convolutional neural networks to find patterns in the small patches, sampling between 160 to 1,440 patches for each of the artists. Using NVIDIA GPUs with cuDNN-accelerated deep learning frameworks, the algorithm matches the samples back to a single painter.

The team tested the algorithm against 180 patches of an artist’s painting, matching the samples back to a painter at about 95% accuracy.