All posts by jnack

Generative dancing about architecture

Paul Trillo is back at it, extending a Chinese restaurant via Stable Diffusion, After Effects, and Runway:

Elsewhere, check out this mutating structure. (Next up: Falling Water made of actual falling water?)

Lexica adds reverse-image search

The Stable Diffusion-centered search engine (see a few posts back) now makes it easy to turn a real-world concept into a Stable Diffusion prompt:

This seems like precisely what I pined for publicly, albeit then about DALL•E:

Honoring creators’ wishes: Source+ & “Have I Been Trained”

I’m really excited to see this work from artists Holly Dryhurst & Mat Herndon. From Input Mag:

Dryhurst and Herndon are developing a standard they’re calling Source+, which is designed as a way of allowing artists to and opt into — or out of — allowing their work being used as training data for AI. (The standard will cover not just visual artists, but musicians and writers, too.) They hope that AI generator developers will recognize and respect the wishes of artists whose work could be used to train such generative tools.

Source+ (now in beta) is a product of the organization Spawning… [It] also developed Have I Been Trained, a site that lets artists see if their work is among the 5.8 billion images in the Laion-5b dataset, which is used to train the Stable Diffusion and MidJourney AI generators. The team plans to add more training datasets to pore through in the future.

The creators also draw a distinction between the rights of living vs. dead creators:

The project isn’t aimed at stopping people putting, say, “A McDonalds restaurant in the style of Rembrandt” into DALL-E and gazing on the wonder produced. “Rembrandt is dead,” Dryhurst says, “and Rembrandt, you could argue, is so canonized that his work has surpassed the threshold of extreme consequence in generating in their image.” He’s more concerned about AI image generators impinging on the rights of living, mid-career artists who have developed a distinctive style of their own.

And lastly,

“We’re not looking to build tools for DMCA takedowns and copyright hell,” he says. “That’s not what we’re going for, and I don’t even think that would work.”

On a personal note, I’m amused to see what the system thinks constitutes “John Nack”—apparently chubby German-ish old chaps…? 🙃

Google & NASA bring 3D to search

Great to see my old teammates (with whom I was working to enable cloud-rendered as well as locally rendered 3D experiences) continuing their work.

NASA and Google Arts & Culture have partnered to bring more than 60 3D models of planets, moons and NASA spacecraft to Google Search. When you use Google Search to learn about these topics, just click on the View in 3D button to understand the different elements of what you’re looking at even better. These 3D annotations will also be available for cells, biological concepts (like skeletal systems), and other educational models on Search.

Insta360 announces the X3

Who’s got two thumbs & just pulled the trigger? This guuuuuy. 😌

Now, will it be worth it? I sure hope so.

Fortunately I got to try out the much larger & more expensive One R 1″ Edition back in July & concluded that it’s not for me (heavier, lacking Bullet Time, and not producing appreciably better quality results—at least for the kind of things I shoot).

I’m of course hoping the X3 (success to my much-beloved One X2) will be more up my alley. Here’s some third-party perspective:

Relight faces via a slick little web app

Check out ClipDrop’s relighting app, demoed here:

Fellow nerds might enjoy reading about the implementation details.

AI art -> “Bullet Hell” & Sirenhead

Shoon is a recently released side scrolling shmup,” says Vice, “that is fairly unremarkable, except for one quirk: it’s made entirely with art created by Midjourney, an AI system that generates images from text prompts written by users.’ Check out the results:

Meanwhile my friend Bilawal is putting generative imaging to work in creating viral VFX:

DALL•E outpainting arrives

Let the canvases extend in every direction! The thoughtfully designed new tiling UI makes it easy to synthesize adjacent chunks in sequence, partly overcoming current resolution limits in generative imaging:

Here’s a nice little demo from our designer Davis Brown, who takes his dad Russell’s surreal desert explorations to totally new levels:

Using DALL•E for generative fashion design

Amazing work from the always clever Karen X. Cheng, collaborating with Paul Trillo & others:

 

 
 
 
 
 
View this post on Instagram
 
 
 
 
 
 
 
 
 
 
 

 

 

A post shared by Karen X (@karenxcheng)

Speaking of Paul here’s a fun new little VFX creation made using DALL•E:

Photoshop previews new AI-powered photo restoration

Eng manager Barry Young writes,

The latest beta build of Photoshop contains a new feature called Photo Restoration. Whenever I have seen new updates in AI photo restoration over the last few years, I have tried the technology on an old family photo that I have of my great great great grandfather. A Scotsman who lived between 1845-1919. I applied the neural filter plus colorize technique to update the image in Photoshop. The restored photo is on the left, the original on the right. It is really astonishing how advanced AI is becoming.

Learn more about accessing the feature in Photoshop here.

Alpaca brings Stable Diffusion to Photoshop 🔥

I don’t know much about these folks, but I’m excited to see that they’re working to integrate Stable Diffusion into Photoshop:

You can add your name to the waitlist via their site. Meanwhile here’s another exploration of SD + Photoshop:

🤘Death Metal Furby!🤘

See, isn’t that a more seductive title than “Personalizing Text-to-Image Generation using Textual Inversion“? 😌 But the so-titled paper seems really important in helping generative models like DALL•E to become much more precise. The team writes:

We ask: how can we use language-guided models to turn our cat into a painting, or imagine a new product based on our favorite toy? Here we present a simple approach that allows such creative freedom.

Using only 3-5 images of a user-provided concept, like an object or a style, we learn to represent it through new “words” in the embedding space of a frozen text-to-image model. These “words” can be composed into natural language sentences, guiding personalized creation in an intuitive way.

Check out the kind of thing it yields:

“Curt Skelton,” homebrew AI influencer

[Update: Seems that much of this may be fake. :-\ Still, the fact that it’s remotely plausible is nuts!]

Good lord (and poor Conan!). This creator used:

  • DALL•E to create hundreds of similar-looking images of a face
  • Create Skeleton to convert them into a 3D model
  • DeepMotion.com to generate 3D body animation
  • Deepfake Lab to generate facial animation
  • Audio tools to deepen & distort her voice, creating a new one
@curt.skelton

♬ Mr. Roboto – Live – Styx

Stunning landscape photos conjured by robots

The new open-source Stable Diffusion model is pretty darn compelling. Per PetaPixel:

“Just telling the AI something like ‘landscape photography by Marc Adamus, Glacial lake, sunset, dramatic lighting, mountains, clouds, beautiful’ gives instant pleasant looking photography-like images. It is incredible that technology has got to this point where mere words produce such wonderful images (please check the Facebook group for more).” — photographer  Aurel Manea

AI art: “…Y’know, for kids!”

Many years ago (nearly 10!), when I was in the thick of making up bedtime stories every night, I wished aloud for an app that would help do the following:

  • Record you telling your kids bedtime stories (maybe after prompting you just before bedtime)
  • Transcribe the text
  • Organize the sound & text files (into a book, journal, and/or timeline layout)
  • Add photos, illustrations, and links.
  • Share from the journal to a blog, Tumblr, etc.

I was never in a position to build it, but seeing this fusion of kid art + AI makes me hope again:

So here’s my tweet-length PRD:

  • Record parents’/kids’ voices.
  • Transcribe as a journal.
  • Enable scribbling.
  • Synthesize images on demand.

On behalf of parents & caregivers everywhere, come on, world—LFG! 😛

“Hyperlapse vs. AI,” + AR fashion

Malick Lombion & friends combined “more than 1,200 AI-generated art pieces combined with around 1,400 photographs” to create this trippy tour:

Elsewhere, After Effects ninja Paul Trillo is back at it with some amazing video-meets-DALL•E-inpainting work:

I’m eager to see all the ways people might combine generation & fashion—e.g. pre-rendering fabric for this kind of use in AR:

https://twitter.com/XRarchitect/status/1492269937829707776

Adobe VP. feat… Wu-Tang Clan?!

What the what? From Sébastien Deguy, founder of Allegorithmic & now VP 3D & Immersive at Adobe:

I don’t communicate often about that other part of my activities, but I am so glad I could work with one of my all time favorites that I have to 🙂 My latest track, featuring Raekwon (Wu-Tang Clan) is available on all streaming platforms! Like here on Spotify.

Perhaps even more surprisingly, it slaps! 👏

Video lovers: Adobe’s hiring a Community Relationship Manager for Pro Video

If this sounds like your kind of jam, read on! From the job description:


What you’ll do:

  • Share your knowledge, passion and experience of Adobe Premiere Pro and After Effects with video makers.
  • Engage daily with communities around professional video wherever they are (Reddit, Twitter, Facebook, Instagram, etc.) in two-way conversations, representing Adobe.
  • Be active and visible within the community, build long term relationships and trust, and demonstrate that knowledge and understanding of the users to help Adobe internally.
  • Build relationships with leaders in the identified communities.
  • Establish yourself as a leader through your work and participation in time-sensitive topics and conversations.
  • Answer questions and engage in discussion about Adobe products and policies with a heavy focus on newcomers to the ecosystem.
  • Encourage others through sharing your personal use of and experimentation with professional video tools.
  • Enable conversation, build content and speak about Adobe tools to address the specific audience needs.
  • Understand the competitive landscape and promptly report accordingly.
  • Coordinate with other community, product, marketing and campaign teams to develop mini-engagements and activities for the community (i.e. AMAs with the product team on Reddit, or community activities and discussions via live streams, etc.)
  • Work closely with the broader community team, evangelism team, and product teams to provide insight and feedback to advocate for the pro video community within Adobe and to help drive product development direction.

Stabile Diffusion + ArtBreeder = creative composition

We are teetering on the cusp of a Cambrian explosion in UI creativity, with hundreds of developers competing to put amazing controls atop a phalanx of ever-improving generative models. These next couple of months & years are gonna be wiiiiiiild.

A really amazing spin on AR furniture shopping

After seeing years & years of AR demos featuring the placement of furniture, I once heard someone say in exasperation, “Bro… how much furniture do you think I buy?”

Happily here’s a decidedly fresh approach, surrounding the user & some real-world furniture with a projection of the person’s 3D-scanned home. Wild!

Now, how easy can 3D home scanning be made—and how much do people care about this kind of scenario? I don’t know, but I love what the tech can enable already.

Old Photoshop ambitions & anxieties, back again with DALL•E & friends

Watching this clip from the Today Show introduction of Photoshop in 1990, it’s amazing to hear the same ethical questions 32 years ago that we contend with now around AI-generated imagery. Also amazing: I now work with Russell‘s son Davis (our designer) to explore AI imaging + Photoshop and beyond.

CLIP interrogator reveals what your robo-artist assistant sees

Ever since DALL•E hit the scene, I’ve been wanting to know what words its model for language-image pairing would use to describe images:

Now the somewhat scarily named CLIP Interrogator promises exactly that kind of insight:

What do the different OpenAI CLIP models see in an image? What might be a good text prompt to create similar images using CLIP guided diffusion or another text to image model? The CLIP Interrogator is here to get you answers!

Here’s hoping it helps us get some interesting image -> text -> image flywheels spinning.

Snap Research promises 3D creation from photo collections

Hmm—this is no doubt brilliant tech, and I’d like to learn more, but I wonder about the Venn diagram between “Objects that people want in 3D,” “Objects for which a sufficiently large number of good images exist,” and “Objects for which good human-made 3D models don’t already exist.” In my experience photogrammetry is most relevant for making models from extremely specific subjects (e.g. a particular apartment) rather than from common objects that are likely to exist on Sketchfab et al. It’s entirely possible I’m missing a nuanced application here, though. As I say, cool tech!

Insta360 Sphere promises epic aerial shots

Somehow I totally missed this announcement a few months back—perhaps because the device apparently isn’t compatible with my Mavic 2 Pro. I previously bought an Insta360 One R (which can split in half) with a drone-mounting cage, but I found the cam so flaky overall that I never took the step of affixing it to a cage that was said to interfere with GPS signals. In any event, this little guy looks fun:

Ketchup goes AI…? Heinz puts DALL•E to work

Interesting, and of course inevitable:

“This emerging tech isn’t perfect yet, so we got some weird results along with ones that looked like Heinz—but that was part of the fun. We then started plugging in ketchup combination phrases like ‘impressionist painting of a ketchup bottle’ or ‘ketchup tarot card’ and the results still largely resembled Heinz. We ultimately found that no matter how we were asking, we were still seeing results that looked like Heinz.”

Pass the Kemp!

[Via Aaron Hertzmann]

More DALL•E + After Effects magic

Creator Paul Trillo (see previous) is back at it. Here’s new work + a peek into how it’s made:

Head to head: Insta360 One RS 1″ vs. X2

At nearly twice the price while lacking features like Bullet Time, the Insta360 One RS 1″ had better produce way better photos and videos than what come out of my trusty One X2. I therefore really appreciate this detailed side-by-side comparison. Having used both together, I don’t see a dramatic difference, but this vid certainly makes a good case that the gains are appreciable.

Designers: Come design Photoshop!

Two roles (listed as being based in NYC & Denver) are now open. Check out the descriptions from the team:

—————————

As a key member and thought-leader on this team, you’ll be an integral part of exploring and influencing the next generation of Adobe’s creative tools. Together, we are forging a new class of experience standards for desktop, mobile, and web products for years to come.

You will (among other things):

  • Seek/design the “simple” experiences and interactions that influence our growing portfolio of creative tools. Empower users to delight themselves.
  • Partner closely with fellow senior designers, product managers, senior engineers, and other leaders across different teams to bring new products to life.

What you’ll bring to the team

Must-Haves

  • A minimum of 5 years of industry experience in product design with a proven track record of success
  • Experience (and a love of!) solving complex design and technology problems using systems thinking
  • Excellent communication skills, with the ability to clearly articulate a multi-level problem space and strategy behind design decisions
  • Creative and analytical skills to advocate for and support research, synthesize, and communicate insights that encourage design opportunities and product strategy
  • Passion for understanding how creative people do what they do and how technology plays a role in the creative process
  • Experience establishing user experience patterns across mobile, web, and desktop products within connected platforms