Marques Brownlee talks DALL•E

This may be the most accessible overall intro & discussion I’ve seen, and it’s chock full of fun example output.

Even the system’s frequent text “fails” are often charmingly bizarre—like a snapshot of a dream that makes sense only while dreaming. Some faves from the vid above:

Bird scooters use Google AR to curb illegal parking

Building on yesterday’s post about Google’s new Geospatial API, developers can now embed a live view featuring a camera feed + augmentations, and developers like Bird are wasting no time in putting it to use. TNW writes,

When parking a scooter, the app prompts a rider to quickly scan the QR code on the vehicle and its surrounding area using their smartphone camera… [T]his results in precise, centimeter-level geolocation that enables the system to detect and prevent improper parking with extreme accuracy — all while helping monitor user behavior.

TechCrunch adds,

Out of the over 200 cities that Lime serves, its VPS is live now in six: London, Paris, Tel Aviv, Madrid, San Diego and Bordeaux. Similar to Bird, Lime’s pilot involves testing the tech with a portion of riders. The company said results from its pilots have been promising, with those who used the new tool seeing a 26% decrease in parking errors compared to riders who didn’t have the tool enabled. 

A deep dive on Google’s new Geospatial API

My friend Bilawal & I collaborated on AR at Google, including our efforts to build a super compact 3D engine for driving spatial annotation & navigation. We’d often talk excitedly about location-based AR experiences, especially the Landmarker functionality arriving in Snapchat. All the while he’s been busy pushing the limits of photogrammetry (including putting me in space!) to scan 3D objects.

Now I’m delighted to see him & his team unveiling the Geospatial API (see blog post, docs, and code), which enables cross-platform (iOS, Android) deployment of experiences that present both close-up & far-off augmentations. Here’s the 1-minute sizzle reel:

For a closer look, check out this interesting deep dive into what it offers & how it works:

A 3lb. wind turbine promises power on the go

Hmm—dunno whether I’d prefer carrying this little dude over just pocketing a battery pack or two—but I dig the idea & message:

Once set up on its tripod, the 3-pound, 40-watt device automatically rotates towards the wind and starts charging its 5V, 12,000 mAh battery. (Alternatively it can charge your device directly via USB.) The company says that in peak conditions, the Shine Turbine can generate enough juice to charge a smartphone in just 20 minutes.

DALL•E vs. Reality: Flavawagon Edition

Heh—I got a kick out of seeing how AI would go about hallucinating its idea of what my flamed-out ’84 Volvo wagon looked like. See below for a comparison. And in retrospect, how did I not adorn mine with a tail light made from a traffic cone (or is it giant candy corn?) and “VOOFO NACK”? 😅

https://twitter.com/jnack/status/1523789615039016960?s=20&t=49EvZvax6TnDd6co-_-muw

ApARtment hunting

Among the Google teams working on augmented reality, there was a low-key religious war about the importance of “metric scale” (i.e. matching real-world proportions 1:1). The ARCore team believed it was essential (no surprise, given their particular tech stack), while my team (Research) believed that simply placing things in the world with a best guess as to size, then letting users adjust an object if needed, was often the better path.

I thought of this upon seeing StreetEasy’s new AR tech for apartment-hunting in NYC. At the moment it lets you scan a building to see its inventory. That’s very cool, but my mind jumped to the idea of seeing 3D representations of actual apartments (something the company already offers, albeit not in AR), and I’m amused to think of my old Manhattan place represented in AR: drawing it as a tiny box at one’s feet would be metric scale. 😅 My God that place sucked. Anyway, we’ll see how useful this tech proves & where it can go from here.

“A StreetEasy Instagram poll found that 95% of people have walked past an apartment building and wondered if it has an available unit that meets their criteria. At the same time, 77% have had trouble identifying a building’s address to search for later.”

Google acquires wearable AR display tech

I got a rude awakening a couple of years ago while working in Google’s AR group: the kind of displays that could fit into “glasses that look like glasses” (i.e. not Glass-style unicorn protuberances) had really tiny fields of view, crummy resolution, short battery life, and more. I knew that my efforts to enable cloud-raytraced Volvos & Stormtroopers & whatnot wouldn’t last long in a world that prioritized Asteroids-quality vector graphics on a display the size of a 3″x5″ index card held at arm’s length.

Having been out of that world for a year+ now, I have no inside info on how Google’s hardware efforts have been evolving, but I’m glad to see that they’re making a serious (billion-dollar+) investment in buying more compelling display tech. Per The Verge,

According to Raxium’s website, a Super AMOLED screen on your phone has a pixel pitch (the distance between the center of one pixel, and the center of another pixel next to it) of about 50 microns, while its MicroLED could manage around 3.5 microns. It also boasts of “unprecedented efficiency” that’s more than five times better than any world record.

How does any of this compare to what we’ll see out of Apple, Meta, Snap, etc.? I have no idea, but at least parts of the future promise to be fun.

New courses: Machine learning for typography & art

These Cooper Union courses sound fun:

Each week we’ll cover a different aspect of machine learning. A short lecture covering theories and practices will be followed by demoes using open source web tools and a web-browser tool called Google Colab. The last 3 weeks of class you’ll be given the chance to create your own project using the skills you’ve learned. Topics will include selecting the right model for your use case, gathering and manipulating datasets, and connecting your models to data sources such as audio, text, or numerical data. We’ll also talk a little ethics, because we can’t teach machine learning without a little ethics.

AI: Sam Harris talks with Eric Schmidt

I really enjoyed this conversation—touching, as it does, on my latest fascination (AI-generated art via DALL•E) and myriad other topics. In fact, I plan to listen to it again—hopefully this time near a surface through which to jot down & share some of the most resonant observations. Meanwhile, I think you’ll find it thoughtful & stimulating.

In this episode of the podcast, Sam Harris speaks with Eric Schmidt about the ways artificial intelligence is shifting the foundations of human knowledge and posing questions of existential risk.

A charming Route 66 doodle from Google

Last year I took my then-11yo son Henry (aka my astromech droid) on a 2000-mile “Miodyssey” down Route 66 in my dad’s vintage Miata. It was a great way to see the country (see more pics & posts than you might ever want), and despite the tight quarters we managed not to kill one another—or to get slain by Anton Chigurh in an especially murdery Texas town (but that’s another story!).

In any event, we were especially charmed to see the Goog celebrate the Mother Road in this doodle:

Outdoor Photographer reviews Adobe Super Resolution

Great to see Adobe AI getting some love:

Adobe Super Resolution technology is the best solution I’ve yet found for increasing the resolution of digital images. It doubles the linear resolution of your file, quadrupling the total pixel count while preserving fine detail. Super Resolution is available in both Adobe Camera Raw (ACR) and Lightroom and is accessed via the Enhance command. And because it’s built-in, it’s free for subscribers to the Creative Cloud Photography Plan.

Check out the whole article for details.

Snapchat rolls out Landmarker creation tools; Disney deploys them

Despite Pokemon Go’s continuing (and to me, slightly baffling) success, I’ve long been much more bullish on Snap than Niantic for location-based AR. That’s in part because of their very cool world lens tech, which they’ve been rolling out more widely. Now they’re opening up the creation flow:

“In 2019, we started with templates of 30 beloved sites around the world which creators could build upon called Landmarkers… Today, we’re launching Custom Landmarkers in Lens Studio, letting creators anchor Lenses to local places they care about to tell richer stories about their communities through AR.”

Interesting stats:

At its Lens Fest event, the company announced that 250,000 lens creators from more than 200 countries have made 2.5 million lenses that have been viewed more than 3.5 trillion times. Meanwhile, on Snapchat’s TikTok clone Spotlight, the app awarded 12,000 creators a total of $250 million for their posts. The company says that more than 65% of Spotlight submissions use one of Snapchat’s creative tools or lenses.

On a related note, Disney is now using the same core tech to enable group AR annotation of the Cinderella Castle. Seems a touch elaborate:

  • Park photographer takes your pic
  • That pic ends up in your Disney app
  • You point that app at the castle
  • You see your pic on the castle
  • You then take a pic of your pic on the castle… #YoDawg :upside_down_face:

NASA celebrates Hubble’s 32nd birthday with a lovely photo of five clustered galaxies

Honestly, from DALL•E innovations to classic mind-blowers like this, I feel like my brain is cooking in my head. 🙃 Take ‘er away, science:

Bonus madness (see thread for details):

A free online face-swapping tool

My old boss on Photoshop, Kevin Connor, used to talk about the inexorable progression of imaging tools from the very general (e.g. the Clone Stamp) to the more specific (e.g. the Healing Brush). In the process, high-complexity, high-skill operations were rendered far more accessible—arguably to a fault. (I used to joke that believe it or not, drop shadows were cool before Photoshop made them easy. ¯\_(ツ)_/¯)

I think of that observation when seeing things like the Face Swap tool from Icons8. What once took considerable time & talent in an app like Photoshop is now rendered trivially fast (and free!) to do. “Days of Miracles & Wonder,” though we hardly even wonder now. (How long will it take DALL•E to go from blown minds to shrugged shoulders? But that’s a subject for another day.)

Substance for Unreal Engine 5

I’m no 3D artist (had I but world enough and time…), but I sure love their work & anything that makes it faster and easier. Perhaps my most obscure point of pride from my Photoshop years is that we added per-layer timestamps into PSD files, so that Pixar could more efficiently render content by noticing which layers had actually been modified.

Anyway, now that Adobe has made a much bigger bet on 3D tooling, it’s great to see new support for Substance Painter coming to Unreal Engine:

The Substance 3D plugin (BETA) enables the use of Substance materials directly in Unreal Engine 5 and Unreal Engine 4. Whether you are working on games, visualization and or deploying across mobile, desktop, or XR, Substance delivers a unique experience with optimized features for enhanced productivity.

Work faster, be more productive: Substance parameters allow for real-time material changes and texture updates.

Substance 3D for Unreal Engine 5 contains the plugin for Substance Engine.

Access over 1000 high-quality tweakable and export-ready 4K materials with presets on the Substance 3D Asset library. You can explore community-contributed assets in the community assets library.

The Substance Assets platform is a vast library containing high-quality PBR-ready Substance materials and is accessible directly in Unreal through the Substance plugin. These customizable Substance files can easily be adapted to a wide range of projects.

Frame.io is now available in Premiere & AE

To quote this really cool Adobe video PM who also lives in my house 😌, and who just happens to have helped bring Frame.io into Adobe,

Super excited to announce that Frame.io is now included with your Creative Cloud subscription. Frame panels are now included in After Effects and Premiere Pro. Check it out!

From the integration FAQ:

Frame.io for Creative Cloud includes real-time review and approval tools with commenting and frame-accurate annotations, accelerated file transfers for fast uploading and downloading of media, 100GB of dedicated Frame.io cloud storage, the ability to work on up to 5 different projects with another user, free sharing with an unlimited number of reviewers, and Camera to Cloud.


Behind Peacemaker’s joyously bizarre title sequence

I generally really enjoyed HBO’s Peacemaker series—albeit, as I told the kids, if even I found the profanity excessive, insofar as “too much salt spoils the soup.” I really enjoyed the whacked-out intro music & choreography:

Here the creators give a peek into how it was made:

And here a dance troupe in Bangladesh puts their spin on it:

https://twitter.com/JamesGunn/status/1507405773252575234

DALL•E 2 looks too amazing to be true

There’s no way this is real, is there?! I think it must use NFW technology (No F’ing Way), augmented with a side of LOL WTAF. 😛

Here’s an NYT video showing the system in action:

The NYT article offers a concise, approachable description of how the approach works:

A neural network learns skills by analyzing large amounts of data. By pinpointing patterns in thousands of avocado photos, for example, it can learn to recognize an avocado. DALL-E looks for patterns as it analyzes millions of digital images as well as text captions that describe what each image depicts. In this way, it learns to recognize the links between the images and the words.

When someone describes an image for DALL-E, it generates a set of key features that this image might include. One feature might be the line at the edge of a trumpet. Another might be the curve at the top of a teddy bear’s ear.

Then, a second neural network, called a diffusion model, creates the image and generates the pixels needed to realize these features. The latest version of DALL-E, unveiled on Wednesday with a new research paper describing the system, generates high-resolution images that in many cases look like photos.

Though DALL-E often fails to understand what someone has described and sometimes mangles the image it produces, OpenAI continues to improve the technology. Researchers can often refine the skills of a neural network by feeding it even larger amounts of data.

I can’t wait to try it out.

MyStyle promises smarter facial editing based on knowing you well

A big part of my rationale in going to Google eight (!) years ago was that a lot of creativity & expressivity hinge on having broad, even mind-of-God knowledge of one’s world (everywhere you’ve been, who’s most important to you, etc.). Given access to one’s whole photo corpus, a robot assistant could thus do amazing things on one’s behalf.

In that vein, MyStyle proposes to do smarter face editing (adjusting expressions, filling in gaps, upscaling) by being trained on 100+ images of an individual face. Check it out:

Adobe is acquiring BRIO XR

Exciting news!

Once the deal closes, BRIO XR will be joining an unparalleled community of engineers and product experts at Adobe – visionaries who are pushing the boundaries of what’s possible in 3D and immersive creation. Our BRIO XR team will contribute to Adobe’s Creative Cloud 3D authoring and experience design teams. Simply put, Adobe is the place to be, and in fact, it’s a place I’ve long set my sights on joining.  

AI: Live panel discussion Wednesday about machine learning & art

Two of my Adobe colleagues (Aaron Hertzmann & Ryan Murdock) are among those set to discuss these new frontiers on Wednesday:

Can machines generate art like a human would? They already are. 

Join us on March 30th, at 9AM Pacific for a live chat about what’s on the frontier of machine learning and art. Our team of panelists will break down how text prompts in machine learning models can create artwork like a human might, and what it all means for the future of artistic expression. 

Bios:

Aaron Hertzmann

Aaron Hertzmann is a Principal Scientist at Adobe, Inc., and an Affiliate Professor at University of Washington. He received a BA in Computer Science and Art & Art History from Rice University in 1996, and a PhD in Computer Science from New York University in 2001. He was a professor at the University of Toronto for 10 years, and has worked at Pixar Animation Studios and Microsoft Research. He has published over 100 papers in computer graphics, computer vision, machine learning, robotics, human-computer interaction, perception, and art. He is an ACM Fellow and an IEEE Fellow.

Ryan Murdock

Ryan is a Machine Learning Engineer/Researcher at Adobe with a focus on multimodal image editing. He has been creating generative art using machine learning for years, but is most known for his recent work with CLIP for text-to-image systems. With a Bachelor’s in Psychology from the University of Utah, he is largely self-taught.

Fantastic Shadow Beasts (and Where To Find Them)

I’m not sure who captured this image (conservationist Beverly Joubert, maybe?), or whether it’s indeed the National Geographic Picture of The Year, but it’s stunning no matter what. Take a close look:

Elsewhere I love this compilation of work from “Shadowologist & filmmaker” Vincent Bal:

 

 
 
 
 
 
View this post on Instagram
 
 
 
 
 
 
 
 
 
 
 

 

A post shared by WELCOME TO THE UNIVERSE OF ART (@artistsuniversum)

Photoshop adds full support for WebP

Huzzah! Here’s the scoop:

——–

Full support for WebP

We are excited to announce that Photoshop now has full support for the WebP file format! WebP files can now be opened, created, edited, and saved in Photoshop without the need for a plug-in or preference setting.

Full support for WebP in Photoshop

To open a WebP file, simply select and open the file in the same manner as you would any other supported file or document. In addition to open capabilities, you can now create, edit, and save WebP files. Once you are done editing your document, open Save As or Save a Copy and select WebP from the options provided in the file format drop-down menu to save your WebP file.

To learn more, check out Work with WebP files in Photoshop.

Adobe demos new screen-to-AR shopping tech

Cool idea:

[Adobe] announced a tool that allows consumers to point their phone at a product image on an ecommerce site—and then see the item rendered three-dimensionally in their living space. Adobe says the true-to-life size precision—and the ability to pull multiple products into the same view—set its AR service apart from others on the market. […]

Adobe Unveils New Augmented Reality Shopping Tool Prototype | Adobe AR-2022

Chang Xiao, the Adobe research scientist who created the tool, said many of the AR services currently on the market provide only rough estimations of the size of the product. Adobe is able to encode dimensions information in its invisible marker code embedded in the photos, which its computer vision algorithms can translate into more precisely sized projections.