Category Archives: AR/VR

Fun little AI->3D->AR experiments with Vision Pro

I love watching people connect the emerging creative dots, right in front of our eyes:

Firefly image creation & Lightroom come to Apple Vision Pro

Not having a spare $3500 burning a hole in my pocket, I’ve yet to take this for a spin myself, but I’m happy to see it. Per the Verge:

The interface of the Firefly visionOS app should be familiar to anyone who’s already used the web-based version of the tool — users just need to enter a text description within the prompt box at the bottom and hit “generate.” This will then spit out four different images that can be dragged out of the main app window and placed around the home like virtual posters or prints. […]

Meanwhile, we also now have a better look at the native Adobe Lightroom photo editing app that was mentioned back when the Apple Vision Pro was announced last June. The visionOS Lightroom experience is similar to that of the iPad version, with a cleaner, simplified interface that should be easier to navigate with hand gestures than the more feature-laden desktop software.

Some great Firefly reels

Hey, remember when we launched Adobe Firefly what feels like 63 years ago? 😅 OMG, what a week. I am so tired & busy trying to get folks access (thanks for your patience!), answer questions, and more that I’ve barely had time to catch up on all the great content folks are making. I’ll work on that soon, and in the meantime, here are three quick clips that caught my eye.

First, OG author Deke McClelland shows off type effects:

@dekenow Create Type Effects Out of Thin Air with Adobe Firefly #AdobeFirefly #photoshop #genai #deketok #typeeffects #texteffect #news ♬ original sound – Deke McClelland

Next, Kyle Nutt does some light painting, compositing himself into Firefly images:

And here Don Allen Stevenson puts Firefly creations into augmented reality with the help of Adobe Aero:

Stable Diffusion meets WebAR

Back at the start of my DALL•E journey, I wished aloud for a diffusion-powered mobile app:

Now, thanks to the openness of Stable Diffusion & WebAR, creators are bringing that vision closer to reality:

I can’t wait to see what’s next!

Snapchat: Even simple AR is effective AR

A quarter billion people engage with AR content every day, the company says.

And interestingly, one need not create a complex lens in order to have it pay off:

“The research found that simple AR can be just as performant as a sophisticated, custom Lens in driving both upper and lower-funnel metrics like brand awareness and purchase intent. Brands with the resources to execute a more sophisticated Lens will see additional benefits in mid-funnel brand metrics, including favorability and consideration.”

Google & NASA bring 3D to search

Great to see my old teammates (with whom I was working to enable cloud-rendered as well as locally rendered 3D experiences) continuing their work.

NASA and Google Arts & Culture have partnered to bring more than 60 3D models of planets, moons and NASA spacecraft to Google Search. When you use Google Search to learn about these topics, just click on the View in 3D button to understand the different elements of what you’re looking at even better. These 3D annotations will also be available for cells, biological concepts (like skeletal systems), and other educational models on Search.

“Hyperlapse vs. AI,” + AR fashion

Malick Lombion & friends combined “more than 1,200 AI-generated art pieces combined with around 1,400 photographs” to create this trippy tour:

Elsewhere, After Effects ninja Paul Trillo is back at it with some amazing video-meets-DALL•E-inpainting work:

I’m eager to see all the ways people might combine generation & fashion—e.g. pre-rendering fabric for this kind of use in AR:

A really amazing spin on AR furniture shopping

After seeing years & years of AR demos featuring the placement of furniture, I once heard someone say in exasperation, “Bro… how much furniture do you think I buy?”

Happily here’s a decidedly fresh approach, surrounding the user & some real-world furniture with a projection of the person’s 3D-scanned home. Wild!

Now, how easy can 3D home scanning be made—and how much do people care about this kind of scenario? I don’t know, but I love what the tech can enable already.

Snap Research promises 3D creation from photo collections

Hmm—this is no doubt brilliant tech, and I’d like to learn more, but I wonder about the Venn diagram between “Objects that people want in 3D,” “Objects for which a sufficiently large number of good images exist,” and “Objects for which good human-made 3D models don’t already exist.” In my experience photogrammetry is most relevant for making models from extremely specific subjects (e.g. a particular apartment) rather than from common objects that are likely to exist on Sketchfab et al. It’s entirely possible I’m missing a nuanced application here, though. As I say, cool tech!

DALL•E + Snapchat = Clothing synthesis + try-on

Though we don’t (yet?) have the ability to use 3D meshes (e.g. those generated from a photo of a person) to guide text-based synthesis through systems like DALL•E, here’s a pretty compelling example of making 2D art, then wrapping it onto a body in real time:

Niantic cancels Transformers AR game, lays off scores of people

I’ve long been bewildered & bearish regarding Niantic, and about location-based AR games in general, even when they’re paired with AAA franchises (RIP Minecraft Earth). Now pile Transformers onto the dead-wagon:

Niantic has been unable to replicate that success [of Pokemon Go]. In 2019 it launched Harry Potter: Wizards Unite, which failed to find an audience and shut down earlier this year. Games based on the board game Catan and the Nintendo series Pikmin were also unsuccessful.

Ugh. Do people want experiences like this? Somehow they’ve continued to pay a billion+ dollars per year for Pokemon Go (!!), which hasn’t seemingly changed in its nearly six years of life—but so far it’s the exception that proves the rule.

But who knows: maybe AR wearables will change the game—and in the meantime Niantic & the NBA have just announced NBA All-World, which will “place NBA fans into the real-world metaverse.”


Mobile DALL•E = My kind of location-based AR

I’ve long considered augmented reality apps to be “realtime Photoshop”—or perhaps more precisely, “realtime After Effects.” I think that’s true & wonderful, but most consumer AR tends to be ultra-confined filters that produce ~1 outcome well.

Walking around San Francisco today, it struck me today that DALL•E & other emerging generative-art tools could—if made available via a simple mobile UI—offer a new kind of (almost) realtime Photoshop, with radically greater creative flexibility.

Here I captured a nearby sculpture, dropped out the background in Photoshop, uploaded it to DALL•E, and requested “a low-polygon metallic tree surrounded by big dancing robots and small dancing robots.” I like the results!

Bird scooters use Google AR to curb illegal parking

Building on yesterday’s post about Google’s new Geospatial API, developers can now embed a live view featuring a camera feed + augmentations, and developers like Bird are wasting no time in putting it to use. TNW writes,

When parking a scooter, the app prompts a rider to quickly scan the QR code on the vehicle and its surrounding area using their smartphone camera… [T]his results in precise, centimeter-level geolocation that enables the system to detect and prevent improper parking with extreme accuracy — all while helping monitor user behavior.

TechCrunch adds,

Out of the over 200 cities that Lime serves, its VPS is live now in six: London, Paris, Tel Aviv, Madrid, San Diego and Bordeaux. Similar to Bird, Lime’s pilot involves testing the tech with a portion of riders. The company said results from its pilots have been promising, with those who used the new tool seeing a 26% decrease in parking errors compared to riders who didn’t have the tool enabled. 

A deep dive on Google’s new Geospatial API

My friend Bilawal & I collaborated on AR at Google, including our efforts to build a super compact 3D engine for driving spatial annotation & navigation. We’d often talk excitedly about location-based AR experiences, especially the Landmarker functionality arriving in Snapchat. All the while he’s been busy pushing the limits of photogrammetry (including putting me in space!) to scan 3D objects.

Now I’m delighted to see him & his team unveiling the Geospatial API (see blog post, docs, and code), which enables cross-platform (iOS, Android) deployment of experiences that present both close-up & far-off augmentations. Here’s the 1-minute sizzle reel:

For a closer look, check out this interesting deep dive into what it offers & how it works:

ApARtment hunting

Among the Google teams working on augmented reality, there was a low-key religious war about the importance of “metric scale” (i.e. matching real-world proportions 1:1). The ARCore team believed it was essential (no surprise, given their particular tech stack), while my team (Research) believed that simply placing things in the world with a best guess as to size, then letting users adjust an object if needed, was often the better path.

I thought of this upon seeing StreetEasy’s new AR tech for apartment-hunting in NYC. At the moment it lets you scan a building to see its inventory. That’s very cool, but my mind jumped to the idea of seeing 3D representations of actual apartments (something the company already offers, albeit not in AR), and I’m amused to think of my old Manhattan place represented in AR: drawing it as a tiny box at one’s feet would be metric scale. 😅 My God that place sucked. Anyway, we’ll see how useful this tech proves & where it can go from here.

“A StreetEasy Instagram poll found that 95% of people have walked past an apartment building and wondered if it has an available unit that meets their criteria. At the same time, 77% have had trouble identifying a building’s address to search for later.”

Google acquires wearable AR display tech

I got a rude awakening a couple of years ago while working in Google’s AR group: the kind of displays that could fit into “glasses that look like glasses” (i.e. not Glass-style unicorn protuberances) had really tiny fields of view, crummy resolution, short battery life, and more. I knew that my efforts to enable cloud-raytraced Volvos & Stormtroopers & whatnot wouldn’t last long in a world that prioritized Asteroids-quality vector graphics on a display the size of a 3″x5″ index card held at arm’s length.

Having been out of that world for a year+ now, I have no inside info on how Google’s hardware efforts have been evolving, but I’m glad to see that they’re making a serious (billion-dollar+) investment in buying more compelling display tech. Per The Verge,

According to Raxium’s website, a Super AMOLED screen on your phone has a pixel pitch (the distance between the center of one pixel, and the center of another pixel next to it) of about 50 microns, while its MicroLED could manage around 3.5 microns. It also boasts of “unprecedented efficiency” that’s more than five times better than any world record.

How does any of this compare to what we’ll see out of Apple, Meta, Snap, etc.? I have no idea, but at least parts of the future promise to be fun.

Snapchat rolls out Landmarker creation tools; Disney deploys them

Despite Pokemon Go’s continuing (and to me, slightly baffling) success, I’ve long been much more bullish on Snap than Niantic for location-based AR. That’s in part because of their very cool world lens tech, which they’ve been rolling out more widely. Now they’re opening up the creation flow:

“In 2019, we started with templates of 30 beloved sites around the world which creators could build upon called Landmarkers… Today, we’re launching Custom Landmarkers in Lens Studio, letting creators anchor Lenses to local places they care about to tell richer stories about their communities through AR.”

Interesting stats:

At its Lens Fest event, the company announced that 250,000 lens creators from more than 200 countries have made 2.5 million lenses that have been viewed more than 3.5 trillion times. Meanwhile, on Snapchat’s TikTok clone Spotlight, the app awarded 12,000 creators a total of $250 million for their posts. The company says that more than 65% of Spotlight submissions use one of Snapchat’s creative tools or lenses.

On a related note, Disney is now using the same core tech to enable group AR annotation of the Cinderella Castle. Seems a touch elaborate:

  • Park photographer takes your pic
  • That pic ends up in your Disney app
  • You point that app at the castle
  • You see your pic on the castle
  • You then take a pic of your pic on the castle… #YoDawg :upside_down_face:

Adobe is acquiring BRIO XR

Exciting news!

Once the deal closes, BRIO XR will be joining an unparalleled community of engineers and product experts at Adobe – visionaries who are pushing the boundaries of what’s possible in 3D and immersive creation. Our BRIO XR team will contribute to Adobe’s Creative Cloud 3D authoring and experience design teams. Simply put, Adobe is the place to be, and in fact, it’s a place I’ve long set my sights on joining.  

Adobe demos new screen-to-AR shopping tech

Cool idea:

[Adobe] announced a tool that allows consumers to point their phone at a product image on an ecommerce site—and then see the item rendered three-dimensionally in their living space. Adobe says the true-to-life size precision—and the ability to pull multiple products into the same view—set its AR service apart from others on the market. […]

Adobe Unveils New Augmented Reality Shopping Tool Prototype | Adobe AR-2022

Chang Xiao, the Adobe research scientist who created the tool, said many of the AR services currently on the market provide only rough estimations of the size of the product. Adobe is able to encode dimensions information in its invisible marker code embedded in the photos, which its computer vision algorithms can translate into more precisely sized projections.

Snapchat introduces new AR lenses to celebrate the lunar new year

As I’ve noted previously, I’m (oddly?) much more bullish on Snap than on Niantic to figure out location-based augmentation of the world. That’s in part because of their very cool world lens tech, which can pair specific experiences with specific spots. It’s cool to see it rolling out more widely:

The first Lens is a new AR experience that takes users through the story of Asian-American businesswoman Lucy Yu, the owner of ‘Yu & Me Books’ in NYC, which is an independent bookshop that’s dedicated to showcasing stories from underrepresented authors.

And for one that’s more widely accessible,

Snap’s also added a new Year of the Tiger Lens, which uses Sky Segmentation technology to add an animated watercolor tiger jumping through the clouds.

Pinterest adds AR shopping

“I’m like, ‘Bro, how much furniture do you think I buy??'”

I forget who said this while I was working on AR at Google, but it’s always made me laugh, because nearly every demo inevitably gets into the territory of, “Don’t you wish you could see whether this sofa fits in your space?”

Still, though, it’s a useful capability—especially if one can offer a large enough corpus of 3D models (something we found challenging, at least a few years back). Now, per the Verge:

Pinterest is adding a “Try On for Home Decor” feature to its app, letting you see furniture from stores like Crate & Barrel, CB2, Walmart, West Elm, and Wayfair in your house… According to the company’s announcement post, you’ll be able to use its Lens camera to try out over 80,000 pieces of furniture from “shoppable Pins.”

As the article notes,

Of course, this isn’t a new idea — Ikea’s app lets you drop virtual furniture into your house in 2013, and in the almost decade since, companies like TargetAmazonShopify, and even Etsy have introduced ways to let you see how certain products will work in your house. Other companies, like Walmart, have gone even further, imagining (and trademarking ideas forentire virtual shopping experiences.

To me the progress here is access & ubiquity, making it commonly possible for shoppers to try these experiences. I’m glad to see it.

Notre-Dame goes VR

(No, not that Notre Dame—the cathedral undergoing restoration.) This VR tour looks compelling:

Equipped with an immersive device (VR headset and backpack), visitors will be able to move freely in a 500 sqm space in Virtual Reality. Guided by a “Compagnon du Devoir” they will travel through different centuries and will explore several eras of Notre Dame de Paris and its environement, recreated in 3D.

Thanks to scientific surveys, and precise historical data, the cathedral and its surroundings have been precisely reproduced to enhance the visitor’s immersion and engagement in the experience.

Check out the short trailer below:

Niantic introduces the Lightship AR dev platform

Hmm—I want to get excited here, but as I’ve previously detailed, I’m finding it tough.

Pokemon Go remains the one-hit wonder of the location-based content/gaming space. That being true 5+ years after its launch, during which time Niantic has launched & killed Harry Potter Wizard Unite; Microsoft has done the same with Minecraft Earth; and Google has (AFAIK) followed suit with their location-based gaming API, I’m not sure that we’ll turn a corner until real AR glasses arrive.

Still & all, here it is:

The Niantic Lightship Augmented Reality Developer Kit, or ARDK, is now available for all AR developers around the world at To celebrate the launch, we’re sharing a glimpse of the earliest AR applications and demo experiences from global brand partners and developer studios from across the world.

We’re also announcing the formation of Niantic Ventures to invest in and partner with companies building the future of AR. With an initial $20 million fund, Niantic Ventures will invest in companies building applications that share our vision for the Real-World Metaverse and contribute to the global ecosystem we are building. To learn more about Niantic Ventures, go to

It’s cool that “The Multiplayer API is free for apps with fewer than 50,000 monthly active users,” and even above that number, it’s free to everyone for the first six months.

Google enables Pixel -> Snap in two taps

I was so excited to build an AR stack for Google Lens, aiming to bring realtime magic to billions of phones’ default camera. Sadly, after AR Playground went out the door three years ago & the world shrugged, Google lost interest.

At least they’re letting others like Snap grab the mic.

Dubbed “Quick Tap to Snap,” the new feature will enable users to tap the back of the device twice to open the Snapchat camera directly from the lock screen. Users will have to authenticate before sending photos or videos to a friend or their personal Stories page. 

Snapchat’s Pixel service will also include extra augmented-reality lenses and integrate some Google features, like live translation in the chat feature, according to the company.

I wish Apple would offer similar access to third-party camera apps like Halide Camera, etc. Its absence has entirely killed my use of those apps, no matter how nice they may be.

AR: How the giant Carolina Panther was made

By now you’ve probably seen this big gato bounding around:

I’ve been wondering how it was done (e.g. was it something from Snap, using the landmarker tech that’s enabled things like Game of Thrones dragons to scale the Flatiron Building?). Fortunately the Verge provides some insights:

In short, what’s going on is that an animation of the virtual panther, which was made in Unreal Engine, is being rendered within a live feed of the real world. That means camera operators have to track and follow the animations of the panther in real time as it moves around the stadium, like camera operators would with an actual living animal. To give the panther virtual objects to climb on and interact with, the stadium is also modeled virtually but is invisible.

This tech isn’t baked into an app, meaning you won’t be pointing your phone’s camera in the stadium to get another angle on the panther if you’re attending a game. The animations are intended to air live. In Sunday’s case, the video was broadcast live on the big screens at the stadium.

I look forward to the day when this post is quaint, given how frequently we’re all able to glimpse things like this via AR glasses. I give it 5 years, or maybe closer to 10—but let’s see.

Behind the scenes with Olympians & Google’s AR “Scan Van”

I swear I spent half of last summer staring at tiny 3D Naomi Osaka volleying shots on my desktop. I remain jealous of my former teammates who got to work with these athletes (and before them, folks like Donald Glover as Childish Gambino), even though doing so meant dealing with a million Covid safety protocols. Here’s a quick look at how they captured folks flexing & flying through space:

You can play with the content just by searching:

[Via Chikezie Ejiasi]

AR: Olympians come to Google search

Last summer my former teammates got all kinds of clever in working around Covid restrictions—and the constraints of physics and 3D capture—to digitize top Olympic athletes performing their signature moves. I wish they’d share the behind-the-scenes footage, as it’s legit fascinating. (Also great: seeing Donald Glover, covered in mocap ping pong balls for the making of Pixel Childish Gambino AR content, sneaking up behind my colleague like some weird-ass phantom. 😝)

Anyway, after so much delay and uncertainty, I’m happy to see those efforts now paying off in the form of 3D/AR search results. Check it out:

PixARface: Scarface goes Pixar

One, it’s insane what AR can do in realtime.
Two, this kind of creative misuse of tech is right up my alley.

Update/bonus: Nobody effs with the AR Jesus:

“Supernatural” offers home workouts in VR

Hmm—this looks slick, but I’m not sure that I want to have a big plastic box swinging around my face while I’m trying to get fit. As a commenter notes, “That’s just Beat Saber with someone saying ‘good job’ once in a while”—but a friend of mine says it’s great. ¯\_(ツ)_/¯

This vid (same poster frame but different content) shows more of the actual gameplay:

Body Movin’: Adobe Character Animator introduces body tracking (beta)

You’ll scream, you’ll cry, promises designer Dave Werner—and maybe not due just to “my questionable dance moves.”

Live-perform 2D character animation using your body. Powered by Adobe Sensei, Body Tracker automatically detects human body movement using a web cam and applies it to your character in real time to create animation. For example, you can track your arms, torso, and legs automatically. View the full release notes.

Check out the demo below & the site for full details.

Vid2Actor: Turning video of humans into posable 3D models

As I’m on a kick sharing recent work from Ira Kemelmacher-Shlizerman & team, here’s another banger:

Given an “in-the-wild” video, we train a deep network with the video frames to produce an animatable human representation.

This can be rendered from any camera view in any body pose, enabling applications such as motion re-targeting and bullet-time rendering without the need for rigged 3D meshes.

I look forward (?) to the not-so-distant day when a 3D-extracted Trevor Lawrence hucks a touchdown to Cleatus the Fox Sports Robot. Grand slam!!

Check out the Spark AR Master Class

I remain fascinated by what Snap & Facebook are doing with their respective AR platforms, putting highly programmable camera stacks into the hands of hundreds of millions of consumers & hundreds of thousands of creators. If you have thoughts on the subject & want to nerd out some time, drop me a note.

A few months back I wanted to dive into the engine that’s inside Instagram, and I came across the Spark AR masterclass put together & presented by filter creator Eddy Adams. I found it engaging & informative, if even a bit fast for my aging brain 🙃. If you’re tempted to get your feet wet in this emerging space, I recommend giving it a shot.

Niantic sneaks 5G AR “Urban Legends”; what does it all mean?

“‘Augmented Reality: A Land Of Contrasts.’ In this essay, I will…”

Okay, no, not really, but let me highlight some interesting mixed signals. (It’s worth noting that these are strictly my opinions, not those of any current or past employer.)

Pokémon Go debuted almost exactly 5 years ago, and last year, even amidst a global pandemic that largely immobilized people, it generated its best revenue ever—more than a billion dollars in just the first 10 months of the year, bringing its then-total to more than $4 billion.

Having said that…

  • In the five years since its launch, what other location-based AR games (or AR games, period) have you seen really take off? Even with triple-A characters & brands, Niantic’s own Harry Potter title made a far smaller splash, and Minecraft Earth (hyped extensively at an Apple keynote event) is being shut down.
  • When I launched Pokémon Go last year (for the first time in years), I noticed that the only apparent change since launch was that AR now defaults to off. That is, Niantic apparently decided that monster-catching was easier, more fun, and/or less resource-intensive when done in isolation, with no camera overlay.
  • The gameplay remains extremely rudimentary—no use (at least that I could see) of fancy SLAM tracking, depth processing, etc., despite Niantic having acquired startups to enable just this sort of thing, showing demos three years ago.
  • Network providers & handset makers really, really want you to want 5G—but I’ve yet to see it prove to be transformative (even for the cloud-rendered streaming AR that my Google team delivered last year). Even when “real” 5G is available beyond a couple of urban areas, it’s hard to imagine a popular title being 5G-exclusive.

So does this mean I think location-based AR games are doomed? Well, no, as I claim zero prognostication-fu here. I didn’t see Pokémon Go coming, despite my roommate in Nepal (who casually mentioned that he’d helped found Google Earth—as one does) describing it ahead of launch; and given the way public interest in the app dropped after launch (see above), I’d never have guessed that it would be generating record revenue now—much less during a pandemic!

So, who knows: maybe Niantic & its numerous partners will figure out how to recapture lighting in a bottle. Here’s a taste of how they expect that to look:

If I had to bet on someone, though, it’d be Snap: they’ve been doing amazing site-specific AR for the last couple of years, and they’ve prototyped collaborative experiences built on the AR engine that hundreds of millions of people use every day; see below. Game on!

AR: Google is working on indoor walking nav

I spent my last couple of years at Google working on a 3D & AR engine that could power experiences across Maps, YouTube, Search, and other surfaces. Meanwhile my colleagues have been working on data-gathering that’ll use this system to help people navigate via augmented reality. As TechCrunch writes:

Indoor Live View is the flashiest of these. Google’s existing AR Live View walking directions currently only work outdoors, but thanks to some advances in its technology to recognize where exactly you are (even without a good GPS signal), the company is now able to bring this indoors.

This feature is already live in some malls in the U.S. in Chicago, Long Island, Los Angeles, Newark, San Francisco, San Jose and Seattle, but in the coming months, it’ll come to select airports, malls and transit stations in Tokyo and Zurich as well (just in time for vaccines to arrive and travel to — maybe — rebound). Because Google is able to locate you by comparing the images around you to its database, it can also tell which floor you are on and hence guide you to your gate at the Zurich airport, for example.

“You Look Like A Thing, And I Love You”

I really enjoyed listening to the podcast version of this funny, accessible talk from AI Weirdness writer Janelle Shane, and think you’d get a kick out of it, too.

On her blog, Janelle writes about AI and the weird, funny results it can produce. She has trained AIs to produce things like cat names, paint colors, and candy heart messages. In this talk she explains how AIs learn, fail, adapt, and reflect the best and worst of humanity.