Category Archives: AR/VR

NYT explains the Notre Dame fire in 3D

The Times staff put together this impressive piece in less than a day, using scrolling to control a glTF file rendered via Three.js (see 10s recording below). Creator Graham Roberts tweets, “This approach/style stems from our AR efforts over the past 18 months. Originally built as a fallback to camera mode, but now also a great way to make the web more dimensional towards a spatial future.”

Tangentially related: Firefighters used a half-ton robot named Colossus to help battle the blaze:

[YouTube]

Tilt Brush is coming to the Oculus Quest

Man, at $199 & sporting this creation tool, the forthcoming standalone device is starting to seriously tempt me. As for Tilt Brush specifically, the Verge writes,

After its original release on the HTC Vive back in 2016, Tilt Brush quickly became a mainstay of headset demos. It’s easy enough to start painting basic 3D structures, but in the years since, artists have painted some pretty stunning pieces in the app. The Quest version of Tilt Brush will continue to support uploads to Poly, Google’s online 3D object library, if you want to share your work, or just gawk at what others have made.

Check out lots more details on the Oculus site.

AR turns a Jack Daniel’s label into a slick popup

I love the execution here. I’ve long wanted to automatically put people into LovePop-style 3D cards, but no one listens & I can just say these things here because lol nothing matters. 😌

VR Scout writes,

The first augmented scene converts the bottle into a model of the actual Jack Daniels distillery located in Lynchburg, Tennessee. The second experience takes you step-by-step through their whiskey distillation process, while the third tells the story of Jack Daniels himself.

NewImage

[YouTube] [Via Asad Ullah Naweed]

Snapchat’s latest GoT filter freezes the Flatiron Building

Really damn impressive.

Snapchat users in NYC will be able to experience the Landmarker lens starting Sunday, April 14 — the premiere date for “Game of Thrones” Season 8 — and the week thereafter, to see the fantasy creature encase the Flatiron in ice… To access the “GOT” Landmaker Lens, users in the vicinity of the Flatiron Building must tap the Snapchat camera and select the “Game of Thrones” Lens.

Mill Mascot promises rich puppet animation

This thing looks quite impressive:

Mill Mascot combines real-time game engine technology with motion sensors, so characters can be puppeteered through hand and facial gestures. This gives directors, creatives and artists the ability to ‘direct’ or ‘perform’ CG character animation live and make creative decisions on the fly.

The system by-passes the lengthy timelines involved with traditional animation and pre-rendered content, reducing animation and rendering time down to 42 milliseconds per frame.

[Vimeo]

Augmented dance: Projection + power

I’d love to see this in person:

Colossal explains,

Pixel is an innovative dance performance conceived by French performance artists Adrien Mondot and Claire Bardainne, known collectively as the Adrien M / Claire B Company, in collaboration with hip-hop choreographer Cie Kafig. The hour-long performance incorporates a host of digital projection mapping techniques, 11 dancers, and bills itself as “a work on illusion combining energy and poetry, fiction and technical achievement, hip hop and circus.”

[Vimeo]

Inside Google’s realtime face ML tech

I’m delighted that my teammates are getting to share the details of how the awesome face-tracking tech they built works

We employ machine learning (ML) to infer approximate 3D surface geometry to enable visual effects, requiring only a single camera input without the need for a dedicated depth sensor. This approach provides the use of AR effects at realtime speeds, using TensorFlow Lite for mobile CPU inference or its new mobile GPU functionality where available. This technology is the same as what powers YouTube Stories’ new creator effects, and is also available to the broader developer community via the latest ARCore SDK release and the ML Kit Face Contour Detection API.

We’ve been hard at work ensuring that the tech works well for really demanding applications like realistic makeup try-on:

NewImage

If you’re a developer, dig into the links above to see how you can use the tech—and everyone else, stay tuned for more fun, useful applications of it across Google products.

Googlers win VFX Oscars

Congrats to Paul Debevec, Xueming Yu Wan-Chun Alex Ma, and their former colleague Timothy Hawkins for the recognition of their groundbreaking Light Stage work! 

[YouTube]

Now they’re working with my extended team:

“We try to bring our knowledge and background to try to make better Google products,” Ma says. “We’re working on improving the realism of VR and AR experiences.”

I go full SNL Sue thinking about what might be possible.

NewImage

Oh, and they worked on Ready Player One (nominated for Best Visual Effects this year) and won for Blade Runner 2049 last year:

Just prior to heading to Google, they worked on “Blade Runner 2049,” which took home the Oscar for Best Visual Effects last year and brought back the character Rachael from the original “Blade Runner” movie. The new Rachael was constructed with facial features from the original actress, Sean Young, and another actress, Loren Peta, to make the character appear to be the same age she was in the first film.

Check out their work in action:

[YouTube 1 & 2]

Machine learning in your browser tracks your sweet bod

A number of our partner teams have been working on both the foundation for browser-based ML & on cool models that can run there efficiently:

We are excited to announce the release of BodyPix, an open-source machine learning model which allows for person and body-part segmentation in the browser with TensorFlow.js. With default settings, it estimates and renders person and body-part segmentation at 25 fps on a 2018 15-inch MacBook Pro, and 21 fps on an iPhone X. […]

This might all make more sense if you try a live demo here.

Check out this post for more details.

It’s Friday: Let’s melt some faces!

I’m so pleased to say that my team’s face-tracking tech (which you may have seen powering AR effects in YouTube Stories and elsewhere) is now available for developers to build upon:

ARCore’s new Augmented Faces API (available on the front-facing camera) offers a high quality, 468-point 3D mesh that lets users attach fun effects to their faces. From animated masks, glasses, and virtual hats to skin retouching, the mesh provides coordinates and region specific anchors that make it possible to add these delightful effects.


“Why do you keep looking at King Midas’s wife?” my son Finn asked as I was making this GIF the other day. :-p

Check out details & grab the SDKs:

We can’t wait to see what folks build with this tech, and we’ll share more details soon!

AR walking nav is starting to arrive in Google Maps

I’m really pleased to see that augmented reality navigation has gone into testing with Google Maps users:

On the Google AI Blog, the team gives some insights into the cool tech at work:

We’re experimenting with a way to solve this problem using a technique we call global localization, which combines Visual Positioning Service (VPS), Street View, and machine learning to more accurately identify position and orientation. […]

VPS determines the location of a device based on imagery rather than GPS signals. VPS first creates a map by taking a series of images which have a known location and analyzing them for key visual features, such as the outline of buildings or bridges, to create a large scale and fast searchable index of those visual features. To localize the device, VPS compares the features in imagery from the phone to those in the VPS index. However, the accuracy of localization through VPS is greatly affected by the quality of the both the imagery and the location associated with it. And that poses another question—where does one find an extensive source of high-quality global imagery?

Read on for the full story.

AR: Gambeezy lands on Pixel!

This is America… augmented by Childish Gambino on Pixel:

NewImage

The Childish Gambino Playmoji pack features unique moves that map to three different songs: “Redbone,” “Summertime Magic,” and “This is America.” Pixel users can start playing with them today using the camera on their Pixel, Pixel XL, Pixel 2, Pixel 2 XL, Pixel 3 and Pixel 3 XL.

And with some help from my team:

He even reacts to your facial expressions in real time thanks to machine learning—try smiling or frowning in selfie mode and see how he responds.

Enjoy!

Wacom’s teaming up with Magic Leap for collaborative creation

Hmm—it’s a little hard to gauge just from a written description, but I’m excited to see new AR blood teaming up with an OG of graphics to try defining a new collaborative environment:

Wearing a Magic Leap One headset connected to a Wacom Intuos Pro pen tablet, designers can use the separate three-button Pro Pen 3D stylus to control their content on a platform called Spacebridge, which streams 3D data into a spatial computing environment. The program allows multiple people in a room to interact with the content, with the ability to view, scale, move, and sketch in the same environment.

Check out the rest of the Verge article for details. I very much look forward to seeing how this develops.

NewImage

AR & AI help blind users navigate space & perceive emotions

I love assistive superpowers like this work from Caltech:

VR Scout shares numerous details:

[T]he team used the Microsoft HoloLens’s capability to create a digital mesh over a “scene” of the real-world. Using unique software called Cognitive Augmented Reality Assistant (CARA), they were able to convert information into audio messages, giving each object a “voice” that you would hear while wearing the headset. […]

If the object is at the left, the voice will come from the left side of the AR headset, while any object on the right will speak out to you from the right side of the headset. The pitch of the voice will change depending on how far you are from the object.

NewImage

Meanwhile Huawei is using AI to help visually impaired users “hear” facial expressions:

Facing Emotions taps the Mate 20 Pro’s back cameras to scan the faces of conversation partners, identifying facial features like eyes, nose, brows, and mouth, and their positions in relation to each other. An offline, on-device machine learning algorithm interprets the detected emotions as sounds, which the app plays on the handset’s loudspeaker.

NewImage

[YouTube] [Via Helen Papagiannis]

Want more friends? Build a clone army in PicsArt.

Heh—kinda fun & interesting:

This new camera feature, dubbed “Live Stickers,” allows users to produce multiple animated stickers of themselves or friends, and place them in “live” environments before sharing them with the world through social media platforms.

Is it useful? I’m not sure, but I’d welcome your thoughts. You can grab the app for iOS and take it for a spin.

NewImage

[YouTube] [Via]

Walk through walls in this eye-popping 3D reconstruction & visualization

Wow—check out this amazing fly-through from Oddviz:

Orphanages are dense and harmonious living spaces housing hundreds of children under same roof simultaneously. Abandoned Jewish Orphanage Building in Ortaköy (OHR-tah-keuy) Istanbul (also known as El Orfelinato) has been home for thousand lives during its century old history. It holds the memory of the past in worn stairs and layers of paint.

El Orfelinato means ‘The Orphanage’ in Spanish. The name has been used by Sephardi Jews (Jews from Spain) community in Istanbul for decades. Sephardi Jews have a 500 year history in Istanbul since they were forced to migrate with mass conversions and executions by Catholic Monarchs in Iberia in 15th century.

oddviz sheds light upon the visual and spatial memory of El Orfelinato, documenting it as it is with photogrammetry and presenting it in doll house view.

And if that’s up your alley, check out their similar Hotel:

NewImage

[Vimeo] [Via]

What might be next for Facebook 3D photos?

Facebook’s 3D photos (generated from portrait-mode images) have quickly proven to be my favorite feature added to that platform in years. Hover or drag over this example:

My crazy three! 😝😍 #007 #HappyHalloween

Posted by John Nack on Wednesday, October 31, 2018

The academic research they’ve shared, however, promises to go farther, enabling VR-friendly panoramas with parallax. The promise is basically “Take 30 seconds to shoot a series of images, then allow another 30 seconds for processing.” The first portion might well be automated, enabling the user to simply pan slowly across a scene.

NewImage

This teaser vid shows how scenes are preserved in 3D, enabling post-capture effects like submerging them in water:

Will we see this ship in FB, and if so when? Your guess is as good as mine, but I find the progress exciting.

[YouTube]

Costume-Aware Fill? Disney shows off neat AR clothing tech

Pretty cool stuff, though at the moment it seems to require using a pre-captured background:

When overlaying a digital costume onto a body using pose matching, several parts of the person’s cloth or skin remain visible due to differences in shape and proportions. In this paper, we present a practical solution to these artifacts which requires minimal costume parameterization work, and a straightforward inpainting approach.

NewImage

[YouTube] [Via Steve Toh]

AR: A virtual desktop on your actual desktop?

Here’s a pretty darn clever idea for navigating among apps by treating your phone as a magic window into physical space.

You use the phone’s spatial awareness to ‘pin’ applications in a certain point in space, much like placing your notebook in one corner of your desk, and your calendar at another… You can create a literal landscape of apps that you can switch between by simply switching the location of your phone.

NewImage

[Via]

New open-source Google AI experiments help people make art

One’s differing physical abilities shouldn’t stand in the way of drawing & making music. Body-tracking tech from my teammates George & Tyler (see previous) is just one of the new Web-based experiments in Creatability. Check it out:

Creatability is a set of experiments made in collaboration with creators and allies in the accessibility community. They explore how creative tools – drawing, music, and more – can be made more accessible using web and AI technology. They’re just a start. We’re sharing open-source code and tutorials for others to make their own projects.

NewImage

[YouTube]

BumpTop is back… now in AR!

I wish I could find the joke video someone made when Google acquired the beloved email client Sparrow back in 2012. The vid presented itself as a cheerful tutorial on “How to prepare for the Sparrow Google migration” that basically went like this: 

  1. Open your Applications folder.
  2. Locate the Sparrow app icon.
  3. Now, carefully just drag it to the trash.
  4. Empty the trash.
  5. And voila, now you’re ready for Google’s ownership of this great app!

I’ve thought of it countless times over the years, and I think of it now recalling BumpTop, a 3D desktop GUI that I first wrote about 13 years ago (!):

Google acquired the tech & team back in 2010, and then… nothing, as far as I know. (All this predated my time at the company, so I have zero inside info.) Empty the trash, and voila!

But evidently not content to let things die, the BumpTop team has reemerged with Spatial, a new approach to team collaboration done via virtual & augmented reality. Check it out:

Will Spatial have more lasting impact than BumpTop? I have no idea! But I look forward to trying it out, and I’m sure glad that teams like this are busy trying to make the world more delightful.

NewImage

[YouTube 1 & 2]

Pixel 3 Playground is here!

I’m so pleased to reveal what we’ve been working on for quite some time—the new Playground augmented reality mode on Google Pixel devices!

Playground brings you more powerful AR experiences and uses AI to recommend content for expressing yourself in the moment. You can make your photos and videos come to life with Playmoji—characters that react to each other and to you—and tell a richer story by adding fun captions or animated stickers.

Playground makes real-time suggestions to recommend content based on the scene you’re in. Are you walking your dog? Cooking in the kitchen? Gardening in the backyard? Playground uses advanced computer vision and machine learning to recommend relevant Playmoji, stickers and captions to populate the scene.

My team contributed tech that enables selfie stickers (using realtime segmentation to let characters stand behind you), reactive stickers (those that respond to humans in the frame), object tracking (so that you can attach stickers to moving elements like pets & hands), and glue that helps the pieces communicate. Happily, too, we’re just getting warmed up.

Oh, and stay tuned for Gambeezy!

NewImage

[YouTube]

Absolute witchcraft: AI synthesizes dance moves, entire street scenes

This 💩 is 🍌🍌🍌, B-A-N-A-N-A-S: This Video-to-Video Synthesis tech apparently can take in one dance performance & apply it to a recording of another person to make her match the moves:

It can even semantically replace entire sections of a scene—e.g. backgrounds in a street scene: 

Now please excuse me while I lie down for a bit, as my brain is broken.

NewImageScan

[YouTube]

[YouTube 1 & 2] [Via Tyler Zhu]

VR creators: Apply for Google’s Creator Lab in London

Sounds pretty cool:

Today we’re announcing that VR Creator Lab is coming to London. Participants will receive between $30,000 and $40,000 USD in funding towards their VR project, attend a three day “boot camp” September 18-20, 2018, and receive three months of training from leading VR instructors and filmmakers.

Applications are open through 5pm British Summer Time on August 6, 2018. YouTube creators with a minimum of 10,000 subscribers and independent filmmakers are eligible.

NewImage

[YouTube]

Match your body pose to Hollywood imagery via Kinemetagraph

Apropos of Google’s Move Mirror project (mentioned last week), here’s a similar idea:

Kinemetagraph reflects the bodily movement of the visitor in real time with a matching pose from the history of Hollywood cinema. To achieve this, it correlates live motion capture data using Kinect-based “skeleton tracking” to an open-source computer vision research dataset of 20,000 Hollywood film stills with included character pose metadata for each image.

The notable thing, I think, is that what required a dedicated hardware sensor a couple of years ago can now be done plug-in-free using just a browser and webcam. Progress!

NewImage

[Via Paul Chang]

Body Movin’: Drive image search with your body movements

Unleash the dank emotes! My teammates George & Tyler (see previous) are back at it running machine learning in your browser, this time to get you off the couch with the playful Move Mirror:

Move Mirror takes the input from your camera feed and maps it to a database of more than 80,000 images to find the best match. It’s powered by Tensorflow.js—a library that runs machine learning models on-device, in your browser—which means the pose estimation happens directly in the browser, and your images are not being stored or sent to a server. For a deep dive into how we built this experiment, check out this Medium post.

 Enjoy!

NewImage

NewImage

[YouTube]

AR soccer on your table? Google & Facebook researchers make it happen

Can low-res YouTube footage be used to generate a 3D model of a ballgame—one that can then be visualized from different angles & mixed into the environment in front of you? Kinda, yeah!

Per TechCrunch,

The “Soccer On Your Tabletop” system takes as its input a video of a match and watches it carefully, tracking each player and their movements individually. The images of the players are then mapped onto 3D models “extracted from soccer video games,” and placed on a 3D representation of the field. Basically they cross FIFA 18 with real life and produce a sort of miniature hybrid.

NewImage

[YouTube]

VR180 Creator, a new video tool from Google

Sounds handy for storytellers embracing new perspectives:

VR180 Creator currently offers two features for VR videos. “Convert for Publishing” takes raw fisheye footage from VR180 cameras like the Lenovo Mirage Camera and converts it into a standardized equirect projection. This can be edited with the video editing software creators already use, like Adobe Premiere and Final Cut Pro. “Prepare for Publishing” re-injects the VR180 metadata after editing so that the footage is viewable on YouTube or Google Photos in 2D or VR.

You can learn more about how to use VR180 Creator here and you can download it here.

NewImage

[Via]

“Bumping the Lamp”: AR storytelling insights from Roger Rabbit (for real!)

If you’re interested in making augmented reality characters feel natural in the real world, it’s well worth spending a few minutes with this tour of some key insights. I’ve heard once-skeptical Google AR artists praising it, saying, “This video is a treasure trove and every artist, designer or anyone working on front-end AR should watch it.” Enjoy, and remember to bump that lamp. 🙂 

NewImage

[YouTube] [Via Jeremy Cowles]

Augmenting (?) Lego reality

Hmm—I’m intrigued by the filmmaking-for-kids possibilities here, but deeply ambivalent about introducing screen time into one of the great (and threatened) pure-imagination oases in my kids’ lives:

Up to four friends can play in the same set on four different iOS devices, and notably all of the virtual aspects of the LEGO AR app will be connected to physical LEGO sets. “We can save our entire world back into our physical set, and pick up where we left off later,” Sanders said.

NewImage