Monthly Archives: May 2019

AI: Object detection & tracking comes to ML Kit

In addition to moving augmented images (see previous), my team’s tracking tech enables object detection & tracking on iOS & Android:

The Object Detection and Tracking API identifies the prominent object in an image and then tracks it in real time. Developers can use this API to create a real-time visual search experience through integration with a product search backend such as Cloud Product Search.

I hope you’ll build some rad stuff with it (e.g. the new Adidas app)!


Flaming Lips + Google AI + … fruit?

What happens when you use machine learning & the capacitive-sensing properties of fruit to make music? The Flaming Lips teamed up with Google to find out:

During their performance that night, Steven Drozd from The Flaming Lips, who usually plays a variety of instruments, played a “magical bowl of fruit” for the first time. He tapped each fruit in the bowl, which then played different musical tones, “singing” the fruit’s own name. With help from Magenta, the band broke into a brand-new song, “Strawberry Orange.”

The Flaming Lips also got help from the audience: At one point, they tossed giant, blow-up “fruits” into the crowd, and each fruit was also set up as a sensor, so any audience member who got their hands on one played music, too. The end result was a cacophonous, joyous moment when a crowd truly contributed to the band’s sound.



Google Glass resurrected, this time for enterprise

The original Glass will be to AR wearables as the Apple Newton was to smartphones—ambitious, groundbreaking, unfocused, premature. After that first… well, learning experience… Google didn’t give up, and folks have cranked away quietly to find product-market fit. Check out the new device—dramatically faster, more extensible, and focused on specific professionals in medicine, manufacturing, and more:



Check out “Dance Like,” a fun ML-driven app from Google

My team has been collaborating with TensorFlow Lite & researchers working on human-pose estimation (see many previous posts) to accelerate on-device machine learning & enable things like the fun “Dance Like” app on iOS & Android:


Here’s brave Aussie PM Tim Davis busting the coldest emotes [note to my sons: am I saying this right?] while demoing on stage at Google I/O (starts at timestamp: 42:55] in case the embed gets wonky):



AR: Nike aims for visual foot-size estimation, jersey try-on

Hmm… am I a size 10.5 or 11 in this brand? These questions are notoriously tough to answer without trying on physical goods, and cracking the code for reliable size estimation promises to enable more online shoe buying with fewer returns.

Now Nike seems to have cracked said code. The Verge writes,

With this new AR feature, Nike says it can measure each foot individually — the size, shape, and volume — with accuracy within 2 millimeters and then suggest the specific size of Nike shoe for the style that you’re looking at. It does this by matching your measurements to the internal volume already known for each of its shoes, and the purchase data of people with similar-sized feet.

Seems like size estimation could be easily paired with visualization a la Wanna Kicks.


On a semi-related note, Nike has also partnered with Snapchat to enable virtual try-on of soccer jerseys:


AR: Google’s “Weird Cuts” lets you make collages in space

Weird indeed, but nifty:

TechCrunch notes,

The app consists of two modes — a cutout mode and a collage mode.

The idea is that you should walk around and collect a bunch of different materials from the world in front of your camera’s viewfinder while in the cutout mode. These images are cut into shapes that you then assemble when you switch to collage mode. To do so, you’ll arrange your cutouts in the 3D space by moving and tapping on the phone’s screen.

You can also adjust the shapes while holding down your finger and moving up, down, left and right — for example, if you want to rotate and scale your “weird cuts” collage shapes.

Unrelated (AFAIK), this little app lets you sketch in 2D, then put the results into AR space. (Adobe Capture should do this!)


Check out Environmental HDR lighting, new in ARCore


Environmental HDR uses machine learning with a single camera frame to understand high dynamic range illumination in 360°. It takes in available light data, and extends the light into a scene with accurate shadows, highlights, reflections and more. When Environmental HDR is activated, digital objects are lit just like physical objects, so the two blend seamlessly, even when light sources are moving.

Check out the results on a digital mannequin (left) and physical mannequin (right):

NewImage plugin comes to Photoshop

I haven’t yet tried it, but sample results look impressive:


It’s free to download, but usage carries a somewhat funky pricing structure. PetaPixel explains,

You’ll need to sign up for an API key through the website and be connected to the Internet while using it. You’ll be able to do 50 background removals in a small size (625×400, or 0.25 megapixels) through the plugin every month for free (and unlimited removals through the website at that size). If you work with larger volumes or higher resolutions (up to 4000×2500, or 10 megapixels), you’ll need to buy credits.

Awesome new portrait lighting tech from Google

The rockstar crew behind Night Sight have created a neural network that takes a standard RGB image from a cellphone & produces a relit image, displaying the subject as though s/he were illuminated via a different environment map. Check out the results:

I spent years wanting & trying to get capabilities like this into Photoshop—and now it’s close to running in realtime on your telephone (!). Days of miracles and… well, you know.

Our method is trained on a small database of 18 individuals captured under different directional light sources in a controlled light stage setup consisting of a densely sampled sphere of lights. Our proposed technique produces quantitatively superior results on our dataset’s validation set compared to prior works, and produces convincing qualitative relighting results on a dataset of hundreds of real-world cellphone portraits. Because our technique can produce a 640 × 640 image in only 160 milliseconds, it may enable interactive user-facing photographic applications in the future.


“Westerosworld”: GoT titles reimagined in the Westworld style

Visual + musical mashup FTW:

The Verge writes,

It took Belgian designer Gilles Augustijnen about eight months of on-and-off work, using After Effects, C4D, Photoshop, Illustrator, Substance Painter, ZBrush, Fusion360, and DAZ3D to bring the sequence to life, aided by Pieterjan Djufri Futra and Loris Ayné (who provided feedback, support, and help with hard surface modeling.)

Oh, and since you’re here already, fancy a little Song Of Vanilla Ice & Fire? Sure ya do!



[YouTube 1 & 2]