Category Archives: Body Tracking

Google’s pose detection is now available on iOS & Android

Awesome work by the team. Come grab a copy & build something great!

The ML Kit Pose Detection API is a lightweight versatile solution for app developers to detect the pose of a subject’s body in real time from a continuous video or static image. A pose describes the body’s position at one moment in time with a set of x,y skeletal landmark points. The landmarks correspond to different body parts such as the shoulders and hips. The relative positions of landmarks can be used to distinguish one pose from another.

Body Movin’: Google AI releases new tech for body tracking, eye measurement

My old teammates keep slapping out the bangers, releasing machine-learning tech to help build apps that key off the human form.

First up is Media Pipe Iris, enabling depth estimation for faces without fancy (iPhone X-/Pixel 4-style) hardware, and that in turn opens up access to accurate virtual try-on for glasses, hats, etc.:

https://twitter.com/GoogleAI/status/1291430839088103424

The model enables cool tricks like realtime eye recoloring:

I always find it interesting to glimpse the work that goes in behind the scenes. For example:

To train the model from the cropped eye region, we manually annotated ~50k images, representing a variety of illumination conditions and head poses from geographically diverse regions, as shown below.

The team has followed up this release with MediaPipe BlazePose, which is in testing now & planned for release via the cross-platform ML Kit soon:

Our approach provides human pose tracking by employing machine learning (ML) to infer 33, 2D landmarks of a body from a single frame. In contrast to current pose models based on the standard COCO topology, BlazePose accurately localizes more keypoints, making it uniquely suited for fitness applications…

If one leverages GPU inference, BlazePose achieves super-real-time performance, enabling it to run subsequent ML models, like face or hand tracking.

Now I can’t wait for apps to help my long-suffering CrossFit coaches actually quantify the crappiness of my form. Thanks, team! 😛

Virtual backgrounds & blurs are coming to Google Meet

It may seem like a small thing, but I’m happy to say that my previous team’s work on realtime human segmentation + realtime browser-based machine learning will be coming to Google Meet soon, powering virtual backgrounds:

Since making Google Meet premium video meetings free and available to everyone, we’ve continued to accelerate the development of new features… In the coming months, we’ll make it easy to blur out your background, or replace it with an image of your choosing so you can keep your team’s focus solely on you. 

Replace your background.jpg

ML Kit gets pose detection

This is kinda inside-baseball, but I’m really happy that friends from my previous team will now have their work distributed on hundreds of millions, if not billions, of devices:

[A] face contours model — which can detect over 100 points in and around a user’s face and overlay masks and beautification elements atop them — has been added to the list of APIs shipped through Google Play Services…

Lastly, two new APIs are now available as part of the ML Kit early access program: entity extraction and pose detection… Pose detection supports 33 skeletal points like hands and feet tracking.

Let’s see what rad stuff the world can build with these foundational components. Here’s an example of folks putting an earlier version to use, and you can find a ton more in my Body Tracking category:

[Via]

Garments strut on their own in a 3D fashion show

No models, no problem: Congolese designer Anifa Mvuemba used software to show off her designs swaying in virtual space:

Cool context:

Inspired by her hometown in Congo, Anifa was intentional about shedding light on issues facing the Central African country with a short documentary at the start of the show. From mineral site conditions to the women and children who suffer as a result of these issues, Anifa’s mission was to educate before debuting any clothes. “Serving was a big part of who I am, and what I want to do,” she said in the short documentary.

Cloaking device engaged: Going invisible via Google’s browser-based ML

Heh—here’s a super fun application of body tracking tech (see whole category here for previous news) that shows off how folks have been working to redefine what’s possible with. realtime machine learning on the Web (!):

Google open-sources PoseNet 2.0 for Web-based body tracking

My teammates Tyler & George have released numerous projects made with their body-tracking library PoseNet, and now v2 has been open-sourced for you to use via TensorFlow.js. You can try it out here.

From last year (post), here’s an example of the kind of fun stuff you can make using it:

[YouTube]

New open-source Google AI experiments help people make art

One’s differing physical abilities shouldn’t stand in the way of drawing & making music. Body-tracking tech from my teammates George & Tyler (see previous) is just one of the new Web-based experiments in Creatability. Check it out:

Creatability is a set of experiments made in collaboration with creators and allies in the accessibility community. They explore how creative tools – drawing, music, and more – can be made more accessible using web and AI technology. They’re just a start. We’re sharing open-source code and tutorials for others to make their own projects.

NewImage

[YouTube]

Absolute witchcraft: AI synthesizes dance moves, entire street scenes

This 💩 is 🍌🍌🍌, B-A-N-A-N-A-S: This Video-to-Video Synthesis tech apparently can take in one dance performance & apply it to a recording of another person to make her match the moves:

It can even semantically replace entire sections of a scene—e.g. backgrounds in a street scene: 

Now please excuse me while I lie down for a bit, as my brain is broken.

NewImageScan

[YouTube]

[YouTube 1 & 2] [Via Tyler Zhu]