My old teammates keep slapping out the bangers, releasing machine-learning tech to help build apps that key off the human form.
First up is Media Pipe Iris, enabling depth estimation for faces without fancy (iPhone X-/Pixel 4-style) hardware, and that in turn opens up access to accurate virtual try-on for glasses, hats, etc.:
The model enables cool tricks like realtime eye recoloring:
I always find it interesting to glimpse the work that goes in behind the scenes. For example:
To train the model from the cropped eye region, we manually annotated ~50k images, representing a variety of illumination conditions and head poses from geographically diverse regions, as shown below.
The team has followed up this release with MediaPipe BlazePose, which is in testing now & planned for release via the cross-platform ML Kit soon:
Our approach provides human pose tracking by employing machine learning (ML) to infer 33, 2D landmarks of a body from a single frame. In contrast to current pose models based on the standard COCO topology, BlazePose accurately localizes more keypoints, making it uniquely suited for fitness applications…
If one leverages GPU inference, BlazePose achieves super-real-time performance, enabling it to run subsequent ML models, like face or hand tracking.
Now I can’t wait for apps to help my long-suffering CrossFit coaches actually quantify the crappiness of my form. Thanks, team! 😛