Thanks as always to the guys at Luma Labs for making it so ridiculously easy to generate 3D scenes from simple orbits:
Hey gang—here’s to having a great 2024 of making the world more beautiful & fun. Here’s a little 3D creation (with processing courtesy of Luma Labs) made from some New Year’s Eve drone footage I captured at Gaviota State Beach. (If it’s not loading for some reason, you can see a video version in this tweet).
Here’s a great look at how the scrappy team behind Luma.ai has helped enable beautiful volumetric captures of Phoenix Suns players soaring through the air:
Go behind the scenes of the innovative collaboration between Profectum Media and the Phoenix Suns to discover how we overcame technological and creative challenges to produce the first 3D bullet time neural radiance field NeRF effect in a major sports NBA arena video. This involved not just custom-building a 48 GoPro multi-cam volumetric rig but also integrating advanced AI tools from Luma AI to capture athletes in stunning, frozen-in-time 3D visual sequences. This venture is more than just a glimpse behind the scenes – it’s a peek into the evolving world of sports entertainment and the future of spatial capture.
“Get cinematic and professional-looking drone Flythroughs in minutes from shaky amateur recorded videos.” The results are slick:
Tangentially, here’s another impressive application of Luma tech—turning drone footage into a dramatically manipulable 3D scene:
I’m so pleased & even proud (having at least having offered my encouragement to him over the years) to see my buddy Bilawal spreading his wings and spreading the good word about AI-powered creativity.
Check out his quick thoughts on “Channel-surfing realities layered on top of the real world,” “3D screenshots for the real world,” and more:
Favorite quote 😉:
“All they need to do is have a creative vision, and a Nack for working in concert with these AI models”—beautifully said, my friend! 🙏😜. pic.twitter.com/f6oUNSQXul
— John Nack (@jnack) September 1, 2023
Great visual storytelling trickery, as always, from Karen X. Cheng:
I’m still digging out (of email, Slack, and photos, but thankfully no longer of literal snow) following last weekend’s amazing photo adventure in Ely, NV. I need to try processing more footage via the amazing Luma app, but for now here’s a cool 3D version of the Nevada Northern Railway‘s water tower, made simply by orbiting it with my drone & uploading the footage:
Last month Paul Trillo shared some wild visualizations he made by walking around Michelangelo’s David, then synthesizing 3D NeRF data. Now he’s upped the ante with captures from the Louvre:
Over in Japan, Tommy Oshima used the tech to fly around, through, and somehow under a playground, recording footage via a DJI Osmo + iPhone:
Here’s an example made from a quick capture I did of my friend (nothing special, but amazing what one can get simply by walking in a circle while recording video):
Karen X. Cheng, back with another 3D/AI banger:
As luck (?) would have it, the commercial dropped on the third anniversary of my former teammate Jon Barron & collaborators bringing NeRFs into existence:
Heh—before the holidays get past us entirely, check out this novel approach to 3D motion capture from the always entertaining Kevin Parry:
[Via Victoria Nece]
This stuff—creating 3D neural models from simple video captures—continues to blow my mind. First up is Paul Trillo visiting the David:
Then here’s AJ from the NYT doing a neat day-to-night transition:
And lastly, Hugues Bruyère used a 360º camera to capture this scene, then animate it in post (see thread for interesting details):
It’s insane to me how much these emerging tools democratize storytelling idioms—and then take them far beyond previous limits. Recently Karen X. Cheng & co. created some wild “drone” footage simply by capturing handheld footage with a smartphone:
Now they’re creating an amazing dolly zoom effect, again using just a phone. (Click through to the thread if you’d like details on how the footage was (very simply) captured.)
Meanwhile, here’s a deeper dive on NeRF and how it’s different from “traditional” photogrammetry (e.g. in capturing reflective surfaces):
Last year my friend Bilawal Singh Sidhu, a PM driving 3D experiences for Google Maps/Earth, created an amazing 3D render (also available in galactic core form) of me sitting atop the Trona Pinnacles. At that time he used “traditional” photogrammetry techniques (kind of a funny thing to say about an emerging field that remains new to the world), and this year he tried processing the same footage (comprised of a couple simple orbits from my drone) using new Neural Radiance Field (“NeRF”) tech:
For comparison, here’s the 3D model generated via the photogrammetry approach:
The file is big enough that I’ve had some trouble loading it on my iPhone. If that affects you as well, check out this quick screen recording:
The power & immersiveness of rendering 3D from images is growing at an extraordinary rate. NeRF Studio promises to make creation much more approachable:
The kind of results one can generate from just a series of photos or video frames is truly bonkers:
Here’s a tutorial on how to use it:
Check out this high-speed overview of recent magic courtesy of my friend Bilawal:
Photogrammetry is an art form that has been around for decades, but it’s never looked better thanks to ML techniques like Neural Radiance Fields (NeRF). This video shows a wide range of 3D captures made using this technique. And I gotta say, NeRF really breathes new life into my old photo scans! All these datasets were posed in COLMAP and trained + rendered with NVIDIA’s free Instant NGP tools.
The visualizations for StyleNeRF tech are more than a little trippy, but the fundamental idea—that generative adversarial networks (GANs) can enable 3D control over 2D faces and other objects—is exciting. Here’s an oddly soundtracked peek:
And here’s a look at the realtime editing experience:
“This is certainly the coolest thing I’ve ever worked on, and it might be one of the coolest things I’ve ever seen.”
My Google Research colleague Jon Barron routinely makes amazing stuff, so when he gets a little breathless about a project, you know it’s something special. I’ll pass the mic to him to explain their new work around capturing multiple photos, then synthesizing a 3D model:
I’ve been collaborating with Berkeley for the last few months and we seem to have cracked neural rendering. You just train a boring (non-convolutional) neural network with five inputs (xyz position and viewing angle) and four outputs (RGB+alpha), combine it with the fundamentals of volume rendering, and get an absurdly simple algorithm that beats the state of the art in neural rendering / view synthesis by *miles*.
You can change the camera angle, change the lighting, insert objects, extract depth maps — pretty much anything you would do with a CGI model, and the renderings are basically photorealistic. It’s so simple that you can implement the entire algorithm in a few dozen lines of TensorFlow.
Check it out in action: