It’s ludicrous to think that these folks formed the company just six months ago, and even more ludicrous to see what the model can already do—from video synthesis, to image animation, to inpainting/outpainting:
Our vision for Pika is to enable everyone to be the director of their own stories and to bring out the creator in each of us. Today, we reached a milestone that brings us closer to our vision. We are thrilled to unveil Pika 1.0, a major product upgrade that includes a new AI model capable of generating and editing videos in diverse styles such as 3D animation, anime, cartoon and cinematic, and a new web experience that makes it easier to use. You can join the waitlist for Pika 1.0 at https://pika.art.
This tech—or something much like it—is going to be a very BFD. Imagine simply describing the change you’d like to see in your image—and then seeing it.
[Generative models] still face limitations when it comes to offering precise control. That’s why we’re introducing Emu Edit, a novel approach that aims to streamline various image manipulation tasks and bring enhanced capabilities and precision to image editing.
Emu Edit is capable of free-form editing through instructions, encompassing tasks such as local and global editing, removing and adding a background, color and geometry transformations, detection and segmentation, and more. […]
Emu Edit precisely follows instructions, ensuring that pixels in the input image unrelated to the instructions remain untouched. For instance, when adding the text “Aloha!” to a baseball cap, the cap itself should remain unchanged.
Here’s a great look at how the scrappy team behind Luma.ai has helped enable beautiful volumetric captures of Phoenix Suns players soaring through the air:
Go behind the scenes of the innovative collaboration between Profectum Media and the Phoenix Suns to discover how we overcame technological and creative challenges to produce the first 3D bullet time neural radiance field NeRF effect in a major sports NBA arena video. This involved not just custom-building a 48 GoPro multi-cam volumetric rig but also integrating advanced AI tools from Luma AI to capture athletes in stunning, frozen-in-time 3D visual sequences. This venture is more than just a glimpse behind the scenes – it’s a peek into the evolving world of sports entertainment and the future of spatial capture.
I’m really digging the experience of (optionally) taking a photo, feeding it into ChatGPT, and then riffing my way towards an interesting visual outcome. Here’s a gallery in which you can see some of the journeys I’ve undertaken recently.
Image->description->image quality is often pretty hit-or-miss. Even so, it’s such a compelling possibility that I keep wanting to try it (e.g. seeing a leaf on the ground, wanting to try turning it into a stingray).
The system attempts to maintain various image properties (e.g. pose, color, style) while varying others (e.g. turning the attached vehicle from a box truck to a tanker while maintaining its general orientation plus specifics like featuring three Holstein cows).
Overall text creation is vastly improved vs. previous models, though it can still derail. It’s striking that one can iteratively improve a particular line of text (e.g. “Make sure that the second line says ‘TRAIN’“).
Man, I’m inspired—and TBH a little jealous—seeing 14yo creator Preston Mutanga creating amazing 3D animations, as he’s apparently been doing for fully half his life. I think you’ll enjoy the short talk he gave covering his passions:
The presentation will take the audience on a journey, a journey across the Spider-Verse where a self-taught, young, talented 14-year-old kid used Blender, to create high-quality LEGO animations of movie trailers. Through the use of social media, this young artist’s passion and skill caught the attention of Hollywood producers, leading to a life-changing invitation to animate in a new Hollywood movie.
Speaking of increasing resolution, check out this sneak peek from Adobe MAX:
It’s a video upscaling tool that uses diffusion-based technology and artificial intelligence to convert low-resolution videos to high-resolution videos for applications. Users can directly upscale low-resolution videos to high resolution. They can also zoom-in and crop videos and upscale them to full resolution with high-fidelity visual details and temporal consistency. This is great for those looking to bring new life into older videos or to prevent blurry videos when playing scaled versions on HD screens.
Google Zoom Enhance. “Using generative AI, Zoom Enhance intelligently fills in the gaps between pixels and predicts fine details, opening up more possibilities when it comes to framing and flexibility to focus on the most important part of your photo.”
Nick St. Pierre writes, “I just upscaled an image in MJ by 4x, then used Topaz Photo AI to upscale that by another 6x. The final image is 682MP and 32000×21333 pixels large.”