The tech itself relies on not one, but two neural networks: one to remove “foreign” shadows that are cast by unwanted objects like a hat or a hand held up to block the sun in your eyes, and the other to soften natural facial shadows and add “a synthetic fill light” to improve the lighting ratio once the unwanted shadows have been removed.
There’s often a lot of work to go from tech demo to robust, shipping feature (especially when targeting Photoshop’s rigorous level of quality & flexibility), and I’m sure the team has been working hard on that. In any event, I’m looking forward to trying it myself.
The company’s Photosynth technology has been public since 2006, and while it’s been cool (placing photos into 3D space), I haven’t seen it gain traction in its original form or as a free panorama maker. That could now change.
The new version stitches photos into smooth fly-throughs. Per TechCrunch:
[U]sers upload a set of photos to Microsoft’s cloud service then the technology begins to looking for points (“features”) in the successive photos that appear to the be same object. It then determines where each photo was taken from, where in 3D space each of these objects were, and how the camera was oriented. Next, it generates the 3D shapes on a per-photo basis. And finally, the technology calculates a smooth path – like a Steadicam – through the locations for each photo, and then slices the images into multi-resolution pyramids for efficiency.
Check this out:
Once you’ve clicked it, try hitting “C” to reveal & interact with the 3D camera path. Here’s an example from photographer David Brashears, who captured Mt. Everest during one of the highest-elevation helicopter flights ever attempted:
So, will we see this become more common? It’s the first presentation I’ve seen that makes me want to don a wearable, lifelogging camera on vacation.
Since Photoshop introduced the content aware fill tool, it has been familiar between several artists who used it to create different concepts. All what I had seen till now are pieces working through static images, but Zach Nader made in 2012 optional features shown, a 02:10 min video using the same tool over some commercial cars in which the texts, cars and people have been replaced by the content aware’s background. I find very interesting the glitchy movement over a constant and quiet background.
Sylvain Paris (creator of previous eye-popping tech demos) presents a sneak peek of a way to transfer the appearance of lighting in one image or video to another. Any yes, you’ll find Rainn Wilson’s interjections annoying. I recommend skipping the first 1:30. (Can anyone tell me exactly how to do that with embedded YouTube content, by the way?)
If I may echo Rainn Wilson, “Oh my God, that’s ridiculous.”
Note: This is a technology demo, not a feature that’s quite ready to go in Photoshop CC. With the move to subscriptions, however, Photoshop and other teams are moving away from “big bang” releases & towards more continuous deployment of improvements. [Update: I know that a number of people aren’t digging Wilson’s schtick. Hats off to Sarah for being such a pro under pressure.]
Researchers at the Max Planck Institute for Informatics (MPII) have developed video inpainting software that can effectively delete people or objects from high-definition footage. The software analyzes each video frame and calculates what pixels should replace a moving area that has been marked for removal.
allow animators to perform and capture motions continuously instead of breaking them into small increments… More importantly, it permits direct hand manipulation without resorting to rigs, achieving more natural object control for beginners.
The first half of the demo is pretty dry, but jump ahead to about 2:20 to start seeing the big wow:
I can’t claim to have known his name, but like you I know his work: Petro Vlahos pioneered blue- and green-screen techniques & founded Ultimatte before passing away this past week at the age of 96. The BBC writes,
Mr Vlahos’s breakthrough was to create a complicated laboratory process which involved separating the blue, green and red parts of each frame before combining them back together in a certain order.
He racked up more than 35 movie-related patents and numerous Academy commendations.
By coincidence, I came across the following peek behind the scenes of The Hobbit. It bears out what Robin Shenfield from compositing firm The Mill says of Mr. Vlahos’s work: “It’s the absolute building block of all the visual effects that you see in television and movies.”
Color me deeply skeptical, but intrigued: The BBC reports on an app that modifies the paper version of The Tokyo Shimbun in ways kids might appreciate:
“What it’s really about is something that’s been talked about for a long time, about content being presented in different ways depending on who the user is,” he said.
“It means two versions of the content – a grown-up one and the kids one. That has enormous potential. It also tackles a big gap in young readership.
This makes me oddly wistful: I’m Proust-ing out, almost smelling the newsprint & listening to the “funny papers” rattle as my dad read me cartoons, or as he’d read news & obits with a drink after work. The real obit, of course, is for the paper newspaper: I’m afraid all this will show up as a quaintly hilarious discovery that flits by on some future adult’s in-optic-nerve newsfeed. But whatever; I’m suddenly, and surprisingly, all choked up.
Adobe publishes some of its best work (e.g. tech behind Content-Aware Fill) in the academic community, rather than keeping it a trade secret, as some other big software companies do. Dan Goldman, one of the brains behind CAF, writes,
First, by encouraging publication, we make it attractive for the best minds in the business to come work in our labs – we count several former and current University professors among our ranks. Second, our researchers draw on the wealth of knowledge in the academic community as well – a great deal of our research is done in collaboration with graduate students like Connelly. And third, the rigorous demands of peer review keep us motivated to try truly new things – rather than being content to simply do all the old things better.
Remember that Wayne’s World “Camera one, camera two!” scene where he opens & closes one eye at a time? (No, you probably weren’t born when that came out; but I digress.) Lytro’s “perspective shift” feature works a bit like that, letting you switch among two subtly different points of view on the same scene:
It’s cool, though my big hope here remains that such technology offers a better way to select elements in a photo by detecting their varying depths. [Via]
It’s far from the flashiest task, but placing cuts & transitions in interview footage can be crucial to telling a story. Adobe’s Wil Li plus UC Berkeley-based collaborators Maneesh Agrawala and Floraine Berthouzoz have unveiled “a one-click method for seamlessly removing ’ums’ and repeated words, as well as inserting natural-looking pauses to emphasize semantic content.”:
To help place cuts in interview video, our interface links a text transcript of the video to the corresponding locations in the raw footage. It also visualizes the suitability of cut locations… Editors can directly highlight segments of text, check if the endpoints are suitable cut locations and if so, simply delete the text to make the edit. For each cut our system generates visible (e.g. jump-cut, fade, etc.) and seamless, hidden transitions.
I’m excited to announce that the company founded by my old boss & friend Kevin Connor, working together with image authenticity pioneer Dr. Hany Farid, has released their first product, FourMatch—an extension for Photoshop CS5/CS6 that “instantly distinguishes unmodified digital camera files from those that may have been edited.” From the press release:
FourMatch… appears as a floating panel that automatically and instantly provides an assessment of any open JPEG image. A green light in the panel indicates that the file matches a verified original signature in FourMatch software’s extensive and growing database of more than 70,000 signatures. If a match is not found, the panel displays any relevant information that can aid the investigator in further assessing the photo’s reliability.
Fourandsix will donate 2 percent of their proceeds from the sale of this software to the National Center for Missing & Exploited Children (NCMEC). The donation will support NCMEC efforts to find missing children and prevent the abduction and sexual exploitation of children.
KinÊtre is a research project from Microsoft Research Cambridge that allows novice users to scan physical objects and bring them to life in seconds by using their own bodies to animate them. This system has a multitude of potential uses for interactive storytelling, physical gaming, or more immersive communications.
“When we started this,” says creator Jiawen Chen, “we were thinking of using it as a more effective way of doing set dressing and prop placement in movies for a preview. Studios have large collections of shapes, and it’s pretty tedious to move them into place exactly. We wanted to be able to quickly walk around and grab things and twist them around. Then we realized we can do many more fun things.” I’ll bet.
Pretty darn cool, though if that Kinect dodgeball demo isn’t Centrifugal Bumble-Puppy come to life, I don’t know what is.
Here’s more info on using a Kinect as a 3D scanner:
Last month I broke the somewhat sad news that Adobe’s Pixel Bender language is being retired, but for a good cause: we can now redirect effort & try other ways to achieve similar goals. To that end, Adobe researchers have teamed up with staff at the Massachusetts Institute of Technology to define Halide, a new programming language for imaging. It promises faster, more compact, and more portable code.
In tests, the MIT researchers used Halide to rewrite several common image-processing algorithms whose performance had already been optimized by seasoned programmers. The Halide versions were typically about one-third as long but offered significant performance gains — two-, three-, or even six-fold speedups. In one instance, the Halide program was actually longer than the original — but the speedup was 70-fold.
Normally we work so hard to reduce motion in video (e.g. bringing the awesome Warp Stabilizer from After Effects to Premiere Pro CS6). There are cases, though (e.g. monitoring a heartbeat, or the breathing of a baby) where one wants to do just the opposite. Here’s an interesting demo:
You know cinemagraphs, “still photographs in which a minor and repeated movement occurs”? They can be extremely cool, but creating them is tricky.
Now Adobe researcher Aseem Agarwala & colleagues at UC Berkeley have devised “a semi-automated technique for selectively de- animating video to remove the large-scale motions of one or more objects so that other motions are easier to see.” It’s easier seen than described:
The user draws strokes to indicate the regions of the video that should be immobilized, and our algorithm warps the video to remove the large-scale motion of these regions while leaving finer-scale, relative motions intact. However, such warps may introduce unnatural motions in previously motionless areas, such as background regions. We therefore use a graph-cut-based optimization to composite the warped video regions with still frames from the input video; we also optionally loop the output in a seamless manner.
Our technique enables a number of applications such as clearer motion visualization, simpler creation of artistic cinemagraphs (photos that include looping motions in some regions), and new ways to edit appearance and complicated motion paths in video by manipulating a de-animated representation. We demonstrate the success of our technique with a number of motion visualizations, cinemagraphs and video editing examples created from a variety of short input videos, as well as visual and numerical comparison to previous techniques.
About five years ago we gave Photoshop the ability to stack multiple images together, then eliminate moving or unwanted details. Similar techniques have appeared in other tools, and now it appears you’ll be able to do all the capture & processing with just your phone. Here’s a quick preview:
My longtime boss Kevin Connor left Adobe earlier this year to launch a startup, Fourandsix, aimed at “revealing the truth behind every photograph.” Now his co-founder (and Adobe collaborator) Hany Farid has published some interesting research:
Dr. Farid and Eric Kee, a Ph.D. student in computer science at Dartmouth, are proposing a software tool for measuring how much fashion and beauty photos have been altered, a 1-to-5 scale that distinguishes the infinitesimal from the fantastic. Their research is being published this week in a scholarly journal, The Proceedings of the National Academy of Sciences.
The video at the top of this post is a Polar Rose demo of an app called “Recognizr”, which recognizes people’s faces and provides you with links to their social media accounts.
Imagine a world where every person on the street can be identified by simply pointing your phone at their face. Curious about a stranger? Point your camera at them to pull up their Facebook profile. People who had concerns over facial recognition in Facebook photos are going to have a fit about this one…
I remain eager to see what developers can do in terms of building photography & design apps. If you see anything cool, give a shout.
A couple of years ago, Esquire shot a magazine cover using not a still camera but a high-res RED video camera. What was groundbreaking becomes commonplace, and as video capture resolution increases, so does the possibility of using stills as photos.
To make that easier, Adobe engineers & University of Washington researchers are collaborating on a method of automatically finding the best candid shots in a video clip. Check it out:
Very cool–though I continue to suspect there’s a market for auto-selecting the most ridiculous, unflattering images of one’s friends…
“With a single image and a small amount of annotation,” writes researcher Kevin Karsch, “our method creates a physical model of the scene that is suitable for realistically rendering synthetic objects.” Fascinating:
Last week over a million people (!) watched a handheld recording of this demo. Here’s a far clearer version*:
And here’s a before/after image (click for higher resolution):
Now, here’s the thing: This is just a technology demo, not a pre-announced feature. It’s very exciting, but much hard work remains to be done. Check out details right from the researchers via the Photsohop.com team blog. [Update: Yes, it’s real. See the researchers’ update at the bottom of the post.] * Downside of this version: Bachman Turner Overdrive. Upside: Rainn Wilson.
The Throwable Panoramic Ball Camera captures a full spherical panorama when thrown into the air. At the peak of its flight, which is determined using an accelerometer, a full panoramic image is captured by 36 mobile phone camera modules.
You can build a business manipulating photos; how about building one by detecting those manipulations?
My longtime boss Kevin Connor was instrumental in building Photoshop, Lightroom, and PS Elements into the successes they are today, and he taught me the ropes of product management. After 15 years he was ready to try starting his own company, so this spring he teamed up with Dr. Hany Farid (“the father of digital image forensics,” said NOVA). Together they’ve started forensics company Fourandsix (get the pun?), aimed at “revealing the truth behind every photograph.”
Now they’ve put up Photo Tampering Throughout History, an interesting collection of famous (and infamous) forgeries & manipulations from Abraham Lincoln’s day to the present. Numerous examples include before & after images plus brief histories of what happened.
I wish Kevin & Hany great success in this new endeavor, and I can’t wait to see the tools & services they introduce.
Ah–I’d been wondering what that little camera icon in the Google Images search field meant. As the company explains,
You might have an old vacation photo, but forgot the name of that beautiful beach. Typing [guy on a rocky path on a cliff with an island behind him] isn’t exactly specific enough to help find your answer. So when words aren’t as descriptive as the image, you can now search using the image itself.
Or, “Any sufficiently advanced technology is indistinguishable from magic.” — Arthur C. Clarke
At Adobe MAX last month, digital imaging researcher Sylvain Paris showed off some tech he & colleagues are cooking up in Adobe’s Boston office. Here he touches on color/tone matching between photos; more sophisticated auto-correction of color and tone (based on analyzing thousands of adjustments made by pro photographers); and image de-blurring:
Lots of other really interesting MAX sneaks are collected here.
Did you know that the Photoshop team has a resident theoretical physicist? If you’d like to meet him, check out next Thursday’s Silicon Valley ACM SIGGRAPH talk:
Recently we and others have gained deeper understanding of the fundamentals of the plenoptic camera and Lippmann sensor. As a result, we have developed new rendering approaches to improve resolution, remove artifacts, and render in real time. By capturing multiple modalities simultaneously, our camera captures images that are focusable after the fact and which can be displayed in multi view stereo. The camera can also be configured to capture HDR, polarization, multispectral color and other modalities. With superresolution techniques we can even render results that approach full sensor resolution. During our presentation we will demonstrate interactive real time rendering of 3D views with after the fact focusing.
Built at last, built at last, thank God almighty, I’ll be built at last… According to Popular Science:
Developers at the Max Planck Institute for Informatics in Saarbrücken, Germany compiled 3D scans of 120 men and women of varying sizes, merging them into a single model that can be morphed to any shape and overlaid atop original footage.
The software, called MovieReshape, builds on existing programs that track an actor’s silhouette through a scene, mapping the body into a morphable model. Using the compiled 3D scans, the program can create realistic-looking and moving body parts to the programmer’s specifications.
At NVIDIA’s technology conference this week, Adobe researcher Todor Georgiev demonstrated GPU-accelerated processing of plenoptic images. As Engadget puts it, “Basically, a plenoptic lens is composed of a litany of tiny “sub-lenses,” which allow those precious photons you’re capturing to be recorded from multiple perspectives.” Plenoptic image capture could open the door to easier object matting/removal (as the scene can be segmented by depth), variable perspective after capture, and more.
This brief demo takes a little while to get going, but I still think it’s interesting enough to share.
Jim McCann is a graphics researcher (you might remember his interesting work with gradient-domain painting), and I’m happy to say he’s joining the Adobe advanced technology staff. He has some ideas about dealing with the limitations of traditional graphical layering models (as seen in Photoshop, After Effects, Flash, etc.):
For more videos & papers on the subject, check out the project page. [Via Jerry Harris]