19 thoughts on “Feedback, please: Voice-driven photo editing

  1. Well, I waited for this since 1982…
    Enhance 224 to 176. Enhance, stop. Move in, stop. Pull out, track right, stop. Center in, pull back. Stop. Track 45 right. Stop. ;))

  2. I love it – in concept! It reminds me of Harrison Ford’s voice interface in Blade Runner.
    However, I think there are a few things that would limit its usefulness. First, background noise – how sensitive is the voice recognition? And second, how many new gestures and commands would the user need to learn? I think having default gestures but allowing the user to override with something custom that feels more natural to them might be really wonderful.
    Last, I wonder how modifiable the automatic features are – if you outline the shirt and for some reason it grabs some of the background too, are you able to exclude that or does the program always know best? Especially important with touch interfaces, where the finger covers up much of what you’re trying to select, so accuracy can be low.
    I think the voice-based interface is wonderful for people who aren’t expert users, and who know how to describe something, but don’t know what combination of menus or keystrokes that functionality might be behind. That’s the thing that excites me the most about this.
    Fun! I’d love to play with it!

  3. Quickly: Will there be a “Zoom and enhance” Easter egg?
    Seems interesting, but also expects users to know the syntax of their editing steps (push mid tones).

  4. Good idea – but can it cope with an accent? I’ve tried for years to train voice activated stuff to my Liverpool accent.

  5. Interesting technology, but appears to be a solution to a problem that doesn’t exist.
    Why would you want users to have to learn a whole new list of voice commands to talk to a computer so that it could what you could also do with a mouse or tool pallet? What happens when there is a loud ambient surrounding (or quiet area
    May be useful for the disabled community, but probably more trouble than it is worth for anyone able to work a mouse on a computer or other hand gestures in the case of a tablet.

    1. The justification is made right at the beginning of the video: “Everyone takes pictures, but photo-editing can be hard” (as a statement over some video of Photoshop …). Which then, for me, makes the decision to work with this voice-commanding project in a completely new app very strange. Why not work with voice commands for Photoshop itself (or Elements maybe)? As a supplement, as you write, “for anyone able to work a mouse on a computer or other hand gestures in the case of a tablet”.

  6. I guess I’m the only one here who thinks its kinda annoying.
    I’m someone who would rather hang up on a support call to anywhere if I have to talk to the robot rather than press the touch tone keys on a phone.
    If I work on pictures after the kids’ bedtime, will I get in trouble for gabbing at the iPad?
    Me: “No no! I said Lighter, not Brighter!”
    Kid: “Mommy, has Daddy gone crazy?”
    Seems like a fun thing for the devs though.

  7. Kinda simplistic, just like the Wright flyer and PS V1.
    In the short run, it seems to me it could provide an increased level of control for users with no PS or similar experience. It would probably suffer the same challenges as all voice recognition tools, which improve with every year.
    In the long run, the ability to use both hands and voice control could be an incredible productivity tool, particularly if the voice recognition could be tailored by the user to understand his or her custom commands. (E.g. police dogs that obey commands only used by their trainers or the Blue Angels in-flight signals that are half word, half grunt).
    I draft documents using Dragon Voice Recognition. Over time, I’ve developed a “3 handed” workflow. I move between windows/monitors with my mouse/tablet; use my keyboard for shortcuts; and I have two hands free while I dictate, which allows me to “type” without interruption while I flip through paper notes and other hard copy source material.

  8. Very cool concept. I’m wondering how the accuracy is ultimately going to compare with the use of a mouse, keyboard, and/or pen tablet. I can see myself becoming frustrated by saying, “Brighten Sara and John” and it only brightens their faces when I want the whole body to be brighter. Or “Change the color of the skirt.” and the system misunderstands me and says, “OK, I’ll change the color of the shirt.”
    Voice recognition is hard enough by itself–anyone used a Nuance product lately?–adding commands to edit photos, which currently require complex keyboard shortcuts and gestures (mouse or finger-based), raises the difficulty that much more.

  9. How do you have Photoshop for both pros and amatuers? I can’t see it for pros except maybe in a high flow shop for quick fixes. The difference between some of the Nik or Snapseed methods and Photoshop. They’re great for quick and dirty but if you want precise masks and specificly controled effects, you can’t get there from here.
    Do you just keep adding everything to Photoshop or try to differentiate? I’m glad it’s not my problem.
    Probably the answer is yes, please all the people and if the cost goes up? Well the pros have to have it and they will pay.

  10. This sounds like a fun feature to play with on a mobile device, but I wonder if this is really useful. Years from now will wedding photographers use a voice controlled version of LR to quickly edit their photos?

Leave a Reply

Your email address will not be published. Required fields are marked *