Wow.
If you were in my shoes, how would you leverage this tech within Google Photos? Knowing that the system can automatically backup your lifetime’s worth of moments, then automatically synthesize things like movies & stories, what would you have it create on your behalf?
PetaPixel notes,
The system is far more intelligent than a simple tagging system. It not only picks up on the details, like a color or object, but also understands the scene in context. In other words: it can not only understanding that a photo has ‘snow’ and ‘trees’ in it, the program could tell you that, “the snow is falling in front of the line of trees.”
The program automatically captioned this: “Two pizzas sitting on top of a stove top oven”


Coming from a somewhat forensic background, I’d love to see emergency capabilities. People always forget details when accidents or other emergencies happen. Having a special mode for quick enable and enhanced tagging/recognition would be great for those situations. For example, you snap a pic of an auto accident, perhaps it could then prompt you with other possible shots you might need, such as street signs, surrounding area, driver’s licenses, license plates, witnesses, etc.
Some legal implications, to be sure, but there are lots of possibilities here.
Alternatively, I’d love to use this for basic tag development in a photoshoot. Let it generate some tags based on a template from one image, that you can have it sweep to the rest of the shoot. Let’s say you’re wandering in a given area, but the subjects keep changing – something we all do. Refining tags is always a pain, but if you can just pick a small handful of images with representative elements you care about, ID those (such as people’s names, key background structures, characteristics like major color, etc.), and let it spend more processor time there.
A more advanced version of that would be phenomenal for building a stock library. Something I’ve wished for in Bridge is the ability to localize tags on an image. That is, on a given image isolate a region and apply the tag there. When you search on that specialized tag, you only get the selected region. Think about building up a library of stock elements such as hair, eyes, fabric, etc. without having to build massive swatch libraries.
I would like to train it to recognize specific criteria that I find important. This would be in addition to the Google algorithms.
Recognizing brand names and logos could be very helpful to marketers. Face recognition is of course helpful in many ways, especially for personal photos. If there was a way to characterize the overall mood, environment or type of situation (playful, industrial, outdoor, retail, suburban streets, etc.) that could help in many ways, as well.