“VOGUE: Try-On by StyleGAN,” from my former Google colleague Ira Kemelmacher-Shlizerman & her team, promises to synthesize photorealistic clothing & automatically apply it to a range of body shapes (leveraging the same StyleGAN foundation that my new teammates are using to build images via text):
Artbreeder is a trippy project that lets you “simply keep selecting the most interesting image to discover totally new images. Infinitely new random ‘children’ are made from each image. Artbreeder turns the simple act of exploration into creativity.” Check out interactive remixing:
I’ve long loved the weird mechanical purring of those flappy-letter signs one sees (or at least used to see) in train stations & similar venues, but I haven’t felt like throwing down the better part of three grand to own a Vestaboard. Now maker Scott Bezek is working on an open-source project for making such signs at home, combining simple materials and code. In case you’d never peeked inside such a mechanism (and really, why would you have?) and are curious, here’s how they work:
And here, for some reason, are six oddly satisfying minutes of a sign spelling out four-letter words:
I remain fascinated by what Snap & Facebook are doing with their respective AR platforms, putting highly programmable camera stacks into the hands of hundreds of millions of consumers & hundreds of thousands of creators. If you have thoughts on the subject & want to nerd out some time, drop me a note.
A few months back I wanted to dive into the engine that’s inside Instagram, and I came across the Spark AR masterclass put together & presented by filter creator Eddy Adams. I found it engaging & informative, if even a bit fast for my aging brain 🙃. If you’re tempted to get your feet wet in this emerging space, I recommend giving it a shot.
“Boys,” I DM’d the lads (because somehow that’s a thing now), “I hope you someday find spouses (speece?) cool enough to send you things like Mom just sent me.” And Crom-willing, they will. 😌 Happy Friday.
I find this emerging space so fascinating. Check out how Toonify.photos (which you can use for free, or at high quality for a very modest fee) can turn one’s image into a cartoon character. It leverages training data based on iconic illustration styles:
I also chuckled at this illustration from the video above, as it endeavors to how two networks (the “adversaries” in “Generative Adversarial Network”) attempt, respectively, to fool the other with output & to avoid being fooled. Check out more details in the accompanying article.
It’s really cool to see the Goog leveraging its immense corpus of not just 2D or 3D, but actually 4D (time-based), data to depict our planetary home.
In the biggest update to Google Earth since 2017, you can now see our planet in an entirely new dimension — time. With Timelapse in Google Earth, 24 million satellite photos from the past 37 years have been compiled into an interactive 4D experience. Now anyone can watch time unfold and witness nearly four decades of planetary change. […]
Having a train-obsessed 11yo son who enjoys exclaiming things like, “Hey, that’s Cooper Black!,” this tour of railroad typography is 💯 up our family’s alley. (Tangential, but as it’s already on my clipboard: we’re keeping a running album of our train-related explorations along Route 66, and Henry’s been adding things like an atomic train tour to his YouTube channel.)
From the typesetting video description:
Ever since the first train services, a wide variety of guides have helped passengers understand the railways; supplementing the text with timetables, maps, views, and diagrams. Typographically speaking, the linear nature of railways and the modular nature of trains meant that successful diagrams could be designed economically by using typographic sorts. Various typographic trains and railways from the 1830s to present-day will be evaluated in terms of data visualization, decoration, and the economics of reproduction. Bringing things up to date, techniques for typesetting emoji and CSS trains are explored, and a railway-inspired layout model will be proposed for wider application in the typography of data visualization and ornamentation.
“Imagine what you can create. Create what you can imagine.”
So said the first Adobe video I ever saw, back in 1993 when I’d just started college & attended the Notre Dame Mad Macs user group. I saw it just that once, 20+ years ago, but the memory is vivid: an unfolding hand with an eye in the palm encircled by the words “Imagine what you can create. Create what you can imagine.” I was instantly hooked.
I got to mention this memory to Adobe founders Chuck Geschke & John Warnock at a dinner some 15 years later. Over that whole time—through my college, Web agency, and ultimately Adobe roles—the company they started had fully bent the arc of my career, as it continues to do today. I wish I’d had the chance to talk more with Chuck, who passed away on Friday. Outside of presenting to him & John at occasional board meetings, however, that’s all the time we had. Still, I’m glad I had the chance to share that one core memory.
I’ll always envy my wife Margot for getting to spend what she says was a terrific afternoon with him & various Adobe women leaders a few years back:
“Everyone sweeps the floor around here”
I can’t tell you how many times I’ve cited this story (source) from Adobe’s early history, as it’s such a beautiful distillation of the key cultural duality that Chuck & John instilled from the start:
The hands-on nature of the startup was communicated to everyone the company brought onboard. For years, Warnock and Geschke hand-delivered a bottle of champagne or cognac and a dozen roses to a new hire’s house. The employee arrived at work to find hammer, ruler, and screwdriver on a desk, which were to be used for hanging up shelves, pictures, and so on.
“From the start we wanted them to have the mentality that everyone sweeps the floor around here,” says Geschke, adding that while the hand tools may be gone, the ethic persists today.
I have one very special moment that meant a tremendous amount to me. Both my grandfather and my father were letterpress photoengravers — the people who made color plates to go into high-quality, high-volume publications such as Time magazine and all the other kinds of publishing that was done back then.
As we were trying to take that very mechanical chemical process and convert it into something digital, I would bring home samples of halftones and show them to my father. He’d say, “Hmm, let me look at that with my loupe,” because engravers always had loupes. He’d say, “You know, Charles, that doesn’t look very good.” Now, when my dad said, “Charles,” it was bad news.
About six months later, I brought him home something that I knew was spot on. All the rosettes were perfect. It was a gorgeous halftone. I showed it to my dad and he took his loupe out and he looked at it, and he smiled and said, “Charlie, you finally did it.” And, to me, that was probably one of the biggest high points of the early part of my career here.
And a final word, which I’ll share with my profound thanks:
“An engineer lives to have his idea embodied in a product that impacts the world.” Mr. Geschke said. “I consider myself the luckiest man on Earth.”
Elsewhere I put my pal Seamus (who’s presently sawing logs on the couch next to me) through NVIDIA’s somewhat wacky GANimal prototype app, attempting to mutate him into various breeds—with semi-Brundlefly results. 👀
You say “work with an AI to make art, purely from a text prompt,” I hear “monkey with a revolver”—which reminds me, I should plug “monkey with a revolver” into this system to see what comes out. Meanwhile, example weirdness:
“Same Energy is a visual search engine. You can use it to find beautiful art, photography, decoration ideas, or anything else.” I recommend simply clicking it & exploring a bit, but you can also see a bit here (vid not in English, but that doesn’t really matter):
I’m using it to find all kinds of interesting image sets, like this:
The default feeds available on the home page are algorithmically curated: a seed of 5-20 images are selected by hand, then our system builds the feed by scanning millions of images in our index to find good matches for the seed images. You can create feeds in just the same way: save images to create a collection of seed images, then look at the recommended images.
“‘Augmented Reality: A Land Of Contrasts.’ In this essay, I will…”
Okay, no, not really, but let me highlight some interesting mixed signals. (It’s worth noting that these are strictly my opinions, not those of any current or past employer.)
Pokémon Go debuted almost exactly 5 years ago, and last year, even amidst a global pandemic that largely immobilized people, it generated its best revenue ever—more than a billion dollars in just the first 10 months of the year, bringing its then-total to more than $4 billion.
Having said that…
In the five years since its launch, what other location-based AR games (or AR games, period) have you seen really take off? Even with triple-A characters & brands, Niantic’s own Harry Potter title made a far smaller splash, and Minecraft Earth (hyped extensively at an Apple keynote event) is being shut down.
When I launched Pokémon Go last year (for the first time in years), I noticed that the only apparent change since launch was that AR now defaults to off. That is, Niantic apparently decided that monster-catching was easier, more fun, and/or less resource-intensive when done in isolation, with no camera overlay.
The gameplay remains extremely rudimentary—no use (at least that I could see) of fancy SLAM tracking, depth processing, etc., despite Niantic having acquired startups to enable just this sort of thing, showing demos three years ago.
Network providers & handset makers really, really want you to want 5G—but I’ve yet to see it prove to be transformative (even for the cloud-rendered streaming AR that my Google team delivered last year). Even when “real” 5G is available beyond a couple of urban areas, it’s hard to imagine a popular title being 5G-exclusive.
So does this mean I think location-based AR games are doomed? Well, no, as I claim zero prognostication-fu here. I didn’t see Pokémon Go coming, despite my roommate in Nepal (who casually mentioned that he’d helped found Google Earth—as one does) describing it ahead of launch; and given the way public interest in the app dropped after launch (see above), I’d never have guessed that it would be generating record revenue now—much less during a pandemic!
So, who knows: maybe Niantic & its numerous partners will figure out how to recapture lighting in a bottle. Here’s a taste of how they expect that to look:
If I had to bet on someone, though, it’d be Snap: they’ve been doing amazing site-specific AR for the last couple of years, and they’ve prototyped collaborative experiences built on the AR engine that hundreds of millions of people use every day; see below. Game on!
On Monday I mentioned my new team’s mind-blowing work to enable image synthesis through typing, and I noted that it builds on NVIDIA’s StyleGAN research. If you’re interested in the latter, check out this two-minute demo of how it enables amazing interactive generation of stylized imagery:
This new project called StyleGAN2, developed by NVIDIA Research, and presented at CVPR 2020, uses transfer learning to produce seemingly infinite numbers of portraits in an infinite variety of painting styles. The work builds on the team’s previously published StyleGAN project. Learn more here.
What if instead of pushing pixels, you could simply tell your tools what changes you’d like to see? (Cue Kramer voice: “Why don’t you just tell me the movie…??”) This new StyleCLIP technology (code) builds on NVIDIA’s StyleGAN foundation to enable image editing simply by applying various terms. Check out some examples (“before” images in the top row; “after” below along with editing terms).
Here’s a demo of editing human & animal faces, and even of transforming cars:
By no means have I been around here long enough (five whole days!) to grok everything that’s going on here, but as I come up to speed, I’ll do my best to share what I’m learning. Meanwhile I’d love to hear your thoughts on how we might thoughtfully bring techniques like this to life.
After driving 2,000+ miles down Route 66 and beyond in six days—the last of which also included getting onboarded at Adobe!—I’ve only just begun to breathe & go through the titanic number of photos and videos my son & I captured. I’ll try to share more good stuff soon, but in the meantime you might get a kick (heh) out of this little vid, captured via my Insta360 One X2:
Now one of these days I just need to dust off my After Effects skills enough to nuke the telltale pole shadows. Someday…!
<Old Man Nack voice> In my day, it cost $2,500 to buy the Adobe Font Folio—but Kids These Days™ (and the rest of us) get fonts on demand, right through the air. I enjoyed the type & illustrations in this little promo piece:
I spent my last couple of years at Google working on a 3D & AR engine that could power experiences across Maps, YouTube, Search, and other surfaces. Meanwhile my colleagues have been working on data-gathering that’ll use this system to help people navigate via augmented reality. As TechCrunch writes:
Indoor Live View is the flashiest of these. Google’s existing AR Live View walking directions currently only work outdoors, but thanks to some advances in its technology to recognize where exactly you are (even without a good GPS signal), the company is now able to bring this indoors.
This feature is already live in some malls in the U.S. in Chicago, Long Island, Los Angeles, Newark, San Francisco, San Jose and Seattle, but in the coming months, it’ll come to select airports, malls and transit stations in Tokyo and Zurich as well (just in time for vaccines to arrive and travel to — maybe — rebound). Because Google is able to locate you by comparing the images around you to its database, it can also tell which floor you are on and hence guide you to your gate at the Zurich airport, for example.