As I’ve noted previously, this essay from Slack founder Stewart Butterfield is a banger. You should read the whole thing if you haven’t—or re-read it if you have—and care about building great products. In my new role exploring the crazy, sometimes scary world of AI-first creativity tools, I find myself meditating on this line:
Who Do We Want Our Customers to Become?… We want them to become relaxed, productive workers… masters of their own information and not slaves… who communicate purposively.
I want customers to be fearless explorers—to F Around & Find Out, in the spirit of Walt Whitman:
Yes, this is way outside Adobe’s comfort zone—but I didn’t come back here to be comfortable. Game on.
Although it’s just one piece of a large puzzle, the Content Authenticity Initiative is working to help toolmakers add content credentials that help establish the original of digital media & disclose what edits have been done to it.
If you make imaging-related tools, check out this in-depth workshop exploring Adobe’s three open-source products for adding CAI support:
I love seeing how Anthony Schmidt, a 13yo photographer with autism, treats his neuroatypicality & resulting hyperfocus as a blessing. It’s a point I try to gently impress upon my own obsessive son about our unusual brains. Check out Anthony’s story & his pretty damn impressive model-car photography!
I’ve gathered links to some of the topics we discussed:
Don’t Give Your Users Shit Work. Seriously. But knowing just where to draw the line between objectively wasteful crap (e.g. tedious file format conversion) and possibly welcome labor (e.g. laborious but meditative etching) isn’t always easy. What happens when you skip the proverbial 10,000 hours of practice required to master a craft? What happens when everyone in the gym is now using a mech suit that lifts 10,000 lbs.?
“Vemödalen: The Fear That Everything Has Already Been Done,” is demonstrated with painful hilarity via accounts like Insta Repeat. (And to make it meta, there’s my repetition of the term.) “So we beat on, boats against the current, borne back ceaselessly into the past…” Or as Marshawn Lynch might describe running through one’s face, “Over & over, and over & over & over…”
The disruption always makes me think of The Onion’s classic “Dolphins Evolve Opposable Thumbs“: “Holy f*ck, that’s it for us monkeys.” My new friend August replied with the armed dolphin below. 💪👀
A group of thoughtful creators recently mused on “What AI art means for human artists.” Like me, many of them likened this revolution to the arrival of photography in the 19th century. It immediately devalued much of what artists had labored for years to master—yet in doing so it freed them up to interpret the world more freely (think Impressionism, Cubism, etc.).
Content-Aware Fill was born from the amazing PatchMatch technology (see video). We got it into Photoshop by stripping it down to just one piece (inpainting), and I foresee similar streamlined applications of the many things DALL•E-type tech can do (layout creation, style transfer, and more).
Longtime generative artist Mario Klingemann used GPT-3 to coin a name for Promptomancy. I wonder how long these incantations & koans will remain central, and how quickly we’ll supplement or even supplant them with visual affordances (presets, sliders, grids, etc.).
O.C.-actor-turned-author Ben McKenzie wrote a book on crypto that promises to be sharp & entertaining, based on the interviews with him I’ve heard.
The same edit controls that you already use to make your photography shine can now be used with your videos as well! Not only can you use Lightroom’s editing capabilities to make your video clips look their best, you can also copy and paste edit settings between photos and videos, allowing you to achieve a consistent aesthetic across both your photos and videos. Presets, including Premium Presets and Lightroom’s AI-powered Recommended Presets, can also be used with videos. Lightroom also allows you to trim off the beginning or end of a video clip to highlight the part of the video that is most important.
And here’s a fun detail:
Video: Creative — to go along with Lightroom’s fantastic new video features, these stylish and creative presets, created by Stu Maschwitz, are specially optimized to work well with videos.
I’ll share more details as I see tutorials, etc. arrive.
Obviously I’m almost criminally obsessed with DALL•E et al. (sorry if you wanted to see my normal filler here 😌). Here’s an accessible overview of how we got here & how it all works:
The vid below gathers a lot of emerging thoughts from sharp folks like my teammate Ryan Murdock & my friend Mario Klingemann. “Maybe the currency is ideas [vs. execution]. This is a future where everyone is an art director,” says Rob Sheridan. Check it out:
The technology’s ability not only to synthesize new content, but to match it to context, blows my mind. Check out this thread showing the results of filling in the gap in a simple cat drawing via various prompts. Some of my favorites are below:
While we’re all still getting our heads around the 2D image-generation magic of DALL•E, Imagen, MidJourney, and more, Google researchers are stepping into a new dimension as well with Dream Fields—synthesizing geometry simply from words.
“Not a single keyframe of animation was set in the making of the title, created by tweaking and bending the alignment knobs of a vintage TV,” writes Anthony Vitagliano. “Instead, I shot it using a vintage Montgomery Ward ‘Airline’ Portable Television, an iPhone, and a patchwork of cables and converters in my basement.”
Check out the results:
See Anthony’s site for high-res captures of the frames.
I’ve long considered augmented reality apps to be “realtime Photoshop”—or perhaps more precisely, “realtime After Effects.” I think that’s true & wonderful, but most consumer AR tends to be ultra-confined filters that produce ~1 outcome well.
Walking around San Francisco today, it struck me today that DALL•E & other emerging generative-art tools could—if made available via a simple mobile UI—offer a new kind of (almost) realtime Photoshop, with radically greater creative flexibility.
Here I captured a nearby sculpture, dropped out the background in Photoshop, uploaded it to DALL•E, and requested “a low-polygon metallic tree surrounded by big dancing robots and small dancing robots.” I like the results!
Hard on the heels of OpenAI revealing DALL•E 2 last month, Google has announced Imagen, promising “unprecedented photorealism × deep level of language understanding.” Unlike DALL•E, it’s not yet available via a demo, but the sample images (below) are impressive.
I’m slightly amused to see Google flexing on DALL•E by highlighting Imagen’s strengths in figuring out spatial arrangements & coherent text (places where DALL•E sometimes currently struggles). The site claims that human evaluators rate Imagen output more highly than what comes from competitors (e.g. MidJourney).
I couldn’t be more excited about these developments—most particularly to figure out how such systems can enable amazing things in concert with Adobe tools & users.
With reporting from 250 locations around the world, AP is a key addition to the CAI’s mission to help consumers everywhere better understand the provenance and attribution of images and video.
“We are pleased to join the CAI in its efforts to combat misinformation and disinformation around photojournalism,” said AP Director of Photography David Ake. “AP has worked to advance factual reporting for over 175 years. Teaming up to help ensure the authenticity of images aligns with that mission.”
We are building some rad stuff (seriously, I wish I could show you already) and would love to have you join us:
Some key responsibilities:
Architect efficient and reusable full-stack systems that can support several different deep learned models
Building simple, robust, and scalable platforms used by many external users
Work closely with UX designers, Product managers, Machine Learning engineers to develop compelling experiences
Take a project from scoping requirements through the actual launch
Building on yesterday’s post about Google’s new Geospatial API, developers can now embed a live view featuring a camera feed + augmentations, and developers like Bird are wasting no time in putting it to use. TNW writes,
When parking a scooter, the app prompts a rider to quickly scan the QR code on the vehicle and its surrounding area using their smartphone camera… [T]his results in precise, centimeter-level geolocation that enables the system to detect and prevent improper parking with extreme accuracy — all while helping monitor user behavior.
Out of the over 200 cities that Lime serves, its VPS is live now in six: London, Paris, Tel Aviv, Madrid, San Diego and Bordeaux. Similar to Bird, Lime’s pilot involves testing the tech with a portion of riders. The company said results from its pilots have been promising, with those who used the new tool seeing a 26% decrease in parking errors compared to riders who didn’t have the tool enabled.
My friend Bilawal & I collaborated on AR at Google, including our efforts to build a super compact 3D engine for driving spatial annotation & navigation. We’d often talk excitedly about location-based AR experiences, especially the Landmarker functionality arriving in Snapchat. All the while he’s been busy pushing the limits of photogrammetry (including putting me in space!) to scan 3D objects.
Now I’m delighted to see him & his team unveiling the Geospatial API (see blog post, docs, and code), which enables cross-platform (iOS, Android) deployment of experiences that present both close-up & far-off augmentations. Here’s the 1-minute sizzle reel:
For a closer look, check out this interesting deep dive into what it offers & how it works:
Hmm—dunno whether I’d prefer carrying this little dude over just pocketing a battery pack or two—but I dig the idea & message:
Once set up on its tripod, the 3-pound, 40-watt device automatically rotates towards the wind and starts charging its 5V, 12,000 mAh battery. (Alternatively it can charge your device directly via USB.) The company says that in peak conditions, the Shine Turbine can generate enough juice to charge a smartphone in just 20 minutes.
Heh—I got a kick out of seeing how AI would go about hallucinating its idea of what my flamed-out ’84 Volvo wagon looked like. See below for a comparison. And in retrospect, how did I not adorn mine with a tail light made from a traffic cone (or is it giant candy corn?) and “VOOFO NACK”? 😅
Not yet having access to this system [taps mic impatiently], I’m just checking out its simple but effective interface from afar. Here’s how artists can designate specific regions in order to repopulate them:
Among the Google teams working on augmented reality, there was a low-key religious war about the importance of “metric scale” (i.e. matching real-world proportions 1:1). The ARCore team believed it was essential (no surprise, given their particular tech stack), while my team (Research) believed that simply placing things in the world with a best guess as to size, then letting users adjust an object if needed, was often the better path.
I thought of this upon seeing StreetEasy’s new AR tech for apartment-hunting in NYC. At the moment it lets you scan a building to see its inventory. That’s very cool, but my mind jumped to the idea of seeing 3D representations of actual apartments (something the company already offers, albeit not in AR), and I’m amused to think of my old Manhattan place represented in AR: drawing it as a tiny box at one’s feet would be metric scale. 😅 My God that place sucked. Anyway, we’ll see how useful this tech proves & where it can go from here.
“A StreetEasy Instagram poll found that 95% of people have walked past an apartment building and wondered if it has an available unit that meets their criteria. At the same time, 77% have had trouble identifying a building’s address to search for later.”
I got a rude awakening a couple of years ago while working in Google’s AR group: the kind of displays that could fit into “glasses that look like glasses” (i.e. not Glass-style unicorn protuberances) had really tiny fields of view, crummy resolution, short battery life, and more. I knew that my efforts to enable cloud-raytraced Volvos & Stormtroopers & whatnot wouldn’t last long in a world that prioritized Asteroids-quality vector graphics on a display the size of a 3″x5″ index card held at arm’s length.
Having been out of that world for a year+ now, I have no inside info on how Google’s hardware efforts have been evolving, but I’m glad to see that they’re making a serious (billion-dollar+) investment in buying more compelling display tech. Per The Verge,
According to Raxium’s website, a Super AMOLED screen on your phone has a pixel pitch (the distance between the center of one pixel, and the center of another pixel next to it) of about 50 microns, while its MicroLED could manage around 3.5 microns. It also boasts of “unprecedented efficiency” that’s more than five times better than any world record.
How does any of this compare to what we’ll see out of Apple, Meta, Snap, etc.? I have no idea, but at least parts of the future promise to be fun.
Each week we’ll cover a different aspect of machine learning. A short lecture covering theories and practices will be followed by demoes using open source web tools and a web-browser tool called Google Colab. The last 3 weeks of class you’ll be given the chance to create your own project using the skills you’ve learned. Topics will include selecting the right model for your use case, gathering and manipulating datasets, and connecting your models to data sources such as audio, text, or numerical data. We’ll also talk a little ethics, because we can’t teach machine learning without a little ethics.
I really enjoyed this conversation—touching, as it does, on my latest fascination (AI-generated art via DALL•E) and myriad other topics. In fact, I plan to listen to it again—hopefully this time near a surface through which to jot down & share some of the most resonant observations. Meanwhile, I think you’ll find it thoughtful & stimulating.
In this episode of the podcast, Sam Harris speaks with Eric Schmidt about the ways artificial intelligence is shifting the foundations of human knowledge and posing questions of existential risk.
Last year I took my then-11yo son Henry (aka my astromech droid) on a 2000-mile “Miodyssey” down Route 66 in my dad’s vintage Miata. It was a great way to see the country (see more pics & posts than you might ever want), and despite the tight quarters we managed not to kill one another—or to get slain by Anton Chigurh in an especially murdery Texas town (but that’s another story!).
Well, they do call themselves a camera company… ¯\_(ツ)_/¯ This little contraption looks incredibly lightweight (pocketable, even) and easy to use. Visual quality (particularly stabilization) seems a little borderline, but I dig its person-centric nature, including tracking & AR effects (segmentation, cloning, etc.). Check out a great review—including a man-machine “romantic montage” (!):
Adobe Super Resolution technology is the best solution I’ve yet found for increasing the resolution of digital images. It doubles the linear resolution of your file, quadrupling the total pixel count while preserving fine detail. Super Resolution is available in both Adobe Camera Raw (ACR) and Lightroom and is accessed via the Enhance command. And because it’s built-in, it’s free for subscribers to the Creative Cloud Photography Plan.
“In 2019, we started with templates of 30 beloved sites around the world which creators could build upon called Landmarkers… Today, we’re launching Custom Landmarkers in Lens Studio, letting creators anchor Lenses to local places they care about to tell richer stories about their communities through AR.”
At its Lens Fest event, the company announced that 250,000 lens creators from more than 200 countries have made 2.5 million lenses that have been viewed more than 3.5 trillion times. Meanwhile, on Snapchat’s TikTok clone Spotlight, the app awarded 12,000 creators a total of $250 million for their posts. The company says that more than 65% of Spotlight submissions use one of Snapchat’s creative tools or lenses.