First, sometimes we’re unbundling not just components but apps, and especially pieces of apps. We take an input or an output from an app on a phone and move it to a new context. So where a GoPro is an alternative to the smartphone camera, an Amazon Echo is taking a piece of the Amazon app and putting it next to you as you do the laundry. In doing so, it changes the context but also changes the friction. You could put down the laundry, find your phone, tap on the Amazon app and search for Tide, but then you’re doing the computer’s work for it – you’re going through a bunch of intermediate steps that have nothing to do with your need. Using Alexa, you effectively have a deep link directly to the task you want, with none of the friction or busywork of getting there.
Next, and again removing friction, we’re removing or changing how we use power switches, buttons and batteries. You don’t turn an Echo or Google Home on or off, nor AirPods, a ChromeCast or an Apple Watch. Most of these devices don’t have a power switch, and if they do you don’t normally use it. You don’t have to do anything to wake them up. They’re always just there, present and waiting for you. You say ‘Hey Google’, or you look at your Apple Watch, or you put the AirPods in your ear, and that’s it. You don’t have to tell them you want them. (Part of this is ‘ambient computing’, but that doesn’t capture a watch or earphones very well.)
Meanwhile charging, for those devices that do have batteries, feels quite different. We go from devices with big batteries that last hours or at best a day and take a meaningful amount of time to charge, to devices with very small batteries that charge very quickly and last a long time – days or weeks. The ratio of use to charging time is different. Even the Apple Watch, mocked as ‘a watch that needs to be charged!’, is now good for two days of normal use, which in practice means that, presuming you take it off at night, you never think about the battery at all. Again, this is all about friction, or perhaps mental load. You don’t have to think about cables and power management and switches and starting up – you don’t have to do the routine of managing your computer. (This is also some of the point of using an iPad instead of a laptop.)
A nicely polarising example of this is in Apple’s AirPods, where the friction is being moved rather than removed, exactly. You can complain that you have to charge your headphones, but you can also say that instead of plugging them in every time you listen (and muttering swearwords to yourself as you untangle the cable), you can just put them in your ears, and with 30 hours of battery life between the case and the pods themselves, you have a week or two of use. You fiddle with a cable and plug them in twice a month instead of every single time you use them. Apple hopes that’s less friction – we’ll see, but it’s certainly different.
A common thread linking all these little devices is this attempt to get rid of management, or friction, or, one could say, clerical work. That links the Apple Watch, ChromeCast, Echo and Home, Snapchat’s Spectacles, AirPods and even perhaps the Apple Pencil. They try to reduce the clerical work that a computer or digital device or service makes you do before you can use it – charging, turning on, restarting, waking up, plugging in, choosing an app and so on. A smartphone interface reduces the management you do within the software (file management, settings etc) but these are more changing how much you have to manage the hardware itself. There’s a shift towards direct manipulation and interaction – less abstraction of buttons between you and the thing you want. They don’t ask you questions that only matter to the computer (“do you want me to wake up now? Am I charged enough?”). The device is transparent to the task.
Of course, questions can be friction, but they’re also choice. So, if it’s not just a microphone but an end-point for a cloud service, it’s also an end-point only for Google or Amazon’s cloud service (and if I tell Alexa to “buy soap powder”, what brand does it pick, and why?). If GoPro is just a camera but SnapChat spectacles are an end-point for Snapchat, they’re only for Snapchat. As platforms, Alexa or Google Home look a little like a feature-phone with a carrier deck, or a cable box – a sealed, subsidised device with centrally controlled services (Amazon probably wants to give Prime customers Echoes for free, or almost). Your choice of voice assistant is made when you choose to buy an Echo or a Home, and not afterwards (assuming you don’t buy both, and assuming they don’t argue with each other in your kitchen).
That means that this is about reducing friction, yes, but it’s also about the reach of cloud and web service companies, and how they think about a broader world in which the PC web is increasingly left behind, the smartphone OS is the platform and the platform is often controlled by their competitors, and how else they can build services beyond fitting into a smartphone API model that someone else defined. There’s an element of push from big companies with a strategic desire, at least as much as there is consumer pull. Google Home is an end-point for the Google Assistant, but so is the Allo messaging app, an Android watch or indeed an Android phone. Facebook hasn’t tried to make a device so far (beyond Oculus, which is a very different conversation, and a feature-phone partnership a long time ago), but like Google it has been circling around what the right run-time or touch-point might be beyond apps, most recently with the Messenger Bot platform.
The final thing to think about here is how many of these devices are driven by some form of AI. The obvious manifestation of that is in voice assistants, which don’t need a UI beyond a microphone and speaker and so theoretically can be completely ‘transparent’. But since in fact we do not have HAL 9000 – we do not have general, human level AI – voice assistants can sometimes feel more like IVRs, or a command line – you can only ask certain things but there’s no indication of what. The mental load could be higher rather than lower. This was where Apple went wrong with Siri – it led people to think that Siri could answer anything when it couldn’t. Conversely, part of Amazon’s success with Alexa, I think, is in communicating how narrow the scope really is. Hence the paradox – voice looks like the ultimate unlimited, general purpose UI, but actually only works if you can narrow the domain. Of course, this will get better, but in addition, one shouldn’t think of sound and voice as the only AI-based UI – we really haven’t tried to see how such an appliance-type model could apply to images, for example. That gets especially interesting when one thinks that, say, a face recognition engine (or a voice/language engine) could be embedded in a small and very cheap device, with the data itself never leaving the device. So an alarm sensor could be a people sensor instead of an IR sensor, that just sends out a binary ‘yes/no there are people’ signal. It might be battery powered and last for years.
Of course, Amazon already sells a small, battery-powered sensor that sends only a very simple signal – the Amazon Dash button. Is it easier to put an Echo in your laundry room, or a Dash? There’s a neat contrast here – these devices are either very smart or very dumb. They represent either the cutting edge of AI research (perhaps locally, perhaps as the end-point to the cloud), or the simplest device possible, and sometimes both at the same time, and both getting you more Tide.