Next Friday, Apple releases its ‘Vision Pro’, a device that looks very much like every other virtual reality headset created over the last decade – but costs a lot more. Hypervigilant about the ‘failure’ of virtual reality to break out of a niche market of mostly-young-male gamers, Apple has forbidden its developer community to use the words ‘virtual reality’, ‘VR’, ‘mixed reality’, ‘MR’, ‘augmented reality’, ‘AR’ or even ‘metaverse’ with respect to its device.
Instead, Apple insists, this is an entirely new device opening into an entirely new era christened by Apple as ‘spatial computing’. That’s all balderdash, of course. The Vision Pro is a simply the latest in a long series of ‘immersive’ devices (another word Apple doesn’t want used). While a three trillion-dollar firm can always try to do its best to try to control the narrative, a brief wander through the history of computing will situate this new bit of kit in its context.
Like geometry, computing has a quality of dimensionality. In geometry, the line composes the one-dimensional figure, with an origin and an extent. Its equivalent in computing, the ‘command line’, provides an interface to the computer both linear – all instructions for the computer get laid down within a single line – and interactive. This means that the computer responds as soon as the user has submitted the command line – generally by tapping the ‘Enter’ key.
Many people using computers today have never encountered a command line; yet buried deep within Windows, MacOS, Linux and many other operating systems you can find a program that’s most often named ‘Terminal’. Launching Terminal opens a box-like interface on the screen, revealing a blinking ‘cursor’ ready to take a user’s commands. In the 1960s, this was the only way to interact with a computer, and in some ways it remains the most powerful interface, capable of allowing the user to ‘touch the metal’, manipulating the guts of the machine. I still use a command line daily – although I hail from a time when that was the only interface to a computer. I’m used to it. But the command line is cryptic and potentially dangerous, so most people steer well clear of it.
Sixty-one years ago, a postgraduate student at MIT named Ivan Sutherland added a second dimension to the computer interface with a program known as SKETCHPAD. Using a newfangled ‘light pen’ – a forerunner of the pencils that pair with today’s tablets – Sutherland wrote an interactive drawing program allowing the user to create segments by touching light pen to display, then enabled the user to ‘drag’ these segments together to create forms. SKETCHPAD didn’t require any arcane command-line knowledge; every command emerged from an activity, like drawing, or a gesture, like dragging.
With SKETCHPAD Sutherland had invented the very first ‘graphical computer interface’, and every graphical interface we use today is a direct descendent of SKETCHPAD. Poke at your phone, double-click on an icon, or type into a window that attempts to look like a page of paper – all of these represent an elaboration of Sutherland’s basic idea, that interactivity in two dimensions could encompass a vast range of human activities, rendering those activities both sensuous and intuitive.
The graphical user interface became ‘the interface’. We expect every computer we touch, everywhere we touch them, to follow the principles laid down with SKETCHPAD. Xerox tried to take those ideas to the mass market; so did Apple, with Macintosh. Neither really succeeded until Microsoft sold hundreds of millions of copies of Windows, transforming PCs, and making that interface the standard. That was more than thirty years ago; not a lot has changed in the decades since. Even our smartphones strictly obey the rules of two-dimensional interfaces as laid down by Ivan Sutherland in 1963.
But Sutherland wasn’t quite done. Although SKETCHPAD made his reputation and got him a Turing Award – the Nobel Prize of computing – he could see beyond and through his two-dimensional interface into a third dimension. Into space. Sutherland’s reasoning about displays led him to believe that eventually you’d have to surround a user in displays – literally enclosing them in a bubble – or you could fool the user, by placing the display wherever they looked. Sutherland reckoned he could do that by building a display that tracked the user’s body and head movements, and in three years from 1965 he knocked up a prototype of what he imagined in his lab at Harvard University.
In 1968 Sutherland unveiled his ‘Head Mounted Display’. Nicknamed the ‘Sword of Damocles’ because of its ceiling-mounted tracking armature, a user could look through the display to see a mixture of the real world and computer-generated augmentations. When we talk about augmented reality (a term Apple refuses to use with its augmented reality device, Vision Pro), this is what we mean: a fully realised three-dimensional environment where interactive augmentations and the real world coexist.
In the ideal case, a user should be unaware which elements are real and which are computer generated. That was far from the case with Sutherland’s first proof-of-concept, and still far from the case almost 20 years later, with NASA’s VIEW system – built to help astronauts rehearse for EVAs while in orbit. In fact, all the early VR systems were clumsy in their own ways – because spatial computing is very hard, as it requires detailed awareness of both the body, and the space surrounding the body. Yet, as clumsy as they were, all of them treated the user as a body in space, not just a line of text on a command line, nor a click and a drag of a mouse. A body does not exist in space by itself; we are not the only and singular inhabitants of the universe. So to have a system that brings computing into the third dimension means that the system is spatially aware by definition. This is the origin of spatial computing. It goes all the way back to Ivan Sutherland’s work in 1965.
Nine years ago, Microsoft stunned the entire tech world when it unveiled its ‘Hololens‘ – a fully realised, portable augmented reality system. I own one, and even now I’m still amazed by how well it works. Hololens uses an array of cameras to map out the space it operates within, allowing a user to place and interact with ‘holograms’ (three-dimensional forms, which have nothing at all to do with light-wave interference holograms) anywhere they choose. The first modern exemplar of spatial computing, Microsoft never devoted the resources to turn its high-tech wonder into a mass-market device, and Hololens withered on the vine.
Apple Vision Pro, with better sensors to detect the body in space, and better software to integrate the body into the computing environment, represents the next great leap forward into the third dimension of computing. It is spatial – but it’s not original. Will it succeed this time? Apple’s ‘memory hole’ approach to the history of three-dimensional computing pointedly ignores a half century of failures and partial successes. Can Apple achieve more? Will they finally realise Sutherland’s vision of computing, where our bodies have more to do than just push pixels around a screen?