Making robots see

Remember The Jetsons – the animated sitcom that reflected popular 1960s imaginings that technology had all the answers? Life, it envisaged, would continue much the way it had, but with our every whim served by platoons of sentient robots – dubious gender stereotypes aside.

So, how did all that work out for us?

“There is a fundamental disconnect between what we roboticists say and what the public perceives,” says Ian Reid, deputy director of the Australian Centre for Robotic Vision, in Brisbane.  

Sure, we use robots for all sorts of things, but they aren’t what fiction prepared us for – although the work of the centre is dedicated to turning that fiction into reality.

“Robots used for search and rescue, for example, are little more than mechanical platforms that are remote-controlled,” says Reid, who specialises in computer science. “Even in industries like mining, the trucks may be considerably smarter, but they are still tied to a fixed environment.”

So, what is required to make the dramatic shift from the remote-controlled to the autonomous so that a robotic butler will greet us with a chilled martini when we arrive home just like George Jetson?

Reid’s colleague Rob Mohony answers that question. Mohony’s research focuses on robots seeing in dynamic environments with a lot of moving and unexpected parts.

“The contextual understanding of the environment is impossible without vision,” he explains. “That is the missing link in letting robots operate in an unstructured environment.”

And that is exactly what the Australian Centre for Robotic Vision is aiming to do, bringing together leading researchers from around the world for that purpose. The centre is a collaboration between researchers at Queensland University of Technology, the University of Adelaide, the Australian National University, Monash University and international partners.

“I have spent almost my entire career interested in situating robots in the environment and letting them see where they are,” says Peter Corke, the centre’s director.

The difference that will make is profound.

“Robots in factories are 60-year-old technology – simple but highly effective,” Corke says. “They have no idea what they are doing; very simple machines relying on absolute precision. But they just don’t work if you take them outside the factory.

“Say you take them to an orchard – the robot doesn’t know where it is; it doesn’t even know where, or what, an apple is.”

And that leads to the heart of the problem, and what researchers mean when they talk about “robotic vision”: using cameras to guide robots to carry out tasks in increasingly uncontrolled environments.

To do that requires a mind-bogglingly complex set of algorithms and advanced 3D geometry. And when we take into account a dynamic environment, where random objects can appear on the scene and move at speed across it, the challenges rise exponentially. And all of this must be achieved with only the computing power a robot can carry on its back. 

The biggest problem is image recognition, where a robot can understand what it “sees” and so find an answer to a question like “what is an apple?”. But this can quickly spin off into something not-so-simple at all. What happens when the robot is confronted, not with an apple, but a pear?

These sorts of problems are categorised by researchers as “semantic representations”. And meeting the challenges they present has been helped enormously by great advances in the field of machine learning in the past four or five years.

“Machine learning has a huge role to play in robotic vision,” Reid says. 

Researchers began investigating artificial intelligence in the early 1950s, but since that time computer vision has diverged from robotics.

“The two have converged again more recently,” Reid explains, “and computer vision has become much better at seeing the world through a camera and converting it to geometric information.”

The breakthrough, he explains, came in about 2012 with deep neural networking: algorithms that learn and improve on their own. In this way, robots will help each other to learn about their worlds and how they can be interpreted, Corke says. 

“Only one robot needs to get the knowledge and then they can share.” he explains. “In some ways they will be like the Borg [fictional cybernetic aliens in Star Trek]. If the knowledge that one acquires is imperfect, then it can be shared and improved with other robots.”

So, how soon can we expect all this? Is this another of Ian Reid’s “disconnects” between the research world and the public’s sci-fi driven expectations? Or, are we really on course for real interaction with robots who can “see” and understand what is going on around them?

“There is a significant difference between the closed world of the factory floor and the open environment of the real world, where there can be no guarantees that the robot has actually seen everything,” Reid says. “For that we need to be able to understand uncertainty, to decide how confident we are in what the computer perceives.”

This means that “seeing” robots will slowly enter our lives, working first in semi-open environments where there are fewer variables.

Rob Mahony is prepared to make some predictions.

“The pieces will be in place within five years, and robots will become more common in semi-structured environments,” he says.

And once the technology is on the ground, you will see companies begin to exploit it. “You’ll see the big money industries move first,” he says, “with mining, already a leader, rapidly expanding the role of robots.”

Agricultural and aerial vehicles could also be common sights within the decade, especially when used for infrastructure inspection, and are expected to transform the whole supply chain from warehousing to the point of delivery.

“In rich countries like Japan where there are also demographic challenges, you will see a big increase in social robotics – in aged, robotic companions and robotic pets,” Mahony predicts. “Probably within 15 years, robots that can move, understand and make decisions will be a major part of our lives.”


Originally published by the Australian Centre for Robotic Vision, and reprinted here with permission.

Please login to favourite this article.