Canadian and US scientists have kept an unmanned balloon in position in the stratosphere for weeks at a time by teaching an AI controller to respond to changing conditions, even without precise knowledge of wind patterns.
Their hope is that this successful application of machine learning – described in a paper in the journal Nature – will pave the way for fully autonomous environmental monitoring technology.
Helium-filled “superpressure” balloons are routinely used to carry out experiments and weather monitoring in the upper atmosphere. When blown off course, they need to return to hover above a fixed ground location, known as their station.
Such adjustments require changing the balloon’s height: moving between layers of atmosphere where the wind direction changes in order to be pushed back into position by a favourable wind.
Current self-navigating balloons either seek out lighter – and therefore slower – winds or require significant battery power to explore the most suitable winds.
In their work, a team from Google Research in Canada and the US organisation Loon used deep reinforcement learning to teach an AI controller to make an optimal series of decisions – that is, whether to move the balloon up, move it down, or do nothing.
They used data from historical wind records and local wind observations and, because wind data is sparse, created an algorithm that filled in the gaps by adding randomly generated “noise”, thus allowing the AI to better predict the range of winds that could occur.
In a 39-day experiment over the Pacific Ocean, balloons controlled by the technology – called StationSeeker – successfully returned to position faster than balloons with conventional controllers, using comparable battery power.
“Our algorithm uses data augmentation and a self-correcting design to overcome the key technical challenge of reinforcement learning from imperfect data,” the authors write. “By reacting to its environment instead of imposing a model upon it, the reinforcement-learning controller gains a flexibility that enables it to continue to perform well over time.”
In an accompanying commentary in Nature, Scott Osprey from the University of Oxford, UK, notes that the environment in which the study was conducted – the stratosphere above the equator – is subject to particular, persisting wind conditions for several months a year, which likely played a part in the experiment’s success.
“Bellemare and colleagues’ system might therefore struggle to achieve the same success at other locations,” he adds.
Nonetheless, the research demonstrates that reinforcement learning can be an effective solution to the challenges facing the autonomous control of machines. In superpressure balloons, this technology could open up a range of commercial and scientific applications, on Earth and beyond.
“Such balloons are already used to study small and large-scale waves in the tropical stratosphere, and to detect low-frequency sounds produced by the ocean, lightning and earthquakes,” Osprey says.
“They have also been proposed for use in future explorations of Venus’s atmosphere, to search for signs of active volcanism and chemical signatures of life.”
Other uses include long-term environmental monitoring – for example, checking air quality over cities or carbon release over thawing permafrost – or for monitoring animal migration routes or the movement of people across borders.
“These applications will become increasingly relevant as the effects of climate change become more pronounced, as restrictions on movement are imposed by global events such as COVID-19, and as long term climate-change mitigation involving aviation prompts the search for alternative platforms for making aerial observations,” Osprey concludes.