A team of scientists at the University of Wisconsin-Madison has developed a new, faster approach to analysing the floods of data that will flow from the Square Kilometre Array (SKA), that will be the world’s largest radio telescope.
SKA, located in Africa and Australia, is expected to be fully operational in the mid-2020s. It’s total collecting area of around a square kilometre will deliver data on the location and properties of stars, galaxies and giant clouds of hydrogen gas.
“There are all these discussions about what we are going to do with the data,” data scientist Robert Lindner told the University of Wisconsin-Madison. “We don’t have enough servers to store the data. We don’t even have enough electricity to power the servers. And nobody has a clear idea how to process this tidal wave of data so we can make sense out of it.”
In many respects, the hydrogen data from SKA will resemble the vastly slower stream coming from existing radio telescopes. The smallest unit, or pixel, will store every bit of information about all hydrogen directly behind a tiny square in the sky. At first, it is not clear if that pixel registers one cloud of hydrogen or many — but answering that question is the basis for knowing the actual location of all that hydrogen.
In the new study, Lindner and his colleagues present a computational approach that solves the hydrogen location problem with just a second of computer time. The system uses software trained to interpret the “how many clouds behind the pixel?” problem.
The software ran on a high-capacity computer network at UW-Madison called HTCondor. And “graduate student Claire Murray was our ‘human,’” Lindner says. “She provided the hand-analysis for comparison.”
“We’re trying to understand the initial conditions of star formation — how, where, when do they start? How do you know a star is going to form here and not there?” Lindner says.
“We’re trying to understand the initial conditions of star formation — how, where, when do they start? How do you know a star is going to form here and not there?”
By correlating data on hydrogen clouds in the Milky Way with ongoing star formation, data from the new radio telescopes will support real numbers that can be entered into the cosmological models.
“We are looking at the Milky Way, because that’s what we can study in the greatest detail,” Lindner says, “but when astronomers study extremely distant parts of the universe, they need to assume certain things about gas and star formation, and the Milky Way is the only place we can get good numbers on that.”
With automated data processing, “suddenly we are not time-limited,” Lindner says. “Let’s take the whole survey from SKA. Even if each pixel is not quite as precise, maybe, as a human calculation, we can do a thousand or a million times more pixels, and so that averages out in our favour.”