This article first appeared in Cosmos Magazine on 4 June 2020, and it has just been announced as the winner of the Finkel Foundation Eureka Prize for Long-Form Science Journalism. Congratulations on your win, Dyani!
The first videoconference – like the many that followed – was late at night. From his home in suburban Melbourne, James McCaw joined fellow disease trackers from around the world in mid-January to discuss some of the early data emerging from Wuhan, the Chinese city at the epicentre of what became the COVID-19 global pandemic.
The news wasn’t good. The new virus was leaving dozens sick; a handful had already died. More worrying still, epidemiologists at Imperial College London estimated the transport and industrial hub in central China harboured many more cases of infection than had been reported. “It was clearly spreading,” says McCaw, who uses mathematical models to trace how diseases do just that, “but we didn’t really know what the consequences of it would be.”
Since January it has been McCaw’s job, along with his long-time collaborator Jodie McVernon and a small army of colleagues, to build mathematical projections of how an outbreak might play out in Australia, and relay that information to government officials.
While other scientists are in the midst of a Herculean effort to discover what they can aboutSARS-CoV-2 (see page 28), the virus responsible for COVID-19, the work of disease modellers like McCaw and McVernon – both based at the University of Melbourne’s Peter Doherty Institute for Infection and Immunity – has been playing an outsized role in upending life as we knew it pre-pandemic.
In Australia, as elsewhere, governments are making decisions not because they are written into a pandemic playbook, but because mathematical models – written on the fly, as the disaster unfolds – light the way.
“There is not a single pandemic plan globally that talks about lockdown as a control measure,” McVernon announced during a press conference in early April. And yet, living in some form of lockdown is exactly where vast swathes of the world’s populace found itself.
Decisions to lock the borders to foreigners, cancel sporting events and concerts, shutter schools and tell people to stay in their homes have been taken, in large part, because of the mathematical models that McCaw, McVernon and their colleagues have built. They determine when restrictions begin, and when they end. So, how did this one line of evidence become so influential?
The foundations of modern epidemic modelling were laid in the early 20th century. In 1897, the British army surgeon Ronald Ross demonstrated that the malaria parasite is transmitted by mosquitoes, not through contaminated water as others had assumed. After retiring from the army – and winning the Nobel Prize for his discovery in 1902 – Ross spent much of the first decade of the new century travelling around Africa and the Mediterranean drumming up support for a fight against the mosquito. Not everyone bought the idea that reducing mosquito numbers could eradicate malaria, but maths, he decided, could provide the evidence.
Others before him had attempted to describe how diseases spread using mathematical principles, but Ross pushed to establish mathematical epidemiology – what he called “a priori pathometry” – as a new field of study. “All epidemiology, concerned as it is with the variation of disease from time to time or from place to place, must be considered mathematically, however many variables are implicated, if it is to be considered scientifically at all,” he said.
In the 1920s, two Scots took things further. Anderson McKendrick – an ex-army physician who had accompanied Ross on a malaria-fighting mission to Sierra Leone two decades earlier – teamed up with William Kermack, a young biochemist who had been blinded in a lab accident.
The duo devised a model that looks deceptively simple, yet forms the basis for transmission models to this day. It places people in a population into one of three buckets, marked S, I and R. Individuals are either susceptible to an infection (S), are infected (I), or have recovered or “removed” (died) (R) .
For a new virus, like SARS-CoV-2, the whole population is presumed to be susceptible at the start of the outbreak. If infection spreads unhindered, the number of susceptible people falls over time, while those who have recovered – and are presumed to be immune to reinfection and unable to pass the infection on – grow in number.
Meanwhile, the number of people infected etches out the now-familiar bell-shaped curve of sickness: a gentle incline followed by a deathly uptick, a levelling as the epidemic reaches its peak and a final downward slope as the outbreak runs out of susceptible people to infect.
The shape of the bell – whether it resembles an upturned champagne flute or a broader, less precipitous, upturned soup dish – depends on how rapidly the disease is spreading. This boils down to the basic reproduction number (R0): how many people, on average, a single sick person infects. Kermack and McKendrick noticed that the curve could only keep rising as long as that number is greater than one. The turning point at the apex of the bell marks the point where the reproduction number dips below one – each person infects fewer than one other and the outbreak starts to fizzle.
To prove they were on the right track, Kermack and McKendrick overlaid their theoretical curve onto data from a real-world epidemic: an outbreak of plague that struck the Indian city of Bombay (now Mumbai) in 1905 and 1906. The deaths recorded each week lined up with their bell.
“The mathematics of it is not complicated,” says Raina MacIntyre, head of the Biosecurity Program at the Kirby Institute at the University of NSW. “What’s complicated is the parameters and the assumptions that go into the model.”
The simplest SIR models assume everyone in the population has an equal risk of being infected and is equally infectious once sick. That’s not true of influenza: young children long on sniffles and short on personal boundaries are the primary spreaders of the flu. And it doesn’t appear to be true of SARS-CoV-2 either: children appear less likely than adults to catch or pass on the virus.
Models today are more sophisticated. They divide the population into smaller tubs – based on age and health status, say – that try to account for different people’s risk of being infected and their differing propensity to infect others. The stages of disease are also more finely compartmentalised to reflect how likely the infection is to jump from one person to the next at different stages, from the point of exposure through to full recovery or death.
In 2003, The World Health Organisation issued a resolution urging its member states to plan for the next flu pandemic: up vaccination rates, strengthen surveillance to spot outbreaks early, and stockpile antivirals and other essential medicines. The 2008 Australian Health Management Plan for Pandemic Influenza helped us navigate the 2009 swine flu pandemic, which killed 191 people across the nation.
McCaw and McVernon have been working with the Australian government to prepare for the next influenza pandemic for the past decade and a half. It was COVID-19 that arrived, but the planning wasn’t in vain. “Flu and the coronaviruses differ in fundamentally important ways biologically,” says McCaw, but “the way that we break down the problem, unpick it, think through the possible response options, is very similar… That’s incredibly valuable.”
Through January, as the COVID-19 situation worsened, McCaw and his team used mathematical modelling to see whether the Australian healthcare system was up to the task that lay ahead. What was the likely shape of the epidemic curve in Australia if left unchecked, or if hit with disease-stopping interventions? Some of the assumptions they used were harvested from crucial – yet still uncertain – pieces of information coming out of China. Using data from the early days of an epidemic is fraught. Early reports can miss mild and asymptomatic cases, and details of when each person first detects a sore throat or sniffly nose can be unreliable.
Nevertheless, it looked like the outbreak was doubling every 6.4 days, and case records suggested an incubation period (how long it takes for symptoms to show up after infection) of just over five days, with people able to transmit the virus for two days before they showed symptoms. These values, in turn, pointed to an R0 of around 2.5, though estimates at the time ranged from as low as 1.5 to more than five depending on which cases were used for the calculation.
A final assumption – based, again, on case reports – approximated how many people who got sick would end up in hospital. Few children, but up to one in five people over the age of 80, would end up in the intensive care unit (ICU). “By early February, we had produced some very early plausible scenarios, which had terrifying numbers in them,” says McCaw.
If allowed to spread uncontrolled, COVID-19 would infect 90% of the population. The healthcare system would be overwhelmed for weeks, and for every three people receiving the intensive care treatment they required, 17 others would go without. Quarantining the sick and keeping the seemingly well apart could stem transmission, reducing the onslaught.
The federal and state governments took note of this and of subsequent models McVernon, McCaw and their team produced. On 1 February restrictions began, first on international travellers from China and then, in March, on travellers from elsewhere. On 24 March, returning Australians were required to self-isolate for two weeks and banned from travelling overseas. Police were given the authority to fine people not at home except for the most essential activities. Behind the scenes, hospitals diverted resources to their intensive care facilities to “plan for the worst and hope for the best”, says McCaw.
An alternative to dividing populations into smaller and smaller buckets is to create simulated worlds filled with virtual people. These are known as individual-based or agent-based models. Rudimentary agent-based models developed in the 1970s and ’80s created communities of roughly 1000 people.
Over the decades, advances in supercomputers and programming innovations that enabled computations to be run in parallel gave epidemiologists the grunt they needed to make much larger versions. But it wasn’t epidemiologists who first made the discovery.
In the early 2000s, Tim Germann, a materials scientist at the Los Alamos National Laboratory in New Mexico, was looking at how individual molecules in metals jostle about and alter the properties of each other as they interact. He and his colleagues assigned individual atoms within a simulated hunk of aluminium attributes such as mass, electrical charge and polarity, then simulated how the atoms smash together during a car crash. The model could handle 19 billion individual particles – a virtual Newton’s cradle on steroids.
Germann knew the capacity to compute interactions on such a large scale could be adapted to solve other problems, so he asked around for suggestions. “Some folks had suggested looking at how fish or birds flock and cluster,” he says. Others had ideas for modelling sensor networks. But when someone suggested disease modelling, the idea “struck a nerve”. In his own work, Germann had sent shockwaves through a set of atoms. “Sending a disease through the population wasn’t too far of a leap.”
Humans, though, don’t behave as atoms do. They don’t just jostle about with their immediate neighbours; they purposefully move, sometimes across vast distances. Fortuitously for Germann, Ira Longini, an epidemiologist who had worked on community-scale agent-based models since the 1970s, had a planned visit to New Mexico in the summer of 2005. “That really helped us have a better disease model, because otherwise it would have looked like something that physicists had come up with,” says Germann.
With two other collaborators, Germann, who is still at Los Alamos, and Longini, now at the University of Florida in Gainsville, took agent-based disease modelling from a few thousand to hundreds of millions. In 2006, they modelled how pandemic influenza could sweep through the entire US population – then 281 million people – and how mitigation strategies could make the bell curve more soup bowl than champagne flute and better manage such an outbreak.
Agent based models take time to build. Mikhail Prokopenko at the University of Sydney spent more than three years building a virtual world based on Australian data. He and his team started constructing their model to make predictions about the spread of seasonal influenza. Taking data from the 2006, 2011 and 2016 censuses and studies of social networks, the model incorporates information about the age and sex of every person in the nation, the size of households and workplaces, travel patterns and social interactions. The granular details try to get at who interacts with whom, and how often – information that’s important for seeing how real-life infectious diseases spread.
On top of that, says Prokopenko, is layered the epidemiological information taken from previous studies of flu: What’s the rate of transmission from a child to an adult? From an adult to a child? What if the pair share a household? Or if they live in the same neighbourhood? “You have all possible combinations of who transmits to whom, in what social layering,” he says.
When the virtual world was completed in 2019, they could seed it with a virus, infect simulated people and watch to see how many others were infected: how far and wide it spread. But Prokopenko’s team didn’t immediately send COVID-19 into their virtual world, because too little was known to estimate transmission rates. Too many unrefined assumptions, he says, would have been a case of “garbage in, garbage out”. But by early March, he says, “we thought, we have some semblance of truth”.
Their model illustrated that social distancing was key to suppressing the spread of coronavirus in Australia. It also showed that buy-in was crucial. If 90% of the population abided by social distancing restrictions, daily new infections would fall close to zero by July, with total cases capped at 8000–10,000. If only 70% complied, the model predicted the measures would be next to useless and the virus would continue to spread.
The model also revealed some potential missteps. If social distancing measures had been introduced three days earlier (on 21 March instead of 24 March), the model predicted, the peak would have been half as high and cases would have fallen to near zero three weeks earlier. Every day of delay at the start of the epidemic cost a week at the tail end.
By mid-March, McCaw, McVernon and colleagues weren’t just generating models, they were comparing models generated by colleagues here and overseas and assessing a deluge of mostly unvetted scientific papers to update assumptions about how the virus behaves. These papers are almost all “preprints” – not yet peer-reviewed – but in a pandemic, time is critical.
More than 2200 coronavirus papers appeared on the top two preprint sites between the start of the year and the end of April. Only a handful have been withdrawn; many have ended up peer-reviewed and published in the stable of traditional scientific publications.
Still, it means that the newest information is not the final word. Results can vary widely, tightening as more work is done. Models live and die by the quality of the assumptions that get built into them, taken from that evolving body of knowledge. “If you use the wrong kind of data or the wrong assumptions, you end up with wrong outputs from your model,” says MacIntyre.
It’s not always clear when assumptions are incorrect, even when real-world data is available to see how a model performed in retrospect. An over-egged ratio of symptomatic people could be offset by an underegged transmission rate. “There are so many variables,” in agent-based modelling, says Prokopenko, “that two mistakes will cancel each other out.”
“We’re trying to make as few assumptions as possible, and make models as lifelike as we possibly can,” says George Milne from the University of Western Australia, who has also used an agent-based model to predict which interventions will control the COVID-19 outbreak in Australia.
But no model is perfect. That’s why, in modelling, more is more. “Different people may use different modelling approaches, they may make [different] assumptions,” says Milne, “but if outcomes are pretty much the same – the relative benefit of this intervention versus another one – that says you’re on the right track. That’s really critical.”
On 16 April, McCaw and McVernon released data that showed the value of R had dipped belowone in all states and territories. The number of newly infected people each day was trending down, as was the overall number infected.
Exactly when every pandemic restriction will end is still unknown, says McCaw. But models and data sharing from around the world will continue to play a role in managing restrictions out (or back in) in the months to come. “You can definitely… start lifting restrictions in a phased way,” says MacIntyre, “and you can inform that by using modelling.”
One problem to solve is that, at present, there’s no clear understanding of which measures worked best. Once COVID-19’s speed of spread and ability to kill became apparent, state and national governments threw every measure at it: quarantines, school closures, bans on gatherings large and small. Unpicking which of those things had the greatest effect on slowing the virus and which made no difference may take months or even years to work through.
Ultimately, says McCaw, the only way to learn which measures work – and which allow transmission to flare up anew – is to gather data from the real world as those restrictions are eased. “Science is empirical,” he says. “We still then need to relax the measure – whatever the measure might be – or implement a different measure, and then try to measure what its impact was. The model can never answer that question.”
As communities around the world continue to navigate the peaks and troughs of the initial epidemic curve, the spectre of COVID-19 remaining part of life in the “new normal” looms. “It’s almost implausible
to imagine this virus going extinct globally, which means that it will be here to stay,” says McCaw.
As weeks of rigid restrictions turn into months of milder control measures, McCaw and epidemiologists around the world could be settling in to a new normal of their own: running models as new data emerges to track and anticipate a seasonal surge in COVID-19 cases, trying to head off the next big outbreak.
For more stories like this, subscribe today and get access to our quarterly magazine in print or digital, plus access to all back issues of Cosmos magazine.