Modelling a modern day pandemic — Where do we begin?

Benji Tigg
6 min readJun 25, 2021

Modelling a pandemic isn’t easy, there are many different techniques that can be used and each techniques has a reason to be used

Source: Photo by Martin Sanchez on Unsplash

To begin to model a pandemic the first question we have to ask is why? Since the turn of the millennium we have witnessed; multiple outbreaks of the Ebola virus, an outbreak of the Zika Virus, H1NI(the Spanish flu) made its reappearance in 2009 and 3 different coronavirus outbreaks; SARS, MERS and COVID-19. In 2020 Professor Tom Koch from the University of British Columbia told iNews “When it will come and what will it be nobody knows, not really. But we know there will be one.” As we don’t know what the pathogen will be we can’t create a vaccine/cure for it, so are best way of combatting the spread is to model it. So that’s the why out of the way, the next question is how?

We now have to ask the question, “well which technique do I use?” In epidemiology the two main techniques used are: system of differential equations and agent based modelling. One of them is significantly harder than the other but that shouldn’t put you off from doing it.

The first of the two techniques, system of differential equations is the most common, it’s what is known as a continuum model. It’s output produces a very common graph that you’ve probably seen before, but how did we get there

In the 1920’s 3 papers were published by William Kermack and Anderson McKendrick that built on the work that Ronald Ross did, showing that equations could model the spread of a disease when they were a function of time. Kermack and McKendrick introduced the ideas compartmentalising the a population giving rise to the SIR (susceptible, infected, removed) model. They also suggested that the probability of being infected would increase as the number of infected increase. Using this information we can create two equations

With K being the probability of being infected and Lambda being the probability of recovering

Using those equations we can establish the 3 differential equations which make up the system

The equations can then be solved for time to form 3 different Cartesian functions of time to produce a graph similar to the one above. The shape of the graph will depend on the input parameters.

A brief look at this model shows you that as the number of infected individuals increases so will the rate at which it increases by. This will eventually peak when the number of people infected or removed is greater than the number of people who are still susceptible. Both the susceptible and removed graph will plateau when the gradient of the infected curve is 0 and not at it’s maximum.

So why should you use this technique as your model:

  • Fast
  • Computationally light
  • Produces repeatable results

However there are reasons not to use it:

  • Produces repeatable results
  • can get complicated very quickly
  • crude output

Producing repeatable results is both a pro and a con as other people can confirm that your model is working, however it doesn’t show any emergent phenomena as it’s not a stochastic model. The model also has a very crude output as it shows only the rate that it will spread at, not allowing for effective decisions to be made.

Agent based modelling, is a more complicated, more advanced way of looking at the spread of a disease. It may be more complicated but the results are worth the time and effort

This graph here shows the number of susceptible individuals in a agent based model with 30000 agents. As you can see this graph isn’t as smooth as the graph produced by the differential equations model, and this is because agent based models, model everything from social interactions to using public transport to get to school or work.

To understand how they do this we have to look at the history of ABMS’s. The first use of a ABM’s was in the 1960’s by Thomas Shelling and it was used to model segregation in America, however unlike modern ABM’s this was a cellular automata model. Notable CA’s would be Conway's game of life, in which cells determine their state by looking at the rules of the model and the cells around it. Modern day ABM’s build on the foundations laid by CA’s as shown by a infection check in CovidSim, a ABM that I developed.

In this example agents that find themselves within the area cast out by the centre infected agent will have a probability of being infected. Whilst this differs from a CA like Conway’s game of life, the core element of a CA is still present.

Like the differential equations model, ABM’s compartmentalise the population into subgroups depending on their infection status, however the number of sub groups can be vastly different, this is due in part to ABM’s being able to keep better track of what each agent is doing and what it’s current infection status is. It’s also due to realism, as it wouldn’t be very realistic if a agent went from being susceptible to infected, there would be a period of time where the virus is in the body but the agent is not infections, this is known as latent infectious. Other groups exist such as hospitalized and larger groups such as removed can be split into recovered and dead. The limit of what a ABM can do is purely computational.

So why should you use a ABM:

  • They are versatile
  • Allow for emergent phenomena to appear
  • The data can tell you more about the outbreak

Everything must have it’s flaws so here they are:

  • Computationally heavy
  • Slow
  • Requires a lot more setup than differential equations model
  • Not entirely clear if the model is actually working

The benefits of ABM’s are mostly due to them being a stochastic model and this allows for emergent phenomena to appear that will tell you more about the nature of the outbreak. However it’s not entirely clear if the model is actually working and so that emergent phenomena could just be a bug. Other pitfalls include it being computationally heavy if poorly optimised and very slow, however this will depend on the size of your model.

In conclusion there isn’t really a best technique to use when it comes to modelling a pandemic both models have benefits over each other and both models have pitfalls. It’s better to pick one which suits you or even better to use them in conjunction with each other.

In the next article I begin to discuss the process of building each model

--

--

Benji Tigg

A student who likes to dabble in subjects outside of their learning