Epidemic Models

I have been following the remarkable COVID-19 pandemic charts posted by @janvda here and here, and I wanted to be able to make comparisons with predictions from contagious disease models and explore a few "what if" scenarios. The standard models, known as compartmental models, are described in some detail in Wikipedia. There are many implementations available on the internet, but I felt I might learn something by building flows in Node-RED. The results are in this GitHub repo. One tab shows a side-by-side comparison of two model runs, with the model type and parameters selected independently. The other tab shows the result of imposing or modifying interventions (lockdown or social distancing), defined by changes in the reproduction number (R0) at specified times.

It goes without saying that these flows should not be taken seriously as models of the current pandemic -- they are strictly educational toys. Because this is work in progress and at an early stage, I have not posted it to the NR Flow Library. Comments, suggestions, and bug reports are welcome here or on Github.


Very Well done !

I am just wondering how I should interpret fatality rate in the model.
In the chart below I have put 0.1. So I would think that this in the end means that 10% of the infected cases will decease. But as you can see in the model in the end 4% deceased and 87% got recovered so this is much less than 10%.

It is a very nice implementation of the different models.
What interests me is the insights it could give with respect to the current Covid pandemic.

I think that at this stage (as most countries are over the peak) the 1 million dollar question is knowing the actual reproduction number (R0) as quickly and precisely as possible and this should then allow us to estimate the impact of relaxing lock down measures on the R0.

For some countries like Italy you can reasonably well estimated the R0 based on the available data (see chart below) as it is not much fluctuating around 0.7 during the last month. But for Belgium the story is very different. There it is fluctuating a lot in the last month which doesn't make much sense.

The calculated reproduction rate is based on confirmed cases. But this also means that it estimates the actual reproduction rate with a number of days delay because there is time gap between the infection moment and the moment this get confirmed through a test. It would also be good to be able to estimate this extra delay (is this a couple of days or more than a week) as without this it is difficult to link an increase of reproduction rate to the relaxation of measures.

Thanks for asking about this. It's a rather subtle point, and I'm not sure I can explain it clearly. The fatality rate (alpha) in the model is not the true infection fatality rate. What it does is divide the population of infected individuals into two categories, one of which recovers at a rate of 1/D per day and the other dies at a rate of 1/H per day. (D is the contagious period and H is the treatment period. ) If D < H, patients recover more quickly than they die, and you get fewer fatalities. If D = H, the result should be what you had expected (although the model seems slightly off, and I'm looking to see why.) Another issue with the terminology in SIRD and SEIRD models is that "contagious period" is not exactly the right term. Individuals being treated remain in the infectious compartment and can still spread the disease. In principle, It might be more precise to introduce another (treatment) compartment, where the reproduction number could be adjusted independently.

1 Like

I doubt that these models can be applied directly to the real world. On the other hand, it can be useful to look for real-world equivalents or correspondences to things that happen in the models. For example, in the SIR model, the effective reproduction number Re at a particular time is given by Re = s * R0, where s is the susceptible population fraction. Unfortunately, s is practically impossible to observe. On the other hand, the model gives the relationship

s * R0 = 1 + (D / i) * (di / dt).

This means that Re can be determined just by counting infections if the contagious period D is known. (I guess this is intuitively obvious.) Estimating di/dt from the daily count of new infections and i from the sum of new infections for the previous D days ought to give a good approximation. I have uploaded a new version to Github that includes exact and approximate calculations of Re and some simplifications to the rest of the flow.

1 Like
  • within Belgium (based on antibodies in blood samples of a representative population) they had estimated that 4.3% of the total population has been infected by Covid-19.

  • Another recent study, also showed that 8.4% of health care workers in Belgian Hospitals had antibodies against the virus. (due to the close contacts with covid patients it is not unexpected that this number is the double from the average population)

Despite the fact that Belgium is one of the most hard hit countries (in relative numbers) only a very small fraction (less than 5%) of its population has been infected so far.

Numbers for most other countries (if not all) will even be lower.

So "herd immunity" is a dream. To get there for Belgium: at least 10 times the number of people should become infected and consequently 10 times the number of deaths = 100 000 deaths (or almost 1% of the population) complete horror scenario.

UK - "hold my beer"

1 Like

Just found here a more precise estimate of the fatality rate based on new-york city:

  • IFR (Infection Fatality Rate) = 1.4%.

On same site you can read that to achieve "Herd immunity" about 2/3rd of the population need to become infected meaning that with IFR = 1.4% about 1% of the total population will die.

1% corresponds to:

  • 660,000 UK citizens (currently they reported 39,000 deaths)
  • 3.3 million US citizens (currently they reported 106,000 deaths)
  • 100,000 Swedish citizens (currently they reported 4,400 deaths)

I just wonder how many people in the community have actually been infected with Covid, and been asymptomatic, had mild symptoms or couldn't actually get a test (in time)?

I don't expect that figure to be anywhere near 2/3rds of the population, but I would expect it to be far in excess of the current official figures.

I guess we will get a better understanding once we get more data from antibody tests.

You can calculate backwards based on an infection fatality rate of 1.4%.

For UK

  • 39,000 deaths => about 2,800,000 infected people (10 times more than confirmed cases) => 4.2% of the total UK population.

Of course the counted deaths might be an underestimate of the real deaths due to Covid 19 but it will certainly be less than the double of the counted deaths. So for the UK I would guess between 4 and 7% of the UK population has been infected so far.

For Belgium:

1 Like

These seem to be in keeping with other research I've seen for the UK. Mainly at the higher end.

Herd immunity is much more complex though I think. Firstly, even if immunity happens, it doesn't last forever and so you need to keep infection levels low enough that the virus effectively dies out. And then, since the virus is bound to exists somewhere, you we get sudden reinfections and new hotspots since immunity will have died out.

In truth what people like Cummings and crowd think of as herd immunity is really more like culling of the weak. Something that we know he and other extreme right-wing nutters believe in.


The estimates @janvda makes for the UK and Belgium (~10x the officially confirmed cases) are consistent with the results he linked to earlier for New York City. The New York results were obtained from reasonably extensive antibody testing (sample size 15K). Possible biases in that study include that the people tested were healthy enough to be out in public and therefore not likely to be recently recovered from a serious infection. On the other hand, testing was voluntary, so that people who had exhibited mild symptoms might be inclined to be tested in order to find out if they had been infected. There apparently was no effort to correct for any bias.


In the UK, several million people have taken part in questionnaire based research on a regular basis sometimes coupled with testing data and that research is also showing similar levels of likely infection.

1 Like

Absolutely. Please let me add a few points, all based on the simple SIRD model run by @janvda, but still relevant.

The number of infections usually cited as necessary for herd immunity is just a threshold value where the reproduction number falls to one and the infection rate begins to decline. The number of infections and deaths occurring after that threshold is reached is comparable to the number that occur before.

Second, even when the number of infections becomes negligible there are still plenty of susceptible individuals left in the population (~10%), and the reproduction number (~0.25) never reaches zero. Introducing a few infectious individuals from outside will not cause another epidemic, but it will produce additional new infections.

Later, if as @TotallyInformation suggests, post-infection immunity declines or recovered individuals simply die off and are replaced by young susceptible ones, the susceptible population may increase to the point where the reproduction number again reaches one and the disease becomes endemic.

It turns out that the relation between the infection fatality rate and the model "fatality rate" (alpha) is not as simple as I might have liked. At a late enough time that every infection has resolved (recovered or deceased) the infection fatality (IFR) rate is defined by

IFR = d / (d + r) or 1 / IFR = 1 + r / d

In your calculation, d = 4.14% and r = 87% gives IFR = 0.045.

In the model, the rate of transitions from I to R is (1 - alpha) / D and from I to D is alpha / H. So,

1 / IFR = 1 + (H / D) * (1 - alpha) / alpha

With D = 6, H = 14, and alpha = 0.1, IFR = 1/22 = 0.045.

I probably could have defined the model so that the parameters had more obvious meanings, but I tried to keep the notation consistent in all four models and ended up with this. Sorry for the confusion.

1 Like

Thanks for sharing (I wasn't aware of the definition for herd immunity threshold).
So the actual % of people that eventually become infected will be higher than this threshold.

E.g. in the model below the herd immunity threshold is reached at day 30 when 35% of the population is still susceptible (in other words 2/3rd of the population has been infected) - note also that at that point in time about 1/4 of the total population is infectious (shocking huge %). After about 90 days no new infections happen: at that time 9% of the population is still susceptible - so in other words this means that in the end 91% of total population becomes infected although herd immunity threshold was already reached when 65% has been infected.


1 Like

Regarding the following you have mentioned at drmibell/node-red-contagion

I understand that R0 is impacted by the lock down measures and other things (big cities versus rural), but the other parameters should be rather constant over time and even amongst different countries (if we exclude alpha).

I would also expect alpha to be a constant for similar countries. Of course there might be differences due to population distribution (e.g. older population).

So to what extend are the above values for D, I, H and alpha applicable for COVID-19 pandemy ?

Exactly right. That's why "flattening the curve" was so important. The idea of letting the pandemic run its course was always totally insane. We may never know exactly why the UK Government changed course on this, but I like to think that some clever epidemiologists (Imperial College London?) showed them simple models like this and got the message across.

1 Like

The values I listed were taken from various sources related to COVID-19. (I should have said "COVID-19 literature" in the README.) Unfortunately, I have not been systematic in comparing or recording the sources I examined. Since the models assume that everyone in the population is the same, we have to pretend that D and I depend only on the nature of the virus itself, while H and alpha may also depend on aspects of the health care system. There is some debate about I (its value or even existence), since it represents a period of time after exposure before the patient becomes infectious. Without intensive contact tracing, patient interviews and testing, this is very difficult to observe.

1 Like

Maybe another intriguing question where the models could give some insight.

Here below for Belgium the graphs based on reported cases and reported tests.

We went in lock down 12the of March: at that time only 8.4 confirmed cases per million per day were reported (which is even less than today). I would expect to see the effect of the lock down after about 1 week... but that didn't happen. It seemed that at least a full week there was no effect at all of the lock down (in other words this meant that confirmed cases trippled per week and this 2 weeks in a row). It is only after 2 weeks that the reproduction rate got below 1 and then it took another 3 weeks before the reproduction rate reached its lowest point and this despite no change in lock down measures during the entire period.

I am just wondering what explains this long delay before seeing the effect and its full effect of the lock down measures in Belgium.