Modelling Epidemics

Few of us have been unaffected by the fears of Swine Flu. Any public health organisation would need to model an epidemic so that it can be prepared and can take preventive or precautionary actions. In this article, you will see how Scipy can be used to model an epidemic using the simple SIR model.

The basic idea is that the population consists of three groups, the susceptible, the infected and those who have recovered. It is reasonable to assume that the rate at which susceptible people are infected will be proportional to the possible pairs of susceptible and infected populations, i.e. their product. It is also reasonable to assume that the rate at which people recover will be proportional to the infected population. Finally, the rate of change of the infected group will be the difference between the infection rate and the recovery rate. The SIR model, assuming that there are no births or deaths, can, thus, be represented as a set of simple ordinary differential equations:

ds/dt = - bs(t)i(t)       # Rate of change of susceptible

di/dt = bs(t)i(t) - ki(t) # Rate of change of infected

dr/dt = ki(t)             # Rate of change of recovered

If the factors, b and k, are known and initial values of susceptible, infected and recovered populations are known, the above equations can be integrated and a solution found.

An Elementary Example

Suppose ten people in a city of a million people are infected. The infectious period of a flu typically lasts for 5 days; so, an estimated 20% of the infected cases recover each day.

Usually, the values of s, i and r are normalised so that the sum of the three is 1. In this case, the ratio of b/k is indicative of the number of people infected by an infected person. Assuming that each infected person infects another 2, what will happen in the next 100 days?

b = 2*k = .4

k = .2

The scipy package includes an integrate module, which can be used to compute the susceptible, infected and recovered populations for each succeeding day for as many days as needed.

import scipy as np

from scipy import integrate

def dif_eq(V,t,b,k):


Compute the derivatives for the differential equations

V = current values of [Susceptible, Infected, Recovered]


dVdt = np.zeros(3)

dVdt[0] = -b*V[0]*V[1]

dVdt[1] = b*V[0]*V[1] - k*V[1]

dVdt[2] = k*V[1]

return dVdt

P = 1e6 # a million

I0 = 10/P # 10 people are infected

S0 = 1 - I0 # Susceptible population

R0 = 0 # Initial Recovered

k = 0.2 # Infection lasts 5 days

b = .4 # Contact ratio is 2

t_array = np.arange(0, 100, 1)

RES = integrate.odeint(dif_eq, (S0,I0,R0), t_array,args=(b,k))


You define a method dif_eq which computes the derivatives for the system of differential equation. The parameters to this method are an array containing the current values, the current time and additional arguments, b and k in this case.

The key method is odeint whose parameters are a function defining the differential equations e.g. dif_eq above, a list containing the initial values, a time array at which solutions need to be found and any additional arguments to the dif_eq function. The best way to observe the results is graphically.

import matplotlib.pyplot as plt

def graph(RES):


plt.plot(t_array, RES[:,0], '-', label='Susceptible')

plt.plot(t_array, RES[:,2], '-', label='Recovered')





plt.plot(t_array, RES[:,1], '-', label='Infected')




It is at times easier to show sub-plots. In the example above, A plot with 2 rows and 1 column is created. In the first row, a sub-plot shows the the susceptible and recovered populations. The second sub-plot shows the infected population. The result is shown in Figure 1. The infections peak about 60 days after the outbreak and the maximum people infected at one time is about 16%, which is 160,000 in a population of a million!

Impact of Vaccinations and Quarantine

Vaccination results in reduction of the susceptible population. These people directly move into the recovered category without being infected. You can modify the dif_eq method as follows:

def dif_eq_vac(V,t,b,k,vac):

dVdt = np.zeros(3)

dVdt[0] = -b*V[0]*V[1] - vac

dVdt[1] = b*V[0]*V[1] - k*V[1]

dVdt[2] = k*V[1] + vac

return dVdt

If 1000 or 5000 people are vaccinated each day, you will need to modify your code to include

t_array = np.arange(0, 200, 1)

RES1 = integrate.odeint(dif_eq, (S0,I0,R0), t_array,args=(b,k))

RES2 = integrate.odeint(dif_eq_vac, (S0,I0,R0), t_array,args=(b,k,1000/P))

RES3 = integrate.odeint(dif_eq_vac, (S0,I0,R0), t_array,args=(b,k,5000/P))

Suppose a fraction q of the infected are quarantined. Hence, only (1-q) fraction will be a part of the infected group. The remaining will be treated as recovered/removed. The code for the differential equations becomes

def dif_eq_quarantine(V,t,b,k,q):

dVdt = np.zeros(3)

dVdt[0] = -b*V[0]*V[1]

dVdt[1] = b*(1-q)*V[0]*V[1] - k*V[1]

dVdt[2] = k*V[1] + b*q*V[0]*V[1]

return dVdt

Suppose a third of the infected population is quarantined. Add the code and call the plot routine.

RES4 = integrate.odeint(dif_eq_quarantine, (S0,I0,R0), t_array,args=(b, k, .333))

graph2(RES1, RES2, RES3, RES4)

The modified graph routine shows the infected populations with and without vaccination.

def graph2(RES1, RES2, RES3, RES4):

plt.plot(t_array, RES1[:,1], '-', label='None')

plt.plot(t_array, RES2[:,1], '-', label='1000 per day')

plt.plot(t_array, RES3[:,1], '-', label='5000 per day')

plt.plot(t_array, RES4[:,1], '-', label='1/3 Quarantined')




The results are shown in Figure 2. The infections now peak at 4% about 70 days after the outbreak. The effect of quarantine is dramatic even with only a third of the infected population isolated.

The intention of the above examples is to show the ease with which various models can be created and impact analysed. Strategies can be implemented and monitored. The effort is not in coding. It is in developing models which can be verified by observations. The model can be enhanced to include births, deaths and disease induced mortalities and much more as shown in the second reference.

The concept of viruses has been applied to ideas(meme), computer malware, viral marketing etc. An interesting example is an application to the modelling of a zombie attack - and .

With the scipy integrate module, you are not just restricted to understanding the dynamics of viruses and their impact on humans. Another very important model in biological systems is the predator-prey model. You can find a tutorial about using the scipy integrate module to solve those equations here:

Furthermore, pyplot offers tremendous versatility and flexibility in presenting the results so that they can attract attention and quickly convey the desired information.

References for the SIR model: