Discrete Event Simulation

Many problems in real life are very complex. One can construct a mathematical model to understand them. However, even solving the models exactly can be virtually impossible. In such cases, numerical experiments or simulation can be very useful. SciPy includes a large number of statistical distributions.

For discrete event simulation, simpy is a very interesting and useful application. It is not yet integrated with SciPy/Numpy but, hopefully, it may be soon.

In this article, you will simulate queueing delays in a bank, building on top of the excellent tutorial included with simpy documentation(see references below). The extension is that you will use statistical distributions from the scipy package with the simulation framework of simpy.

Statistical Distributions

You can explore the Erlang distribution which seems to be a good option for modelling the arrival and servicing times. It is still widely used in tele-communications industry. Try the following code:

import matplotlib.pyplot as plt

import scipy as np

from scipy.stats import erlang

# 3 variations with mean = 2

er1=erlang(1,scale=2)

print 'mean, variance', er1.stats()

er2=erlang(2,scale=1)

print 'mean, variance', er2.stats()

er3=erlang(10,scale=.2)

print 'mean, variance', er3.stats()

# plot the probability distribution fn

x = np.linspace(0,5)

p=plt.plot(x, er1.pdf(x))

p=plt.plot(x, er2.pdf(x))

p=plt.plot(x, er3.pdf(x))

plt.show()

The plot in Figure 1 shows you what the probability distribution functions look like for the Erlang distribution with the parameters chosen. The function is like the exponential distribution when the shape parameter is one and seems to be good option for the customer arrival rate. It looks a lot like a normal distribution for larger values of the shape parameter. However, the distribution with the shape parameter 2 looks a lot like you may have experienced. No one leaves quickly and some seem to be on the counter for a long time. All three distributions have the same mean, but their variances differs. The output of this program will be

mean, variance (array(2.0), array(4.0))

mean, variance (array(2.0), array(2.0))

mean, variance (array(2.0), array(0.40))

Erlang distribution probability functions with mean 2: Shape=1(Blue), Shape=2(Green), Shape=10(Red).

A SimPy Model

Your model consists of a source which generates customers randomly. Each customer waits for a counter(resource) to become available. The customer then spends a random amount of time for getting his work done and leaves. Include the necessary packages and write the code for generating customers.

import SimPy.Simulation as simpy

from scipy.stats import erlang

class Source(simpy.Process):

""" Generate customers randomly"""

def generate(self, number, resource, pdf, pdf_counter):

for i in range(number):

c = Customer(name = "Customer%02d"%(i,))

simpy.activate(c, c.visit(pdf=pdf_counter, counter=resource))

# wait for the next customer

t = pdf.rvs()

yield simpy.hold, self, t

The class Source contains one method – generate. It takes as parameters the number of customers to be generated, counter resources and probability distribution functions for the arrival of customers and the serving time at the counter. You create a customer and activate the customer passing a method name (visit in your case) as a parameter. The simulation package will call the visit method, which will need to be defined in the Customer class. Then, you find the random time at which the next customer is expected and hold the process till then.

You may now define the Customer class.

class Customer(simpy.Process):

def visit(self, pdf, counter):

time_in = simpy.now()

yield simpy.request, self, counter

wait_time = simpy.now() - time_in

tib = pdf.rvs()

yield simpy.hold, self, tib

yield simpy.release, self, counter

time_in_bank = simpy.now() - time_in

process(self.name, time_in, wait_time, time_in_bank)

Customer class contains the method visit, which encompasses the logic for interaction of the customer with the bank. You track the time which the customer comes in. Customer waits for a counter to become available and you compute the wait time. Then, find the random time for servicing this customer and hold the process till it gets over. Release the counter. Finally, compute the time spent by the customer in the bank and process the times.

def main(maxTime, maxNumber, arrival_interval, mean_serve_time, counters=1):

print('%i counters with mean arrival and service times %4.1f %4.1f minutes'

%(counters, arrival_interval, mean_serve_time))

k = simpy.Resource(capacity=counters, name="Counter", unitName="Clerk")

simpy.initialize()

s = Source('Source')

simpy.activate(s, s.generate(number=maxNumber,

pdf=erlang(1, scale=arrival_interval),

pdf_counter=erlang(2, scale = mean_serve_time/2.0),

resource=k), at=0.0)

simpy.simulate(until=maxTime)

Your main program starts the simulation by creating the needed resources, activating the source process for generating customers using the generate method defined in the Source class. You use the Erlang distributions with shape factor 1 and 2 for arrivals and serving time.

def process(name, time_in, wait_time, time_in_bank):

print("%s %6.2f %5.2f %5.2f"%(name, time_in, wait_time, time_in_bank))



main(100, 5, 1.5, 2.0, 1)

For processing, just print the results and run the simulation with a customer arriving every 1.5 minutes and needing an average of 2 minutes to be served. You would expect the queue to build up pretty quickly as you can see from the results:

1 counters with mean arrival and service times 1.5 2.0 minutes

Customer00 0.00 0.00 1.08

Customer01 0.36 0.72 3.21

Customer02 0.41 3.17 5.52

Customer03 0.44 5.48 7.86

Customer04 1.40 6.90 9.00

Here is another run and the counter is usually free:

1 counters with mean arrival and service times 1.5 2.0 minutes

Customer00 0.00 0.00 2.62

Customer01 1.67 0.96 3.99

Customer02 4.29 1.36 2.82

Customer03 7.38 0.00 1.02

Customer04 9.42 0.00 0.51

And here is an output with 2 counters:

2 counters with mean arrival and service times 1.5 2.0 minutes

Customer00 0.00 0.00 2.54

Customer01 0.67 0.00 2.01

Customer03 2.04 0.64 0.78

Customer04 2.04 0.77 1.93

Customer02 1.79 0.75 2.46

You will notice that the customers do not necessarily exit in the same order as they entered. Also, even with 2 counters, there are times when queuing delay is present.

Obviously, you need to experiment with large number of customers and printing the results is not particularly useful or meaningful.

Statistical Analysis of the Simulation

You can collect the data in the process function and then use scipy's statistical functions. Simpy includes some simple statistical methods which will not be explored here.

def process(name, time_in, wait_time, time_in_bank):

wait_array.append(wait_time)

tib_array.append(time_in_bank)

import scipy as np

wait_array=[]

tib_array=[]

main(1000, 100, 1.5, 2.0, 2)

wt=np.array(wait_array)

tib=np.array(tib_array)

print(15*' '+'Mean Variance Median Max')

print('%s %6.2f %6.2f %6.2f %6.2f'%('Queue Time ',

wt.mean(), wt.var(), np.median(wait_array), wt.max()))

print('%s %6.2f %6.2f %6.2f %6.2f'%('Time in Bank',

tib.mean(), tib.var(), np.median(tib_array), tib.max()))

print('The percent of customers who did not have to wait %4.1f'

%(100 - 100.0*len(wt.nonzero()[0])/len(wt)))

You keep appending times to global python lists during the simulation. The lists are converted to numpy arrays and their mean, variance, median and maximum are printed. Of these, median is not a property of a numpy array and is computed using a standard numpy method. With two counters, a customer is handled every minute; hence, it also interesting to see how many customers did not have to wait. Numpy arrays have a number of useful properties, e.g. nonzero to get indices of the elements which are not zero.

The results for a run with two counters will look like:

2 counters with mean arrival and service times 1.5 2.0 minutes

Mean Variance Median Max

Queue Time   0.68  1.09    0.00  3.76

Time in Bank 2.55  2.99    2.11  9.48

The percent of customers who did not have to wait 58.0

With a sample of only a hundred, running the simulation again may give you substantially different results. You can experiment with larger sample sizes and explore the many more functions available in scipy.

With this article, we come to an end of our exploration of scipy, with the hope that you will find it useful for solving more complex research and educational problems.

References

SimPy Tutorial : http://simpy.sourceforge.net/SimPyDocs/SimPy_Tutorials.html

Erlang Distribution : http://en.wikipedia.org/wiki/Erlang_distribution

SciPy Statistics : http://docs.scipy.org/doc/scipy/reference/stats.html



Comments