Optimising Returns, Minimising Risk

Can scipy be useful for economics and social sciences? The answer, of course, is yes.

One of the best known methods of minimising risk is insurance. However, it came into existence in the 1600's. Economic theory, like insurance, is heavily based on statistics. Economic models are not perfect as the global recession in the last few years have shown. It is not that modelling is wrong. Rather, more complex models are needed. Human behaviour cannot be ignored.

As models get more complex, spreadsheets with macros will not be adequate and more complex tools will be needed. An example of the possibilities is an interesting commercial product Resolver One which blends a spreadsheet-like interface with the ability to program in python(ironpython) and includes some support for NumPy.

It is likely that scipy will find proponents among social scientists as well. You can be among the early converts. Explore a few trivial examples in order to get an idea of the tools available.

Reducing Risk by Diversification

The risk for an investment can be estimated by the standard deviation of the result. Consider the following options

Investment 1 at 10% return with a standard deviation of 10%

Investment 2 at 15% return with a standard deviation of 20%

Suppose a fraction p is invested in Investment 1 and the remaining (1-p) in Investment 2. What would be the resulting standard deviation? The textbook formula is

You can use the matplotlib to plot this equation for several values of correlation and visualise it. The plot is shown in Figure 1.

import scipy as np

def std_dev(p, c12):

vpsq = (p*v1)**2 + ((1-p)*v2)**2 + p*(1-p)*c12*v1*v2

return np.sqrt(vpsq)

def avg_return(p):

return p*r1 + (1-p)*r2

r1, v1 = 10, 10

r2, v2 = 15, 20

p = np.arange(0,1.01,.01)

# Plot the results

import matplotlib.pyplot as plt

plt.plot(avg_return(p),std_dev(p, 0),'-', label='Independent')

plt.plot(avg_return(p),std_dev(p, 0.3),'-', label='Correlation 0.3')

plt.plot(avg_return(p),std_dev(p, -0.3),'-', label='Correlation -0.3')

plt.xlabel('Expected Return')

plt.ylabel('Standard Deviation')



The safest investment has a mixture of both and gives a better return than the safer of the two.

The best proportion to use will be the one which minimises the standard deviation. It is easy to get that value. Just add the following code:

from scipy import optimize

res = optimize.fmin(std_dev, 50, args=(0,))

print "Result", res

The function fmin in the optimize module accepts a function, std_dev in your case, as a parameter and a starting guess. Additional parameters for the std_dev function are passed using the args parameter. It then finds the optimal value. This should give you the result that you should have about 80% in first investment and 20% in the second as the safest option in case the investments are independent:

Optimization terminated successfully.

Current function value: 8.944272

Iterations: 23

Function evaluations: 46

Result [ 0.80001831]

You know that the investments are usually correlated. So, you may want to examine visually how the result varies depending upon the correlation. Add the following code to compute the optimum proportion of Investment 1 for a list of correlations and plot it:

def optimum_p(corr):

res = []

starting_val = 50

for c12 in corr:


# use the previous result as a starting guess

starting_val = res[-1]

return res

corr_values = np.arange(-1, 1., 0.02)

res = optimum_p(corr_values)

fig = plt.figure()

fig.canvas.set_window_title('Figure 2')



plt.ylabel('Optimum Proportion of Investment 1')


The result is shown in Figure 2. If both investments are perfectly correlated, then you are better with everything in Investment 1. However, if they are have a perfect negative correlation, the best option is to invest only 70% in Investment 1 and remaining in Investment 2. An example of negative correlation is the frequently heard advice: “When stock markets are booming, interest rates fall.”

If only finding the correct model in economics were easier!

LP - Finding a Solution

Suppose the government can spend a 100 rupees on increasing consumption and production of food. It estimates that each rupee spent on Rural Employment will increase consumption by a rupee. This increase must be matched by an increase in production. It estimates that each rupee spent on infrastructure will increase production by 2 rupees. So, you can formulate the problem as:

c– p =0 # consumption equals production

c – r =0 # consumption equals expenditure on rural employment

p – 2*i = 0 # production equals twice the expenditure on infrastructure

r + i = 100 # sum of the expenditures must be 100

You can use the scipy's linalg module to solve this problem as follows, where the four columns are c, p, r, i.

>>> from scipy import linalg

>>> A=[[1,-1,0,0],[1,0,-1,0],[0,1,0,-.2],[0,0,1,1]]

>>> b=[0,0,0,100]

>>> linalg.solve(A,b)

array([ 66.66666667, 66.66666667, 66.66666667, 33.33333333])

Suppose government can also waste money, w. That is

r + i + w = 100

You now have 4 equations but 5 unknowns. So, you can use the lstsq method and get a solution.

>>> A=[[1,-1,0,0,0],[1,0,-1,0,0],[0,1,0,-2,0],[0,0,1,1,1]]

>>> linalg.lstsq(A,b)

(array([[ 27.27272727],

[ 27.27272727],

[ 27.27272727],

[ 13.63636364],

[ 59.09090909]]), array([], dtype=float64), 4, array([ 2.53498934, 1.87927105, 1.13653538, 0.86628904]))

The first array in the result contains the recommended values for the 5 parameters. The solution is undesirable, even if realistic. It suggests that 59% of the money be wasted! Your model will need to minimise wastage.

Optimising in the Face of Constraints

Scipy as yet does not include a module for optimising a utility function subject to constraints. One project extending scipy for this functionality is OpenOpt. OpenOpt uses python-cvxopt as one of the solvers. So, you will need to install python-cvxopt, whichis available in the Fedora or Ubuntu repository. OpenOpt can be installed from http://openopt.org/.

You define the optimisation problem using an lp example described on the OpenOpt site, as follows:

import scipy as np

from openopt import LP

# Parameters (c, p, r, i, w)

# Minimze w

f = np.array([0,0,0,0, 1])

# Equality constraints

# Same 4 eq as above

Aeq = np.array([[1,-1, 0, 0, 0],[1, 0, -1, 0, 0],

[0, 1, 0, -2, 0],[0, 0, 1, 1, 1]])

beq = (0,0,0,100)

# lower bounds – all values must be >= 0

lb = [0,0,0,0,0]

# Define and solve the problem

p = LP(f, Aeq=Aeq, beq=beq,lb=lb)

r = p.solve('cvxopt_lp')

p.debug = 1

# The optimum values

print 'x_opt:', r.xf

If everything goes well, you should see the result that wastage should be zero.


solver: cvxopt_lp problem: unnamed goal: minimum

pcost dcost gap pres dres k/t

0: 5.9091e+01 5.9091e+01 2e+02 4e-16 3e+00 1e+00

1: 3.7282e+01 3.7493e+01 3e+01 3e-14 4e-01 3e-01

2: 4.7204e-01 5.4936e-01 1e+00 1e-14 1e-02 8e-02

3: 4.7330e-03 5.5094e-03 1e-02 2e-14 1e-04 8e-04

4: 4.7331e-05 5.5096e-05 1e-04 3e-16 1e-06 8e-06

5: 4.7331e-07 5.5096e-07 1e-06 3e-14 1e-08 8e-08

6: 4.7331e-09 5.5096e-09 1e-08 3e-14 1e-10 8e-10

Optimal solution found.

istop: 1000 (optimal)

Solver: Time Elapsed = 0.1 CPU Time Elapsed = 0.01

objFunValue: 4.7331067e-09 (feasible, max constraint = 2.84217e-14)

x_opt: [ 6.66666667e+01 6.66666667e+01 6.66666667e+01 3.33333333e+01


You can explore alternate optimisations, e.g. maximize c (minimise -c) or minimise (w – c), etc. Your constraints can include inequalities as well.

OpenOpt includes non-linear optimisation tools as well.

Scipy and extensions make it easy to formulate and explore solutions of the problems. Matplotlib makes it easy to visualise the results. So, you can focus on the creative and hard part of defining the models and prevent any future crises in economics!


You can learn a lot from the courses available online. For example,