1. Objective

Would like to know how much investment should go into each stocks, in order to optimize the portfolio.
Here, optimization means

  • expected return exceeds minimum threshold
  • minimize the risk of the portfolio return

2. Components of optimization

2.1 Decision variable

Matrix X refers to a portion of each individual stock $X = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}$
Here, the constraint is: $X\geq0$

2.2 Constraints

  • Budget constraint
    • $e^Tx = 1.0$ where $e=[1,1,1]$
    • In other words, the investment portion of each stock should sum up to 1
  • Expected return constraint
    • We want our expected return of our portfolio to be higher than a certain threshold
    • $\mathbb E\begin{bmatrix} \sum_{i=1}^{3}\tilde{r}_i x_i\end{bmatrix} = \sum_{i=1}^{3} \mathbb E\begin{bmatrix}\tilde{r}_i\end{bmatrix}x_i = \sum_{i=1}^{3} \bar{r}_i x_i$
    • This could be the same this as $\bar{r}^T X$ which means the sum of multiplication of average return and investment portion
    • Thus, $ \sum_{i=1}^{3} \bar{r}_i x_i \geq r_{min}$
    • $r_{min}$ would be set based on our judgement

2.3 Objective function

This is the criteria for choosing the best set of decision.

  • It is to minimize the variance of the portfolio returns
    • $x^TQx$ where $Q$ is a covariance matrix

2.4 Assumption

It is assumed that the monthly stock returns have a stationary proability distribution. This means that it has a fixed distribution, and this also means that the projections done based on historical data is valid.

3. Analysis

import pandas as pd
import numpy as np
from cvxpy import *
import pandas_datareader as pdr

3.1 Read ticker data

It is easy to read ticker data using pandas-datareader

tickers = ['MSFT', 'V', 'WMT']

start_date = '2019-01-02'
end_date = '2021-12-31'

stock_price = pdr.DataReader(tickers, 'yahoo', start_date, end_date)
stock_price.head(3)
Attributes Adj Close Close High Low Open Volume
Symbols MSFT V WMT MSFT V WMT MSFT V WMT MSFT V WMT MSFT V WMT MSFT V WMT
Date
2019-01-02 97.782417 130.463150 88.576424 101.120003 132.919998 93.339996 101.750000 133.740005 93.650002 98.940002 129.600006 91.639999 99.550003 130.000000 91.639999 35329300.0 8788000.0 8152700.0
2019-01-03 94.185211 125.761711 88.120903 97.400002 128.130005 92.860001 100.190002 131.279999 94.709999 97.199997 127.879997 92.699997 100.099998 131.210007 93.209999 42579100.0 9428300.0 8277300.0
2019-01-04 98.565704 131.179657 88.671318 101.930000 133.649994 93.440002 102.510002 134.589996 93.660004 98.930000 130.130005 92.690002 99.720001 130.440002 93.209999 44060600.0 11065800.0 8029100.0

3.2 Derive monthly return

We'll just take the close price for each ticker. The monthly return will be used for simplicity.

stock_price = stock_price[[(    'Close', 'MSFT'),
                            (    'Close',    'V'),
                            (    'Close',  'WMT')]]
stock_price.columns = ['MSFT', 'V', 'WMT']
stock_price.reset_index(inplace=True)

stock_price.head(3)
Date MSFT V WMT
0 2019-01-02 101.120003 132.919998 93.339996
1 2019-01-03 97.400002 128.130005 92.860001
2 2019-01-04 101.930000 133.649994 93.440002
stock_price['year'] = stock_price.Date.dt.year
stock_price['month'] = stock_price.Date.dt.month

stock_price = stock_price.groupby(['year', 'month']).mean().reset_index(drop=True)

stock_price.head(3)
MSFT V WMT
0 104.135238 136.442381 95.809047
1 107.927894 143.581579 97.775263
2 115.133810 152.260954 98.275714

$$stockReturn = \frac{price_t-price_{t-1}}{price_{t-1}}$$

stock_price_shift = stock_price.shift().rename(columns={"MSFT":"MSFT_sft", 
                                                        "V":"V_sft", 
                                                        "WMT":"WMT_sft"})
stock_price = pd.concat([stock_price, stock_price_shift[['MSFT_sft', 'V_sft', 'WMT_sft']]], axis=1)

stock_price.head(3)
MSFT V WMT MSFT_sft V_sft WMT_sft
0 104.135238 136.442381 95.809047 NaN NaN NaN
1 107.927894 143.581579 97.775263 104.135238 136.442381 95.809047
2 115.133810 152.260954 98.275714 107.927894 143.581579 97.775263
monthly_returns = pd.DataFrame()

for ticker in ['MSFT', 'V', 'WMT']:
    returns = (stock_price[ticker] - stock_price[ticker+'_sft']) / stock_price[ticker+'_sft']
    monthly_returns = pd.concat([monthly_returns, pd.DataFrame(returns)], axis=1)
    
# rename the column names
monthly_returns.columns = ['MSFT', 'V', 'WMT']
# drop first row since it is not used when calculating stock returns
monthly_returns.drop(0, axis=0, inplace=True)
monthly_returns.head(3)
MSFT V WMT
1 0.036420 0.052324 0.020522
2 0.066766 0.060449 0.005118
3 0.066540 0.050027 0.026698

4. Specify the components for optimization

To reiterate, the objective of this analysis is to build a portfolio that has minimum risk.
And while there are many ways to represent risk, we will use the variance of the return.

print("Average monthly return over the two years.")
monthly_returns.mean().reset_index()
Average monthly return over the two years.
index 0
0 MSFT 0.034952
1 V 0.013684
2 WMT 0.011622
print("Standard deviation of each stock over two years.")
monthly_returns.std().reset_index()
Standard deviation of each stock over two years.
index 0
0 MSFT 0.049329
1 V 0.048437
2 WMT 0.035628

Now, in order to utilize cvxpy to optimize the current problem, we would have to specify the components we have identified earlier.

Objective

  • minimize $x^TQx$ where $Q$ is a covariance matrix (risk)

Constraints

  • $X>0$ where X is a portion of budget that goes into each stock
  • $\sum X=1$ Investment portion of each stock should sum up to 1
  • $ \sum_{i=1}^{3} \bar{r}_i x_i \geq r_{min}$

For $r_{min}$, lets set it to 2% since Microsoft has the highest monthly average return, and it is 3.5%.
Realistically, a portfolio which is a combination of other stocks would have lower average return.

symbols = monthly_returns.columns.to_list()
n = len(symbols)
x = Variable(n)

min_return = 0.02
r = monthly_returns.mean().values
# potfolio_return = multiply(r.T, x)
potfolio_return = r.T @ x

# objective is to minimize risk
covariance_matrix = np.asmatrix(np.cov(monthly_returns.values.T)) 
risk = quad_form(x, covariance_matrix)

# set the optimization problem
prob = Problem(Minimize(risk), 
               [sum(x) == 1, potfolio_return >= min_return, x >= 0])

5. Solve optimization using cvxpy

prob.solve()
0.0009641333660832668
print("Here is the portion of each stock in order to minimize risk while the return is bigger than 2%.\n")
for i in range(len(symbols)):
    print("{}: {:.2f}%".format(symbols[i], x.value[i]*100))
    
print()
print("The volatility of this portfolio is {:.5f}%".format(risk.value**0.5*100))
print("The return of this portfolio is {}%".format(round(potfolio_return.value*100, 0)))
Here is the portion of each stock in order to minimize risk while the return is bigger than 2%.

MSFT: 34.33%
V: 17.87%
WMT: 47.79%

The volatility of this portfolio is 3.10505%
The return of this portfolio is 2.0%

You can check out for the official example of csvpy for portfolio optimization here