Portfolio optimization using cvxpy
Using cvxpy to solve optimization problems
- 1. Objective
- 2. Components of optimization
- 3. Analysis
- 4. Specify the components for optimization
- 5. Solve optimization using cvxpy
2. Components of optimization
2.1 Decision variable
Matrix X refers to a portion of each individual stock
$X = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}$
Here, the constraint is: $X\geq0$
2.2 Constraints
- Budget constraint
- $e^Tx = 1.0$ where $e=[1,1,1]$
- In other words, the investment portion of each stock should sum up to 1
- Expected return constraint
- We want our expected return of our portfolio to be higher than a certain threshold
- $\mathbb E\begin{bmatrix} \sum_{i=1}^{3}\tilde{r}_i x_i\end{bmatrix} = \sum_{i=1}^{3} \mathbb E\begin{bmatrix}\tilde{r}_i\end{bmatrix}x_i = \sum_{i=1}^{3} \bar{r}_i x_i$
- This could be the same this as $\bar{r}^T X$ which means the sum of multiplication of average return and investment portion
- Thus, $ \sum_{i=1}^{3} \bar{r}_i x_i \geq r_{min}$
- $r_{min}$ would be set based on our judgement
2.3 Objective function
This is the criteria for choosing the best set of decision.
- It is to minimize the variance of the portfolio returns
- $x^TQx$ where $Q$ is a covariance matrix
2.4 Assumption
It is assumed that the monthly stock returns have a stationary proability distribution. This means that it has a fixed distribution, and this also means that the projections done based on historical data is valid.
import pandas as pd
import numpy as np
from cvxpy import *
import pandas_datareader as pdr
tickers = ['MSFT', 'V', 'WMT']
start_date = '2019-01-02'
end_date = '2021-12-31'
stock_price = pdr.DataReader(tickers, 'yahoo', start_date, end_date)
stock_price.head(3)
stock_price = stock_price[[( 'Close', 'MSFT'),
( 'Close', 'V'),
( 'Close', 'WMT')]]
stock_price.columns = ['MSFT', 'V', 'WMT']
stock_price.reset_index(inplace=True)
stock_price.head(3)
stock_price['year'] = stock_price.Date.dt.year
stock_price['month'] = stock_price.Date.dt.month
stock_price = stock_price.groupby(['year', 'month']).mean().reset_index(drop=True)
stock_price.head(3)
$$stockReturn = \frac{price_t-price_{t-1}}{price_{t-1}}$$
stock_price_shift = stock_price.shift().rename(columns={"MSFT":"MSFT_sft",
"V":"V_sft",
"WMT":"WMT_sft"})
stock_price = pd.concat([stock_price, stock_price_shift[['MSFT_sft', 'V_sft', 'WMT_sft']]], axis=1)
stock_price.head(3)
monthly_returns = pd.DataFrame()
for ticker in ['MSFT', 'V', 'WMT']:
returns = (stock_price[ticker] - stock_price[ticker+'_sft']) / stock_price[ticker+'_sft']
monthly_returns = pd.concat([monthly_returns, pd.DataFrame(returns)], axis=1)
# rename the column names
monthly_returns.columns = ['MSFT', 'V', 'WMT']
# drop first row since it is not used when calculating stock returns
monthly_returns.drop(0, axis=0, inplace=True)
monthly_returns.head(3)
print("Average monthly return over the two years.")
monthly_returns.mean().reset_index()
print("Standard deviation of each stock over two years.")
monthly_returns.std().reset_index()
Now, in order to utilize cvxpy to optimize the current problem, we would have to specify the components we have identified earlier.
Objective
- minimize $x^TQx$ where $Q$ is a covariance matrix (risk)
Constraints
- $X>0$ where X is a portion of budget that goes into each stock
- $\sum X=1$ Investment portion of each stock should sum up to 1
- $ \sum_{i=1}^{3} \bar{r}_i x_i \geq r_{min}$
For $r_{min}$, lets set it to 2% since Microsoft has the highest monthly average return, and it is 3.5%.
Realistically, a portfolio which is a combination of other stocks would have lower average return.
symbols = monthly_returns.columns.to_list()
n = len(symbols)
x = Variable(n)
min_return = 0.02
r = monthly_returns.mean().values
# potfolio_return = multiply(r.T, x)
potfolio_return = r.T @ x
# objective is to minimize risk
covariance_matrix = np.asmatrix(np.cov(monthly_returns.values.T))
risk = quad_form(x, covariance_matrix)
# set the optimization problem
prob = Problem(Minimize(risk),
[sum(x) == 1, potfolio_return >= min_return, x >= 0])
prob.solve()
print("Here is the portion of each stock in order to minimize risk while the return is bigger than 2%.\n")
for i in range(len(symbols)):
print("{}: {:.2f}%".format(symbols[i], x.value[i]*100))
print()
print("The volatility of this portfolio is {:.5f}%".format(risk.value**0.5*100))
print("The return of this portfolio is {}%".format(round(potfolio_return.value*100, 0)))
You can check out for the official example of csvpy for portfolio optimization here