Stata Optimization and Markowitz Efficient Frontier

Stata Optimization and Markowitz Efficient Frontier

In my two previous tweets, I used Stata and Python to fit Markowitz Efficient Frontier respectively: Use Stata to Fine the Markowitz Efficient Frontier & Use Stata to Fine the Markowitz Efficient Frontier. But Stata verison is imperfection, because I don’t know how to optimize in Stata. This problem was finally solved today: How to optimize in Stata.

Get stock data

First of all, review the previous code. This time, I used four stocks: Yintai Resource, Dayuecheng, China tianying, and Tonghua golden horse. The time range is from January 1, 2019 to today, August 14.

Here I use the cntrade2 command in my ‘finance’ package, about how to use this package, you can reference: FINANCE: Finance Toolkit in Stata. The cntrade2 command was mainly modified from ‘cntrade’ which written by ‘Crawler Club’.

Stata
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
clear all
cd "stata-markowits"
foreach i in "000975" "000031" "000035" "000766"{
cntrade2 `i', start(20190101) end(20190814)
save `i', replace
}

use 000975, clear
keep close date
ren close Yintai
save a000975, replace

use 000031, clear
keep close date
ren close Dayuecheng
save a000031, replace

use 000035, clear
keep close date
ren close Tianying
save a000035, replace

use 000766, clear
keep close date
ren close Tonghua
save a000766, replace

Merge these four datesets horizontally:

Stata
1
2
3
4
5
6
use a000975, clear
foreach i in "000031" "000035" "000766"{
merge 1:1 date using a`i'
drop _m
}
save rawdata, replace

Observing the trend of stock price

Standardize the stock price at the begining of the period to 100, and observe the stock price trend of four stocks:

Stata
1
2
3
4
5
6
7
8
9
10
11
12
13
14
use rawdata, clear
/* Standardization of stock price(Transform stock price in Juanuary 1, 2019 into 100). */
foreach i of varlist _all{
if "`i'" != "date"{
replace `i' = (`i' / `=`i'[1]') * 100
format `i' %6.2f
}
}
colorscheme 4, palette(Dark2)
tw ///
line Yintai Dayuecheng Tianying Tonghua date, ///
xla(21550(45)21775, ang(20)) leg(order(1 "Yintai" 2 "Dayuecheng" ///
3 "Tianying" 4 "Tonghua") pos(1) ring(0)) ///
lp(solid solid solid solid) lc(`r(colors)')

Next, we calculate the logarithmic return of each stock. We need to pay attention to that there are intervals between trading days. So we can’t use directly use the ‘date’ variable as date index.

Calculating the logarithmic returns

The formula of calculating the logarithmic returns:

$$R_{n} = \log{P_{n}} - \log{P_{n-1}}$$

where $P_{n}$ indicates stock price at $n$th day.

Stata
1
2
3
4
5
6
7
8
9
10
11
12
13
14
/* Calculating the logarithmic returns */
use rawdata, clear
gen dateid = _n
tsset dateid
foreach i of varlist _all{
if "`i'" != "date" & "`i'" != "dateid"{
gen l`i' = l.`i'
replace `i' = log(`i'/l`i')
drop l`i'
format `i' %6.4f
}
}
drop dateid
save returndata, replace

Observe the distribution of stock logarithmic returns:

Stata
1
2
3
4
5
6
7
8
9
10
11
12
13
14
use returndata, clear
foreach i of varlist _all{
if "`i'" != "date"{
ren `i' returns`i'
}
}

reshape long returns, i(date) j(name) string
hist returns, by(name, r note("")) bin(50) ///
xti(Daily Logarithmic Return Rate) yti(Frequency) ///
xla(, format(%6.2f)) norm

use returndata, clear
drop in 1

Portfolio construction

Maximizing Sharpe-ratio

Sharpe-ratio represents that how many excess returns can traders get when they take one unit additional risk. If it is positive, it means that the excess return rate is higher than the volatility. If it is negative, it means that the operational risk is higher than the excess return rate. In this way, each portfolio can calculate a Sharpe-ratio, that is, the ratio of return on investment to risk taking. The higher the Sharpe ratio, the better the corresponding portfolio.

$$Sharpe-ratio = \frac{R_{p} - R_{f}}{\sigma_{p}}$$

Where $R_{p}$ represents portfolio’s return. $R_{f}$ stand for risk-free interest rate which is assumed to be $4\%$, $\sigma_{p}$ is the standard deviation of portfolio returns, repersenting the volatility of portfolio returns.

Assuming that there are $n$ securities in the portfolio, the weight of security $i$ if $w_{i}$ with return rate at $r_{i}$. Then the standard deviation of the portfolio is: $\sqrt{w’Vw}$, where $w$ is the weight vector and $V$ is the variance-covariance matrix of the return rate vector. Next, we can write a program to maximize Sharpe-ratio. Now, we have four stocks, that is two say, we need to determine the weight of four stocks, Since the total weight should be 1, so we just need to determine three weight parameters.

First, calculate the logarithmic return of each stock and stored in a matrix named returns(actually a one-dimensional row vector).

Stata
1
2
3
4
5
6
7
8
9
10
11
12
13
use returndata, clear
drop in 1

local j = 1
foreach i of varlist _all{
if "`i'" != "date"{
qui sum `i'
local r`j' = r(mean)
local j = `j' + 1
}
}
di "`r1', `r2', `r3', `r4'"
mat returns = (`r1', `r2', `r3', `r4')

Then, calculate the variance-covariance matrix, stored it in matrix variance:

Stata
1
2
corr Yintai Dayuecheng Tianying Tonghua, cov
mat variance = r(C)

The next step if to use Stata to optimize. Four your understanding, Let me give you an example from Stata help document. If I want to optimize $y = e^{-x^2 + x - 3}$:

Stata
1
2
3
4
5
6
7
8
9
10
11
mata:
void myeval(todo, x, y, g, H)
{
y = exp(-x^2 + x - 3)
}
S = optimize_init()
optimize_init_evaluator(S, &myeval())
optimize_init_params(S, 0)
x = optimize(S)
x
end

mata: means to enter Mata language environment, then define a function named myeval, void means that the function has no return value. There are five parameters: $todo, x, y, g, H$. Here we just need to use $x$ and $y$.

That is to say, the parameters to be tuned are placed in the position of the second parameter, and the return value of the function is placed in the position of the third parameter.

S = optimize_init() indicates initialization, optimize_init_evaluator(S, &myeval()) set which function to be optimized, optimize_init_params(S, 0) set the initial parameter values, here is $x = 0$,The defaul optimization method is Newton–Raphson. At last, execute x = optimize(S) to optimize, $x$ is what we need.

First, we allow short selling which means the weight parameters can be negative.

Stata
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
mata:
mata clear
void sharpe_ratio(real scalar todo, real vector w123, sharpe, g, H)
{
real scalar w4
w4 = 1 - sum(w123)
real vector rmean
rmean = st_matrix("returns")
real scalar rp
real vector w
w = (w123, w4)
rp = w * rmean'
rp = rp * 252
real matrix variancep
variancep = st_matrix("variance")
variancep = w * variancep * w'
variancep = variancep * 252
real scalar std
std = sqrt(variancep[1, 1])
sharpe = (rp - 0.04) / std
}

S = optimize_init()
optimize_init_evaluator(S, &sharpe_ratio())
optimize_init_technique(S, "nr")
optimize_init_params(S, J(1, 3, 0.25))
wh = optimize(S)
wh

w = (wh, 1-sum(wh))
rp_yearly = (wh, 1-sum(wh)) * st_matrix("returns")' * 252
variancep = (w * st_matrix("variance") * w') * 252
std = sqrt(variancep[1, 1])
sharpe = (rp_yearly - 0.04) / std
"The weight vector is: "
w

"The portfolio return is:"
rp_yearly

"The portfolio standard deviation is:"
std

"The portfolio Sharpe-ratio is:"
sharpe
end

Here $w123$ is a three-dimensional row vectol. The result of above codes is:

Stata
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Iteration 0:   f(p) =  .40753973  (not concave)
Iteration 1: f(p) = 1.7935652 (not concave)
Iteration 2: f(p) = 2.4598186
Iteration 3: f(p) = 2.7009405
Iteration 4: f(p) = 2.7899491
Iteration 5: f(p) = 2.7955035
Iteration 6: f(p) = 2.7955184
Iteration 7: f(p) = 2.7955184

The weight vector is:
1 2 3 4
+-------------------------------------------------------------+
1 | .8653939127 1.508855624 -.5683459377 -.8059035992 |
+-------------------------------------------------------------+

The portfolio return is:
1.764668396
The portfolio standard deviation is:
.6169404465
The portfolio Sharpe-ratio is:
2.795518443

This it the portfolio with maximum Sharpe-ratio when we allow short selling.

Consider the short selling is not allowed. In order to ensure that the weight is positive and the sum is 1, I use the characteristics of the $e^x$ function.

Stata
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
use returndata, clear
drop in 1

local j = 1
foreach i of varlist _all{
if "`i'" != "date"{
qui sum `i'
local r`j' = r(mean)
local j = `j' + 1
}
}
di "`r1', `r2', `r3', `r4'"
mat returns = (`r1', `r2', `r3', `r4')

corr Yintai Dayuecheng Tianying Tonghua, cov
mat variance = r(C)


mata:
mata clear
void sharpe_ratio_no_short(real scalar todo, real vector w1234, sharpe, g, H)
{

real vector rmean
rmean = st_matrix("returns")
real scalar rp
real vector w
w = (exp(w1234[1]), exp(w1234[2]), exp(w1234[3]), exp(w1234[4])) / sum((exp(w1234[1]), exp(w1234[2]), exp(w1234[3]), exp(w1234[4])))
rp = w * rmean'
rp = rp * 252
real matrix variancep
variancep = st_matrix("variance")
variancep = w * variancep * w'
variancep = variancep * 252
real scalar std
std = sqrt(variancep[1, 1])
sharpe = (rp - 0.04) / std
}

S = optimize_init()
optimize_init_evaluator(S, &sharpe_ratio_no_short())
optimize_init_technique(S, "nr")
optimize_init_params(S, J(1, 4, 0.01))
wh = optimize(S)
wh

w = (exp(wh[1]), exp(wh[2]), exp(wh[3]), exp(wh[4])) / sum((exp(wh[1]), exp(wh[2]), exp(wh[3]), exp(wh[4])))
rp_yearly = ((exp(wh[1]), exp(wh[2]), exp(wh[3]), exp(wh[4])) / sum((exp(wh[1]), exp(wh[2]), exp(wh[3]), exp(wh[4])))) * st_matrix("returns")' * 252
variancep = (w * st_matrix("variance") * w') * 252
std = sqrt(variancep[1, 1])
sharpe = (rp_yearly - 0.04) / std
"The weight vector is: "
w

"The portfolio return is:"
rp_yearly

"The portfolio standard deviation is:"
std

"The portfolio Sharpe-ratio is:"
sharpe
end
`

And the result is:

Stata
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
: "The weight vector is: "
The weight vector is:

: w
1 2 3 4
+---------------------------------------------------------+
1 | .4691822479 .5308177521 3.35273e-13 5.02331e-16 |
+---------------------------------------------------------+

:
: "The portfolio return is:"
The portfolio return is:

: rp_yearly
.5616682688

:
: "The portfolio standard deviation is:"
The portfolio standard deviation is:

: std
.3221560385

:
: "The portfolio Sharpe-ratio is:"
The portfolio Sharpe-ratio is:

: sharpe
1.619303091

: end

A Sharpe-ratio function

I also write a function to calculate the Sharpe-ratio:

Stata
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
use returndata, clear
drop in 1

local j = 1
foreach i of varlist _all{
if "`i'" != "date"{
qui sum `i'
local r`j' = r(mean)
local j = `j' + 1
}
}
di "`r1', `r2', `r3', `r4'"
mat returns = (`r1', `r2', `r3', `r4')

corr Yintai Dayuecheng Tianying Tonghua, cov
mat variance = r(C)
mata:
mata clear
real scalar sharpe_ratio(real scalar w1, real scalar w2, real scalar w3)
{
real scalar w4
w4 = 1 - w1 - w2 - w3
real colvector w
w = (w1 \ w2 \ w3 \ w4)
real matrix r
real colvector rmean
rmean = st_matrix("returns")
real scalar rp
rp = w' * rmean'
rp = rp * 252
real matrix variancep
variancep = st_matrix("variance")
variancep = w' * variancep * w
variancep = variancep * 252
real scalar std
std = sqrt(variancep[1, 1])
real scalar sharpe
sharpe = (rp - 0.04) / std
return(sharpe)
}
end

mata: sharpe_ratio(0.4691822479 , 0.5308177521, 0)

The result is:

Stata
1
2
. mata: sharpe_ratio(0.4691822479 , 0.5308177521, 0)
1.619303091

This is consistent with the results of above optimization.

When the short selling is allowed, the portfolio whit maximum Sharpe-ratio is to buy half of Yintai resource and half of Dayuecheng。

Next, we use Monte Carlo simulation to observe the relationship between the return and standard deviation of all possible combinations when short selling is allowed.

Stata
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
cap prog drop front
prog def front, rclass
version 15.1
use returndata, clear
local w1 = runiform()
local w2 = runiform()
local w3 = runiform()
local w4 = runiform()
mat weight = (`w1' \ `w2' \ `w3' \ `w4')
mat list weight
mat weight = weight / (`w1'+`w2'+`w3'+`w4')
mat list weight
ret scalar w1 = weight[1, 1]
ret scalar w2 = weight[2, 1]
ret scalar w3 = weight[3, 1]
ret scalar w4 = weight[4, 1]
local j = 1
foreach i of varlist _all{
if "`i'" != "date"{
qui sum `i'
local r`j' = r(mean)
local j = `j' + 1
}
}
mat returns = (`r1', `r2', `r3', `r4')
mat list returns
mat a = returns * weight * 252
mat list a
ret scalar ret = a[1, 1]
corr Yintai Dayuecheng Tianying Tonghua, cov
ret list
mat variance = r(C)
mat list variance
mat b = weight' * variance * 252 * weight
mat list b
ret scalar var = b[1, 1]
ret scalar std = sqrt(b[1, 1])
end

simulate ret = r(ret) var = r(var) std = r(std) w1 = r(w1) w2 = r(w2) w3 = r(w3) w4 = r(w4), reps(100000): front
gen sharpe_ratio = (ret - 0.04)/std
save montecarlo, replace

Display using graph:

Stata
1
2
3
4
5
6
7
8
use montecarlo, clear
tw ///
sc ret std, msize(*0.01) xti(Volatility) yti(Return) msymbol(o) ///
xla(#6, format(%6.2f)) yla(, format(%6.1f)) ///
ti("Figure: The Relationship Between Portfolio Return and Volatility") || ///
scatteri 1.210332617 0.5197421221 (9) "Short Selling Allowed", mc("red") msymbol(o) mlabc("red") || ///
scatteri 0.5616682688 0.3221560385 "Short Selling Not Allowed", mc("green") msymbol(o) mlabc("green") ///
mlabp(2) ||, leg(off)

Obviously, short selling increases both returns and volatility.

Minimizing variance

Minimizing variance equal to minimize standard deviation. which is the right-most point on the figure above. The following procedure is used to find the corresponding combination construction method for this point.

Stata
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
use returndata, clear
drop in 1

local j = 1
foreach i of varlist _all{
if "`i'" != "date"{
qui sum `i'
local r`j' = r(mean)
local j = `j' + 1
}
}
di "`r1', `r2', `r3', `r4'"
mat returns = (`r1', `r2', `r3', `r4')

corr Yintai Dayuecheng Tianying Tonghua, cov
mat variance = r(C)

/* Short selling is allowed */
mata:
mata clear
void min_std(real scalar todo, real vector w123, std, g, H)
{
real scalar w4
w4 = 1 - sum(w123)
real vector rmean
rmean = st_matrix("returns")
real scalar rp
real vector w
w = (w123, w4)
rp = w * rmean'
rp = rp * 252
real matrix variancep
variancep = st_matrix("variance")
variancep = w * variancep * w'
variancep = variancep * 252
std = -sqrt(variancep[1, 1])
}

S = optimize_init()
optimize_init_evaluator(S, &min_std())
optimize_init_technique(S, "nr")
optimize_init_params(S, J(1, 3, 0.25))
wh = optimize(S)
wh

w = (wh, 1-sum(wh))
rp_yearly = (wh, 1-sum(wh)) * st_matrix("returns")' * 252
variancep = (w * st_matrix("variance") * w') * 252
std = sqrt(variancep[1, 1])
sharpe = (rp_yearly - 0.04) / std
"The weight vector is: "
w

"The portfolio return is:"
rp_yearly

"The portfolio standard deviation is:"
std

"The portfolio Sharpe-ratio is:"
sharpe
end

Note that in the code above, I put a negative sign in std = -sqrt(variancep[1, 1]) which minimizes the standard deviation when maximizing std.
The result is:

Stata
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
: w
1 2 3 4
+-------------------------------------------------------------+
1 | .2626422792 .5289960484 .2535941825 -.0452325102 |
+-------------------------------------------------------------+

:
: "The portfolio return is:"
The portfolio return is:

: rp_yearly
.4228356242

:
: "The portfolio standard deviation is:"
The portfolio standard deviation is:

: std
.2972620247

:
: "The portfolio Sharpe-ratio is:"
The portfolio Sharpe-ratio is:

: sharpe
1.287872626

: end

The location of this point is:

Stata
1
2
3
4
5
6
7
8
9
10
use montecarlo, clear
tw ///
sc ret std, msize(*0.01) xti(Volatility) yti(Return) msymbol(o) ///
xla(#6, format(%6.2f)) yla(, format(%6.1f)) ///
ti("Figure: The Relationship Between Portfolio Return and Volatility") || ///
scatteri 1.210332617 0.5197421221 (9) "Short Selling Allowed", mc("red") msymbol(o) mlabc("red") || ///
scatteri 0.5616682688 0.3221560385 "Short Selling Not Allowed", mc("green") msymbol(o) mlabc("green") ///
mc("green") msymbol(o) mlabc("green") mlabp(12) || ///
scatteri 0.4228356242 0.2972620247 "Minimum Variance Point", ///
mc("pink") msymbol(o) mlabc("pink") ||, leg(off)

Constructing the efficient frontier

The efficient frontier is a line on which the point represents the portfolio with minimum volatility under given return rate. Obvious it is the boundary line from the minimum variance point to the maximum Sharpe-ratio point. From the above graph, we can see that the return rate range from the minimum variance to the maximum Sharpe-ratio point which allows short selling is [0.44, 1.77]. Then calculte each standard deviation of at the interval of 0.01.

Stata
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
use returndata, clear
drop in 1
local j = 1
foreach i of varlist _all{
if "`i'" != "date"{
qui sum `i'
local r`j' = r(mean)
local j = `j' + 1
}
}
di "`r1', `r2', `r3', `r4'"
mat returns = (`r1', `r2', `r3', `r4')
corr Yintai Dayuecheng Tianying Tonghua, cov
mat variance = r(C)

mata:
mata clear
void min_std(real scalar todo, real vector w1234, std, g, H)
{
real vector rmean
rmean = st_matrix("returns")
real scalar rp
real vector w
w = w1234
rp = w * rmean'
rp = rp * 252
real matrix variancep
variancep = st_matrix("variance")
variancep = w * variancep * w'
variancep = variancep * 252
std = -sqrt(variancep[1, 1])
}
std_vector = J(1, 79, 0)
j = 1
for(i = 0.44; i < 1.23; i = i + 0.01)
{
S = optimize_init()
optimize_init_evaluator(S, &min_std())
optimize_init_technique(S, "nr")
optimize_init_constraints(S, ((st_matrix("returns") * 252 \ J(1, 4, 1)), (i \ 1)))
optimize_init_params(S, J(1, 4, 0.25))
wh = optimize(S)
wh

w = wh
variancep = (w * st_matrix("variance") * w') * 252
std = sqrt(variancep[1, 1])
std_vector[1, j] = std
j++
}
st_matrix("std_vector", std_vector)
end
clear
set obs 79
gen ret = (_n + 43) / 100
gen std = .
forval i = 1/79{
replace std = std_vector[1, `i'] in `i'
}
gen front = 1
save front, replace

Similarly, we plot these 134 points on the graph:

Stata
1
2
3
4
5
6
7
8
9
10
11
12
13
use montecarlo, clear 
append using front
tw ///
sc ret std, msize(*0.01) xti(Volatility) yti(Return) msymbol(o) ///
xla(#6, format(%6.2f)) yla(, format(%6.1f)) ///
ti("Figure: The Relationship Between Portfolio Return and Volatility") || ///
scatteri 1.210332617 0.5197421221 (11) "Short Selling Allowed", mc("red") msymbol(o) mlabc("red") || ///
scatteri 0.5616682688 0.3221560385 (2) "Short Selling Not Allowed", mc("green") msymbol(o) mlabc("green") ///
mc("green") msymbol(o) mlabc("green") mlabp(12) || ///
scatteri 0.4228356242 0.2972620247 "Minimum Variance Point", ///
mc("pink") msymbol(o) mlabc("pink") || ///
sc ret std if front == 1, msize(tiny) msymbol(D) ///
mc("dkorange") leg(off)





Click to download the relevant code for this article


unsplash-logoKeytion

# Stata

Comments

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×