Stata旧笔记整理(三)

Stata旧笔记整理(三)

之前老网站上有很多没有很好整理的笔记。之前也整理过一些,但是还有两百多篇,所以就简单汇总一下,便于检索。

数值积分方法

区间[0, 1]上的积分

1
2
3
4
5
6
7
8
clear all
set more off
set seed 2890
set obs 10000
gen x = uniform()
gen y = x*sin(x)*ln(1+x^2)
qui sum y
di r(mean)

有界区间上的定积分

1
2
3
4
5
6
7
8
clear all
set more off
set seed 5678
set obs 10000
gen t = uniform()
gen y = (1+4*t)*sin(1+4*t)*ln(1+(1+4*t)^2)
qui sum y
di r(mean)*4
1
2
3
4
5
6
7
8
clear all
set more off
set seed 56789
set obs 10000
gen x = uniform()*4 + 1
gen y = x*sin(x)*ln(1+x^2)
qui sum y
di r(mean)*4

有界区域上的曲面积分

1
2
3
4
5
6
7
8
9
10
11
clear all
set more off
set obs 10000
set seed 1234
gen x = uniform()*2 - 1
set seed 6789
gen y = uniform()*2 - 1
keep if x^2 + y^2 <1
gen z = x^5 + 3*x^4 + ln(abs(x) + sin(x))
qui sum z
di r(mean)*c(pi)

无限区间上的广义积分

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
clear all
set more off
cap program drop integration
program define integration, rclass
version 14.2
syntax [, obs(integer 10000)]
drop _all
set obs `obs'
tempvar x
gen `x' = invnorm(uniform())
qui sum `x'
local var = r(Var)
return scalar var = `var'
end
simulate var = r(var), reps(1000): integration, obs(100000)
sum var
1
2
3
4
5
6
7
8
clear all
set more off
set seed 5678
set obs 10000
gen x = rnormal()
gen y = x^2
sum y
di r(mean)
1
2
3
4
5
6
7
8
clear all
set more off
set seed 2345
set obs 10000
gen t = uniform()
gen y = 1/t^2*(1/t-1)^2*exp(-(1/t-1)^2/2)
qui sum y
di r(mean)*2/sqrt(2*c(pi))

当积分区间不是[0, 1]的时候,可以通过一定的变量变换把区间变成[0, 1]

不收敛的广义定积分

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
clear all
set more off
cap program drop integration
program define integration, rclass
version 14.2
syntax [, obs(integer 10000)]
drop _all
set obs `obs'
tempvar x fx
gen `x' = uniform()
gen `fx' = 1/`x' + 1/(1-`x')
qui sum `fx'
local mean = r(mean)
return scalar mean = `mean'
end
simulate mean = r(mean), reps(1000): integration, obs(10000)
sum mean, d

用蒙特卡洛方法计算冰淇淋体积:

1
2
3
4
5
6
7
8
9
clear all
set more off
set obs 10000
gen x = uniform()
gen y = uniform()
keep if x^2 + y^2 <= 1
gen d = (1-sqrt(1-x^2-y^2)) + sqrt(x^2+y^2)
sum y
di r(mean) * c(pi)

数值模拟解方程

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
* 首先计算x^2 + x^3 = 12,先画图观察
clear all
gen y = x^2 + x^3 - 12
tw function y = x^2 + x^3 - 12, range(1.5 2.5)

clear all
set obs 100
local a = 0
local b = 16
gen y = .
gen x = .
local j = 1
forv i = 1/100{
local x = 0.5*(`a' + `b')
local y = `x'^2 + `x'^3 - 12
if `y' == 0 | abs(`y') < 0.000001 {
qui replace x = `x' in `j'
qui replace y = `y' in `j'
continue, break
}
else{
if `y' > 0{
local b = `x'
qui replace x = `x' in `j'
qui replace y = `y' in `j'
}
}
else{
if `y' < 0{
local a = `x'
qui replace x = `x' in `j'
qui replace y = `y' in `j'
}
}
local j = `j' + 1
}
* 此时解出方程的解为2
* 再解一个复杂的:
* 首先看图像
clear all
gen y = 5*x + x^2 + 3*x^3 + x^4 + 8*x^5 - 100
tw function y = 5*x + x^2 + 3*x^3 + x^4 + 8*x^5 - 100, range(1 2)

clear all
set obs 100
local a = 1.4
local b = 1.8
gen y = .
gen x = .
local j = 1
forv i = 1/100{
local x = 0.5*(`a' + `b')
local y = 5*`x' + `x'^2 + 3*`x'^3 + `x'^4 + 8*`x'^5 - 100
if `y' == 0 | abs(`y') < 0.000001 {
qui replace x = `x' in `j'
qui replace y = `y' in `j'
continue, break
}
else{
if `y' > 0{
local b = `x'
qui replace x = `x' in `j'
qui replace y = `y' in `j'
}
}
else{
if `y' < 0{
local a = `x'
qui replace x = `x' in `j'
qui replace y = `y' in `j'
}
}
local j = `j' + 1
}
* 解出x = 1.55

双变量带状热力图

这幅图的做法是将两张散点图叠加在一起,第一张散点图使用数字做标签,第二张散点图使用方块做标签。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
sysuse nlsw88, clear
* 生成一个变量g为8和grade中的较大值,因为绘制将只绘制教育年限大于8年的
gen g = max(grade, 8)
label var g "受教育年限"
rename married m
label var m "是否已婚"
* 按照g 和 m 计算wage的均值(均值是默认的)并生成一个新的数据集
collapse wage, by(g m)
* 生成一个心的变量w为wage保留两位有效数字
gen w = round(wage, 0.01)
sum wage, meanonly
*生成一个变量c用来控制方块的颜色
gen c = round((wage-r(min))/(r(max)-r(min))*255)
qui levelsof c, local(cs)
local g
foreach c of local cs{
local c1 = 120 + round(`c'/2)
local c2 = 255 - round(`c'/2)
local m mc("`c2' `c2' `c1'")
* mc用来设定标签的颜色,这里使用的是三个RGB值,三个255表示白色
local g `g' || scatter m g if c == `c', ms(S) msize(ehuge) `m'
}
local g `g' || scatter m g, ms(i) mlab(w) mlabp(0)
*上面这句就是绘制用变量值作为标签的图层了,指定标签的方向为0点钟的方向
local g `g' legend(off) yla(-0.75 " " 0 "N" 1 "Y" 1.75 " ", notick)
local g `g' xla(7 " " 8/18 19 " ", notick)
tw `g' scheme(s1mono) title(工资热力图) xsize(8) ysize(5)
graph export 工资热力图.png, replace

双因子多变量输出结果对比金字塔图

这幅图的做法是在条形图上添加文字

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
clear all
input input low high mean
1 1.6 2.2 1.9
2 1.5 2.3 1.9
5 2.9 .01 1.9
3 1.0 2.9 1.9
4 0.5 3.2 1.9
end
label var input "投入"
label var low "下限"
label var high "上限"
label var mean "均值"

label define inputlabel ///
1 "变量名1" ///
2 "变量名2" ///
3 "变量名3" ///
4 "变量名4" ///
5 "变量名5"

label val input inputlabel

gen sens = high - low
sort sens
local ml1 = (2.2+1.9)/2
local mh1 = (1.6+1.9)/2
local ml2 = (2.3+1.9)/2
local mh2 = (1.5+1.9)/2
local ml3 = (2.9+1.9)/2
local mh3 = (1.0+1.9)/2
local ml4 = (3.2+1.9)/2
local mh4 = (0.5+1.9)/2
local ml5 = (.01+1.9)/2
local mh5 = (2.9+1.9)/2
local format fcolor(white) lcolor(white)

tw ///
rbar low mean input, horizontal || ///
rbar mean high input, horizontal ///
yla(, val angle(0)) xsize(7.1) ysize(5.2) ///
scheme(s1mono) xtitle(" " "x轴标题") legend(label(1 "低投入值") label(2 "高投入值") rows(2)) ///
text(5 `ml5' "93.5%", box j(center) margin(l+1 r+1 t+1 b+1) `format') ///
text(5 `mh5' "99.9%", box j(center) margin(l+1 r+1 t+1 b+1) `format') ///
text(4 `ml4' "5%", box j(center) margin(l+1 r+1 t+1 b+1) `format') ///
text(4 `mh4' "0.8%", box j(center) margin(l+1 r+1 t+1 b+1) `format') ///
text(3 `ml3' "3", box j(center) margin(l+1 r+1 t+1 b+1) `format') ///
text(3 `mh3' "1", box j(center) margin(l+1 r+1 t+1 b+1) `format') ///
text(2 1.4 "1:28, 600", box margin(l+1 r+1 t+1 b+1) `format' placement(9) j(right)) ///
text(2 2.4 "1:18, 200", box margin(l+1 r+1 t+1 b+1) `format' placement(3) j(left)) ///
text(1 1.5 "13.4million", box margin(l+1 r+1 t+1 b+1) `format' placement(9) j(right)) ///
text(1 2.4 "19million", box margin(l+1 r+1 t+1 b+1) `format' placement(3) j(left)) ///
title(双因子多变量输出结果对比金字塔图) subtitle(一个示例)
graph export 双因子多变量输出结果对比金字塔图.png, replace

四象限图与坐标轴上的杠杠

去掉两个坐标轴,然后使用散点图的方法在两条直线上画杠杠。

1
2
3
4
5
6
7
8
9
10
11
12
13
clear all
sysuse auto, clear
gen y = (price - 5000)/1000
gen x = mpg - 20
* 这个图将会使用scatteri命令,注意这里真的是scatteri
local x `"||scatteri 0 -10 "-10" 0 10 "10" 0 20 "20", mlabpos(6) mlabsize(*1.25) msymbol(none) mlabcolor(black)"'
local x `"`x' ||scatteri 0 -10 "|" 0 10 "|" 0 20 "|", mlabpos(0) msymbol(none) mlabcolor(black)"'
local y `"|| scatteri -3 0 "-3" 3 0 "3" 6 0 "6" 9 0 "9", mlabpos(9) mlabsize(*1.25) msymbol(none) mlabcolor(black) "'
local y `"`y' || scatteri -3 0 "_" 3 0 "_" 6 0 "_" 9 0 "_", mlabpos(0) msymbol(none) mlabcolor(black)"'
local ax "yli(0, lc(black) lw(thin)) xli(0, lc(black) lw(thin))"
local ax "`ax' ysc(off) xsc(off) leg(off)"
scatter y x, ms(t) `ax'`x'`y' scheme(s2mono) graphr(fc(white)) yla(, nogrid)
graph export 四象限图与坐标轴上的杠杠.png, replace

特殊文本的使用

1
2
3
4
vguse allstates, clear
tw sc propval100 popk || ///
lfit propval100 popk ||, ///
ti(y = {&alpha} + {&beta}*pop + {&epsilon})

1
2
3
tw sc propval100 popk || ///
lfit propval100 popk ||, ///
ti(y = {&Alpha} + {&Beta}*pop + {&Epsilon})

1
2
3
tw sc propval100 popk || ///
lfit propval100 popk ||, ///
ti("p {&le} 0.150, y = {&function}(x), 1 {&ne} 2")

1
2
3
tw sc propval100 popk || ///
lfit propval100 popk ||, ///
ti(y = {&Beta}{sub:0} + {&Beta}{sub:1}*Pop + {&Beta}{sub:2}*Pop{sup:2})

1
2
3
tw sc propval100 popk || ///
qfit propval100 popk ||, ///
ti(y = {&Beta}{sub:0} + {&Beta}{sub:1}*Pop + {&Beta}{sub:2}*Pop{sup:2}) note({bf: Source of data: The 1990 US Census 5% PUMS housld file}) caption({it: Type of model: Quadratic regression})

1
2
3
4
tw sc propval100 popk || ///
qfit propval100 popk ||, ///
note({bf: Source of data: The 1990 US Census 5% PUMS housld file}) caption({it: Type of model: Quadratic regression}) ///
ti(stSerif: This is the title in a serif font.)

1
2
3
4
5
6
tw sc propval100 popk || ///
qfit propval100 popk ||, ///
note({stSerif: This note uses a monospace font.}) ///
caption({stSymbol: This caption use a symbol font. ABCDEFG abcdef}) ///
ti(stSans: This is the title in a sans font.) ///
subti(stSerif: This is the subtitle in a serif font.)

1
sc propval100 popk, ti({fontface Papyrus: This is the title in Papyrus.})

1
sc propval100 popk, ti({fontface Zapfino: This is the title in Zapfino.})

1
sc propval100 popk, ti({fontface MarkerFelt-Thin: This is the title in MarkerFelt-Thin.})

1
sc propval100 popk, ti({fontface Herculanum: This is the title in Herculanum.})

1
sc propval100 popk, ti({fontface AmericanTypewriter: This is the title in AmericanTypewriter.})

添加平均柱条

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
sysuse auto, clear
sum price if for == 0 & !missing(rep78)
local ave1 = r(mean)

sum price if for == 1 & !missing(rep78)
local ave2 = r(mean)

set obs `=_N+3'

replace for = 0 in -3
replace rep78 = 6 in -3

replace for = 0 in -2
replace rep78 = 7 in -2

replace price = `ave1' in -2

replace for = 1 in l
replace rep78 = 7 in l
replace price = `ave2' in l

label define kk 1 1 2 2 3 3 4 4 5 5 6 "-" 7 "Average"
label value rep78 kk

gr hbar (mean) price, over(rep78) over(for) showyvars ///
asyvars legend(off) ///
bar(7, color(7)) blabel(bar)

添加文本

placement:n/ne/e/se/s/sw/w/nw/c

1
2
vguse allstates, clear
tw sc ownhome borninstate, text(43 40 "DC", placement(ne))

分轴添加文本

1
2
3
tw sc ownhome propval100, xaxis(1) || ///
sc ownhome borninstate, xaxis(2) ||, ///
text(43 66 "DC") text(43 42 "DC", xaxis(2))

1
tw sc ownhome borninstate, ti("% who own home by" "% that reside in state of birth", box justification(left))

1
tw sc ownhome borninstate, ti("% who own home by" "% that reside in state of birth", box bexpand)

1
tw sc ownhome borninstate, ti("% who own home by" "% that reside in state of birth", box span bexpand)

bmargin(l r t b)

1
tw sc ownhome borninstate, ti("% who own home by" "% that reside in state of birth", box span bexpand justification(left) bmargin(0 0 3 3))

1
tw sc ownhome borninstate, ti("% who own home by" "% that reside in state of birth", box margin(medium))

1
tw sc ownhome borninstate, ti("% who own home by" "% that reside in state of birth", box linegap(4))

1
tw sc ownhome borninstate, ti("% own home by % reside in state", box fc(ltblue) lc(gray))

1
tw sc ownhome borninstate, ti("% own home by % reside in state", box bc(gold))

1
tw sc ownhome borninstate, by(nsw, ti("% own home by % reside in state", ring(0) pos(5) box width(65) height(40)))

1
tw sc ownhome borninstate, by(nsw, ti("% own home by % reside in state", ring(0) pos(5) box width(65) height(40) justification(left) alignment(top)))

条形图

1
2
sysuse sp500, clear
tw bar close date

竖直

1
tw bar close date, horiz

base()

1
tw bar close date, base(1150)

barw()

1
tw bar close date, barw(0.7) base(1150)

柱型

1
tw bar close date in 1/10, lcolor(gs5) fcolor(gs5)

条形图的着色

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
clear all
input ///
country gdp
1 100
2 110
3 240
4 50
5 10
end

list

gsort -gdp

gen gdpred = gdp if _n == 1
gen gdpgreen = gdp if _n == 2
gen gdpblue = gdp if _n > 2
list

gr hbar gdpred gdpgreen gdpblue, over( ///
country, sort(gdp) descending) nofill ///
leg(label(1 "Best") label(2 "2nd best") ///
label(3 "rest") r(1)) bar(1, color(cranberry)) ///
bar(2, color(dkgreen)) bar(3, color(dknavy))

# Stata

评论

程振兴

程振兴 @czxa.top
截止今天,我已经在本博客上写了659.4k个字了!

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×