Stata旧笔记整理(五)

Stata旧笔记整理(五)

之前老网站上有很多没有很好整理的笔记。之前也整理过一些,但是还有两百多篇,所以就简单汇总一下,便于检索。

一幅密度图

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
cap which spgrid
if _rc != 0{
ssc install spgrid
}
cap which spkde
if _rc != 0{
ssc install spkde
}
cap which mylabels
if _rc != 0{
ssc install mylabels
}

sysuse auto, clear

su price mpg
clonevar x = mpg
* clonevar比起gen的好吃是可以连标签一并克隆
clonevar y = price
replace x = (x-0)/(50-0)
replace y = (y-0)/(20000-0)
* 定义自己的label
mylabels 0(10)50, myscale((@-0)/(50-0)) local(XLAB)
mylabels 0(5000)20000, myscale((@-0)/(20000-0)) local(YLAB)
keep x y
save xy, replace

* 生成100x100的格点
spgrid, shape(hexagonal) xdim(100) ///
xrange(0 1) yrange(0 1) ///
dots replace ///
cells("2D-GridCells.dta") ///
points("2D-GridPoints.dta")

spkde using "2D-GridPoints.dta", ///
xcoord(x) ycoord(y) ///
bandwidth(fbw) fbw(0.1) dots ///
saving("2D-Kde.dta", replace)

use "2D-Kde.dta", clear

merge 1:1 _n using xy.dta

tw contour p spgrid_ycoord spgrid_xcoord ///
if p != 0, levels(15) || ///
sc y x, mcolor(black) msize(small) ||, ///
xla(`XLAB', nogrid) xti("里程数") ///
yla(`YLAB', nogrid) yti("价格") ///
plotregion(color(blue)) scheme(s1color) ///
zla(, format(%6.2f))
gr export 一幅密度图.png, replace

一幅有趣的文字图

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
clear programs
program s
sysuse auto, clear
tempname hdle
tempfile info
postfile `hdle' str20 variable missings using `info'
qui ds, has(type numeric)
qui foreach var in `r(varlist)'{
replace `var' = . if runiform() < 0.1
qui count if `var' == .
post `hdle' ("`var'") (r(N))
}
postclose `hdle'

use `info', clear
gen number = _n
list, noobs
gen x = 1
sum missings, meanonly
gen factor = missings/r(mean)
levelsof variable, local(aa)
local a = 1
foreach i of local aa{
local f = factor[`a']
local txt `"`txt' text(`a' 1 "`i'", size(*`f'))"'
cap `++a'
}

sc number x, ms(i) ///
xti("") xsc(off) ///
yti("") ysc(off) ///
plotr(lstyle(none)) ///
`txt'
end

s

一个回归命令

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
cap prog drop myreg
prog define myreg, eclass
version 14.0
* replay(): 1 if the first nonblank character of local macro `0' is a comma, or if `0' is empty
if !replay(){
syntax varlist(min = 2 numeric) [if] [in] [, Level(cilevel)]
marksample touse
tempname YXX XX Xy b hat V
* 计算交叉乘积YXX
qui matrix accum `YXX' = `varlist' if `touse'
local nobs = r(N)
local df = `nobs' - (rowsof(`YXX') - 1)
matrix `XX' = `YXX'[2..., 2...]
matrix `Xy' = `YXX'[1,2...]
* 计算beta vector
matrix `b' = `Xy' * invsym(`XX')
* 计算协方差矩阵
matrix `hat' = `b' * `Xy''
matrix `V' = invsym(`XX') * (`YXX'[1,1] - `hat'[1,1]) / `df'
* 存储到返回值中
eret post `b' `V', dof(`df') obs(`nobs') depname(`1') esample(`touse')
* 保存估计信息
tokenize "`varlist'"
eret local depvar "`1'"
eret local cmd "myreg"
}
else {
syntax [, Level(cilevel)]
}
if "`e(cmd)'" != "myreg" error 301
eret di, level(`level')
end


sysuse auto, clear
myreg price mpg
myreg price mpg rep78 headroom
myreg price mpg rep78 headroom, l(90)
myreg
reg price mpg rep78 headroom, l(90)

一些基本的要素和工具

sysdir and adopath

1
2
3
4
5
6
7
. sysdir
STATA: /Applications/Stata 14/
BASE: /Applications/Stata 14/ado/base/
SITE: /Applications/Stata 14/ado/site/
PLUS: ~/Library/Application Support/Stata/ado/plus/
PERSONAL: ~/Library/Application Support/Stata/ado/personal/
OLDPLACE: ~/ado/
  • 其中,BASE文件夹包含了Stata自己和官方的ado文件;
  • SITE directory may reference a network drive in a university or corporate setting where a system administrator places ado-files to be shared by many users;
  • PERSONAL 文件夹可以放置你自己的ado-files;
  • PLUS 文件夹在你下载外部命令时会被自动创建。
1
2
3
4
5
6
7
. adopath
[1] (BASE) "/Applications/Stata 14/ado/base/"
[2] (SITE) "/Applications/Stata 14/ado/site/"
[3] "."
[4] (PERSONAL) "~/Library/Application Support/Stata/ado/personal/"
[5] (PLUS) "~/Library/Application Support/Stata/ado/plus/"
[6] (OLDPLACE) "~/ado/"
  • adopath命令的运行结果和sysdir基本相同,其中第三个是表示当前文件夹,这里的排序也是Stata寻找ado文件的顺序。

数据类型

  • 固定宽度的字符串变量的长度可以高达2045 bytes。
  • 如果你需要更长的字符串,你可以选择使用strL声明变量,它可以容纳20亿字节。
  • 数值型数据的类型
存储类型 最小值 最大值 占用字节
byte -127 100 1
int -32,767 32,740 2
long -2,147,483,647 2,147,483,620 4
float $-1.701 \times 10^{38}$ $1.701 \times 10^{38}$ 4
double $-8,988 \times 10^{307}$ $-8,988 \times 10^{307}$ 8
  • 需要注意的是,不同类型的数据有着不同的精度,因此,float型的0.01可能不等于double型的0.01。在需要精度更高的值的时候,可以使用double型的。
  • 存储含有很多小数的ID时应该使用字符串型的;
  • 针对常量和一个浮点数的相等比较时,不要依赖与精确的对比,可以使用reldif()函数进行相似相等比较。
  • 在线性回归中,变量的尺度并不会影响精度,但是在非线性的回归中会的。

日期和时间的处理

时间尺度 格式
date %td
week %tw
month %tm
quarter %tq
half-year %th
  • Stata 中也支持交易日,bcal命令。

时间序列算子

  • L. F. D. S.分别代表滞后算子、领先算子、差分算子和季节差分算子。
  • 这个比较绕人,建议实际运用时慢慢感觉。

因子变量和算子

  • i. :因子算子;
  • #: 生成一个交互项;
  • ##:生成很多交互项。

infile命令:

1
infile strL var1 strL2 var2 using temp.txt, clear
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
. clear all

. type temp.raw
AK 12.34 .09 262000 Alaska 1
AL 9.02 .075 37800 Alabama 2

. infile str2 state mem prop potential str20 state_name key using temp
(2 observations read)

. list

+--------------------------------------------------+
| state mem prop potent~l state_~e key |
|--------------------------------------------------|
1. | AK 12.34 .09 262000 Alaska 1 |
2. | AL 9.02 .075 37800 Alabama 2 |
+--------------------------------------------------+

import delimited 命令:

  • 导入制表符分隔或逗号分隔的文件
1
2
3
4
5
6
7
8
9
10
11
12
13
. clear all

. import delimited using temp.csv
(5 vars, 2 obs)

. list

+----------------------------------------+
| v1 v2 v3 v4 v5 |
|----------------------------------------|
1. | AK 12.34 .09 262000 Alaska 1 |
2. | AL 9.02 .075 37800 Alabama 2 |
+----------------------------------------+
  • infile 和 import delimited 的区别在于,前者可以和if 或 in 一起使用来选择需要导入的数据。例如可以使用 in 1/1000 选择导入前1000行。

infix命令:

  • 最常用于爬数据的程序中,可以自由的选择导入数据的长度,例如:

    1
    infix strL v 1-20000 using temp.txt, clear
  • 实际使用中,对于复杂的数据,可以手动导入,然后再把代码复制到自己的程序里面。

bivariate

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
* net install bivariate.pkg, from("http://digital.cgdev.org/doc/stata/MO/Misc/")
* h bivariate
* 因变量与每个独立变量的二元关联,可以产生变量的描述性统计表格和VIF等
* Setup
sysuse auto, clear
bivariate price weight mpg rep78 foreign, tabstat obsgain
* 更美观的表格显示
frmttable , statmat(r(TransposedST))
frmttable , statmat(r(bivariate))
* 下面的命令可以产生相同的结果
regress price weight mpg rep78 foreign
gen byte insample = e(sample)
summ price weight mpg rep78 if insample
tab foreign if insample, sum(price)

* 不显示VIF
bivariate price weight mpg rep78 foreign, novif
* frmttable 命令可以将表格输出至word文件
frmttable using word, replace statmat(r(bivariate)) rtitles("Vehicle weight" \ "Miles per gallon" \ "Repair record" \ "Foreign or domestic") sdec(3,0,0,2,3,1,0) title("Table _. Bivariate relationships between vehicle price and each independent variables")
* 也可以制定统计量
bivariate price weight mpg rep78, group(foreign) groupstats(n mean sd)
matrix list r(grouptab)
frmttable, statmat(r(grouptab))
frmttable, statmat(r(frmttable)) substat(1)

代码运行结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
. * net install bivariate.pkg, from("http://digital.cgdev.org/doc/stata/MO/Misc/")

. * h bivariate

. * 因变量与每个独立变量的二元关联,可以产生变量的描述性统计表格和VIF

. * Setup

. sysuse auto, clear
(1978 Automobile Data)

. bivariate price weight mpg rep78 foreign, tabstat obsgain

Summary of the bivariate relationships between the dependent variable: price
and each of the independent variables: weight mpg rep78 foreign

Casewise deletion deletes : 5 observations.

The analysis uses : 69 observations.
The variance inflation factor is: Centered
Without the variable rep78 nobs is: 74


Bivariate table for the dependent variable: price

| Correla~n For D=0 For D=1 t-stat p-value VIF Obs Gai~d
-------------+-----------------------------------------------------------------------------
weight | .5478396 . . 5.360207 1.10e-06 4.087574 0
mpg | -.455949 . . -4.193346 .0000825 3.104604 0
rep78 | .0065533 . . .053642 .95738 1.64413 5
foreign | . 6179.25 6070.143 -.1421511 .8873872 2.36932 0

If Gallup's -frmttable- is installed, click here:
frmttable , statmat(r(bivariate))

Option -tabstat-:
Descriptive statistics on the dependent and the independent variables:

| mean p50 sd cv min max skewness
-------------+-----------------------------------------------------------------------------
price | 6146.043 5079 2912.44 .4738724 3291 15906 1.687968
weight | 3032.029 3200 792.8515 .2614921 1760 4840 .1180643
mpg | 21.28986 20 5.866408 .2755495 12 41 .9953495
rep78 | 3.405797 3 .9899323 .290661 1 5 -.0570331
foreign | .3043478 0 .4635016 1.522934 0 1 .8504201

If Gallup's -frmttable- is installed, click here:
frmttable , statmat(r(TransposedST))

. * 更美观的表格显示

. frmttable , statmat(r(TransposedST))

------------------------------------------------------------------------------
mean p50 sd cv min max skewness
------------------------------------------------------------------------------
price 6,146.04 5,079.00 2,912.44 0.47 3,291.00 15,906.00 1.69
weight 3,032.03 3,200.00 792.85 0.26 1,760.00 4,840.00 0.12
mpg 21.29 20.00 5.87 0.28 12.00 41.00 1.00
rep78 3.41 3.00 0.99 0.29 1.00 5.00 -0.06
foreign 0.30 0.00 0.46 1.52 0.00 1.00 0.85
------------------------------------------------------------------------------


. frmttable , statmat(r(bivariate))

-------------------------------------------------------------------------------
Correlation For D=0 For D=1 t-stat p-value VIF Obs Gained
-------------------------------------------------------------------------------
weight 0.55 5.36 0.00 4.09 0.00
mpg -0.46 -4.19 0.00 3.10 0.00
rep78 0.01 0.05 0.96 1.64 5.00
foreign 6,179.25 6,070.14 -0.14 0.89 2.37 0.00
-------------------------------------------------------------------------------


. * 下面的命令可以产生相同的结果

. regress price weight mpg rep78 foreign

Source | SS df MS Number of obs = 69
-------------+---------------------------------- F(4, 64) = 15.82
Model | 286761158 4 71690289.6 Prob > F = 0.0000
Residual | 290035800 64 4531809.38 R-squared = 0.4972
-------------+---------------------------------- Adj R-squared = 0.4657
Total | 576796959 68 8482308.22 Root MSE = 2128.8

------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
weight | 3.565247 .6582976 5.42 0.000 2.250146 4.880347
mpg | 27.32371 77.53757 0.35 0.726 -127.5754 182.2228
rep78 | 121.1322 334.3828 0.36 0.718 -546.8742 789.1387
foreign | 3520.324 857.318 4.11 0.000 1807.634 5233.013
_cons | -6729.56 3450.835 -1.95 0.056 -13623.4 164.2752
------------------------------------------------------------------------------

. gen byte insample = e(sample)

. summ price weight mpg rep78 if insample

Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
price | 69 6146.043 2912.44 3291 15906
weight | 69 3032.029 792.8515 1760 4840
mpg | 69 21.28986 5.866408 12 41
rep78 | 69 3.405797 .9899323 1 5

. tab foreign if insample, sum(price)

| Summary of Price
Car type | Mean Std. Dev. Freq.
------------+------------------------------------
Domestic | 6,179.25 3,188.969 48
Foreign | 6,070.143 2,220.984 21
------------+------------------------------------
Total | 6,146.043 2,912.44 69

.
. * 不显示VIF

. bivariate price weight mpg rep78 foreign, novif

Summary of the bivariate relationships between the dependent variable: price
and each of the independent variables: weight mpg rep78 foreign

Casewise deletion deletes : 5 observations.

The analysis uses : 69 observations.
The variance inflation factor is: Suppressed


Bivariate table for the dependent variable: price

| Correla~n For D=0 For D=1 t-stat p-value
-------------+-------------------------------------------------------
weight | .5478396 . . 5.360207 1.10e-06
mpg | -.455949 . . -4.193346 .0000825
rep78 | .0065533 . . .053642 .95738
foreign | . 6179.25 6070.143 -.1421511 .8873872

If Gallup's -frmttable- is installed, click here:
frmttable , statmat(r(bivariate))

. * frmttable 命令可以将表格输出至word文件

. frmttable using word, replace statmat(r(bivariate)) rtitles("Vehicle weight" \ "Miles per gallon" \ "Repair record" \ "Foreign or domestic
> ") sdec(3,0,0,2,3,1,0) title("Table _. Bivariate relationships between vehicle price and each independent variables")

Table _. Bivariate relationships between vehicle price and each independent variables
-----------------------------------------------------------------------
Correlation For D=0 For D=1 t-stat p-value
-----------------------------------------------------------------------
Vehicle weight 0.548 5.36 0.000
Miles per gallon -0.456 -4.19 0.000
Repair record 0.007 0.05 0.957
Foreign or domestic 6,179 6,070 -0.14 0.887
-----------------------------------------------------------------------


. * 也可以制定统计量

. bivariate price weight mpg rep78, group(foreign) groupstats(n mean sd)

Summary of the bivariate relationships between the dependent variable: price
and each of the independent variables: weight mpg rep78

Casewise deletion deletes : 5 observations.

The analysis uses : 69 observations.
The variance inflation factor is: Centered


Bivariate table for the dependent variable: price

| Correla~n t-stat p-value VIF
-------------+--------------------------------------------
weight | .5478396 5.360207 1.10e-06 2.905282
mpg | -.455949 -4.193346 .0000825 2.910836
rep78 | .0065533 .053642 .95738 1.217191

If Gallup's -frmttable- is installed, click here:
frmttable , statmat(r(bivariate))

Estimation sample discrim lda
Summarized by foreign

| foreign
| Domestic Foreign | Total
-------------+----------------------+----------
price | |
Mean | 6179.25 6070.143 | 6146.043
Std dev | 3188.969 2220.984 | 2912.44
-------------+----------------------+----------
weight | |
Mean | 3368.333 2263.333 | 3032.029
Std dev | 688.0108 364.7099 | 792.8515
-------------+----------------------+----------
mpg | |
Mean | 19.54167 25.28571 | 21.28986
Std dev | 4.753312 6.309856 | 5.866408
-------------+----------------------+----------
rep78 | |
Mean | 3.020833 4.285714 | 3.405797
Std dev | .837666 .7171372 | .9899323
-------------+----------------------+----------
| |
N | 48 21 | 69

Statistics by group available in the matrix r(grouptab)

| Domestic | Foreign
| mean sd | mean sd
-------------+----------------------+----------------------
price | 6179.25 3188.969 | 6070.143 2220.984
weight | 3368.333 688.0108 | 2263.333 364.7099
mpg | 19.54167 4.753312 | 25.28571 6.309856
rep78 | 3.020833 .837666 | 4.285714 .7171372

If Gallup's -frmttable- is installed, click on one of the following links:
frmttable , statmat(r(grouptab))
frmttable , statmat(r(frmttable)) note(Statistics are -- mean sd) substat(1)

. matrix list r(grouptab)

r(grouptab)[4,4]
Domestic: Domestic: Foreign: Foreign:
mean sd mean sd
price 6179.25 3188.9693 6070.1429 2220.9835
weight 3368.3333 688.0108 2263.3333 364.70993
mpg 19.541667 4.7533116 25.285714 6.3098562
rep78 3.0208333 .83766604 4.2857143 .71713717

. frmttable, statmat(r(grouptab))

--------------------------------------------------
Domestic Domestic Foreign Foreign
mean sd mean sd
--------------------------------------------------
price 6,179.25 3,188.97 6,070.14 2,220.98
weight 3,368.33 688.01 2,263.33 364.71
mpg 19.54 4.75 25.29 6.31
rep78 3.02 0.84 4.29 0.72
--------------------------------------------------


. frmttable, statmat(r(frmttable)) substat(1)

----------------------------------
Domestic Foreign
----------------------------------
price 6,179.25 6,070.14
(3,188.97) (2,220.98)
weight 3,368.33 2,263.33
(688.01) (364.71)
mpg 19.54 25.29
(4.75) (6.31)
rep78 3.02 4.29
(0.84) (0.72)
----------------------------------

在使用by()选项时改变tw_graph的顺序

1
2
3
4
* ssc install egenmore
sysuse auto, clear
sc mpg weight, by(rep78) name(a2, replace)
gr export rep78.png, replace

1
2
3
4
5
egen mean1 = mean(mpg), by(rep78)
egen axis = axis(mean1 rep78), label(rep78) reverse

sc mpg weight, by(axis) name(a1, replace)
gr export rep781.png, replace

在条形图中间添加间隔

1
2
3
4
sysuse auto, clear
set scheme vg_rose, permanently
gr bar (mean) price turn mpg weight, over(rep78)
gr export 在条形图中间添加间隔1.png, replace

1
2
3
4
5
set obs 75
replace rep78 = 2.5 in 75

gr bar (mean) price turn mpg weight, over(rep78, relabel( 3 " "))
gr export 在条形图中间添加间隔2.png, replace

在图片侧面添加表格

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
sysuse auto, clear
forval n = 1/20{
local sidetable `"`sidetable' "`=make[`n']'" "'
}
forval n = 1/20{
local sidecap `"`sidecap' "$`=price[`n']'" "'
}

sysuse auto, clear
tw sc mpg rep78, msize(small) ms(s) ///
note("{bf:Make }" "-------------------" `sidetable', ///
size(medsmall) color(green) pos(2) m(small) ///
justification(left)) ///
caption("{bf: Price }" "--------" `sidecap', ///
size(medsmall) color(midgreen) ///
pos(2) m(vsmall) justification(left)) sch(vg_rose)
gr export 在图片侧面添加表格.png, replace

在图片中间添加一个纵轴

这个图是两张图片合成的,关键点是如何给合成后的图加坐标标签,这里我是用text()选项添加的文本。

1
2
3
4
5
6
clear all
sysuse auto, clear
tw sc price weight if weight < 3000, graphr(margin(right(-10))) xsc(range(2000 3000)) plotr(margin(0)) name(first, replace) ysc(off) nodraw xtitle("") xlabel(, nogrid)
tw sc price weight if weight > 3000, graphr(margin(left(-10))) xsc(rang(3010 5000)) plotr(margin(0)) name(second, replace) xtitle("") nodraw ytitle(价格) text(3000 3000 "重量")
graph combine first second
graph export 在图片中间添加一个纵轴.png, replace

在直方图中添加一个分布

1
2
3
sysuse auto, clear
hist mpg, addplot(function normalden(x, 26, 10), ra(10 42) lcolor(black)) yla(, ang(0) nogrid) xla(10(5)40) leg(col(2)) bfcolor(green*0.7) scheme(s1mono)
gr export 在直方图中添加一个分布.png, replace

在柱条中间添加频数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
*! g2:在柱条中间添加频数标签
cap prog drop g2
prog define g2
syntax varname, over(varname)
levelsof `over', local(kk)
graph bar `varlist' , over(`over') blabel(bar, position(base) gap(*22.5) format(%9.1f) size(large) color(black))
local x=1
foreach i of local kk {
summ `varlist' if `over'==`i'
local a1="N= "+string(r(N),"%8.0f")

gr_edit plotregion1.barlabels[`x'].text = {}
gr_edit plotregion1.barlabels[`x']._set_orientation vertical
gr_edit plotregion1.barlabels[`x'].text.Arrpush `"{bf: `a1' }"'
local ++x
}
end

sysuse auto, clear
egen pr2 = cut(price), group(10)
g2 weight, over(pr2)

在Stata绘图中添加小组件

1
vguse allstates, clear
  • 添加箭头、文本
1
2
3
4
5
6
gr tw sc ownhome propval100 || ///
pcarrowi 42.5 26 42.5 61.3, ///
lw(medthick) lcolor(red) ///
text(42.5 12 "Possible Outlier", size(large)) ||, ///
leg(off)
gr export pcarrowi.png, replace

  • 添加点
1
2
3
4
gr tw sc ownhome propval100 || ///
scatteri 42.6 62.1, ms(S) color(red) ||, ///
leg(off)
gr export dots.png, replace

在x轴下方添加表格

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
clear all
sysuse auto, clear
tw ///
sc mpg rep78, ysc(r(0 50)) ysize(2) name(a1) xla(1(1)5) msize(small) nodraw
contract rep78 foreign
fillin rep78 foreign
replace _freq = 0 if missing(_freq)

set obs `=_N+2'
tostring _freq, replace
replace rep78 = 0 in `=_N-1'
replace rep78 = 0 in `=_N'

replace foreign = 1 in `=_N-1'
replace foreign = 0 in `=_N'

replace _freq = "Foreign cars" in `=_N-1'
replace _freq = "Domestic cars" in `=_N'

tw ///
sc foreign rep78 if foreign == 1, ///
mlabel(_freq) ms(none) mlabgap(-2) ///
mlabpos(12) mlabsize(small) || ///
sc foreign rep78 if foreign == 0, ///
mlabel(_freq) ms(none) mlabgap(-2) ///
mlabpos(6) mlabsize(small) ||, ///
name(a2, replace) leg(off) yti("b") ///
fysize(5) xsc(off) ///
ysc(off fill) xla(-0.2(1)5) nodraw
gr combine a1 a2, cols(1) xcommon imargin(0 0 0 0)
gr export 在x轴下方添加表格.png, replace

长注记自动断行

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
sysuse auto, clear
local z1 "The development of Linux is one of the most prominent"
local z2 " examples of free and open source software"
local z3 "local z3 collaboration; typically all the underlying source code"
local z4 "can beused, freely modified, and redistributed, both commercially"
local z5 "and non-commercially, by anyone under licenses such as the GNU"
local z6 "General Public License."
local z7 "Typically Linux is packaged in a format known as a Linux"
local z8 "distribution for desktop and server use."

local total "`z1' `z2' `z3' `z4' `z5' `z6' `z7' `z8'"

* 调整行的长度
local size=40

local t1
forval i = 1/12{
local a`i': piece `i' `size' of "`total'"
local t1 `"`t1' `=char(34) + "`a`i''" + char(34)'"'
}
gr bar price, bargap(-30) over(foreign) ///
note(`t1', margin(large) justification(left))

诊断图的一些命令

1
2
3
4
sysuse auto, clear

* 对称图
symplot price

1
2
* 该图的绘制方法是首先将所有的价格进行排序,symplot绘制yi = median-z(i) 和xi = z(N+1-i)-median的散点图,如果所有的散点都在参考线上说明数据呈对称分布,如果大多数散点分布在参考线的上方,说明数据呈右偏分布,反之则是左偏。下面的核密度图也表明价格的分布是右偏的。
kdensity price

1
2
3
* 分位数图
* 将数据按从小到大排列,然后绘制观测值编号的分位数和观测值的分位数的散点图,类似基尼系数
quantile price

1
2
3
4
* 比较国产车和进口车的重量分布
generate weightd=weight if !foreign
generate weightf=weight if foreign
qqplot weightd weightf

1
2
* 比较价格的分布和正态分布
qnorm price, grid

1
2
* 正态分位数图强调正态分布,正态密度图强调中间的分布
pnorm price, grid

1
2
3
4
5
* 比较ch和自由度为2的卡方分布
egen c1 = std(price)
egen c2 = std(mpg)
generate ch = c1^2 + c2^2
qchi ch, df(2) grid

1
2
* Chi-squared probability plot
pchi ch, df(2) grid

# Stata

评论

程振兴

程振兴 @czxa.top
截止今天,我已经在本博客上写了604.4k个字了!

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×