第五章习题
1.习题5.1
解:假定两总体服从正态分布,且协方差矩阵,误判损失相同又先验概率按比例分配,通过SAS计算得到先验概率如表:
Class
Level
Information
group
Variable
Name
Frequency
Weight
Proportion
Prior
Probability
G1
G1
6.0000
0.428571
0.428571
G2
G2
8.0000
0.571429
0.571429
即:
又计算可得:
有计算的总体协防差距矩阵S为:
Pooled
Within-Class
Covariance
Matrix,DF
=
Variable
x1
x2
x1
1.081944444
-0.310902778
x2
-0.310902778
0.174756944
并且:
计算广义平方距离函数:
并计算后验概率:
回代判别结果如下:
Posterior
Probability
of
Membership
in
group
Obs
From
group
Classified
into
group
G1
G2
G1
G1
0.9387
0.0613
G1
G1
0.9303
0.0697
G1
G1
0.9999
0.0001
G1
G2
*
0.4207
0.5793
G1
G1
0.9893
0.0107
G1
G1
1.0000
0.0000
G2
G2
0.0007
0.9993
G2
G2
0.0026
0.9974
G2
G2
0.0008
0.9992
G2
G2
0.0586
0.9414
G2
G2
0.0350
0.9650
G2
G2
0.0006
0.9994
G2
G2
0.0038
0.9962
G2
G2
0.0012
0.9988
由此可见误判的回代估计:
若按照交叉确认法,定义广义平方距离如下:
逐个剔除,交叉判别,后验概率按下式计算:
通过SAS计算得到表所示结果。发现同样也是属于G1的4号被误判为G2,因此误判率的交叉确认估计为
Posterior
Probability
of
Membership
in
group
Obs
From
group
Classified
into
group
G1
G2
G1
G1
0.9060
0.0940
G1
G1
0.7641
0.2359
G1
G1
1.0000
0.0000
G1
G2
*
0.1950
0.8050
G1
G1
0.9743
0.0257
G1
G1
1.0000
0.0000
G2
G2
0.0012
0.9988
G2
G2
0.0051
0.9949
G2
G2
0.0014
0.9986
G2
G2
0.0713
0.9287
G2
G2
0.0422
0.9578
G2
G2
0.0009
0.9991
G2
G2
0.0059
0.9941
G2
G2
0.0022
0.9978
其中=12.1138,又因为,所以,最后可得后验概率p为:0.048709
习题5.3
解:(1)在并且先验概率相同的的假设前提下,建立矩离判别的线性判别函数。利用SAS的proc
discrim过程首先计算得到总体的协方差矩阵,如表:
Pooled
Within-Class
Covariance
Matrix,DF
=
Variable
x1
x2
x3
x4
x5
x6
x7
x8
x1
2.25705591
-0.91513311
0.34259974
-0.6084399
-0.9576508
-0.8929719
-0.0539445
-0.2192724
x2
-0.9151331
25.2318255
-0.3390873
-2.5515272
-5.0966371
0.78571637
-0.0835586
4.37529806
x3
0.34259974
-0.33908734
3.30063123
1.42276017
1.78692343
0.40208409
-0.0676655
-0.0732213
x4
-0.6084399
-2.55152726
1.42276017
6.07845863
5.78100857
2.32039331
-0.3205116
0.48605897
x5
-0.9576508
-5.09663714
1.78692343
5.78100857
8.15854743
3.44983429
-0.1096651
0.08904743
x6
-0.8929719
0.78571637
0.40208409
2.32039331
3.44983429
4.16657066
-0.2236278
0.87862549
x7
-0.0539445
-0.08355869
-0.0676655
-0.3205116
-0.1096651
-0.2236278
0.26009291
-0.0767347
x8
-0.2192724
4.37529806
-0.0732213
0.48605897
0.08904743
0.87862549
-0.0767347
2.51054423
各个总体的马氏平方距离见表:
Generalized
Squared
Distance
to
group
From
group
G1
G2
G1
0
24.61468
G2
24.61468
0
线性判别函数为:
得到训练样本回判法判别结果如表:
Error
Count
Estimates
for
group
G1
G2
Total
Rate
0.0000
0.0000
0.0000
Priors
0.5000
0.5000
训练样本的交叉确认判别结果:
Posterior
Probability
of
Membership
in
group
Obs
From
group
Classified
into
group
G1
G2
G1
G2
*
0.4501
0.5499
G1
G2
*
0.0920
0.9080
Error
Count
Estimates
for
group
G1
G2
Total
Rate
0.1000
0.0000
0.0500
Priors
0.5000
0.5000
(2)假设两总体服从正态分布,先验概率按比例分配且误判损失相同,在两总体协方差矩阵相同,即的条件下进行Bayes判别分析,通过SAS
discrim过程得到结果:
Error
Count
Estimates
for
group
G1
G2
Total
Rate
0.0000
0.0000
0.0000
Priors
0.7407
0.2593
交叉确认判别结果:
Posterior
Probability
of
Membership
in
group
Obs
From
group
Classified
into
group
G1
G2
G1
G2
*
0.2246
0.7754
G2
G1
*
0.5282
0.4718
Error
Count
Estimates
for
group
G1
G2
Total
Rate
0.0500
0.1429
0.0741
Priors
0.7407
0.2593
在,并且先验概率按比例分配的假设前提下利用SAS的proc
discrim过程进行Bays判别分析,这时以个总体的训练样本单独估计各总体的协方差矩阵,可到的训练样本的回判和交叉确认结果:
回判结果:
Error
Count
Estimates
for
group
G1
G2
Total
Rate
0.0000
0.0000
0.0000
Priors
0.7407
0.2593
交叉确认判别结果:
Posterior
Probability
of
Membership
in
group
Obs
From
group
Classified
into
group
G1
G2
G2
G1
*
1.0000
0.0000
G2
G1
*
1.0000
0.0000
G2
G1
*
1.0000
0.0000
G2
G1
*
1.0000
0.0000
G2
G1
*
1.0000
0.0000
G2
G1
*
1.0000
0.0000
G2
G1
*
1.0000
0.0000
Error
Count
Estimates
for
group
G1
G2
Total
Rate
0.0000
1.0000
0.2593
Priors
0.7407
0.2593
(3)在不同的假设前提,采用不同判别方法得到待判样本的判别结果:
1.距离判别分析得到西藏、上海、广东的判别结果:
Posterior
Probability
of
Membership
in
group
Obs
Classified
into
group
G1
G2
G2
0.0000
1.0000
G2
0.0000
1.0000
G2
0.0000
1.0000
2.在协方差矩阵相同的前提下,Bayes对西藏、上海、广东的判别结果:
Posterior
Probability
of
Membership
in
group
Obs
Classified
into
group
G1
G2
G2
0.0000
1.0000
G2
0.0000
1.0000
G2
0.0000
1.0000
3在协方差不同矩阵相同的前提下,Bayes对西藏、上海、广东的判别结果:
Posterior
Probability
of
Membership
in
group
Obs
Classified
into
group
G1
G2
G1
1.0000
0.0000
G1
1.0000
0.0000
G1
1.0000
0.0000
3.习题5.4
解:(1)假设两总体服从正态分布且在两总体协方差矩阵相同,即,先验概率按相同的条件下进行Bayes判别分析,通过SAS
discrim过程得到结果:
首先得到线性判别函数:
回代误判结果:
Posterior
Probability
of
Membership
in
group
Obs
From
group
Classified
into
group
G1
G2
G1
G2
*
0.3401
0.6599
G2
G1
*
0.8571
0.1429
由计算结果发现,第9号样本被误判到G2,29号样本被误判到G1.误判率为6.34%
Error
Count
Estimates
for
group
G1
G2
Total
Rate
0.0833
0.0435
0.0634
Priors
0.5000
0.5000
交叉确认判别结果:由计算发现总共有四个样本被判错,分别是9、28、29、35号样品。累计误判率为10.69%
Posterior
Probability
of
Membership
in
group
Obs
From
group
Classified
into
group
G1
G2
G1
G2
*
0.0973
0.9027
G2
G1
*
0.6130
0.3870
G2
G1
*
0.9643
0.0357
G2
G1
*
0.8470
0.1530
Error
Count
Estimates
for
group
G1
G2
Total
Rate
0.0833
0.1304
0.1069
Priors
0.5000
0.5000
(1)假设两总体服从正态分布且在两总体协方差矩阵相同,即,先验概率按比例分配且误判损失相同的条件下进行Bayes判别分析,通过SAS
discrim过程得到结果:
首先得到线性判别函数:
Linear
Discriminant
Function
for
group
Variable
G1
G2
Constant
-99.91796
-95.41991
x1
30.35060
29.87680
x2
-0.15214
-0.15210
x3
-0.78868
-0.22662
x4
1.95176
1.39528
x5
0.58964
0.06490
x6
-108.10195
-85.33735
x7
-0.31156
-0.25957
回代误判结果
Posterior
Probability
of
Membership
in
group
Obs
From
group
Classified
into
group
G1
G2
G1
G2
*
0.2119
0.7881
G2
G1
*
0.7579
0.2421
Error
Count
Estimates
for
group
G1
G2
Total
Rate
0.0833
0.0435
0.0571
Priors
0.3429
0.6571
交叉确认误判结果:
Posterior
Probability
of
Membership
in
group
Obs
From
group
Classified
into
group
G1
G2
G1
G2
*
0.3436
0.6564
G1
G2
*
0.0532
0.9468
G1
G2
*
0.4052
0.5948
G1
G2
*
0.3519
0.6481
G2
G1
*
0.9338
0.0662
G2
G1
*
0.7428
0.2572
Error
Count
Estimates
for
group
G1
G2
Total
Rate
0.3333
0.0870
0.1714
Priors
0.3429
0.6571