Exercise 9.3 Solution Example - Hoff, A First Course in Bayesian Statistical Methods
標準ベイズ統計学 演習問題 9.3 解答例
a)
Answer
The result of bayesian regression using the g-prior is shown below:
crime_src = readdlm("../../Exercises/crime.dat", header=true) crime = crime_src[1] header = crime_src[2] |> vec df = DataFrame(crime, header) using GLM function ResidualSE(y,X) fit = lm(X, y) return sum(map(x -> x^2, residuals(fit))) / (size(X, 1) - size(X, 2)) end function lmGprior(y, X; g=length(y), ν₀=1, σ₀²=ResidualSE(y,X), S=1000) n, p = size(X) Hg = (g/(g+1)) * X * inv(X'X) * X' SSRg = y' * (diagm(ones(n)) - Hg) * y σ² = 1 ./ rand(Gamma((ν₀+n)/2, 2/(ν₀*σ₀²+SSRg)), S) Vb = g * inv(X'X) /(g+1) Eb = Vb * X' * y β = Matrix{Float64}(undef, 0, p) for i in 1:S s = σ²[i] β = vcat(β, rand(MvNormal(Eb, Symmetric(s * Vb)))') end return β, σ² end # set prior parameters n = size(df, 1) g = n ν₀ = 2 σ₀² = 1 y = df[:, :y] X = select(df, Not(:y)) |> Matrix X = [ones(size(X, 1)) X] # add intercept p = size(X, 2) S=5000 BETA, SIGMA² = lmGprior(y, X; g=g, ν₀=ν₀, σ₀²=σ₀², S=S) # posterior mean for β β̂_gprior = mean(BETA, dims=1) |> vec # posterior confidence intervals for β ci_gprior = map(x -> quantile(x, [0.025, 0.975]), eachcol(BETA)) # store results coef_name = names(df[:, Not(:y)]) |> vec coef_name = ["intercept", coef_name...] result_gprior = DataFrame( param = coef_name, β̂ = β̂_gprior, ci_lower = map(x -> x[1], ci_gprior), ci_upper = map(x -> x[2], ci_gprior) ) @transform!(result_gprior, :ci_length = :ci_upper - :ci_lower) end
:RESULTS:
16×5 DataFrame
Row │ param β̂ ci_lower ci_upper ci_length
│ String Float64 Float64 Float64 Float64
─────┼────────────────────────────────────────────────────────────
1 │ intercept -0.00138411 -0.145485 0.141785 0.28727
2 │ M 0.280085 0.0326016 0.522226 0.489625
3 │ So 0.000124904 -0.335994 0.326097 0.662091
4 │ Ed 0.531345 0.213679 0.848809 0.63513
5 │ Po1 1.4448 0.0578835 2.9098 2.85191
6 │ Po2 -0.769673 -2.28756 0.708661 2.99622
7 │ LF -0.0649535 -0.345494 0.201018 0.546512
8 │ M.F 0.128508 -0.147925 0.402133 0.550058
9 │ Pop -0.0704276 -0.301267 0.153849 0.455116
10 │ NW 0.105647 -0.200629 0.416185 0.616814
11 │ U1 -0.265548 -0.631514 0.103081 0.734594
12 │ U2 0.359167 0.0283685 0.688258 0.659889
13 │ GDP 0.239074 -0.213783 0.692641 0.906425
14 │ Ineq 0.715697 0.299354 1.13272 0.83337
15 │ Prob -0.27993 -0.527327 -0.0344898 0.492837
16 │ Time -0.0602993 -0.288748 0.181236 0.469984
The result of least square estimation is shown below:
# least square estimates olsfit = lm(X, y) tstat = coef(olsfit) ./ stderror(olsfit) pval = 2 * (1 .- cdf.(Normal(), abs.(tstat))) result_ols = DataFrame( param = coef_name, β̂ = coef(olsfit), pvalue = pval, ci_lower = confint(olsfit)[:, 1], ci_upper = confint(olsfit)[:, 2] ) @transform!(result_ols, :ci_length = :ci_upper - :ci_lower)
16×6 DataFrame
Row │ param β̂ pvalue ci_lower ci_upper ci_length
│ String Float64 Float64 Float64 Float64 Float64
─────┼─────────────────────────────────────────────────────────────────────────
1 │ intercept -0.000458109 0.995369 -0.161444 0.160527 0.321971
2 │ M 0.286518 0.0348684 0.00955607 0.56348 0.553924
3 │ So -0.000114046 0.999506 -0.375493 0.375265 0.750757
4 │ Ed 0.544514 0.00243231 0.178196 0.910832 0.732636
5 │ Po1 1.47162 0.0715397 -0.193934 3.13718 3.33111
6 │ Po2 -0.78178 0.358608 -2.51862 0.955056 3.47367
7 │ LF -0.0659646 0.667281 -0.378923 0.246994 0.625917
8 │ M.F 0.131298 0.397494 -0.185192 0.447788 0.63298
9 │ Pop -0.0702919 0.579692 -0.329144 0.18856 0.517704
10 │ NW 0.109057 0.525852 -0.241574 0.459687 0.701261
11 │ U1 -0.270536 0.168865 -0.671568 0.130495 0.802063
12 │ U2 0.36873 0.0403534 0.00190637 0.735554 0.733648
13 │ GDP 0.238059 0.357931 -0.290079 0.766198 1.05628
14 │ Ineq 0.726292 0.00192334 0.24874 1.20384 0.955105
15 │ Prob -0.285226 0.033017 -0.558095 -0.0123574 0.545738
16 │ Time -0.0615769 0.638379 -0.328802 0.205648 0.534451
The point estimates and 95% confidence/credible intervals are very similar between bayesian regression and least square estimation. The lengths of the intervals seem to be shorter in bayesian regression than in least square estimation. Seeing the results of bayesian regression, the 95% credible intervals of coefficients of M, Ed, Po1, U2, Ineq and Prob do not include 0, so these variables seem strongly predictive of crime rates. On the other hand, the results of least square estimation show that the p-values of M, Ed, U2 and Ineq are less than 0.05, so these variables seem strongly predictive of crime rates. Using these types of criteria, the difference between the two results is that Po1 and Prob are predictive of crime rates in bayesian regression, but not in least square estimation.
b)
Answer
# split data into training and test set Random.seed!(1234) i_te = sample(1:n, div(n, 2), replace=false) i_tr = setdiff(1:n, i_te) y_tr = y[i_tr] y_te = y[i_te] X_tr = X[i_tr, :] X_te = X[i_te, :]
# OLS olsfit = lm(X_tr, y_tr) β̂_ols = coef(olsfit) ŷ_ols = X_te * β̂_ols error_ols = mean((y_te - ŷ_ols).^2)
:RESULTS:
0.6155918321208247
# Bayesian BETA, SIGMA² = lmGprior(y_tr, X_tr; g=g, ν₀=ν₀, σ₀²=σ₀², S=S) β̂_bayes = mean(BETA, dims=1) |> vec ŷ_bayes = X_te * β̂_bayes error_bayes = mean((y_te - ŷ_bayes).^2)
:RESULTS:
0.6012597112455574
From the above results, the least square regression and bayesian regression seem to yield very similar predictions for the test data. For this time, the bayesian regression predict better than the least square regression, but the gap is very small.
c)
Answer
function computeMses(y, X; g=g, ν₀=ν₀, σ₀²=σ₀², S=S) # split data into training and test set y_tr, y_te, X_tr, X_te = splitData(y, X) # OLS olsfit = lm(X_tr, y_tr) β̂_ols = coef(olsfit) ŷ_ols = X_te * β̂_ols error_ols = mean((y_te - ŷ_ols).^2) # Bayesian BETA, SIGMA² = lmGprior(y_tr, X_tr; g=g, ν₀=ν₀, σ₀²=σ₀², S=S) β̂_bayes = mean(BETA, dims=1) |> vec ŷ_bayes = X_te * β̂_bayes error_bayes = mean((y_te - ŷ_bayes).^2) return error_ols, error_bayes end # compute MSEs n_sim = 1000 error_ols = zeros(n_sim) error_bayes = zeros(n_sim) for i in 1:n_sim error_ols[i], error_bayes[i] = computeMses(y, X) end # average MSEs mean(error_ols), mean(error_bayes)
:RESULTS:
(1.12446194924989, 1.092206057122776)