Preface 9
0.1 Notation, terminology and some guidance for reading the book13
I Basic concepts in Bayesian methods 15
1 Modes of statistical inference 16
1.1 The frequentist approach: a critical reflection 17
1.1.1 The classical statistical approach 17
1.1.2 The P-value as a measure of evidence 18
1.1.3 The confidence interval as a measure of evidence 21
1.1.4 An historical note on the two frequentist paradigms* 22
1.2 Statistical inference based on the likelihood function 24
1.2.1 The likelihood function 24
1.2.2 The likelihood principles 25
1.3 The Bayesian approach: some basic ideas 28
1.3.1 Introduction 28
1.3.2 Bayes theorem – Discrete version for simple events 30
1.4 Outlook 33
2 Bayes theorem 36
2.1 Introduction 36
2.2 Bayes theorem – The binary version 36
2.3 Probability in a Bayesian context 37
2.4 Bayes theorem – The categorical version 39
2.5 Bayes theorem – The continuous version 39
2.6 The binomial case 40
2.7 The Gaussian case 47
2.8 The Poisson case 54
2.9 The prior and posterior distribution of h(θ) 57
2.10 Bayesian versus likelihood approach 58
2.11 Bayesian versus frequentist approach 58
2.12 The different modes of the Bayesian approach 59
2.13 An historical note on the Bayesian approach 60
2.14 Closing remarks 62
3 Posterior summary measures 64
3.1 Introduction 64
3.2 Summarizing the posterior by probabilities 64
3.3 Posterior summary measures 65
3.3.1 Characterizing the location and variability of the posterior distribution
65
3.3.2 Posterior interval estimation 67
3.4 Predictive distributions 70
3.4.1 The frequentist approach to prediction 71
3.4.2 The Bayesian approach to prediction 71
3.4.3 Applications 72
3.5 Exchangeability 77
3.6 A normal approximation to the posterior 79
3.6.1 A Bayesian analysis based on a normal approximation to the
likelihood 79
3.6.2 Asymptotic properties of the posterior distribution 81
3.7 Numerical techniques to determine the posterior 82
3.7.1 Numerical integration 83
3.7.2 Sampling from the posterior 85
3.7.3 Choice of posterior summary measures 92
3.8 Bayesian hypothesis testing 92
3.8.1 Inference based on credible intervals 92
3.8.2 The Bayes factor 93
3.8.3 Bayesian versus frequentist hypothesis testing 97
3.9 Closing remarks 99
4 More than one parameter 103
4.1 Introduction 103
4.2 Joint versus marginal posterior inference 104
4.3 The normal distribution with μ and σ2 unknown 104
4.3.1 No prior knowledge on μ and σ2 is available 105
4.3.2 An historical study is available 108
4.3.3 Expert knowledge is available 110
4.4 Multivariate distributions 110
4.4.1 The multivariate normal and related distributions 111
4.4.2 The multinomial distribution 111
4.5 Frequentist properties of Bayesian inference 114
4.6 Sampling from the posterior distribution: the Method of Composition 115
4.7 Bayesian linear regression models 118
4.7.1 The frequentist approach to linear regression 118
4.7.2 A noninformative Bayesian linear regression model 119
4.7.3 Posterior summary measures for the linear regression model 120
4.7.4 Sampling from the posterior distribution121
4.7.5 An informative Bayesian linear regression model 123
4.8 Bayesian generalized linear models 123
4.9 More complex regression models 124
4.10 Closing remarks 124
5 The prior distribution 126
5.1 Introduction 126
5.2 The sequential use of Bayes theorem 126
5.3 Conjugate prior distributions 128
5.3.1 Univariate data distributions 129
5.3.2 Normal distribution – mean and variance unknown 132
5.3.3 Priors for multivariate distributions 133
5.3.4 Conditional conjugate and semi-conjugate distributions134
5.3.5 Hyperpriors 134
5.4 Noninformative prior distributions 136
5.4.1 Introduction 136
5.4.2 Expressing ignorance 137
5.4.3 General principles to choose noninformative priors 138
5.4.4 Improper prior distributions 142
5.4.5 Weak/vague priors 143
5.5 Informative prior distributions 144
5.5.1 Introduction 144
5.5.2 Data-based prior distributions 144
5.5.3 Elicitation of prior knowledge 145
5.5.4 Archetypal prior distributions 150
5.6 Prior distributions for regression models 153
5.6.1 Normal linear regression 153
5.6.2 Generalized linear models 155
5.6.3 Specification of priors in Bayesian software 158
5.7 Modeling priors 158
5.8 Other regression models 160
5.9 Closing remarks 161
6 Markov chain Monte Carlo sampling 164
6.1 Introduction 164
6.2 The Gibbs sampler 165
6.2.1 The bivariate Gibbs sampler 165
6.2.2 The general Gibbs sampler 171
6.2.3 Remarks* 177
6.2.4 Review of Gibbs sampling approaches 178
6.2.5 The Slice sampler* 179
6.3 The Metropolis(-Hastings) algorithm 180
6.3.1 The Metropolis algorithm 181
6.3.2 The Metropolis-Hastings algorithm 183
6.3.3 Remarks* 186
6.3.4 Review of Metropolis(-Hastings) approaches 188
6.4 Justification of the MCMC approaches* 189
6.4.1 Properties of the MH algorithm 191
6.4.2 Properties of the Gibbs sampler 192
6.5 Choice of the sampler 192
6.6 The Reversible Jump MCMC algorithm* 195
6.7 Closing remarks 200
7 MCMC convergence 203
7.1 Introduction 203
7.2 Assessing convergence of a Markov chain 204
7.2.1 Definition of convergence for a Markov chain204
7.2.2 Checking convergence of the Markov chain 204
7.2.3 Graphical approaches to assess convergence 205
7.2.4 Formal diagnostic tests 208
7.2.5 Computing the Monte Carlo standard error 215
7.2.6 Practical experience with the formal diagnostic procedures. 217
7.3 Accelerating convergence 218
7.3.1 Introduction 218
7.3.2 Acceleration techniques 219
7.4 Practical guidelines for assessing and accelerating convergence. 224
7.5 Data augmentation 225
7.6 Closing remarks 231
8 Software 233
8.1 WinBUGS and related software 233
8.1.1 A first analysis 234
8.1.2 Assessing and accelerating convergence 237
8.1.3 Vector and matrix manipulations 240
8.1.4 Working in batch mode 242
8.1.5 Troubleshooting 243
8.1.6 Directed acyclic graphs 243
8.1.7 Add-on modules: GeoBUGS and PKBUGS 245
8.1.8 Related software 246
8.2 Bayesian analysis using SASr 247
8.2.1 Analysis using procedure GENMOD 248
8.2.2 Analysis using procedure MCMC 251
8.2.3 Other Bayesian programs 253
8.3 Additional Bayesian software and comparisons254
8.3.1 Additional Bayesian software 254
8.3.2 Comparison of Bayesian software 255
8.4 Closing remarks 256
II Bayesian tools for statistical modeling 258
9 Hierarchical models 259
9.1 Introduction 259
9.2 The Poisson-gamma hierarchical model 260
9.2.1 Introduction 260
9.2.2 Model specification 261
9.2.3 Posterior distributions 264
9.2.4 Estimating the parameters 264
9.2.5 Posterior predictive distributions 270
9.3 Full versus Empirical Bayesian approach 271
9.4 Gaussian hierarchical models 273
9.4.1 Introduction 273
9.4.2 The Gaussian hierarchical model 273
9.4.3 Estimating the parameters 274
9.4.4 Posterior predictive distributions 277
9.4.5 Comparison of FB and EB approach 277
9.5 Mixed models 278
9.5.1 Introduction 278
9.5.2 The linear mixed model 278
9.5.3 The generalized linear mixed model 281
9.5.4 Nonlinear mixed models 288
9.5.5 Some further extensions 290
9.5.6 Estimation of the random effects and posterior predictive distributions
291
9.5.7 Choice of the level-2 variance prior 292
9.6 Propriety of the posterior 295
9.7 Assessing and accelerating convergence 296
9.8 Comparison with frequentist methods 298
9.8.1 Estimating the level-2 variance 298
9.8.2 ML and REML estimates compared with Bayesian estimates299
9.9 Closing remarks 300
10 Model building and assessment 303
10.1 Introduction 303
10.2 Measures for model selection 304
10.2.1 The Bayes factor 304
10.2.2 Information theoretic measures for model selection 310
10.2.3 Model selection based on other predictive loss functions324
10.3 Model checking 326
10.3.1 Introduction 326
10.3.2 Model checking procedures 327
10.3.3 Sensitivity analysis 333
10.3.4 Posterior predictive checks 339
10.3.5 Model expansion 348
10.4 Closing remarks 355
11 Variable selection 360
11.1 Introduction 360
11.2 Classical variable selection 361
11.2.1 Variable selection techniques 362
11.2.2 Frequentist regularization 363
11.3 Bayesian variable selection: concepts and questions 366
11.4 Introduction to Bayesian variable selection 368
11.4.1 Variable selection for K small 368
11.4.2 Variable selection for K large 372
11.5 Variable selection based on Zellner’s g-prior 376
11.6 Variable selection based on Reversible Jump Markov chain Monte Carlo379
11.7 Spike and slab priors 383
11.7.1 Stochastic Search Variable Selection (SSVS) 383
11.7.2 Gibbs Variable Selection (GVS) 386
11.7.3 Dependent variable selection using SSVS 388
11.8 Bayesian regularization 389
11.8.1 Bayesian LASSO regression 389
11.8.2 Elastic Net and further extensions of the Bayesian LASSO 393
11.9 The many regressors case 395
11.10Bayesian model selection 399
11.11Bayesian model averaging 400
11.12Closing remarks 403
III Applications 407
12 Bioassay 408
12.1 Bioassay essentials 408
12.1.1 Cell assays 408
12.1.2 Animal assays 409
12.2 A generic in-vitro example 412
12.3 Ames/Salmonella mutagenic assay 414
12.4 Mouse lymphoma assay (L5178Y TK+/-) 416
12.5 Closing remarks 418
13 Measurement error 419
13.1 Continuous measurement error 419
13.1.1 Measurement error in a variable 419
13.1.2 Two types of measurement error on the predictor in linear and
nonlinear models 420
13.1.3 Accommodation of predictor measurement error 422
13.1.4 Non-additive errors and other extensions426
13.2 Discrete measurement error 427
13.2.1 Sources of misclassification 427
13.2.2 Misclassification in the binary predictor428
13.2.3 Misclassification in a binary response 430
13.3 Closing remarks 434
14 Survival analysis 435
14.1 Basic terminology 435
14.1.1 Endpoint distributions 436
14.1.2 Censoring 437
14.1.3 Random effect specification 438
14.1.4 A general hazard model 438
14.1.5 Proportional hazards 439
14.1.6 The Cox model with random effects 439
14.2 The Bayesian model formulation 439
14.2.1 A Weibull survival model 440
14.2.2 A Bayesian AFT model 441
14.3 Examples 442
14.3.1 The gastric cancer study 442
14.3.2 Prostate cancer in Louisiana: a spatial AFT model 446
14.4 Closing remarks 451
15 Longitudinal analysis 453
15.1 Fixed time periods 453
15.1.1 Introduction 453
15.1.2 A classical growth curve example 454
15.1.3 Alternate data models 460
15.2 Random event times 464
15.3 Dealing with missing data 467
15.3.1 Introduction 467
15.3.2 Response missingness 468
15.3.3 Missingness mechanisms 469
15.3.4 Bayesian considerations 471
15.3.5 Predictor missingness 471
15.4 Joint modeling of longitudinal and survival responses472
15.4.1 Introduction 472
15.4.2 An example 472
15.5 Closing remarks 476
16 Disease mapping & image analysis 478
16.1 Introduction 478
16.2 Disease mapping 478
16.2.1 Some general spatial epidemiological issues 479
16.2.2 Some spatial statistical issues 481
16.2.3 Count data models 481
16.2.4 A special application area: Disease mapping/risk estimation. 482
16.2.5 A special application area: Disease clustering486
16.2.6 A special application area: Ecological analysis493
16.3 Image analysis 494
16.3.1 fMRI modeling 496
16.3.2 A note on software 504
17 Final chapter 506
17.1 What this book covered 506
17.2 Additional Bayesian developments 506
17.2.1 Medical decision making 507
17.2.2 Clinical trials 507
17.2.3 Bayesian networks 508
17.2.4 Bioinformatics 508
17.2.5 Missing data 508
17.2.6 Mixture models 509
17.2.7 Nonparametric Bayesian methods 509
17.3 Alternative reading 509
18 Distributions 511
18.1 Introduction 511
18.2 Continuous univariate distributions 512
18.3 Discrete univariate distributions 528
18.4 Multivariate distributions 532
Bibliography 536
Index