What's different about this book
Working with Data in the "Active Learning Exercises"
Acknowledgments
Notation
Part I Introduction and Statistics Review
Chapter 1. Introduction
Preliminaries
Example: Is Growth Good for the Poor?
What's to Come
Exercises
ALE 1a: An Econometrics "Time Capsule"
ALE 1b: Investigating the Slope Graphically Using a Scatterplot
ALE 1c: Examining Some Disturbing Variations on Dollar & Kraay's Model
ALE 1d: The Pitfalls of Making Scatterplots with Trended Time-Series Data
Chapter 2. A Review of Probability Theory
Introduction
Random Variables
Discrete Random Variables
Continuous Random Variables
Some Initial Results on Expectations
Some Results on Variances
A Pair of Random Variables
The Linearity Property of Expectations
Statistical Independence
Normally Distributed Random Variables
Three Special Properties of Normally Distributed Variables
Distribution of a Linear Combination of Normally Distributed Random Variables
Conclusion
Exercises
ALE 2a: The Normal Distribution
ALE 2b: Central Limit Theorem Simulators on the Web
iii
Appendix 1: The Conditional Mean of a Random Variable
Appendix 2: Proof of the Linearity Property for Expectations for a Weighted Sum
of Two Discretely-Distributed Random Variables
Chapter 3. Estimating the Mean of a Normally Distributed Random
Variable
Introduction
Estimating μ by Curve Fitting
The Sampling Distribution of
Consistency - A First Pass
Unbiasedness and the Optimal Estimator
The Squared Error Loss Function and the Optimal Estimator
The Feasible Optimality Properties: Efficiency and BLUness
Summary
Conclusions and Lead-in to Next Chapter
Exercises
ALE 3a: Investigating the Consistency of the Sample Mean and Sample Variance
Using Computer-Generated Data
ALE 3b: Estimating Means and Variances Regarding the Standard & Poor's
SP500 Stock Index
Chapter 4. Statistical Inference on the Mean of a Normally Distributed
Random Variable
Summary
Standardizing the distribution of
Confidence Intervals for μ when σ2 is Known
Hypothesis Testing when σ2 is Known
Using s2 to Estimate σ2 (And Introducing the Chi-Squared Distribution)
Inference Results on μ when σ2 is Unknown
(and Introducing the Student's t Distribution)
Application: State-level U.S. Unemployment Rates
Introduction to Diagnostic Checking :
Testing the Constancy of μ Across the Sample
Introduction to Diagnostic Checking :
Testing the Constancy of σ2 Across the Sample
Some General Comments on Diagnostic Checking
Closing Comments
Exercises
iv
ALE 4a: Investigating the Sensitivity of Hypothesis Test P-Values to Departures
from the NIID(μ, σ2) Assumption Using Computer-Generated Data
ALE 4b: Individual Income Data from the Panel Study on Income Dynamics
(PSID) - Does Birth-Month Matter?
Part II Regression Analysis
Chapter 5. The Bivariate Regression Model:
(Introduction, Assumptions, and Parameter Estimates)
Introduction
The Transition from Mean Estimation to Regression:
Analyzing the Variation of Per Capita Real Output Across Countries
The Bivariate Regression Model - Its Form and the "Fixed in Repeated Samples"
Causality Assumption
The Assumptions on the Model Error Term, Ui
Least Squares Estimation of α and β
Interpreting the Least Squares Estimates of α and β
Bivariate Regression with a Dummy Variable:
Quantifying the Impact of College Graduation on Weekly Earnings
Exercises
ALE 5a: Exploring the Penn World Table Data
ALE 5b: Verifying and Over a Very Small Data Set
ALE 5c: Extracting and Downloading CPS Data from
the Census Bureau Web Site
ALE 5d: Verifying That on a Dummy Variable Equals
the Difference in the Sample Means
Appendix 1: When xi is a Dummy Variable
Chapter 6. The Bivariate Regression Model:
(Sampling Distributions and Estimator Properties)
Introduction
Estimates and Estimators
as a Linear Estimator and the Least Squares Weights
The Sampling Distribution of
Properties of Consistency
Properties of Best Linear Unbiasedness
Summary
Exercises
ALE 6a: Outliers and Other Perhaps-Overly Influential Observations:
Investigating the Sensitivity of to an Outlier Using Computer-
Generated Data
ALE 6b: Investigating the Consistency of Using Computer-Generated Data
Chapter 7. The Bivariate Regression Model: Inference on β
Preliminaries
A Statistic for β with a Known Distribution
A 95% Confidence Interval for β with σ2 Given
Estimates Versus Estimators and the Role of the Model Assumptions
Testing a Hypothesis about β with σ2 Given
Estimating σ2
A Statistic for β Not Involving σ2
A 95% Confidence Interval for β with σ2 Unknown
Testing a Hypothesis about β with σ2 Unknown
Application: The Impact of College Graduation on Weekly Earnings
(Inference Results)
Application: Is Growth Good For the Poor?
Summary
Exercises
ALE 7a: Investigating the Sensitivity of Hypothesis Test P-Values to Departures
from the Ui ~ NIID(0, σ2) Assumption Using Computer-Generated Data
ALE 7b: Distorted Inference in Time-Series Regressions with Serially Correlated
Model Errors: An Investigation Using Computer-Generated Data
Appendix 1: Proof that s2 is independent of β
Chapter 8. The Bivariate Regression Model: R2 and Prediction
Preliminaries
Quantifying How Well the Model Fits the Data
Prediction as a Tool for Model Validation
Predicting YN+1 given xN+1
Exercises
ALE 8a: On the Folly of Trying Too Hard: a Simple Example of "Data Mining"
Chapter 9. The Multiple Regression Model
Preliminaries
Why the Multiple Regression Model is Necessary and Important
Multiple Regression Parameter Estimates Via Least Squares Fitting
Properties and Sampling Distribution of
Over-Elaborate Multiple Regression Models
Under-Elaborate Multiple Regression Models
Application: The Curious Relationship Between Marriage and Death
Multicollinearity
Application: The Impact of College Graduation and Gender
on Weekly Earnings
Application: Vote Fraud in Philadelphia Senatorial Elections
Exercises
ALE 9a: A Statistical Examination of the Florida Voting in the November 2000
Presidential Election - Did Mistaken Votes for Pat Buchanan Swing the
Election from Gore to Bush?
ALE 9b: Observing and Interpreting the Symptoms of Multicollinearity
ALE 9c: The Market Value of a Bathroom in Georgia
Appendix 1: Prediction Using the Multiple Regression Model
10. Diagnostically Checking and Re-Specifying the Multiple Regression
Model: Dealing With Potential Outliers and Heteroscedasticity in the
Cross-Sectional Data Case
Preliminaries
The Fitting Errors as Large-Sample Estimates of the Model Errors, U1 ... UN
Reasons for Checking the Normality of the Model Errors
Heteroscedasticity and its Consequences
Testing for Heteroscedasticity
Correcting for Heteroscedasticity of Known Form
Correcting for Heteroscedasticity of Unknown Form
Application: Is Growth Good For the Poor? Diagnostically Checking the
Dollar/Kraay (2002) Model.1
Exercises
ALE 10a The Fitting Errors as Approximations for the Model Errors
ALE 10b Does Output Per Person Depend on Human Capital? (A Test of the
1Uses data from Dollar, D. and A. Kraay (2002) "Growth is Good for the Poor," Journal of Economic
Growth 7, pp. 195-225.
Augmented Solow Model of Growth)2
ALE 10c Is Trade Good or Bad for the Environment? (First Pass)3
Chapter 11. Stochastic Regressors and Endogeneity
Introduction
Unbiasedness of the OLS Slope Estimator with a Stochastic Regressor
Independent of the Model Error
A Brief Introduction to Asymptotic Theory
Asymptotic Results for the OLS Slope Estimator with a Stochastic
Regressor
Endogenous Regressors: Omitted Variables
Endogenous Regressors: Measurement Error
Endogenous Regressors: Joint Determination - Introduction to Simultaneous
Equation Macroeconomic and Microeconomic Models
How Large a Sample is "Large Enough"? The Simulation Alternative
Exercises
ALE 11a: Central Limit Theorem Convergence for in the Bivariate
Regression Model
ALE 11b:Bootstrap Analysis of the Convergence of the Asymptotic Sampling
Distributions for Multiple Regression Parameter Estimators
Appendix 1: The Algebra of Probability Limits
Appendix 2: Derivation of the Asymptotic Sampling Distribution of the OLS
Slope Estimator
Chapter 12. Instrumental Variables Estimation
Introduction - Why It Is Challenging to Test for Endogeneity
Correlation versus Causation - Two Ways to Untie the Knot
The Instrumental Variables Slope Estimator (and Proof of Its Consistency) in
the Bivariate Regression Model
Inference Using the Instrumental Variables Slope Estimator
The Two-Stage Least Squares Estimator for the Over-Identified Case
2Uses data from Mankiw, G. N, D. Romer, and D. N. Weil (1992) "A Contribution to the Empirics of
Economic Growth," The Quarterly Journal of Economics 107(2), 407-37. Mankiw, et al. estimate and test a Solow
growth model, augmenting it with a measure of human capital (quantified by the percentage of the population in
secondary school).
3Uses data from Frankel, J. A. and A. K. Rose (2005) " Is Trade Good or Bad for the Environment? Sorting
Out the Causality," The Review of Economics and Statistics 87(1), 85-91. Frankel and Rose quantify and test the
effect of trade openness {(X + M)/Y} on three measures of environmental damage (SO2, NO2, and total suspended
particulates). Since trade openness may well be endogenous, they also obtain 2SLS estimates; these are examined in
an Active Learning Exercise for Chapter 12.
viii
Application: The Relationship Between Education and Wages (Angrist and
Krueger, 1991)
Exercises
ALE 12a: The Role of Institutions (‘Rule of Law') in Economic Growth4
ALE 12b: Is Trade Good or Bad for the Environment? (Completion)5
ALE 12c: The Impact of Military Service on the Smoking Behavior of
Veterans6
ALE 12d: The Effect of Measurement-Error Contamination on OLS Regression
Estimates and the Durbin IV Estimator
Appendix 1: Derivation of the Asymptotic Sampling Distribution of the
Instrumental Variables and the OLS Slope Estimators
Appendix 2: Proof That the 2SLS Composite Instrument Is Asymptotically
Uncorrelated with the Model Error Term
Chapter 13. Diagnostically Checking and Re-Specifying the Multiple
Regression Model: the Time-Series Data Case (Part A)
An Introduction to Time-Series Data, with a "Road-Map" for this Chapter
The Bivariate Time-Series Regression Model With Fixed Regressors but Serially
Correlated Model Errors, U1 ... UT
Disastrous Parameter Inference with Correlated Model Errors: Two
Cautionary Examples Based on U.S. Consumption Expenditures Data
The AR(1) Model for Serial Dependence in a Time-Series
The Consistency of as an Estimator of φ1 in the AR(1) Model
and its Asymptotic Distribution
Application of the AR(1) Model to the Errors of the (De-Trended)
U.S. Consumption Function - and a Straightforward Test for Serially
Correlated Regression Errors
Dynamic Model Re-Specification: An Effective Response to Serially
Correlated Regression Model Errors, with an Application to the (De-
Trended) U.S. Consumption Function
Exercises
4Uses data from Acemoglu, D., S. Johnson, and J. A. Robinson (2001) "The Colonial Origins of
Comparative Development," The American Economic Review 91(5), 1369-1401. They argue that the European
mortality rate in colonial times is a valid instrument for current institutional quality because Europeans settled (and
imported their cultural institutions) only in colonies with climates they found healthy.
5See footnote for ALE 10c.
6Uses data from Bedard, K. And O. Deschênes (2006) "The Long-Term Impact of Military Service on
Health: Evidence from World War II and Korean War Veterans." The American Economic Review 96(1), 176-194.
They quantify the impact of the provision of free and/or low-cost tobacco products to servicemen on smoking and
(later) on mortality rates, using instrumental variable methods to control for the non-random selection into military
service.
ix
Appendix 1: Derivation of the Asymptotic Sampling Distribution of in
the AR(1) Model
Chapter 14. Diagnostically Checking and Re-Specifying the Multiple
Regression Model: the Time-Series Data Case (Part B)
Introduction: Generalizing the Results to Multiple Time-Series
The Dynamic Multiple Regression Model
I(1) or "Random Walk" Time-Series
Capstone Example Part 1: Modeling Monthly U.S. Consumption Expenditures
in Growth Rates
Capstone Example Part 2: Modeling Monthly U.S. Consumption Expenditures
in Growth Rates and Levels (Cointegrated Model)
Capstone Example Part 3: Modeling the Level of Monthly U.S. Consumption
Expenditures
Which is better: to Model in Levels or to Model in Changes?
Exercises
ALE 14a: Analyzing the Food Price Sub-Index of the Monthly U. S. Consumer
Price Index
ALE 14b: Estimating Taylor Rules for How the U.S. Fed Sets Interest Rates
Part III Additional Topics in Regression Analysis
Chapter 15. Regression Modeling with Panel Data (Part A)
Introduction: A Source of Large (But Likely Heterogeneous) Data Sets
Re-Visiting the Chapter 5 Illustrative Example Using Data from the Penn World
Table
A Multivariate Empirical Example
The Fixed Effects and the Between Effects Models
The Random Effects Model
Diagnostic Checking of an Estimated Model
Exercises
Appendix 1: Stata Code for the Generalized Hausman Test
Chapter 16. Regression Modeling with Panel Data (Part B)
Relaxing Strict Exogeneity: Dynamics and Lagged Dependent Variables
Relaxing Strict Exogeneity: The First-Differences Model
Summary
Exercises
x
ALE 16a: Assessing the Impact of 4-H participation on the Standardized Test
Scores of Florida Schoolchildren
ALE16b: Using Panel Data Methods to Re-Analyze Data from a Public Goods
Experiment
Appendix 1 Summary of Panel-Data Estimation Methods and the Modeling
Situations for Which They Yield Consistent Parameter Estimation
Chapter 17. A Concise Introduction to Time-Series Analysis and Forecasting
(Part A)
Introduction: the Difference Between Time-Series Analysis and Time-Series
Econometrics
Optimal Forecasts: the Primacy of the Conditional-Mean Forecast and When it is
Better to Use a Biased Forecast
The Crucial Assumption (Stationarity) and the Fundamental Tools: the Time-Plot
and the Sample Correlogram
A Polynomial in the Lag Operator and it Inverse: The Key to Understanding and
Manipulating Linear Time-Series Models
Identification/Estimation/Checking/Forecasting of an Invertible MA(q) Model
Identification/Estimation/Checking/Forecasting of a Stationary AR(p) Model
ARMA(p,q) Models and a Summary of the Box-Jenkins Modeling Algorithm
Exercises
ALE 17a: Conditional Forecasting Using a Large-Scale Macroeconometric Model
ALE 17b: Modeling U.S. GNP
Chapter 18. A Concise Introduction to Time-Series Analysis and Forecasting
(Part B)
Integrated - ARIMA(p,d,q) - Models and ‘Trend-like' Behavior
A Univariate Application: Modeling the Monthly U.S. Treasury Bill Rate
Seasonal Time-Series Data and ARMA De-Seasonalization of the U.S.
Total Nonfarm Payroll Time-Series
Multivariate Time-Series Models
Post-Sample Model Forecast Evaluation and Testing for Granger-Causation
Modeling Non-Linear Serial Dependence in a Time-Series
Additional Topics in Forecasting
Exercises
ALE 18a: Modeling the South Korean Won - U.S. Dollar Exchange Rate
ALE 18b: Modeling the Daily Returns to Ford Motor Company Stock
Chapter 19. Parameter Estimation Beyond Curve-Fitting: MLE (with an
Application to Binary-Choice Models) and GMM (with an
Application to IV Regression)
Introduction
Maximum Likelihood Estimation of a Simple Bivariate Regression Model
Maximum Likelihood Estimation of Binary-Choice Regression Models
Generalized Method of Moments Estimation
Exercises
ALE 19a: Probit Modeling of the Determinants of Labor Force Participation
ALE 19b: GMM Estimation of a Model for U.S. State-Level Mortality Rates
as a Function of the Unemployment Rate
Appendix 1: GMM Estimation of β in the Bivariate Regression Model (Optimal
Penalty-Weights and Sampling Distribution)
Chapter 20. Concluding Comments
The Goals of This Book
Diagnostic Checking and Model Re-Specification
Avoiding the Four "Big Mistakes"
Mathematics Review