Description

Book Synopsis
Handbook and reference guide for students and practitioners of statistical regression-based analyses in R Handbook of Regression Analysis with Applications in R, Second Edition is a comprehensive and up-to-date guide to conducting complex regressions in the R statistical programming language. The authors' thorough treatment of classical regression analysis in the first edition is complemented here by their discussion of more advanced topics including time-to-event survival data and longitudinal and clustered data. The book further pays particular attention to methods that have become prominent in the last few decades as increasingly large data sets have made new techniques and applications possible. These include: Regularization methodsSmoothing methodsTree-based methods In the new edition of the Handbook, the data analyst's toolkit is explored and expanded. Examples are drawn from a wide variety of real-life applications and data sets. All the utilized R code and data are available via an author-maintained website. Of interest to undergraduate and graduate students taking courses in statistics and regression, the Handbook of Regression Analysis will also be invaluable to practicing data scientists and statisticians.

Table of Contents

Preface to the Second Edition xv

Preface to the First Edition xix

Part I The Multiple Linear Regression Model

1 Multiple Linear Regression 3

1.1 Introduction 3

1.2 Concepts and Background Material 4

1.2.1 The Linear Regression Model 4

1.2.2 Estimation Using Least Squares 5

1.2.3 Assumptions 8

1.3 Methodology 9

1.3.1 Interpreting Regression Coefficients 9

1.3.2 Measuring the Strength of the Regression Relationship 10

1.3.3 Hypothesis Tests and Confidence Intervals for β 12

1.3.4 Fitted Values and Predictions 13

1.3.5 Checking Assumptions Using Residual Plots 14

1.4 Example —Estimating Home Prices 15

1.5 Summary 19

2 Model Building 23

2.1 Introduction 23

2.2 Concepts and Background Material 24

2.2.1 Using Hypothesis Tests to Compare Models 24

2.2.2 Collinearity 26

2.3 Methodology 29

2.3.1 Model Selection 29

2.3.2 Example—Estimating Home Prices (continued) 31

2.4 Indicator Variables and Modeling Interactions 38

2.4.1 Example—Electronic Voting and the 2004 Presidential Election 40

2.5 Summary 46

Part II Addressing Violations of Assumptions

3 Diagnostics for Unusual Observations 53

3.1 Introduction 53

3.2 Concepts and Background Material 54

3.3 Methodology 56

3.3.1 Residuals and Outliers 56

3.3.2 Leverage Points 57

3.3.3 Influential Points and Cook’s Distance 58

3.4 Example— Estimating Home Prices (continued) 60

3.5 Summary 63

4 Transformations and Linearizable Models 67

4.1 Introduction 67

4.2 Concepts and Background Material: The Log-Log Model 69

4.3 Concepts and Background Material: Semilog Models 69

4.3.1 Logged Response Variable 70

4.3.2 Logged Predictor Variable 70

4.4 Example— Predicting Movie Grosses After One Week 71

4.5 Summary 77

5 Time Series Data and Autocorrelation 79

5.1 Introduction 79

5.2 Concepts and Background Material 81

5.3 Methodology: Identifying Autocorrelation 83

5.3.1 The Durbin-Watson Statistic 83

5.3.2 The Autocorrelation Function (ACF) 84

5.3.3 Residual Plots and the Runs Test 85

5.4 Methodology: Addressing Autocorrelation 86

5.4.1 Detrending and Deseasonalizing 86

5.4.2 Example— e-Commerce Retail Sales 87

5.4.3 Lagging and Differencing 93

5.4.4 Example— Stock Indexes 94

5.4.5 Generalized Least Squares (GLS): The Cochrane-Orcutt Procedure 99

5.4.6 Example— Time Intervals Between Old Faithful Geyser Eruptions 100

5.5 Summary 104

Part III Categorical Predictors

6 Analysis of Variance 109

6.1 Introduction 109

6.2 Concepts and Background Material 110

6.2.1 One-Way ANOVA 110

6.2.2 Two-Way ANOVA 111

6.3 Methodology 113

6.3.1 Codings for Categorical Predictors 113

6.3.2 Multiple Comparisons 118

6.3.3 Levene’s Test and Weighted Least Squares 120

6.3.4 Membership in Multiple Groups 123

6.4 Example—DVD Sales of Movies 125

6.5 Higher-Way ANOVA 130

6.6 Summary 132

7 Analysis of Covariance 135

7.1 Introduction 135

7.2 Methodology 136

7.2.1 Constant Shift Models 136

7.2.2 Varying Slope Models 137

7.3 Example —International Grosses of Movies 137

7.4 Summary 142

Part IV Non-Gaussian Regression Models

8 Logistic Regression 145

8.1 Introduction 145

8.2 Concepts and Background Material 147

8.2.1 The Logit Response Function 148

8.2.2 Bernoulli and Binomial Random Variables 149

8.2.3 Prospective and Retrospective Designs 149

8.3 Methodology 152

8.3.1 Maximum Likelihood Estimation 152

8.3.2 Inference, Model Comparison, and Model Selection 153

8.3.3 Goodness-of-Fit 155

8.3.4 Measures of Association and Classification Accuracy 157

8.3.5 Diagnostics 159

8.4 Example— Smoking and Mortality 159

8.5 Example— Modeling Bankruptcy 163

8.6 Summary 168

9 Multinomial Regression 173

9.1 Introduction 173

9.2 Concepts and Background Material 174

9.2.1 Nominal Response Variable 174

9.2.2 Ordinal Response Variable 176

9.3 Methodology 178

9.3.1 Estimation 178

9.3.2 Inference, Model Comparisons, and Strength of Fit 178

9.3.3 Lack of Fit and Violations of Assumptions 180

9.4 Example— City Bond Ratings 180

9.5 Summary 184

10 Count Regression 187

10.1 Introduction 187

10.2 Concepts and Background Material 188

10.2.1 The Poisson Random Variable 188

10.2.2 Generalized Linear Models 189

10.3 Methodology 190

10.3.1 Estimation and Inference 190

10.3.2 Offsets 191

10.4 Overdispersion and Negative Binomial Regression 192

10.4.1 Quasi-likelihood 192

10.4.2 Negative Binomial Regression 193

10.5 Example— Unprovoked Shark Attacks in Florida 194

10.6 Other Count Regression Models 201

10.7 Poisson Regression and Weighted Least Squares 203

10.7.1 Example— International Grosses of Movies (continued) 204

10.8 Summary 206

11 Models for Time-to-Event (Survival) Data 209

11.1 Introduction 210

11.2 Concepts and Background Material 211

11.2.1 The Nature of Survival Data 211

11.2.2 Accelerated Failure Time Models 212

11.2.3 The Proportional Hazards Model 214

11.3 Methodology 214

11.3.1 The Kaplan-Meier Estimator and the Log-Rank Test 214

11.3.2 Parametric (Likelihood) Estimation 219

11.3.3 Semiparametric (Partial Likelihood) Estimation 221

11.3.4 The Buckley-James Estimator 223

11.4 Example—The Survival of Broadway Shows (continued) 223

11.5 Left-Truncated/Right-Censored Data and Time-Varying Covariates 230

11.5.1 Left-Truncated/Right-Censored Data 230

11.5.2 Example—The Survival of Broadway Shows (continued) 233

11.5.3 Time-Varying Covariates 233

11.5.4 Example—Female Heads of Government 235

11.6 Summary 238

Part V Other Regression Models

12 Nonlinear Regression 243

12.1 Introduction 243

12.2 Concepts and Background Material 244

12.3 Methodology 246

12.3.1 Nonlinear Least Squares Estimation 246

12.3.2 Inference for Nonlinear Regression Models 247

12.4 Example —Michaelis-Menten Enzyme Kinetics 248

12.5 Summary 252

13 Models for Longitudinal and Nested Data 255

13.1 Introduction 255

13.2 Concepts and Background Material 257

13.2.1 Nested Data and ANOVA 257

13.2.2 Longitudinal Data and Time Series 258

13.2.3 Fixed Effects Versus Random Effects 259

13.3 Methodology 260

13.3.1 The Linear Mixed Effects Model 260

13.3.2 The Generalized Linear Mixed Effects Model 262

13.3.3 Generalized Estimating Equations 262

13.3.4 Nonlinear Mixed Effects Models 263

13.4 Example —Tumor Growth in a Cancer Study 264

13.5 Example —Unprovoked Shark Attacks in the United States 269

13.6 Summary 275

14 Regularization Methods and Sparse Models 277

14.1 Introduction 277

14.2 Concepts and Background Material 278

14.2.1 The Bias–Variance Tradeoff 278

14.2.2 Large Numbers of Predictors and Sparsity 279

14.3 Methodology 280

14.3.1 Forward Stepwise Regression 280

14.3.2 Ridge Regression 281

14.3.3 The Lasso 281

14.3.4 Other Regularization Methods 283

14.3.5 Choosing the Regularization Parameter(s) 284

14.3.6 More Structured Regression Problems 285

14.3.7 Cautions About Regularization Methods 286

14.4 Example— Human Development Index 287

14.5 Summary 289

Part VI Nonparametric and Semiparametric Models

15 Smoothing and Additive Models 295

15.1 Introduction 296

15.2 Concepts and Background Material 296

15.2.1 The Bias–Variance Tradeoff 296

15.2.2 Smoothing and Local Regression 297

15.3 Methodology 298

15.3.1 Local Polynomial Regression 298

15.3.2 Choosing the Bandwidth 298

15.3.3 Smoothing Splines 299

15.3.4 Multiple Predictors, the Curse of Dimensionality, and Additive Models 300

15.4 Example— Prices of German Used Automobiles 301

15.5 Local and Penalized Likelihood Regression 304

15.5.1 Example— The Bechdel Rule and Hollywood Movies 305

15.6 Using Smoothing to Identify Interactions 307

15.6.1 Example— Estimating Home Prices (continued) 308

15.7 Summary 310

16 Tree-Based Models 313

16.1 Introduction 314

16.2 Concepts and Background Material 314

16.2.1 Recursive Partitioning 314

16.2.2 Types of Trees 317

16.3 Methodology 318

16.3.1 CART 318

16.3.2 Conditional Inference Trees 319

16.3.3 Ensemble Methods 320

16.4 Examples 321

16.4.1 Estimating Home Prices (continued) 321

16.4.2 Example—Courtesy in Airplane Travel 322

16.5 Trees for Other Types of Data 327

16.5.1 Trees for Nested and Longitudinal Data 327

16.5.2 Survival Trees 328

16.6 Summary 332

Bibliography 337

Index 343

Handbook of Regression Analysis With Applications

    Product form

    £99.86

    Includes FREE delivery

    RRP £110.95 – you save £11.09 (9%)

    Order before 4pm today for delivery by Mon 22 Jun 2026.

    A Hardback by Samprit Chatterjee, Jeffrey S. Simonoff

    7 in stock

      Trusted by thousands of customers. See 2,385+ Customer Reviews

      View other formats and editions of Handbook of Regression Analysis With Applications by Samprit Chatterjee

      Publisher: John Wiley & Sons Inc
      Publication Date: 02/10/2020
      ISBN13: 9781119392378, 978-1119392378
      ISBN10: 1119392373

      Description

      Book Synopsis
      Handbook and reference guide for students and practitioners of statistical regression-based analyses in R Handbook of Regression Analysis with Applications in R, Second Edition is a comprehensive and up-to-date guide to conducting complex regressions in the R statistical programming language. The authors' thorough treatment of classical regression analysis in the first edition is complemented here by their discussion of more advanced topics including time-to-event survival data and longitudinal and clustered data. The book further pays particular attention to methods that have become prominent in the last few decades as increasingly large data sets have made new techniques and applications possible. These include: Regularization methodsSmoothing methodsTree-based methods In the new edition of the Handbook, the data analyst's toolkit is explored and expanded. Examples are drawn from a wide variety of real-life applications and data sets. All the utilized R code and data are available via an author-maintained website. Of interest to undergraduate and graduate students taking courses in statistics and regression, the Handbook of Regression Analysis will also be invaluable to practicing data scientists and statisticians.

      Table of Contents

      Preface to the Second Edition xv

      Preface to the First Edition xix

      Part I The Multiple Linear Regression Model

      1 Multiple Linear Regression 3

      1.1 Introduction 3

      1.2 Concepts and Background Material 4

      1.2.1 The Linear Regression Model 4

      1.2.2 Estimation Using Least Squares 5

      1.2.3 Assumptions 8

      1.3 Methodology 9

      1.3.1 Interpreting Regression Coefficients 9

      1.3.2 Measuring the Strength of the Regression Relationship 10

      1.3.3 Hypothesis Tests and Confidence Intervals for β 12

      1.3.4 Fitted Values and Predictions 13

      1.3.5 Checking Assumptions Using Residual Plots 14

      1.4 Example —Estimating Home Prices 15

      1.5 Summary 19

      2 Model Building 23

      2.1 Introduction 23

      2.2 Concepts and Background Material 24

      2.2.1 Using Hypothesis Tests to Compare Models 24

      2.2.2 Collinearity 26

      2.3 Methodology 29

      2.3.1 Model Selection 29

      2.3.2 Example—Estimating Home Prices (continued) 31

      2.4 Indicator Variables and Modeling Interactions 38

      2.4.1 Example—Electronic Voting and the 2004 Presidential Election 40

      2.5 Summary 46

      Part II Addressing Violations of Assumptions

      3 Diagnostics for Unusual Observations 53

      3.1 Introduction 53

      3.2 Concepts and Background Material 54

      3.3 Methodology 56

      3.3.1 Residuals and Outliers 56

      3.3.2 Leverage Points 57

      3.3.3 Influential Points and Cook’s Distance 58

      3.4 Example— Estimating Home Prices (continued) 60

      3.5 Summary 63

      4 Transformations and Linearizable Models 67

      4.1 Introduction 67

      4.2 Concepts and Background Material: The Log-Log Model 69

      4.3 Concepts and Background Material: Semilog Models 69

      4.3.1 Logged Response Variable 70

      4.3.2 Logged Predictor Variable 70

      4.4 Example— Predicting Movie Grosses After One Week 71

      4.5 Summary 77

      5 Time Series Data and Autocorrelation 79

      5.1 Introduction 79

      5.2 Concepts and Background Material 81

      5.3 Methodology: Identifying Autocorrelation 83

      5.3.1 The Durbin-Watson Statistic 83

      5.3.2 The Autocorrelation Function (ACF) 84

      5.3.3 Residual Plots and the Runs Test 85

      5.4 Methodology: Addressing Autocorrelation 86

      5.4.1 Detrending and Deseasonalizing 86

      5.4.2 Example— e-Commerce Retail Sales 87

      5.4.3 Lagging and Differencing 93

      5.4.4 Example— Stock Indexes 94

      5.4.5 Generalized Least Squares (GLS): The Cochrane-Orcutt Procedure 99

      5.4.6 Example— Time Intervals Between Old Faithful Geyser Eruptions 100

      5.5 Summary 104

      Part III Categorical Predictors

      6 Analysis of Variance 109

      6.1 Introduction 109

      6.2 Concepts and Background Material 110

      6.2.1 One-Way ANOVA 110

      6.2.2 Two-Way ANOVA 111

      6.3 Methodology 113

      6.3.1 Codings for Categorical Predictors 113

      6.3.2 Multiple Comparisons 118

      6.3.3 Levene’s Test and Weighted Least Squares 120

      6.3.4 Membership in Multiple Groups 123

      6.4 Example—DVD Sales of Movies 125

      6.5 Higher-Way ANOVA 130

      6.6 Summary 132

      7 Analysis of Covariance 135

      7.1 Introduction 135

      7.2 Methodology 136

      7.2.1 Constant Shift Models 136

      7.2.2 Varying Slope Models 137

      7.3 Example —International Grosses of Movies 137

      7.4 Summary 142

      Part IV Non-Gaussian Regression Models

      8 Logistic Regression 145

      8.1 Introduction 145

      8.2 Concepts and Background Material 147

      8.2.1 The Logit Response Function 148

      8.2.2 Bernoulli and Binomial Random Variables 149

      8.2.3 Prospective and Retrospective Designs 149

      8.3 Methodology 152

      8.3.1 Maximum Likelihood Estimation 152

      8.3.2 Inference, Model Comparison, and Model Selection 153

      8.3.3 Goodness-of-Fit 155

      8.3.4 Measures of Association and Classification Accuracy 157

      8.3.5 Diagnostics 159

      8.4 Example— Smoking and Mortality 159

      8.5 Example— Modeling Bankruptcy 163

      8.6 Summary 168

      9 Multinomial Regression 173

      9.1 Introduction 173

      9.2 Concepts and Background Material 174

      9.2.1 Nominal Response Variable 174

      9.2.2 Ordinal Response Variable 176

      9.3 Methodology 178

      9.3.1 Estimation 178

      9.3.2 Inference, Model Comparisons, and Strength of Fit 178

      9.3.3 Lack of Fit and Violations of Assumptions 180

      9.4 Example— City Bond Ratings 180

      9.5 Summary 184

      10 Count Regression 187

      10.1 Introduction 187

      10.2 Concepts and Background Material 188

      10.2.1 The Poisson Random Variable 188

      10.2.2 Generalized Linear Models 189

      10.3 Methodology 190

      10.3.1 Estimation and Inference 190

      10.3.2 Offsets 191

      10.4 Overdispersion and Negative Binomial Regression 192

      10.4.1 Quasi-likelihood 192

      10.4.2 Negative Binomial Regression 193

      10.5 Example— Unprovoked Shark Attacks in Florida 194

      10.6 Other Count Regression Models 201

      10.7 Poisson Regression and Weighted Least Squares 203

      10.7.1 Example— International Grosses of Movies (continued) 204

      10.8 Summary 206

      11 Models for Time-to-Event (Survival) Data 209

      11.1 Introduction 210

      11.2 Concepts and Background Material 211

      11.2.1 The Nature of Survival Data 211

      11.2.2 Accelerated Failure Time Models 212

      11.2.3 The Proportional Hazards Model 214

      11.3 Methodology 214

      11.3.1 The Kaplan-Meier Estimator and the Log-Rank Test 214

      11.3.2 Parametric (Likelihood) Estimation 219

      11.3.3 Semiparametric (Partial Likelihood) Estimation 221

      11.3.4 The Buckley-James Estimator 223

      11.4 Example—The Survival of Broadway Shows (continued) 223

      11.5 Left-Truncated/Right-Censored Data and Time-Varying Covariates 230

      11.5.1 Left-Truncated/Right-Censored Data 230

      11.5.2 Example—The Survival of Broadway Shows (continued) 233

      11.5.3 Time-Varying Covariates 233

      11.5.4 Example—Female Heads of Government 235

      11.6 Summary 238

      Part V Other Regression Models

      12 Nonlinear Regression 243

      12.1 Introduction 243

      12.2 Concepts and Background Material 244

      12.3 Methodology 246

      12.3.1 Nonlinear Least Squares Estimation 246

      12.3.2 Inference for Nonlinear Regression Models 247

      12.4 Example —Michaelis-Menten Enzyme Kinetics 248

      12.5 Summary 252

      13 Models for Longitudinal and Nested Data 255

      13.1 Introduction 255

      13.2 Concepts and Background Material 257

      13.2.1 Nested Data and ANOVA 257

      13.2.2 Longitudinal Data and Time Series 258

      13.2.3 Fixed Effects Versus Random Effects 259

      13.3 Methodology 260

      13.3.1 The Linear Mixed Effects Model 260

      13.3.2 The Generalized Linear Mixed Effects Model 262

      13.3.3 Generalized Estimating Equations 262

      13.3.4 Nonlinear Mixed Effects Models 263

      13.4 Example —Tumor Growth in a Cancer Study 264

      13.5 Example —Unprovoked Shark Attacks in the United States 269

      13.6 Summary 275

      14 Regularization Methods and Sparse Models 277

      14.1 Introduction 277

      14.2 Concepts and Background Material 278

      14.2.1 The Bias–Variance Tradeoff 278

      14.2.2 Large Numbers of Predictors and Sparsity 279

      14.3 Methodology 280

      14.3.1 Forward Stepwise Regression 280

      14.3.2 Ridge Regression 281

      14.3.3 The Lasso 281

      14.3.4 Other Regularization Methods 283

      14.3.5 Choosing the Regularization Parameter(s) 284

      14.3.6 More Structured Regression Problems 285

      14.3.7 Cautions About Regularization Methods 286

      14.4 Example— Human Development Index 287

      14.5 Summary 289

      Part VI Nonparametric and Semiparametric Models

      15 Smoothing and Additive Models 295

      15.1 Introduction 296

      15.2 Concepts and Background Material 296

      15.2.1 The Bias–Variance Tradeoff 296

      15.2.2 Smoothing and Local Regression 297

      15.3 Methodology 298

      15.3.1 Local Polynomial Regression 298

      15.3.2 Choosing the Bandwidth 298

      15.3.3 Smoothing Splines 299

      15.3.4 Multiple Predictors, the Curse of Dimensionality, and Additive Models 300

      15.4 Example— Prices of German Used Automobiles 301

      15.5 Local and Penalized Likelihood Regression 304

      15.5.1 Example— The Bechdel Rule and Hollywood Movies 305

      15.6 Using Smoothing to Identify Interactions 307

      15.6.1 Example— Estimating Home Prices (continued) 308

      15.7 Summary 310

      16 Tree-Based Models 313

      16.1 Introduction 314

      16.2 Concepts and Background Material 314

      16.2.1 Recursive Partitioning 314

      16.2.2 Types of Trees 317

      16.3 Methodology 318

      16.3.1 CART 318

      16.3.2 Conditional Inference Trees 319

      16.3.3 Ensemble Methods 320

      16.4 Examples 321

      16.4.1 Estimating Home Prices (continued) 321

      16.4.2 Example—Courtesy in Airplane Travel 322

      16.5 Trees for Other Types of Data 327

      16.5.1 Trees for Nested and Longitudinal Data 327

      16.5.2 Survival Trees 328

      16.6 Summary 332

      Bibliography 337

      Index 343

      Recently viewed products

      © 2026 Book Curl

        • American Express
        • Apple Pay
        • Diners Club
        • Discover
        • Google Pay
        • Maestro
        • Mastercard
        • PayPal
        • Shop Pay
        • Union Pay
        • Visa

        Login

        Forgot your password?

        Don't have an account yet?
        Create account