# Applied Predictive Modelling

### Chapter 1 Introduction

Prediction Versus Interpretation, Key Ingredients of Predictive Models; Terminology; Example Data Sets and Typical Data Scenarios; Overview; Notation (15 pages, 3 figures)

## Part I: General Strategies

### Chapter 2 A Short Tour of the Predictive Modeling Process

Case Study: Predicting Fuel Economy; Themes; Summary (8 pages, 6 figures, R packages used)

### Chapter 3 Data Pre-Processing

Case Study: Cell Segmentation in High-Content Screening; Data Transformations for Individual Predictors; Data Transformations for Multiple Predictors; Dealing with Missing Values; Removing Variables; Adding Variables; Binning Variables; Computing; Exercises (32 pages, 11 figures, R packages used)

### Chapter 4 Over-Fitting and Model Tuning

The Problem of Over-Fitting; Model Tuning; Data Splitting; Resampling Techniques; Case Study: Credit Scoring; Choosing Final Tuning Parameters; Data Splitting Recommendations; Choosing Between Models; Computing; Exercises (29 pages, 13 figures, R packages used)

## Part II: Regression Models

### Chapter 5 Measuring Performance in Regression Models

Quantitative Measures of Performance; The Variance-Bias Tradeoff; Computing (4 pages, 3 figures)

### Chapter 6 Linear Regression and Its Cousins

Case Study: Quantitative Structure-Activity Relationship Modeling; Linear Regression; Partial Least Squares; Penalized Models; Computing; Exercises (37 pages, 20 figures, R packages used)

### Chapter 7 Non-Linear Regression Models

Neural Networks; Multivariate Adaptive Regression Splines; Support Vector Machines; K-Nearest Neighbors; Computing; Exercises (28 pages, 10 figures, R packages used)

### Chapter 8 Regression Trees and Rule-Based Models

Basic Regression Trees; Regression Model Trees; Rule-Based Models; Bagged Trees; Random Forests; Boosting; Cubist; Computing; Exercises (46 pages, 24 figures, R packages used)

### Chapter 9 A Summary of Solubility Models

(3 pages, 3 figures)

### Chapter 10 Case Study: Compressive Strength of Concrete Mixtures

Model Building Strategy; Model Performance; Optimizing Compressive Strength; Computing (12 pages, 5 figures, R packages used)

## Part III: Classification Models

### Chapter 11 Measuring Performance in Classification Models

Class Predictions; Evaluating Predicted Classes; Evaluating Class Probabilities; Computing (20 pages, 9 figures, R packages used)

### Chapter 12 Discriminant Analysis and Other Linear Classification Models

Case Study; Logistic Regression; Linear Discriminant Analysis; Partial Least Squares Discriminant Analysis; Penalized Models; Nearest Shrunken Centroids; Computing; Exercises (52 pages, 20 figures, R packages used)

### Chapter 13 Non-Linear Classification Models

Nonlinear Discriminant Analysis; Neural Networks; Flexible Discriminant Analysis; Support Vector Machines; K-Nearest Neighbors; Naive Bayes; Computing; Exercises (38 pages, 16 figures, R packages used)

### Chapter 14 Classification Trees and Rule-Based Models

Basic Regression Trees; Rule-Based Models; Bagged Trees; Random Forests; Boosting; C5.0; Wrap-Up; Computing (46 pages, 15 figures, R packages used)

### Chapter 15 A Summary of Grant Application Models

(3 pages, 2 figures)

### Chapter 16 Remedies for Severe Class Imbalance

Case Study: Predicting Caravan Policy Ownership; The Effect of Class Imbalance; Model Tuning; Alternate Cutoffs; Adjusting Prior Probabilities; Unequal Case Weights; Sampling Methods; Cost-Sensitive Training; Computing; Exercises (24 pages, 7 figures, R packages used)

### Chapter 17 Case Study: Job Scheduling

Data Splitting and Model Strategy; Results; Computing (13 pages, 6 figures, R packages used)

## Part IV: Other Considerations

### Chapter 18 Measuring Predictor Importance

Numeric Outcomes; Categorical Outcomes; Other Approaches; Computing; Exercises (24 pages, 10 figures, R packages used)

### Chapter 19 An Introduction to Feature Selection

Consequences of Using Non-Informative Predictors; Approaches for Reducing the Number of Predictors; Wrappers Methods; Filter Methods; Selection Bias; Misuse of Feature Selection; Case Study: Predicting Cognitive Impairment; Computing; Exercises (34 pages, 7 figures, R packages used)

### Chapter 20 Factors That Can Affect Model Performance

Type III Errors; Measurment Error in the Outcome; Measurement Error in the Predictors; Discretizing Continuous Outcomes; When Should You Trust Your Model’s Prediction?; The Impact of a Large Sample; Computing; Exercises (26 pages, 12 figures, R packages used)

## Appendix

These are included in the sample pages on Spinger’s website.

### Appendix A A Summary of Various Models

### Appendix B An Introduction to R

Startup and Getting Help; Packages; Creating Objects; Data Types and Basic Structures; Working with Rectangular Data Sets; Objects and Classes; R Functions; The Three Faces of =; The AppliedPredictiveModeling Package; The caret Package; Software Used in This Text (16 pages, 1 figure, R packages used)