Description

Book Synopsis


Table of Contents

Foreword by Ravi Bapna xix

Foreword by Gareth James xxi

Preface to the Second R Edition xxiii

Acknowledgments xxvi

Part I Preliminaries

Chapter 1 Introduction 3

1.1 What Is Business Analytics? 3

1.2 What Is Machine Learning? 5

1.3 Machine Learning, AI, and Related Terms 5

1.4 Big Data 7

1.5 Data Science 8

1.6 Why Are There So Many Different Methods? 8

1.7 Terminology and Notation 9

1.8 Road Maps to This Book 11

Order of Topics 13

Chapter 2 Overview of the Machine Learning Process 17

2.1 Introduction 17

2.2 Core Ideas in Machine Learning 18

Classification 18

Prediction 18

Association Rules and Recommendation Systems 18

Predictive Analytics 19

Data Reduction and Dimension Reduction 19

Data Exploration and Visualization 19

Supervised and Unsupervised Learning 20

2.3 The Steps in a Machine Learning Project 21

2.4 Preliminary Steps 23

Organization of Data 23

Predicting Home Values in the West Roxbury Neighborhood 23

Loading and Looking at the Data in R 24

Sampling from a Database 26

Oversampling Rare Events in Classification Tasks 27

Preprocessing and Cleaning the Data 28

2.5 Predictive Power and Overfitting 35

Overfitting 36

Creating and Using Data Partitions 38

2.6 Building a Predictive Model 41

Modeling Process 41

2.7 Using R for Machine Learning on a Local Machine 46

2.8 Automating Machine Learning Solutions 47

Predicting Power Generator Failure 48

Uber’s Michelangelo 50

2.9 Ethical Practice in Machine Learning 52

Machine Learning Software: The State of the Market (by Herb Edelstein) 53

Problems 57

Part II Data Exploration and Dimension Reduction

Chapter 3 Data Visualization 63

3.1 Uses of Data Visualization 63

Base R or ggplot? 65

3.2 Data Examples 65

Example 1: Boston Housing Data 65

Example 2: Ridership on Amtrak Trains 67

3.3 Basic Charts: Bar Charts, Line Charts, and Scatter Plots 67

Distribution Plots: Boxplots and Histograms 70

Heatmaps: Visualizing Correlations and Missing Values 73

3.4 Multidimensional Visualization 75

Adding Variables: Color, Size, Shape, Multiple Panels, and Animation 76

Manipulations: Rescaling, Aggregation and Hierarchies, Zooming, Filtering 79

Reference: Trend Lines and Labels 83

Scaling Up to Large Datasets 85

Multivariate Plot: Parallel Coordinates Plot 85

Interactive Visualization 88

3.5 Specialized Visualizations 91

Visualizing Networked Data 91

Visualizing Hierarchical Data: Treemaps 93

Visualizing Geographical Data: Map Charts 95

3.6 Major Visualizations and Operations, by Machine Learning Goal 97

Prediction 97

Classification 97

Time Series Forecasting 97

Unsupervised Learning 98

Problems 99

Chapter 4 Dimension Reduction 101

4.1 Introduction 101

4.2 Curse of Dimensionality 102

4.3 Practical Considerations 102

Example 1: House Prices in Boston 103

4.4 Data Summaries 103

Summary Statistics 104

Aggregation and Pivot Tables 104

4.5 Correlation Analysis 107

4.6 Reducing the Number of Categories in Categorical Variables 109

4.7 Converting a Categorical Variable to a Numerical Variable 111

4.8 Principal Component Analysis 111

Example 2: Breakfast Cereals 111

Principal Components 116

Normalizing the Data 117

Using Principal Components for Classification and Prediction 120

4.9 Dimension Reduction Using Regression Models 121

4.10 Dimension Reduction Using Classification and Regression Trees 121

Problems 123

Part III Performance Evaluation

Chapter 5 Evaluating Predictive Performance 129

5.1 Introduction 130

5.2 Evaluating Predictive Performance 130

Naive Benchmark: The Average 131

Prediction Accuracy Measures 131

Comparing Training and Holdout Performance 133

Cumulative Gains and Lift Charts 133

5.3 Judging Classifier Performance 136

Benchmark: The Naive Rule 136

Class Separation 136

The Confusion (Classification) Matrix 137

Using the Holdout Data 138

Accuracy Measures 139

Propensities and Threshold for Classification 139

Performance in Case of Unequal Importance of Classes 143

Asymmetric Misclassification Costs 146

Generalization to More Than Two Classes 149

5.4 Judging Ranking Performance 150

Cumulative Gains and Lift Charts for Binary Data 150

Decile-wise Lift Charts 153

Beyond Two Classes 154

Gains and Lift Charts Incorporating Costs and Benefits 154

Cumulative Gains as a Function of Threshold 155

5.5 Oversampling 156

Creating an Over-sampled Training Set 158

Evaluating Model Performance Using a Non-oversampled Holdout Set 159

Evaluating Model Performance If Only Oversampled Holdout Set Exists 159

Problems 162

Part IV Prediction and Classification Methods

Chapter 6 Multiple Linear Regression 167

6.1 Introduction 167

6.2 Explanatory vs. Predictive Modeling 168

6.3 Estimating the Regression Equation and Prediction 170

Example: Predicting the Price of Used Toyota Corolla Cars 171

Cross-validation and caret 175

6.4 Variable Selection in Linear Regression 176

Reducing the Number of Predictors 176

How to Reduce the Number of Predictors 178

Regularization (Shrinkage Models) 183

Problems 188

Chapter 7 k-Nearest Neighbors (kNN) 193

7.1 The k-NN Classifier (Categorical Outcome) 193

Determining Neighbors 194

Classification Rule 194

Example: Riding Mowers 195

Choosing k 196

Weighted k-NN 199

Setting the Cutoff Value 200

k-NN with More Than Two Classes 201

Converting Categorical Variables to Binary Dummies 201

7.2 k-NN for a Numerical Outcome 201

7.3 Advantages and Shortcomings of k-NN Algorithms 204

Problems 205

Chapter 8 The Naive Bayes Classifier 207

8.1 Introduction 207

Threshold Probability Method 208

Conditional Probability 208

Example 1: Predicting Fraudulent Financial Reporting 208

8.2 Applying the Full (Exact) Bayesian Classifier 209

Using the “Assign to the Most Probable Class” Method 210

Using the Threshold Probability Method 210

Practical Difficulty with the Complete (Exact) Bayes Procedure 210

8.3 Solution: Naive Bayes 211

The Naive Bayes Assumption of Conditional Independence 212

Using the Threshold Probability Method 212

Example 2: Predicting Fraudulent Financial Reports, Two Predictors 213

Example 3: Predicting Delayed Flights 214

Working with Continuous Predictors 218

8.4 Advantages and Shortcomings of the Naive Bayes Classifier 220

Problems 223

Chapter 9 Classification and Regression Trees 225

9.1 Introduction 226

Tree Structure 227

Decision Rules 227

Classifying a New Record 227

9.2 Classification Trees 228

Recursive Partitioning 228

Example 1: Riding Mowers 228

Measures of Impurity 231

9.3 Evaluating the Performance of a Classification Tree 235

Example 2: Acceptance of Personal Loan 236

9.4 Avoiding Overfitting 239

Stopping Tree Growth 242

Pruning the Tree 243

Best-Pruned Tree 245

9.5 Classification Rules from Trees 247

9.6 Classification Trees for More Than Two Classes 248

9.7 Regression Trees 249

Prediction 250

Measuring Impurity 250

Evaluating Performance 250

9.8 Advantages and Weaknesses of a Tree 250

9.9 Improving Prediction: Random Forests and Boosted Trees 252

Random Forests 252

Boosted Trees 254

Problems 257

Chapter 10 Logistic Regression 261

10.1 Introduction 261

10.2 The Logistic Regression Model 263

10.3 Example: Acceptance of Personal Loan 264

Model with a Single Predictor 265

Estimating the Logistic Model from Data: Computing Parameter Estimates 267

Interpreting Results in Terms of Odds (for a Profiling Goal) 270

10.4 Evaluating Classification Performance 271

10.5 Variable Selection 273

10.6 Logistic Regression for Multi-Class Classification 274

Ordinal Classes 275

Nominal Classes 276

10.7 Example of Complete Analysis: Predicting Delayed Flights 277

Data Preprocessing 282

Model-Fitting and Estimation 282

Model Interpretation 282

Model Performance 284

Variable Selection 285

Problems 289

Chapter 11 Neural Nets 293

11.1 Introduction 293

11.2 Concept and Structure of a Neural Network 294

11.3 Fitting a Network to Data 295

Example 1: Tiny Dataset 295

Computing Output of Nodes 296

Preprocessing the Data 299

Training the Model 300

Example 2: Classifying Accident Severity 304

Avoiding Overfitting 305

Using the Output for Prediction and Classification 305

11.4 Required User Input 307

11.5 Exploring the Relationship Between Predictors and Outcome 308

11.6 Deep Learning 309

Convolutional Neural Networks (CNNs) 310

Local Feature Map 311

A Hierarchy of Features 311

The Learning Process 312

Unsupervised Learning 312

Example: Classification of Fashion Images 313

Conclusion 320

11.7 Advantages and Weaknesses of Neural Networks 320

Problems 322

Chapter 12 Discriminant Analysis 325

12.1 Introduction 325

Example 1: Riding Mowers 326

Example 2: Personal Loan Acceptance 327

12.2 Distance of a Record from a Class 327

12.3 Fisher’s Linear Classification Functions 329

12.4 Classification Performance of Discriminant Analysis 333

12.5 Prior Probabilities 334

12.6 Unequal Misclassification Costs 334

12.7 Classifying More Than Two Classes 336

Example 3: Medical Dispatch to Accident Scenes 336

12.8 Advantages and Weaknesses 339

Problems 341

Chapter 13 Generating, Comparing, and Combining Multiple Models 345

13.1 Ensembles 346

Why Ensembles Can Improve Predictive Power 346

Simple Averaging or Voting 348

Bagging 349

Boosting 349

Bagging and Boosting in R 349

Stacking 350

Advantages and Weaknesses of Ensembles 351

13.2 Automated Machine Learning (AutoML) 352

AutoML: Explore and Clean Data 352

AutoML: Determine Machine Learning Task 353

AutoML: Choose Features and Machine Learning Methods 354

AutoML: Evaluate Model Performance 354

AutoML: Model Deployment 356

Advantages and Weaknesses of Automated Machine Learning 357

13.3 Explaining Model Predictions 358

13.4 Summary 360

Problems 362

345

Part V Intervention and User Feedback

Chapter 14 Interventions: Experiments, Uplift Models, and Reinforcement Learning 367

14.1 A/B Testing 368

Example: Testing a New Feature in a Photo Sharing App 369

The Statistical Test for Comparing Two Groups (T-Test) 370

Multiple Treatment Groups: A/B/n Tests 372

Multiple A/B Tests and the Danger of Multiple Testing 372

14.2 Uplift (Persuasion) Modeling 373

Gathering the Data 374

A Simple Model 376

Modeling Individual Uplift 376

Computing Uplift with R 378

Using the Results of an Uplift Model 378

14.3 Reinforcement Learning 380

Explore-Exploit: Multi-armed Bandits 380

Example of Using a Contextual Multi-Arm Bandit for Movie Recommendations 382

Markov Decision Process (MDP) 383

14.4 Summary 388

Problems 390

Part VI Mining Relationships Among Records

Chapter 15 Association Rules and Collaborative Filtering 393

15.1 Association Rules 394

Discovering Association Rules in Transaction Databases 394

Example 1: Synthetic Data on Purchases of Phone Faceplates 394

Generating Candidate Rules 395

The Apriori Algorithm 397

Selecting Strong Rules 397

Data Format 399

The Process of Rule Selection 400

Interpreting the Results 401

Rules and Chance 403

Example 2: Rules for Similar Book Purchases 405

15.2 Collaborative Filtering 407

Data Type and Format 407

Example 3: Netflix Prize Contest 408

User-Based Collaborative Filtering: “People Like You” 409

Item-Based Collaborative Filtering 411

Evaluating Performance 412

Example 4: Predicting Movie Ratings with MovieLens Data 413

Advantages and Weaknesses of Collaborative Filtering 416

Collaborative Filtering vs. Association Rules 417

15.3 Summary 419

Problems 421

Chapter 16 Cluster Analysis 425

16.1 Introduction 426

Example: Public Utilities 427

16.2 Measuring Distance Between Two Records 429

Euclidean Distance 429

Normalizing Numerical Variables 430

Other Distance Measures for Numerical Data 432

Distance Measures for Categorical Data 433

Distance Measures for Mixed Data 434

16.3 Measuring Distance Between Two Clusters 434

Minimum Distance 434

Maximum Distance 435

Average Distance 435

Centroid Distance 435

16.4 Hierarchical (Agglomerative) Clustering 437

Single Linkage 437

Complete Linkage 438

Average Linkage 438

Centroid Linkage 438

Ward’s Method 438

Dendrograms: Displaying Clustering Process and Results 439

Validating Clusters 441

Limitations of Hierarchical Clustering 443

16.5 Non-Hierarchical Clustering: The k-Means Algorithm 444

Choosing the Number of Clusters (k) 445

Problems 450

Part VII Forecasting Time Series

Chapter 17 Handling Time Series 455

17.1 Introduction 455

17.2 Descriptive vs. Predictive Modeling 457

17.3 Popular Forecasting Methods in Business 457

Problems 466

Chapter 18 Regression-Based Forecasting 469

18.1 A Model with Trend 469

Linear Trend 469

Exponential Trend 473

Polynomial Trend 474

Problems 489

Chapter 19 Smoothing and Deep Learning Methods for Forecasting 499

19.1 Smoothing Methods: Introduction 500

19.2 Moving Average 500

Centered Moving Average for Visualization 500

Trailing Moving Average for Forecasting 501

Choosing Window Width (w) 504

Problems 516

Part VIII Data Analytics

Chapter 20 Social Network Analytics 527

20.1 Introduction 527

20.2 Directed vs. Undirected Networks 529

20.3 Visualizing and Analyzing Networks 530

Plot Layout 530

Edge List 533

Adjacency Matrix 533

Using Network Data in Classification and Prediction 534

Problems 548

Chapter 21 Text Mining 549

21.1 Introduction 549

21.2 The Tabular Representation of Text 550

21.3 Bag-of-Words vs. Meaning Extraction at Document Level 551

Problems 570

Chapter 22 Responsible Data Science 573

22.1 Introduction 573

22.2 Unintentional Harm 574

22.3 Legal Considerations 576

22.4 Principles of Responsible Data Science 577

Non-maleficence 578

Fairness 578

Transparency 579

Accountability 580

Data Privacy and Security 580

Problems 599

Part IX Cases

Chapter 23 Cases 603

23.1 Charles Book Club 603

The Book Industry 603

Database Marketing at Charles 604

Machine Learning Techniques 606

Assignment 608

23.2 German Credit 610

Background 610

Data 610

Assignment 614

Index 647

Machine Learning for Business Analytics

Product form

£98.96

Includes FREE delivery

RRP £109.95 – you save £10.99 (9%)

Order before 4pm tomorrow for delivery by Sat 27 Dec 2025.

A Hardback by Galit Shmueli, Peter C. Bruce, Peter Gedeck

15 in stock


    View other formats and editions of Machine Learning for Business Analytics by Galit Shmueli

    Publisher: John Wiley & Sons Inc
    Publication Date: 08/02/2023
    ISBN13: 9781119835172, 978-1119835172
    ISBN10: 1119835178

    Description

    Book Synopsis


    Table of Contents

    Foreword by Ravi Bapna xix

    Foreword by Gareth James xxi

    Preface to the Second R Edition xxiii

    Acknowledgments xxvi

    Part I Preliminaries

    Chapter 1 Introduction 3

    1.1 What Is Business Analytics? 3

    1.2 What Is Machine Learning? 5

    1.3 Machine Learning, AI, and Related Terms 5

    1.4 Big Data 7

    1.5 Data Science 8

    1.6 Why Are There So Many Different Methods? 8

    1.7 Terminology and Notation 9

    1.8 Road Maps to This Book 11

    Order of Topics 13

    Chapter 2 Overview of the Machine Learning Process 17

    2.1 Introduction 17

    2.2 Core Ideas in Machine Learning 18

    Classification 18

    Prediction 18

    Association Rules and Recommendation Systems 18

    Predictive Analytics 19

    Data Reduction and Dimension Reduction 19

    Data Exploration and Visualization 19

    Supervised and Unsupervised Learning 20

    2.3 The Steps in a Machine Learning Project 21

    2.4 Preliminary Steps 23

    Organization of Data 23

    Predicting Home Values in the West Roxbury Neighborhood 23

    Loading and Looking at the Data in R 24

    Sampling from a Database 26

    Oversampling Rare Events in Classification Tasks 27

    Preprocessing and Cleaning the Data 28

    2.5 Predictive Power and Overfitting 35

    Overfitting 36

    Creating and Using Data Partitions 38

    2.6 Building a Predictive Model 41

    Modeling Process 41

    2.7 Using R for Machine Learning on a Local Machine 46

    2.8 Automating Machine Learning Solutions 47

    Predicting Power Generator Failure 48

    Uber’s Michelangelo 50

    2.9 Ethical Practice in Machine Learning 52

    Machine Learning Software: The State of the Market (by Herb Edelstein) 53

    Problems 57

    Part II Data Exploration and Dimension Reduction

    Chapter 3 Data Visualization 63

    3.1 Uses of Data Visualization 63

    Base R or ggplot? 65

    3.2 Data Examples 65

    Example 1: Boston Housing Data 65

    Example 2: Ridership on Amtrak Trains 67

    3.3 Basic Charts: Bar Charts, Line Charts, and Scatter Plots 67

    Distribution Plots: Boxplots and Histograms 70

    Heatmaps: Visualizing Correlations and Missing Values 73

    3.4 Multidimensional Visualization 75

    Adding Variables: Color, Size, Shape, Multiple Panels, and Animation 76

    Manipulations: Rescaling, Aggregation and Hierarchies, Zooming, Filtering 79

    Reference: Trend Lines and Labels 83

    Scaling Up to Large Datasets 85

    Multivariate Plot: Parallel Coordinates Plot 85

    Interactive Visualization 88

    3.5 Specialized Visualizations 91

    Visualizing Networked Data 91

    Visualizing Hierarchical Data: Treemaps 93

    Visualizing Geographical Data: Map Charts 95

    3.6 Major Visualizations and Operations, by Machine Learning Goal 97

    Prediction 97

    Classification 97

    Time Series Forecasting 97

    Unsupervised Learning 98

    Problems 99

    Chapter 4 Dimension Reduction 101

    4.1 Introduction 101

    4.2 Curse of Dimensionality 102

    4.3 Practical Considerations 102

    Example 1: House Prices in Boston 103

    4.4 Data Summaries 103

    Summary Statistics 104

    Aggregation and Pivot Tables 104

    4.5 Correlation Analysis 107

    4.6 Reducing the Number of Categories in Categorical Variables 109

    4.7 Converting a Categorical Variable to a Numerical Variable 111

    4.8 Principal Component Analysis 111

    Example 2: Breakfast Cereals 111

    Principal Components 116

    Normalizing the Data 117

    Using Principal Components for Classification and Prediction 120

    4.9 Dimension Reduction Using Regression Models 121

    4.10 Dimension Reduction Using Classification and Regression Trees 121

    Problems 123

    Part III Performance Evaluation

    Chapter 5 Evaluating Predictive Performance 129

    5.1 Introduction 130

    5.2 Evaluating Predictive Performance 130

    Naive Benchmark: The Average 131

    Prediction Accuracy Measures 131

    Comparing Training and Holdout Performance 133

    Cumulative Gains and Lift Charts 133

    5.3 Judging Classifier Performance 136

    Benchmark: The Naive Rule 136

    Class Separation 136

    The Confusion (Classification) Matrix 137

    Using the Holdout Data 138

    Accuracy Measures 139

    Propensities and Threshold for Classification 139

    Performance in Case of Unequal Importance of Classes 143

    Asymmetric Misclassification Costs 146

    Generalization to More Than Two Classes 149

    5.4 Judging Ranking Performance 150

    Cumulative Gains and Lift Charts for Binary Data 150

    Decile-wise Lift Charts 153

    Beyond Two Classes 154

    Gains and Lift Charts Incorporating Costs and Benefits 154

    Cumulative Gains as a Function of Threshold 155

    5.5 Oversampling 156

    Creating an Over-sampled Training Set 158

    Evaluating Model Performance Using a Non-oversampled Holdout Set 159

    Evaluating Model Performance If Only Oversampled Holdout Set Exists 159

    Problems 162

    Part IV Prediction and Classification Methods

    Chapter 6 Multiple Linear Regression 167

    6.1 Introduction 167

    6.2 Explanatory vs. Predictive Modeling 168

    6.3 Estimating the Regression Equation and Prediction 170

    Example: Predicting the Price of Used Toyota Corolla Cars 171

    Cross-validation and caret 175

    6.4 Variable Selection in Linear Regression 176

    Reducing the Number of Predictors 176

    How to Reduce the Number of Predictors 178

    Regularization (Shrinkage Models) 183

    Problems 188

    Chapter 7 k-Nearest Neighbors (kNN) 193

    7.1 The k-NN Classifier (Categorical Outcome) 193

    Determining Neighbors 194

    Classification Rule 194

    Example: Riding Mowers 195

    Choosing k 196

    Weighted k-NN 199

    Setting the Cutoff Value 200

    k-NN with More Than Two Classes 201

    Converting Categorical Variables to Binary Dummies 201

    7.2 k-NN for a Numerical Outcome 201

    7.3 Advantages and Shortcomings of k-NN Algorithms 204

    Problems 205

    Chapter 8 The Naive Bayes Classifier 207

    8.1 Introduction 207

    Threshold Probability Method 208

    Conditional Probability 208

    Example 1: Predicting Fraudulent Financial Reporting 208

    8.2 Applying the Full (Exact) Bayesian Classifier 209

    Using the “Assign to the Most Probable Class” Method 210

    Using the Threshold Probability Method 210

    Practical Difficulty with the Complete (Exact) Bayes Procedure 210

    8.3 Solution: Naive Bayes 211

    The Naive Bayes Assumption of Conditional Independence 212

    Using the Threshold Probability Method 212

    Example 2: Predicting Fraudulent Financial Reports, Two Predictors 213

    Example 3: Predicting Delayed Flights 214

    Working with Continuous Predictors 218

    8.4 Advantages and Shortcomings of the Naive Bayes Classifier 220

    Problems 223

    Chapter 9 Classification and Regression Trees 225

    9.1 Introduction 226

    Tree Structure 227

    Decision Rules 227

    Classifying a New Record 227

    9.2 Classification Trees 228

    Recursive Partitioning 228

    Example 1: Riding Mowers 228

    Measures of Impurity 231

    9.3 Evaluating the Performance of a Classification Tree 235

    Example 2: Acceptance of Personal Loan 236

    9.4 Avoiding Overfitting 239

    Stopping Tree Growth 242

    Pruning the Tree 243

    Best-Pruned Tree 245

    9.5 Classification Rules from Trees 247

    9.6 Classification Trees for More Than Two Classes 248

    9.7 Regression Trees 249

    Prediction 250

    Measuring Impurity 250

    Evaluating Performance 250

    9.8 Advantages and Weaknesses of a Tree 250

    9.9 Improving Prediction: Random Forests and Boosted Trees 252

    Random Forests 252

    Boosted Trees 254

    Problems 257

    Chapter 10 Logistic Regression 261

    10.1 Introduction 261

    10.2 The Logistic Regression Model 263

    10.3 Example: Acceptance of Personal Loan 264

    Model with a Single Predictor 265

    Estimating the Logistic Model from Data: Computing Parameter Estimates 267

    Interpreting Results in Terms of Odds (for a Profiling Goal) 270

    10.4 Evaluating Classification Performance 271

    10.5 Variable Selection 273

    10.6 Logistic Regression for Multi-Class Classification 274

    Ordinal Classes 275

    Nominal Classes 276

    10.7 Example of Complete Analysis: Predicting Delayed Flights 277

    Data Preprocessing 282

    Model-Fitting and Estimation 282

    Model Interpretation 282

    Model Performance 284

    Variable Selection 285

    Problems 289

    Chapter 11 Neural Nets 293

    11.1 Introduction 293

    11.2 Concept and Structure of a Neural Network 294

    11.3 Fitting a Network to Data 295

    Example 1: Tiny Dataset 295

    Computing Output of Nodes 296

    Preprocessing the Data 299

    Training the Model 300

    Example 2: Classifying Accident Severity 304

    Avoiding Overfitting 305

    Using the Output for Prediction and Classification 305

    11.4 Required User Input 307

    11.5 Exploring the Relationship Between Predictors and Outcome 308

    11.6 Deep Learning 309

    Convolutional Neural Networks (CNNs) 310

    Local Feature Map 311

    A Hierarchy of Features 311

    The Learning Process 312

    Unsupervised Learning 312

    Example: Classification of Fashion Images 313

    Conclusion 320

    11.7 Advantages and Weaknesses of Neural Networks 320

    Problems 322

    Chapter 12 Discriminant Analysis 325

    12.1 Introduction 325

    Example 1: Riding Mowers 326

    Example 2: Personal Loan Acceptance 327

    12.2 Distance of a Record from a Class 327

    12.3 Fisher’s Linear Classification Functions 329

    12.4 Classification Performance of Discriminant Analysis 333

    12.5 Prior Probabilities 334

    12.6 Unequal Misclassification Costs 334

    12.7 Classifying More Than Two Classes 336

    Example 3: Medical Dispatch to Accident Scenes 336

    12.8 Advantages and Weaknesses 339

    Problems 341

    Chapter 13 Generating, Comparing, and Combining Multiple Models 345

    13.1 Ensembles 346

    Why Ensembles Can Improve Predictive Power 346

    Simple Averaging or Voting 348

    Bagging 349

    Boosting 349

    Bagging and Boosting in R 349

    Stacking 350

    Advantages and Weaknesses of Ensembles 351

    13.2 Automated Machine Learning (AutoML) 352

    AutoML: Explore and Clean Data 352

    AutoML: Determine Machine Learning Task 353

    AutoML: Choose Features and Machine Learning Methods 354

    AutoML: Evaluate Model Performance 354

    AutoML: Model Deployment 356

    Advantages and Weaknesses of Automated Machine Learning 357

    13.3 Explaining Model Predictions 358

    13.4 Summary 360

    Problems 362

    345

    Part V Intervention and User Feedback

    Chapter 14 Interventions: Experiments, Uplift Models, and Reinforcement Learning 367

    14.1 A/B Testing 368

    Example: Testing a New Feature in a Photo Sharing App 369

    The Statistical Test for Comparing Two Groups (T-Test) 370

    Multiple Treatment Groups: A/B/n Tests 372

    Multiple A/B Tests and the Danger of Multiple Testing 372

    14.2 Uplift (Persuasion) Modeling 373

    Gathering the Data 374

    A Simple Model 376

    Modeling Individual Uplift 376

    Computing Uplift with R 378

    Using the Results of an Uplift Model 378

    14.3 Reinforcement Learning 380

    Explore-Exploit: Multi-armed Bandits 380

    Example of Using a Contextual Multi-Arm Bandit for Movie Recommendations 382

    Markov Decision Process (MDP) 383

    14.4 Summary 388

    Problems 390

    Part VI Mining Relationships Among Records

    Chapter 15 Association Rules and Collaborative Filtering 393

    15.1 Association Rules 394

    Discovering Association Rules in Transaction Databases 394

    Example 1: Synthetic Data on Purchases of Phone Faceplates 394

    Generating Candidate Rules 395

    The Apriori Algorithm 397

    Selecting Strong Rules 397

    Data Format 399

    The Process of Rule Selection 400

    Interpreting the Results 401

    Rules and Chance 403

    Example 2: Rules for Similar Book Purchases 405

    15.2 Collaborative Filtering 407

    Data Type and Format 407

    Example 3: Netflix Prize Contest 408

    User-Based Collaborative Filtering: “People Like You” 409

    Item-Based Collaborative Filtering 411

    Evaluating Performance 412

    Example 4: Predicting Movie Ratings with MovieLens Data 413

    Advantages and Weaknesses of Collaborative Filtering 416

    Collaborative Filtering vs. Association Rules 417

    15.3 Summary 419

    Problems 421

    Chapter 16 Cluster Analysis 425

    16.1 Introduction 426

    Example: Public Utilities 427

    16.2 Measuring Distance Between Two Records 429

    Euclidean Distance 429

    Normalizing Numerical Variables 430

    Other Distance Measures for Numerical Data 432

    Distance Measures for Categorical Data 433

    Distance Measures for Mixed Data 434

    16.3 Measuring Distance Between Two Clusters 434

    Minimum Distance 434

    Maximum Distance 435

    Average Distance 435

    Centroid Distance 435

    16.4 Hierarchical (Agglomerative) Clustering 437

    Single Linkage 437

    Complete Linkage 438

    Average Linkage 438

    Centroid Linkage 438

    Ward’s Method 438

    Dendrograms: Displaying Clustering Process and Results 439

    Validating Clusters 441

    Limitations of Hierarchical Clustering 443

    16.5 Non-Hierarchical Clustering: The k-Means Algorithm 444

    Choosing the Number of Clusters (k) 445

    Problems 450

    Part VII Forecasting Time Series

    Chapter 17 Handling Time Series 455

    17.1 Introduction 455

    17.2 Descriptive vs. Predictive Modeling 457

    17.3 Popular Forecasting Methods in Business 457

    Problems 466

    Chapter 18 Regression-Based Forecasting 469

    18.1 A Model with Trend 469

    Linear Trend 469

    Exponential Trend 473

    Polynomial Trend 474

    Problems 489

    Chapter 19 Smoothing and Deep Learning Methods for Forecasting 499

    19.1 Smoothing Methods: Introduction 500

    19.2 Moving Average 500

    Centered Moving Average for Visualization 500

    Trailing Moving Average for Forecasting 501

    Choosing Window Width (w) 504

    Problems 516

    Part VIII Data Analytics

    Chapter 20 Social Network Analytics 527

    20.1 Introduction 527

    20.2 Directed vs. Undirected Networks 529

    20.3 Visualizing and Analyzing Networks 530

    Plot Layout 530

    Edge List 533

    Adjacency Matrix 533

    Using Network Data in Classification and Prediction 534

    Problems 548

    Chapter 21 Text Mining 549

    21.1 Introduction 549

    21.2 The Tabular Representation of Text 550

    21.3 Bag-of-Words vs. Meaning Extraction at Document Level 551

    Problems 570

    Chapter 22 Responsible Data Science 573

    22.1 Introduction 573

    22.2 Unintentional Harm 574

    22.3 Legal Considerations 576

    22.4 Principles of Responsible Data Science 577

    Non-maleficence 578

    Fairness 578

    Transparency 579

    Accountability 580

    Data Privacy and Security 580

    Problems 599

    Part IX Cases

    Chapter 23 Cases 603

    23.1 Charles Book Club 603

    The Book Industry 603

    Database Marketing at Charles 604

    Machine Learning Techniques 606

    Assignment 608

    23.2 German Credit 610

    Background 610

    Data 610

    Assignment 614

    Index 647

    Recently viewed products

    © 2025 Book Curl

      • American Express
      • Apple Pay
      • Diners Club
      • Discover
      • Google Pay
      • Maestro
      • Mastercard
      • PayPal
      • Shop Pay
      • Union Pay
      • Visa

      Login

      Forgot your password?

      Don't have an account yet?
      Create account