Description
Book SynopsisTree-based Methods for Statistical Learning in R provides a thorough introduction to both individual decision tree algorithms (Part I) and ensembles thereof (Part II). Part I of the book brings several different tree algorithms into focus, both conventional and contemporary. Building a strong foundation for how individual decision trees work will help readers better understand tree-based ensembles at a deeper level, which lie at the cutting edge of modern statistical and machine learning methodology.
The book follows up most ideas and mathematical concepts with code-based examples in the R statistical language; with an emphasis on using as few external packages as possible. For example, users will be exposed to writing their own random forest and gradient tree boosting functions using simple for loops and basic tree fitting software (like rpart and party/partykit), and more. The core chapters also end with a detailed section on relevant softw
Trade Review
Tree-based algorithms have been a workhorse for data science teams for decades, but the data science field has lacked an all-encompassing review of trees - and their modern variants like XGBoost - until now. Greenwell has written the ultimate guide for tree-based methods: how they work, their pitfalls, and alternative solutions. He puts it all together in a readable and immediately usable book. You're guaranteed to learn new tips and tricks to help your data science team.
-Alex Gutman, Director of Data Science, Author: Becoming a Data Head
"Here’s a new title that is a “must have” for any data scientist who uses the R language. It’s a wonderful learning resource for tree-based techniques in statistical learning, one that’s become my go-to text when I find the need to do a deep dive into various ML topic areas for my work."
Daniel D. Gutierrez, Editor-in-Chief for insideBIGDATA, USA, insideBIGDATA, February 2023
Table of Contents1 Introduction 2 Binary recursive partitioning with CART 3 Conditional inference trees 4 "The hitchhiker’s GUIDE to modern decision trees" 5 Ensemble algorithms 6 Peeking inside the “black box”: post-hoc interpretability 7 Random forests 8 Gradient boosting machines