Description

Book Synopsis


Table of Contents

Introduction 1

About This Book 1

Foolish Assumptions 3

Icons Used in This Book 4

Beyond the Book 4

Where to Go from Here 5

Book 1: Defining Data Science 7

Chapter 1: Considering the History and Uses of Data Science 9

Considering the Elements of Data Science 10

Considering the emergence of data science 10

Outlining the core competencies of a data scientist 11

Linking data science, big data, and AI 12

Understanding the role of programming 12

Defining the Role of Data in the World 13

Enticing people to buy products 13

Keeping people safer 14

Creating new technologies 15

Performing analysis for research 16

Providing art and entertainment 17

Making life more interesting in other ways 18

Creating the Data Science Pipeline 18

Preparing the data 18

Performing exploratory data analysis 18

Learning from data 19

Visualizing 19

Obtaining insights and data products 19

Comparing Different Languages Used for Data Science 20

Obtaining an overview of data science languages 20

Defining the pros and cons of using Python 22

Defining the pros and cons of using R 23

Learning to Perform Data Science Tasks Fast 25

Loading data 26

Training a model 26

Viewing a result 26

Chapter 2: Placing Data Science within the Realm of AI 29

Seeing the Data to Data Science Relationship 30

Considering the data architecture 30

Acquiring data from various sources 31

Performing data analysis 32

Archiving the data 33

Defining the Levels of AI 33

Beginning with AI 34

Advancing to machine learning 39

Getting detailed with deep learning 43

Creating a Pipeline from Data to AI 47

Considering the desired output 47

Defining a data architecture 47

Combining various data sources 47

Checking for errors and fixing them 48

Performing the analysis 48

Validating the result 49

Enhancing application performance 49

Chapter 3: Creating a Data Science Lab of Your Own 51

Considering the Analysis Platform Options 52

Using a desktop system 53

Working with an online IDE 53

Considering the need for a GPU 54

Choosing a Development Language 56

Obtaining and Using Python 58

Working with Python in this book 58

Obtaining and installing Anaconda for Python 59

Defining a Python code repository 64

Working with Python using Google Colaboratory 69

Defining the limits of using Azure Notebooks with Python and R 71

Obtaining and Using R 72

Obtaining and installing Anaconda for R 72

Starting the R environment 73

Defining an R code repository 75

Presenting Frameworks 76

Defining the differences 76

Explaining the popularity of frameworks 77

Choosing a particular library 79

Accessing the Downloadable Code 80

Chapter 4: Considering Additional Packages and Libraries You Might Want 81

Considering the Uses for Third-Party Code 82

Obtaining Useful Python Packages 83

Accessing scientific tools using SciPy 84

Performing fundamental scientific computing using NumPy 85

Performing data analysis using pandas 85

Implementing machine learning using Scikit-learn 86

Going for deep learning with Keras and TensorFlow 86

Plotting the data using matplotlib 87

Creating graphs with NetworkX 88

Parsing HTML documents using Beautiful Soup 88

Locating Useful R Libraries 89

Using your Python code in R with reticulate 89

Conducting advanced training using caret 90

Performing machine learning tasks using mlr 90

Visualizing data using ggplot2 91

Enhancing ggplot2 using esquisse 91

Creating graphs with igraph 91

Parsing HTML documents using rvest 92

Wrangling dates using lubridate 92

Making big data simpler using dplyr and purrr 93

Chapter 5: Leveraging a Deep Learning Framework 95

Understanding Deep Learning Framework Usage 96

Working with Low-End Frameworks 97

Chainer 97

PyTorch 98

MXNet 98

Microsoft Cognitive Toolkit/CNTK 99

Understanding TensorFlow 100

Grasping why TensorFlow is so good 101

Making TensorFlow easier by using TFLearn 102

Using Keras as the best simplifier 102

Getting your copy of TensorFlow and Keras 103

Fixing the C++ build tools error in Windows 106

Accessing your new environment in Notebook 108

Book 2: Interacting with Data Storage 109

Chapter 1: Manipulating Raw Data 111

Defining the Data Sources 112

Obtaining data locally 112

Using online data sources 117

Employing dynamic data sources 121

Considering other kinds of data sources 123

Considering the Data Forms 124

Working with pure text 124

Accessing formatted text 125

Deciphering binary data 126

Understanding the Need for Data Reliability 128

Chapter 2: Using Functional Programming Techniques 131

Defining Functional Programming 132

Differences with other programming paradigms 132

Understanding its goals 133

Understanding Pure and Impure Languages 134

Using the pure approach 134

Using the impure approach 134

Comparing the Functional Paradigm 135

Imperative 135

Procedural 136

Object-oriented 136

Declarative 136

Using Python for Functional Programming Needs 137

Understanding How Functional Data Works 138

Working with immutable data 139

Considering the role of state 139

Eliminating side effects 140

Passing by reference versus by value 140

Working with Lists and Strings 142

Creating lists 144

Evaluating lists 144

Performing common list manipulations 146

Understanding the Dict and Set alternatives 147

Considering the use of strings 148

Employing Pattern Matching 150

Looking for patterns in data 150

Understanding regular expressions 152

Using pattern matching in analysis 155

Working with pattern matching 156

Working with Recursion 159

Performing tasks more than once 159

Understanding recursion 161

Using recursion on lists 162

Considering advanced recursive tasks 163

Passing functions instead of variables 164

Performing Functional Data Manipulation 165

Slicing and dicing 166

Mapping your data 167

Filtering data 168

Organizing data 169

Chapter 3: Working with Scalars, Vectors, and Matrices 171

Considering the Data Forms 172

Defining Data Type through Scalars 173

Creating Organized Data with Vectors 174

Defining a vector 175

Creating vectors of a specific type 175

Performing math on vectors 176

Performing logical and comparison tasks on vectors 176

Multiplying vectors 177

Creating and Using Matrices 178

Creating a matrix 178

Creating matrices of a specific type 179

Using the matrix class 181

Performing matrix multiplication 181

Executing advanced matrix operations 183

Extending Analysis to Tensors 185

Using Vectorization Effectively 186

Selecting and Shaping Data 187

Slicing rows 188

Slicing columns 188

Dicing 189

Concatenating 189

Aggregating 194

Working with Trees 195

Understanding the basics of trees 195

Building a tree 196

Representing Relations in a Graph 198

Going beyond trees 198

Arranging graphs 199

Chapter 4: Accessing Data in Files 201

Understanding Flat File Data Sources 202

Working with Positional Data Files 203

Accessing Data in CSV Files 205

Working with a simple CSV file 205

Making use of header information 208

Moving On to XML Files 209

Working with a simple XML file 209

Parsing XML 211

Using XPath for data extraction 212

Considering Other Flat-File Data Sources 214

Working with Nontext Data 215

Downloading Online Datasets 218

Working with package datasets 218

Using public domain datasets 219

Chapter 5: Working with a Relational DBMS 223

Considering RDBMS Issues 224

Defining the use of tables 225

Understanding keys and indexes 226

Using local versus online databases 227

Working in read-only mode 228

Accessing the RDBMS Data 228

Using the SQL language 229

Relying on scripts 231

Relying on views 231

Relying on functions 232

Creating a Dataset 233

Combining data from multiple tables 233

Ensuring data completeness 234

Slicing and dicing the data as needed 234

Mixing RDBMS Products 234

Chapter 6: Working with a NoSQL DMBS 237

Considering the Ramifications of Hierarchical Data 238

Understanding hierarchical organization 238

Developing strategies for freeform data 239

Performing an analysis 240

Working around dangling data 241

Accessing the Data 243

Creating a picture of the data form 243

Employing the correct transiting strategy 244

Ordering the data 247

Interacting with Data from NoSQL Databases 248

Working with Dictionaries 249

Developing Datasets from Hierarchical Data 250

Processing Hierarchical Data into Other Forms 251

Book 3: Manipulating Data Using Basic Algorithms 253

Chapter 1: Working with Linear Regression 255

Considering the History of Linear Regression 256

Combining Variables 257

Working through simple linear regression 257

Advancing to multiple linear regression 260

Considering which question to ask 262

Reducing independent variable complexity 263

Manipulating Categorical Variables 265

Creating categorical variables 266

Renaming levels 267

Combining levels 268

Using Linear Regression to Guess Numbers 269

Defining the family of linear models 270

Using more variables in a larger dataset 271

Understanding variable transformations 274

Doing variable transformations 275

Creating interactions between variables 277

Understanding limitations and problems 282

Learning One Example at a Time 283

Using Gradient Descent 283

Implementing Stochastic Gradient Descent 283

Considering the effects of regularization 287

Chapter 2: Moving Forward with Logistic Regression 289

Considering the History of Logistic Regression 290

Differentiating between Linear and Logistic Regression 291

Considering the model 291

Defining the logistic function 292

Understanding the problems that logistic regression solves 294

Fitting the curve 295

Considering a pass/fail example 296

Using Logistic Regression to Guess Classes 297

Applying logistic regression 297

Considering when classes are more 298

Defining logistic regression performance 300

Switching to Probabilities 301

Specifying a binary response 301

Transforming numeric estimates into probabilities 302

Working through Multiclass Regression 305

Understanding multiclass regression 305

Developing a multiclass regression implementation 306

Chapter 3: Predicting Outcomes Using Bayes 309

Understanding Bayes’ Theorem 310

Delving into Bayes history 310

Considering the basic theorem 312

Using Naïve Bayes for Predictions 313

Finding out that Naïve Bayes isn’t so naïve 314

Predicting text classifications 315

Getting an overview of Bayesian inference 318

Working with Networked Bayes 324

Considering the network types and uses 324

Understanding Directed Acyclic Graphs (DAGs) 327

Employing networked Bayes in predictions 328

Deciding between automated and guided learning 332

Considering the Use of Bayesian Linear Regression 332

Considering the Use of Bayesian Logistic Regression 333

Chapter 4: Learning with K-Nearest Neighbors 335

Considering the History of K-Nearest Neighbors 336

Learning Lazily with K-Nearest Neighbors 337

Understanding the basis of KNN 337

Predicting after observing neighbors 338

Choosing the k parameter wisely 341

Leveraging the Correct k Parameter 342

Understanding the k parameter 342

Experimenting with a flexible algorithm 343

Implementing KNN Regression 345

Implementing KNN Classification 347

Book 4: Performing Advanced Data Manipulation 351

Chapter 1: Leveraging Ensembles of Learners 353

Leveraging Decision Trees 354

Growing a forest of trees 356

Seeing Random Forests in action 358

Understanding the importance measures 360

Configuring your system for importance measures with Python 361

Seeing importance measures in action 361

Working with Almost Random Guesses 364

Understanding the premise 365

Bagging predictors with AdaBoost 366

Meeting Again with Gradient Descent 369

Understanding the GBM difference 369

Seeing GBM in action 371

Averaging Different Predictors 372

Chapter 2: Building Deep Learning Models 373

Discovering the Incredible Perceptron 374

Understanding perceptron functionality 375

Touching the nonseparability limit 376

Hitting Complexity with Neural Networks 378

Considering the neuron 379

Pushing data with feed-forward 381

Defining hidden layers 383

Executing operations 384

Considering the details of data movement through the neural network 386

Using backpropagation to adjust learning 387

Understanding More about Neural Networks 390

Getting an overview of the neural network process 391

Defining the basic architecture 391

Documenting the essential modules 393

Solving a simple problem 396

Looking Under the Hood of Neural Networks 399

Choosing the right activation function 399

Relying on a smart optimizer 401

Setting a working learning rate 402

Explaining Deep Learning Differences with Other Forms of AI 402

Adding more layers 403

Changing the activations 405

Adding regularization by dropout 406

Using online learning 407

Transferring learning 407

Learning end to end 408

Chapter 3: Recognizing Images with CNNs 409

Beginning with Simple Image Recognition 410

Considering the ramifications of sight 410

Working with a set of images 411

Extracting visual features 417

Recognizing faces using Eigenfaces 419

Classifying images 423

Understanding CNN Image Basics 427

Moving to CNNs with Character Recognition 429

Accessing the dataset 430

Reshaping the dataset 431

Encoding the categories 432

Defining the model 432

Using the model 433

Explaining How Convolutions Work 435

Understanding convolutions 435

Simplifying the use of pooling 439

Describing the LeNet architecture 440

Detecting Edges and Shapes from Images 446

Visualizing convolutions 447

Unveiling successful architectures 449

Discussing transfer learning 450

Chapter 4: Processing Text and Other Sequences 453

Introducing Natural Language Processing 454

Defining the human perspective as it relates to data science 454

Considering the computer perspective as it relates to data science 455

Understanding How Machines Read 456

Creating a corpus 457

Performing feature extraction 457

Understanding the BoW 458

Processing and enhancing text 459

Maintaining order using n-grams 461

Stemming and removing stop words 462

Scraping textual datasets from the web 465

Handling problems with raw text 470

Storing processed text data in sparse matrices 473

Understanding Semantics Using Word Embeddings 478

Using Scoring and Classification 482

Performing classification tasks 482

Analyzing reviews from e-commerce 485

Book 5: Performing Data-Related Tasks 491

Chapter 1: Making Recommendations 493

Realizing the Recommendation Revolution 494

Downloading Rating Data 495

Navigating through anonymous web data 496

Encountering the limits of rating data 499

Leveraging SVD 506

Considering the origins of SVD 506

Understanding the SVD connection 508

Chapter 2: Performing Complex Classifications 509

Using Image Classification Challenges 510

Delving into ImageNet and Coco 511

Learning the magic of data augmentation 513

Distinguishing Traffic Signs 516

Preparing the image data 517

Running a classification task 520

Chapter 3: Identifying Objects 525

Distinguishing Classification Tasks 526

Understanding the problem 526

Performing localization 527

Classifying multiple objects 528

Annotating multiple objects in images 529

Segmenting images 530

Perceiving Objects in Their Surroundings 531

Considering vision needs in self-driving cars 531

Discovering how RetinaNet works 532

Using the Keras-RetinaNet code 534

Overcoming Adversarial Attacks on Deep Learning Applications 538

Tricking pixels 539

Hacking with stickers and other artifacts 541

Chapter 4: Analyzing Music and Video 543

Learning to Imitate Art and Life 544

Transferring an artistic style 545

Reducing the problem to statistics 546

Understanding that deep learning doesn’t create 548

Mimicking an Artist 548

Defining a new piece based on a single artist 549

Combining styles to create new art 550

Visualizing how neural networks dream 551

Using a network to compose music 551

Other creative avenues 552

Moving toward GANs 553

Finding the key in the competition 554

Considering a growing field 556

Chapter 5: Considering Other Task Types 559

Processing Language in Texts 560

Considering the processing methodologies 560

Defining understanding as tokenization 561

Putting all the documents into a bag 562

Using AI for sentiment analysis 566

Processing Time Series 574

Defining sequences of events 574

Performing a prediction using LSTM 575

Chapter 6: Developing Impressive Charts and Plots 579

Starting a Graph, Chart, or Plot 580

Understanding the differences between graphs, charts, and plots 580

Considering the graph, chart, and plot types 582

Defining the plot 583

Drawing multiple lines 584

Drawing multiple plots 584

Saving your work 586

Setting the Axis, Ticks, and Grids 587

Getting the axis 587

Formatting the ticks 590

Adding grids 590

Defining the Line Appearance 591

Working with line styles 592

Adding markers 593

Using Labels, Annotations, and Legends 594

Adding labels 595

Annotating the chart 596

Creating a legend 598

Creating Scatterplots 599

Depicting groups 599

Showing correlations 600

Plotting Time Series 603

Representing time on axes 604

Plotting trends over time 605

Plotting Geographical Data 608

Getting the toolkit 608

Drawing the map 609

Plotting the data 613

Visualizing Graphs 615

Understanding the adjacency matrix 615

Using NetworkX basics 615

Book 6: Diagnosing and Fixing Errors 619

Chapter 1: Locating Errors in Your Data 621

Considering the Types of Data Errors 622

Obtaining the Required Data 624

Considering the data sources 624

Obtaining reliable data 625

Making human input more reliable 626

Using automated data collection 628

Validating Your Data 629

Figuring out what’s in your data 629

Removing duplicates 631

Creating a data map and a data plan 632

Manicuring the Data 634

Dealing with missing data 634

Considering data misalignments 639

Separating out useful data 640

Dealing with Dates in Your Data 640

Formatting date and time values 641

Using the right time transformation 641

Chapter 2: Considering Outrageous Outcomes 643

Deciding What Outrageous Means 644

Considering the Five Mistruths in Data 645

Commission 645

Omission 646

Perspective 646

Bias 647

Frame-of-reference 648

Considering Detection of Outliers 649

Understanding outlier basics 649

Finding more things that can go wrong 651

Understanding anomalies and novel data 651

Examining a Simple Univariate Method 653

Using the pandas package 653

Leveraging the Gaussian distribution 655

Making assumptions and checking out 656

Developing a Multivariate Approach 657

Using principle component analysis 658

Using cluster analysis 659

Automating outliers detection with Isolation Forests 661

Chapter 3: Dealing with Model Overfitting and Underfitting 663

Understanding the Causes 664

Considering the problem 664

Looking at underfitting 665

Looking at overfitting 666

Plotting learning curves for insights 668

Determining the Sources of Overfitting and Underfitting 670

Understanding bias and variance 671

Having insufficient data 671

Being fooled by data leakage 672

Guessing the Right Features 672

Selecting variables like a pro 673

Using nonlinear transformations 676

Regularizing linear models 684

Chapter 4: Obtaining the Correct Output Presentation 689

Considering the Meaning of Correct 690

Determining a Presentation Type 691

Considering the audience 691

Defining a depth of detail 692

Ensuring that the data is consistent with audience needs 693

Understanding timeliness 693

Choosing the Right Graph 694

Telling a story with your graphs 694

Showing parts of a whole with pie charts 694

Creating comparisons with bar charts 695

Showing distributions using histograms 697

Depicting groups using boxplots 699

Defining a data flow using line graphs 700

Seeing data patterns using scatterplots 701

Working with External Data 702

Embedding plots and other images 703

Loading examples from online sites 703

Obtaining online graphics and multimedia 704

Chapter 5: Developing Consistent Strategies 707

Standardizing Data Collection Techniques 707

Using Reliable Sources 709

Verifying Dynamic Data Sources 711

Considering the problem 712

Analyzing streams with the right recipe 714

Looking for New Data Collection Trends 715

Weeding Old Data 716

Considering the Need for Randomness 717

Considering why randomization is needed 718

Understanding how probability works 718

Index 721

Data Science Programming AllinOne For Dummies

    Product form

    £26.39

    Includes FREE delivery

    RRP £32.99 – you save £6.60 (20%)

    Order before 4pm tomorrow for delivery by Tue 23 Jun 2026.

    A Paperback / softback by John Paul Mueller, Luca Massaron

    2 in stock

      Trusted by thousands of customers. See 2,385+ Customer Reviews

      View other formats and editions of Data Science Programming AllinOne For Dummies by John Paul Mueller

      Publisher: John Wiley & Sons Inc
      Publication Date: 10/02/2020
      ISBN13: 9781119626114, 978-1119626114
      ISBN10: 1119626110
      Also in:
      Mathematics

      Description

      Book Synopsis


      Table of Contents

      Introduction 1

      About This Book 1

      Foolish Assumptions 3

      Icons Used in This Book 4

      Beyond the Book 4

      Where to Go from Here 5

      Book 1: Defining Data Science 7

      Chapter 1: Considering the History and Uses of Data Science 9

      Considering the Elements of Data Science 10

      Considering the emergence of data science 10

      Outlining the core competencies of a data scientist 11

      Linking data science, big data, and AI 12

      Understanding the role of programming 12

      Defining the Role of Data in the World 13

      Enticing people to buy products 13

      Keeping people safer 14

      Creating new technologies 15

      Performing analysis for research 16

      Providing art and entertainment 17

      Making life more interesting in other ways 18

      Creating the Data Science Pipeline 18

      Preparing the data 18

      Performing exploratory data analysis 18

      Learning from data 19

      Visualizing 19

      Obtaining insights and data products 19

      Comparing Different Languages Used for Data Science 20

      Obtaining an overview of data science languages 20

      Defining the pros and cons of using Python 22

      Defining the pros and cons of using R 23

      Learning to Perform Data Science Tasks Fast 25

      Loading data 26

      Training a model 26

      Viewing a result 26

      Chapter 2: Placing Data Science within the Realm of AI 29

      Seeing the Data to Data Science Relationship 30

      Considering the data architecture 30

      Acquiring data from various sources 31

      Performing data analysis 32

      Archiving the data 33

      Defining the Levels of AI 33

      Beginning with AI 34

      Advancing to machine learning 39

      Getting detailed with deep learning 43

      Creating a Pipeline from Data to AI 47

      Considering the desired output 47

      Defining a data architecture 47

      Combining various data sources 47

      Checking for errors and fixing them 48

      Performing the analysis 48

      Validating the result 49

      Enhancing application performance 49

      Chapter 3: Creating a Data Science Lab of Your Own 51

      Considering the Analysis Platform Options 52

      Using a desktop system 53

      Working with an online IDE 53

      Considering the need for a GPU 54

      Choosing a Development Language 56

      Obtaining and Using Python 58

      Working with Python in this book 58

      Obtaining and installing Anaconda for Python 59

      Defining a Python code repository 64

      Working with Python using Google Colaboratory 69

      Defining the limits of using Azure Notebooks with Python and R 71

      Obtaining and Using R 72

      Obtaining and installing Anaconda for R 72

      Starting the R environment 73

      Defining an R code repository 75

      Presenting Frameworks 76

      Defining the differences 76

      Explaining the popularity of frameworks 77

      Choosing a particular library 79

      Accessing the Downloadable Code 80

      Chapter 4: Considering Additional Packages and Libraries You Might Want 81

      Considering the Uses for Third-Party Code 82

      Obtaining Useful Python Packages 83

      Accessing scientific tools using SciPy 84

      Performing fundamental scientific computing using NumPy 85

      Performing data analysis using pandas 85

      Implementing machine learning using Scikit-learn 86

      Going for deep learning with Keras and TensorFlow 86

      Plotting the data using matplotlib 87

      Creating graphs with NetworkX 88

      Parsing HTML documents using Beautiful Soup 88

      Locating Useful R Libraries 89

      Using your Python code in R with reticulate 89

      Conducting advanced training using caret 90

      Performing machine learning tasks using mlr 90

      Visualizing data using ggplot2 91

      Enhancing ggplot2 using esquisse 91

      Creating graphs with igraph 91

      Parsing HTML documents using rvest 92

      Wrangling dates using lubridate 92

      Making big data simpler using dplyr and purrr 93

      Chapter 5: Leveraging a Deep Learning Framework 95

      Understanding Deep Learning Framework Usage 96

      Working with Low-End Frameworks 97

      Chainer 97

      PyTorch 98

      MXNet 98

      Microsoft Cognitive Toolkit/CNTK 99

      Understanding TensorFlow 100

      Grasping why TensorFlow is so good 101

      Making TensorFlow easier by using TFLearn 102

      Using Keras as the best simplifier 102

      Getting your copy of TensorFlow and Keras 103

      Fixing the C++ build tools error in Windows 106

      Accessing your new environment in Notebook 108

      Book 2: Interacting with Data Storage 109

      Chapter 1: Manipulating Raw Data 111

      Defining the Data Sources 112

      Obtaining data locally 112

      Using online data sources 117

      Employing dynamic data sources 121

      Considering other kinds of data sources 123

      Considering the Data Forms 124

      Working with pure text 124

      Accessing formatted text 125

      Deciphering binary data 126

      Understanding the Need for Data Reliability 128

      Chapter 2: Using Functional Programming Techniques 131

      Defining Functional Programming 132

      Differences with other programming paradigms 132

      Understanding its goals 133

      Understanding Pure and Impure Languages 134

      Using the pure approach 134

      Using the impure approach 134

      Comparing the Functional Paradigm 135

      Imperative 135

      Procedural 136

      Object-oriented 136

      Declarative 136

      Using Python for Functional Programming Needs 137

      Understanding How Functional Data Works 138

      Working with immutable data 139

      Considering the role of state 139

      Eliminating side effects 140

      Passing by reference versus by value 140

      Working with Lists and Strings 142

      Creating lists 144

      Evaluating lists 144

      Performing common list manipulations 146

      Understanding the Dict and Set alternatives 147

      Considering the use of strings 148

      Employing Pattern Matching 150

      Looking for patterns in data 150

      Understanding regular expressions 152

      Using pattern matching in analysis 155

      Working with pattern matching 156

      Working with Recursion 159

      Performing tasks more than once 159

      Understanding recursion 161

      Using recursion on lists 162

      Considering advanced recursive tasks 163

      Passing functions instead of variables 164

      Performing Functional Data Manipulation 165

      Slicing and dicing 166

      Mapping your data 167

      Filtering data 168

      Organizing data 169

      Chapter 3: Working with Scalars, Vectors, and Matrices 171

      Considering the Data Forms 172

      Defining Data Type through Scalars 173

      Creating Organized Data with Vectors 174

      Defining a vector 175

      Creating vectors of a specific type 175

      Performing math on vectors 176

      Performing logical and comparison tasks on vectors 176

      Multiplying vectors 177

      Creating and Using Matrices 178

      Creating a matrix 178

      Creating matrices of a specific type 179

      Using the matrix class 181

      Performing matrix multiplication 181

      Executing advanced matrix operations 183

      Extending Analysis to Tensors 185

      Using Vectorization Effectively 186

      Selecting and Shaping Data 187

      Slicing rows 188

      Slicing columns 188

      Dicing 189

      Concatenating 189

      Aggregating 194

      Working with Trees 195

      Understanding the basics of trees 195

      Building a tree 196

      Representing Relations in a Graph 198

      Going beyond trees 198

      Arranging graphs 199

      Chapter 4: Accessing Data in Files 201

      Understanding Flat File Data Sources 202

      Working with Positional Data Files 203

      Accessing Data in CSV Files 205

      Working with a simple CSV file 205

      Making use of header information 208

      Moving On to XML Files 209

      Working with a simple XML file 209

      Parsing XML 211

      Using XPath for data extraction 212

      Considering Other Flat-File Data Sources 214

      Working with Nontext Data 215

      Downloading Online Datasets 218

      Working with package datasets 218

      Using public domain datasets 219

      Chapter 5: Working with a Relational DBMS 223

      Considering RDBMS Issues 224

      Defining the use of tables 225

      Understanding keys and indexes 226

      Using local versus online databases 227

      Working in read-only mode 228

      Accessing the RDBMS Data 228

      Using the SQL language 229

      Relying on scripts 231

      Relying on views 231

      Relying on functions 232

      Creating a Dataset 233

      Combining data from multiple tables 233

      Ensuring data completeness 234

      Slicing and dicing the data as needed 234

      Mixing RDBMS Products 234

      Chapter 6: Working with a NoSQL DMBS 237

      Considering the Ramifications of Hierarchical Data 238

      Understanding hierarchical organization 238

      Developing strategies for freeform data 239

      Performing an analysis 240

      Working around dangling data 241

      Accessing the Data 243

      Creating a picture of the data form 243

      Employing the correct transiting strategy 244

      Ordering the data 247

      Interacting with Data from NoSQL Databases 248

      Working with Dictionaries 249

      Developing Datasets from Hierarchical Data 250

      Processing Hierarchical Data into Other Forms 251

      Book 3: Manipulating Data Using Basic Algorithms 253

      Chapter 1: Working with Linear Regression 255

      Considering the History of Linear Regression 256

      Combining Variables 257

      Working through simple linear regression 257

      Advancing to multiple linear regression 260

      Considering which question to ask 262

      Reducing independent variable complexity 263

      Manipulating Categorical Variables 265

      Creating categorical variables 266

      Renaming levels 267

      Combining levels 268

      Using Linear Regression to Guess Numbers 269

      Defining the family of linear models 270

      Using more variables in a larger dataset 271

      Understanding variable transformations 274

      Doing variable transformations 275

      Creating interactions between variables 277

      Understanding limitations and problems 282

      Learning One Example at a Time 283

      Using Gradient Descent 283

      Implementing Stochastic Gradient Descent 283

      Considering the effects of regularization 287

      Chapter 2: Moving Forward with Logistic Regression 289

      Considering the History of Logistic Regression 290

      Differentiating between Linear and Logistic Regression 291

      Considering the model 291

      Defining the logistic function 292

      Understanding the problems that logistic regression solves 294

      Fitting the curve 295

      Considering a pass/fail example 296

      Using Logistic Regression to Guess Classes 297

      Applying logistic regression 297

      Considering when classes are more 298

      Defining logistic regression performance 300

      Switching to Probabilities 301

      Specifying a binary response 301

      Transforming numeric estimates into probabilities 302

      Working through Multiclass Regression 305

      Understanding multiclass regression 305

      Developing a multiclass regression implementation 306

      Chapter 3: Predicting Outcomes Using Bayes 309

      Understanding Bayes’ Theorem 310

      Delving into Bayes history 310

      Considering the basic theorem 312

      Using Naïve Bayes for Predictions 313

      Finding out that Naïve Bayes isn’t so naïve 314

      Predicting text classifications 315

      Getting an overview of Bayesian inference 318

      Working with Networked Bayes 324

      Considering the network types and uses 324

      Understanding Directed Acyclic Graphs (DAGs) 327

      Employing networked Bayes in predictions 328

      Deciding between automated and guided learning 332

      Considering the Use of Bayesian Linear Regression 332

      Considering the Use of Bayesian Logistic Regression 333

      Chapter 4: Learning with K-Nearest Neighbors 335

      Considering the History of K-Nearest Neighbors 336

      Learning Lazily with K-Nearest Neighbors 337

      Understanding the basis of KNN 337

      Predicting after observing neighbors 338

      Choosing the k parameter wisely 341

      Leveraging the Correct k Parameter 342

      Understanding the k parameter 342

      Experimenting with a flexible algorithm 343

      Implementing KNN Regression 345

      Implementing KNN Classification 347

      Book 4: Performing Advanced Data Manipulation 351

      Chapter 1: Leveraging Ensembles of Learners 353

      Leveraging Decision Trees 354

      Growing a forest of trees 356

      Seeing Random Forests in action 358

      Understanding the importance measures 360

      Configuring your system for importance measures with Python 361

      Seeing importance measures in action 361

      Working with Almost Random Guesses 364

      Understanding the premise 365

      Bagging predictors with AdaBoost 366

      Meeting Again with Gradient Descent 369

      Understanding the GBM difference 369

      Seeing GBM in action 371

      Averaging Different Predictors 372

      Chapter 2: Building Deep Learning Models 373

      Discovering the Incredible Perceptron 374

      Understanding perceptron functionality 375

      Touching the nonseparability limit 376

      Hitting Complexity with Neural Networks 378

      Considering the neuron 379

      Pushing data with feed-forward 381

      Defining hidden layers 383

      Executing operations 384

      Considering the details of data movement through the neural network 386

      Using backpropagation to adjust learning 387

      Understanding More about Neural Networks 390

      Getting an overview of the neural network process 391

      Defining the basic architecture 391

      Documenting the essential modules 393

      Solving a simple problem 396

      Looking Under the Hood of Neural Networks 399

      Choosing the right activation function 399

      Relying on a smart optimizer 401

      Setting a working learning rate 402

      Explaining Deep Learning Differences with Other Forms of AI 402

      Adding more layers 403

      Changing the activations 405

      Adding regularization by dropout 406

      Using online learning 407

      Transferring learning 407

      Learning end to end 408

      Chapter 3: Recognizing Images with CNNs 409

      Beginning with Simple Image Recognition 410

      Considering the ramifications of sight 410

      Working with a set of images 411

      Extracting visual features 417

      Recognizing faces using Eigenfaces 419

      Classifying images 423

      Understanding CNN Image Basics 427

      Moving to CNNs with Character Recognition 429

      Accessing the dataset 430

      Reshaping the dataset 431

      Encoding the categories 432

      Defining the model 432

      Using the model 433

      Explaining How Convolutions Work 435

      Understanding convolutions 435

      Simplifying the use of pooling 439

      Describing the LeNet architecture 440

      Detecting Edges and Shapes from Images 446

      Visualizing convolutions 447

      Unveiling successful architectures 449

      Discussing transfer learning 450

      Chapter 4: Processing Text and Other Sequences 453

      Introducing Natural Language Processing 454

      Defining the human perspective as it relates to data science 454

      Considering the computer perspective as it relates to data science 455

      Understanding How Machines Read 456

      Creating a corpus 457

      Performing feature extraction 457

      Understanding the BoW 458

      Processing and enhancing text 459

      Maintaining order using n-grams 461

      Stemming and removing stop words 462

      Scraping textual datasets from the web 465

      Handling problems with raw text 470

      Storing processed text data in sparse matrices 473

      Understanding Semantics Using Word Embeddings 478

      Using Scoring and Classification 482

      Performing classification tasks 482

      Analyzing reviews from e-commerce 485

      Book 5: Performing Data-Related Tasks 491

      Chapter 1: Making Recommendations 493

      Realizing the Recommendation Revolution 494

      Downloading Rating Data 495

      Navigating through anonymous web data 496

      Encountering the limits of rating data 499

      Leveraging SVD 506

      Considering the origins of SVD 506

      Understanding the SVD connection 508

      Chapter 2: Performing Complex Classifications 509

      Using Image Classification Challenges 510

      Delving into ImageNet and Coco 511

      Learning the magic of data augmentation 513

      Distinguishing Traffic Signs 516

      Preparing the image data 517

      Running a classification task 520

      Chapter 3: Identifying Objects 525

      Distinguishing Classification Tasks 526

      Understanding the problem 526

      Performing localization 527

      Classifying multiple objects 528

      Annotating multiple objects in images 529

      Segmenting images 530

      Perceiving Objects in Their Surroundings 531

      Considering vision needs in self-driving cars 531

      Discovering how RetinaNet works 532

      Using the Keras-RetinaNet code 534

      Overcoming Adversarial Attacks on Deep Learning Applications 538

      Tricking pixels 539

      Hacking with stickers and other artifacts 541

      Chapter 4: Analyzing Music and Video 543

      Learning to Imitate Art and Life 544

      Transferring an artistic style 545

      Reducing the problem to statistics 546

      Understanding that deep learning doesn’t create 548

      Mimicking an Artist 548

      Defining a new piece based on a single artist 549

      Combining styles to create new art 550

      Visualizing how neural networks dream 551

      Using a network to compose music 551

      Other creative avenues 552

      Moving toward GANs 553

      Finding the key in the competition 554

      Considering a growing field 556

      Chapter 5: Considering Other Task Types 559

      Processing Language in Texts 560

      Considering the processing methodologies 560

      Defining understanding as tokenization 561

      Putting all the documents into a bag 562

      Using AI for sentiment analysis 566

      Processing Time Series 574

      Defining sequences of events 574

      Performing a prediction using LSTM 575

      Chapter 6: Developing Impressive Charts and Plots 579

      Starting a Graph, Chart, or Plot 580

      Understanding the differences between graphs, charts, and plots 580

      Considering the graph, chart, and plot types 582

      Defining the plot 583

      Drawing multiple lines 584

      Drawing multiple plots 584

      Saving your work 586

      Setting the Axis, Ticks, and Grids 587

      Getting the axis 587

      Formatting the ticks 590

      Adding grids 590

      Defining the Line Appearance 591

      Working with line styles 592

      Adding markers 593

      Using Labels, Annotations, and Legends 594

      Adding labels 595

      Annotating the chart 596

      Creating a legend 598

      Creating Scatterplots 599

      Depicting groups 599

      Showing correlations 600

      Plotting Time Series 603

      Representing time on axes 604

      Plotting trends over time 605

      Plotting Geographical Data 608

      Getting the toolkit 608

      Drawing the map 609

      Plotting the data 613

      Visualizing Graphs 615

      Understanding the adjacency matrix 615

      Using NetworkX basics 615

      Book 6: Diagnosing and Fixing Errors 619

      Chapter 1: Locating Errors in Your Data 621

      Considering the Types of Data Errors 622

      Obtaining the Required Data 624

      Considering the data sources 624

      Obtaining reliable data 625

      Making human input more reliable 626

      Using automated data collection 628

      Validating Your Data 629

      Figuring out what’s in your data 629

      Removing duplicates 631

      Creating a data map and a data plan 632

      Manicuring the Data 634

      Dealing with missing data 634

      Considering data misalignments 639

      Separating out useful data 640

      Dealing with Dates in Your Data 640

      Formatting date and time values 641

      Using the right time transformation 641

      Chapter 2: Considering Outrageous Outcomes 643

      Deciding What Outrageous Means 644

      Considering the Five Mistruths in Data 645

      Commission 645

      Omission 646

      Perspective 646

      Bias 647

      Frame-of-reference 648

      Considering Detection of Outliers 649

      Understanding outlier basics 649

      Finding more things that can go wrong 651

      Understanding anomalies and novel data 651

      Examining a Simple Univariate Method 653

      Using the pandas package 653

      Leveraging the Gaussian distribution 655

      Making assumptions and checking out 656

      Developing a Multivariate Approach 657

      Using principle component analysis 658

      Using cluster analysis 659

      Automating outliers detection with Isolation Forests 661

      Chapter 3: Dealing with Model Overfitting and Underfitting 663

      Understanding the Causes 664

      Considering the problem 664

      Looking at underfitting 665

      Looking at overfitting 666

      Plotting learning curves for insights 668

      Determining the Sources of Overfitting and Underfitting 670

      Understanding bias and variance 671

      Having insufficient data 671

      Being fooled by data leakage 672

      Guessing the Right Features 672

      Selecting variables like a pro 673

      Using nonlinear transformations 676

      Regularizing linear models 684

      Chapter 4: Obtaining the Correct Output Presentation 689

      Considering the Meaning of Correct 690

      Determining a Presentation Type 691

      Considering the audience 691

      Defining a depth of detail 692

      Ensuring that the data is consistent with audience needs 693

      Understanding timeliness 693

      Choosing the Right Graph 694

      Telling a story with your graphs 694

      Showing parts of a whole with pie charts 694

      Creating comparisons with bar charts 695

      Showing distributions using histograms 697

      Depicting groups using boxplots 699

      Defining a data flow using line graphs 700

      Seeing data patterns using scatterplots 701

      Working with External Data 702

      Embedding plots and other images 703

      Loading examples from online sites 703

      Obtaining online graphics and multimedia 704

      Chapter 5: Developing Consistent Strategies 707

      Standardizing Data Collection Techniques 707

      Using Reliable Sources 709

      Verifying Dynamic Data Sources 711

      Considering the problem 712

      Analyzing streams with the right recipe 714

      Looking for New Data Collection Trends 715

      Weeding Old Data 716

      Considering the Need for Randomness 717

      Considering why randomization is needed 718

      Understanding how probability works 718

      Index 721

      Recently viewed products

      © 2026 Book Curl

        • American Express
        • Apple Pay
        • Diners Club
        • Discover
        • Google Pay
        • Maestro
        • Mastercard
        • PayPal
        • Shop Pay
        • Union Pay
        • Visa

        Login

        Forgot your password?

        Don't have an account yet?
        Create account