{"product_id":"statistical-data-analytics-9781118619650","title":"Statistical Data Analytics","description":"\u003cb\u003eBook Synopsis\u003c\/b\u003e\u003cbr\u003eA comprehensive introduction to statistical methods for data mining and knowledge discovery.\u003cbr\u003e\u003cbr\u003e\u003cb\u003eTable of Contents\u003c\/b\u003e\u003cbr\u003ePreface xiii \u003cp\u003e\u003cb\u003ePart I Background: Introductory Statistical Analytics 1\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e\u003cb\u003e1 Data analytics and data mining 3\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e1.1 Knowledge discovery: finding structure in data 3\u003c\/p\u003e \u003cp\u003e1.2 Data quality versus data quantity 5\u003c\/p\u003e \u003cp\u003e1.3 Statistical modeling versus statistical description 7\u003c\/p\u003e \u003cp\u003e\u003cb\u003e2 Basic probability and statistical distributions 10\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e2.1 Concepts in probability 10\u003c\/p\u003e \u003cp\u003e2.1.1 Probability rules 11\u003c\/p\u003e \u003cp\u003e2.1.2 Random variables and probability functions 12\u003c\/p\u003e \u003cp\u003e2.1.3 Means, variances, and expected values 17\u003c\/p\u003e \u003cp\u003e2.1.4 Median, quartiles, and quantiles 18\u003c\/p\u003e \u003cp\u003e2.1.5 Bivariate expected values, covariance, and correlation 20\u003c\/p\u003e \u003cp\u003e2.2 Multiple random variables∗ 21\u003c\/p\u003e \u003cp\u003e2.3 Univariate families of distributions 23\u003c\/p\u003e \u003cp\u003e2.3.1 Binomial distribution 23\u003c\/p\u003e \u003cp\u003e2.3.2 Poisson distribution 26\u003c\/p\u003e \u003cp\u003e2.3.3 Geometric distribution 27\u003c\/p\u003e \u003cp\u003e2.3.4 Negative binomial distribution 27\u003c\/p\u003e \u003cp\u003e2.3.5 Discrete uniform distribution 28\u003c\/p\u003e \u003cp\u003e2.3.6 Continuous uniform distribution 29\u003c\/p\u003e \u003cp\u003e2.3.7 Exponential distribution 29\u003c\/p\u003e \u003cp\u003e2.3.8 Gamma and chi-square distributions 30\u003c\/p\u003e \u003cp\u003e2.3.9 Normal (Gaussian) distribution 32\u003c\/p\u003e \u003cp\u003e2.3.10 Distributions derived from normal 37\u003c\/p\u003e \u003cp\u003e2.3.11 The exponential family 41\u003c\/p\u003e \u003cp\u003e\u003cb\u003e3 Data manipulation 49\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e3.1 Random sampling 49\u003c\/p\u003e \u003cp\u003e3.2 Data types 51\u003c\/p\u003e \u003cp\u003e3.3 Data summarization 52\u003c\/p\u003e \u003cp\u003e3.3.1 Means, medians, and central tendency 52\u003c\/p\u003e \u003cp\u003e3.3.2 Summarizing variation 56\u003c\/p\u003e \u003cp\u003e3.3.3 Summarizing (bivariate) correlation 59\u003c\/p\u003e \u003cp\u003e3.4 Data diagnostics and data transformation 60\u003c\/p\u003e \u003cp\u003e3.4.1 Outlier analysis 60\u003c\/p\u003e \u003cp\u003e3.4.2 Entropy∗ 62\u003c\/p\u003e \u003cp\u003e3.4.3 Data transformation 64\u003c\/p\u003e \u003cp\u003e3.5 Simple smoothing techniques 65\u003c\/p\u003e \u003cp\u003e3.5.1 Binning 66\u003c\/p\u003e \u003cp\u003e3.5.2 Moving averages∗ 67\u003c\/p\u003e \u003cp\u003e3.5.3 Exponential smoothing∗ 69\u003c\/p\u003e \u003cp\u003e\u003cb\u003e4 Data visualization and statistical graphics 76\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e4.1 Univariate visualization 77\u003c\/p\u003e \u003cp\u003e4.1.1 Strip charts and dot plots 77\u003c\/p\u003e \u003cp\u003e4.1.2 Boxplots 79\u003c\/p\u003e \u003cp\u003e4.1.3 Stem-and-leaf plots 81\u003c\/p\u003e \u003cp\u003e4.1.4 Histograms and density estimators 83\u003c\/p\u003e \u003cp\u003e4.1.5 Quantile plots 87\u003c\/p\u003e \u003cp\u003e4.2 Bivariate and multivariate visualization 89\u003c\/p\u003e \u003cp\u003e4.2.1 Pie charts and bar charts 90\u003c\/p\u003e \u003cp\u003e4.2.2 Multiple boxplots and QQ plots 95\u003c\/p\u003e \u003cp\u003e4.2.3 Scatterplots and bubble plots 98\u003c\/p\u003e \u003cp\u003e4.2.4 Heatmaps 102\u003c\/p\u003e \u003cp\u003e4.2.5 Time series plots∗ 105\u003c\/p\u003e \u003cp\u003e\u003cb\u003e5 Statistical inference 115\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e5.1 Parameters and likelihood 115\u003c\/p\u003e \u003cp\u003e5.2 Point estimation 117\u003c\/p\u003e \u003cp\u003e5.2.1 Bias 118\u003c\/p\u003e \u003cp\u003e5.2.2 The method of moments 118\u003c\/p\u003e \u003cp\u003e5.2.3 Least squares\/weighted least squares 119\u003c\/p\u003e \u003cp\u003e5.2.4 Maximum likelihood∗ 120\u003c\/p\u003e \u003cp\u003e5.3 Interval estimation 123\u003c\/p\u003e \u003cp\u003e5.3.1 Confidence intervals 123\u003c\/p\u003e \u003cp\u003e5.3.2 Single-sample intervals for normal (Gaussian) parameters 124\u003c\/p\u003e \u003cp\u003e5.3.3 Two-sample intervals for normal (Gaussian) parameters 128\u003c\/p\u003e \u003cp\u003e5.3.4 Wald intervals and likelihood intervals∗ 131\u003c\/p\u003e \u003cp\u003e5.3.5 Delta method intervals∗ 135\u003c\/p\u003e \u003cp\u003e5.3.6 Bootstrap intervals∗ 137\u003c\/p\u003e \u003cp\u003e5.4 Testing hypotheses 138\u003c\/p\u003e \u003cp\u003e5.4.1 Single-sample tests for normal (Gaussian) parameters 140\u003c\/p\u003e \u003cp\u003e5.4.2 Two-sample tests for normal (Gaussian) parameters 142\u003c\/p\u003e \u003cp\u003e5.4.3 Walds tests, likelihood ratio tests, and ‘exact’ tests∗ 145\u003c\/p\u003e \u003cp\u003e5.5 Multiple inferences∗ 148\u003c\/p\u003e \u003cp\u003e5.5.1 Bonferroni multiplicity adjustment 149\u003c\/p\u003e \u003cp\u003e5.5.2 False discovery rate 151\u003c\/p\u003e \u003cp\u003e\u003cb\u003ePart II Statistical Learning and Data Analytics 161\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e\u003cb\u003e6 Techniques for supervised learning: simple linear regression 163\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e6.1 What is “supervised learning?” 163\u003c\/p\u003e \u003cp\u003e6.2 Simple linear regression 164\u003c\/p\u003e \u003cp\u003e6.2.1 The simple linear model 164\u003c\/p\u003e \u003cp\u003e6.2.2 Multiple inferences and simultaneous confidence bands 171\u003c\/p\u003e \u003cp\u003e6.3 Regression diagnostics 175\u003c\/p\u003e \u003cp\u003e6.4 Weighted least squares (WLS) regression 184\u003c\/p\u003e \u003cp\u003e6.5 Correlation analysis 187\u003c\/p\u003e \u003cp\u003e6.5.1 The correlation coefficient 187\u003c\/p\u003e \u003cp\u003e6.5.2 Rank correlation 190\u003c\/p\u003e \u003cp\u003e\u003cb\u003e7 Techniques for supervised learning: multiple linear regression 198\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e7.1 Multiple linear regression 198\u003c\/p\u003e \u003cp\u003e7.1.1 Matrix formulation 199\u003c\/p\u003e \u003cp\u003e7.1.2 Weighted least squares for the MLR model 200\u003c\/p\u003e \u003cp\u003e7.1.3 Inferences under the MLR model 201\u003c\/p\u003e \u003cp\u003e7.1.4 Multicollinearity 208\u003c\/p\u003e \u003cp\u003e7.2 Polynomial regression 210\u003c\/p\u003e \u003cp\u003e7.3 Feature selection 211\u003c\/p\u003e \u003cp\u003e7.3.1 R2p plots 212\u003c\/p\u003e \u003cp\u003e7.3.2 Information criteria: AIC and BIC 215\u003c\/p\u003e \u003cp\u003e7.3.3 Automated variable selection 216\u003c\/p\u003e \u003cp\u003e7.4 Alternative regression methods∗ 223\u003c\/p\u003e \u003cp\u003e7.4.1 Loess 224\u003c\/p\u003e \u003cp\u003e7.4.2 Regularization: ridge regression 230\u003c\/p\u003e \u003cp\u003e7.4.3 Regularization and variable selection: the Lasso 238\u003c\/p\u003e \u003cp\u003e7.5 Qualitative predictors: ANOVA models 242\u003c\/p\u003e \u003cp\u003e\u003cb\u003e8 Supervised learning: generalized linear models 258\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e8.1 Extending the linear regression model 258\u003c\/p\u003e \u003cp\u003e8.1.1 Nonnormal data and the exponential family 258\u003c\/p\u003e \u003cp\u003e8.1.2 Link functions 259\u003c\/p\u003e \u003cp\u003e8.2 Technical details for GLiMs∗ 259\u003c\/p\u003e \u003cp\u003e8.2.1 Estimation 260\u003c\/p\u003e \u003cp\u003e8.2.2 The deviance function 261\u003c\/p\u003e \u003cp\u003e8.2.3 Residuals 262\u003c\/p\u003e \u003cp\u003e8.2.4 Inference and model assessment 264\u003c\/p\u003e \u003cp\u003e8.3 Selected forms of GLiMs 265\u003c\/p\u003e \u003cp\u003e8.3.1 Logistic regression and binary-data GLiMs 265\u003c\/p\u003e \u003cp\u003e8.3.2 Trend testing with proportion data 271\u003c\/p\u003e \u003cp\u003e8.3.3 Contingency tables and log-linear models 273\u003c\/p\u003e \u003cp\u003e8.3.4 Gamma regression models 281\u003c\/p\u003e \u003cp\u003e\u003cb\u003e9 Supervised learning: classification 291\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e9.1 Binary classification via logistic regression 292\u003c\/p\u003e \u003cp\u003e9.1.1 Logistic discriminants 292\u003c\/p\u003e \u003cp\u003e9.1.2 Discriminant rule accuracy 296\u003c\/p\u003e \u003cp\u003e9.1.3 ROC curves 297\u003c\/p\u003e \u003cp\u003e9.2 Linear discriminant analysis (LDA) 297\u003c\/p\u003e \u003cp\u003e9.2.1 Linear discriminant functions 297\u003c\/p\u003e \u003cp\u003e9.2.2 Bayes discriminant\/classification rules 302\u003c\/p\u003e \u003cp\u003e9.2.3 Bayesian classification with normal data 303\u003c\/p\u003e \u003cp\u003e9.2.4 Naïve Bayes classifiers 308\u003c\/p\u003e \u003cp\u003e9.3 k-Nearest neighbor classifiers 308\u003c\/p\u003e \u003cp\u003e9.4 Tree-based methods 312\u003c\/p\u003e \u003cp\u003e9.4.1 Classification trees 312\u003c\/p\u003e \u003cp\u003e9.4.2 Pruning 314\u003c\/p\u003e \u003cp\u003e9.4.3 Boosting 321\u003c\/p\u003e \u003cp\u003e9.4.4 Regression trees 321\u003c\/p\u003e \u003cp\u003e9.5 Support vector machines∗ 322\u003c\/p\u003e \u003cp\u003e9.5.1 Separable data 322\u003c\/p\u003e \u003cp\u003e9.5.2 Nonseparable data 325\u003c\/p\u003e \u003cp\u003e9.5.3 Kernel transformations 326\u003c\/p\u003e \u003cp\u003e\u003cb\u003e10 Techniques for unsupervised learning: dimension reduction 341\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e10.1 Unsupervised versus supervised learning 341\u003c\/p\u003e \u003cp\u003e10.2 Principal component analysis 342\u003c\/p\u003e \u003cp\u003e10.2.1 Principal components 342\u003c\/p\u003e \u003cp\u003e10.2.2 Implementing a PCA 344\u003c\/p\u003e \u003cp\u003e10.3 Exploratory factor analysis 351\u003c\/p\u003e \u003cp\u003e10.3.1 The factor analytic model 351\u003c\/p\u003e \u003cp\u003e10.3.2 Principal factor estimation 353\u003c\/p\u003e \u003cp\u003e10.3.3 Maximum likelihood estimation 354\u003c\/p\u003e \u003cp\u003e10.3.4 Selecting the number of factors 355\u003c\/p\u003e \u003cp\u003e10.3.5 Factor rotation 356\u003c\/p\u003e \u003cp\u003e10.3.6 Implementing an EFA 357\u003c\/p\u003e \u003cp\u003e10.4 Canonical correlation analysis∗ 361\u003c\/p\u003e \u003cp\u003e\u003cb\u003e11 Techniques for unsupervised learning: clustering and association 373\u003c\/b\u003e\u003c\/p\u003e \u003cp\u003e11.1 Cluster analysis 373\u003c\/p\u003e \u003cp\u003e11.1.1 Hierarchical clustering 376\u003c\/p\u003e \u003cp\u003e11.1.2 Partitioned clustering 384\u003c\/p\u003e \u003cp\u003e11.2 Association rules\/market basket analysis 395\u003c\/p\u003e \u003cp\u003e11.2.1 Association rules for binary observations 396\u003c\/p\u003e \u003cp\u003e11.2.2 Measures of rule quality 397\u003c\/p\u003e \u003cp\u003e11.2.3 The Apriori algorithm 398\u003c\/p\u003e \u003cp\u003e11.2.4 Statistical measures of association quality 402\u003c\/p\u003e \u003cp\u003eA Matrix manipulation 411\u003c\/p\u003e \u003cp\u003eA.1 Vectors and matrices 411\u003c\/p\u003e \u003cp\u003eA.2 Matrix algebra 412\u003c\/p\u003e \u003cp\u003eA.3 Matrix inversion 414\u003c\/p\u003e \u003cp\u003eA.4 Quadratic forms 415\u003c\/p\u003e \u003cp\u003eA.5 Eigenvalues and eigenvectors 415\u003c\/p\u003e \u003cp\u003eA.6 Matrix factorizations 416\u003c\/p\u003e \u003cp\u003eA.6.1 QR decomposition 417\u003c\/p\u003e \u003cp\u003eA.6.2 Spectral decomposition 417\u003c\/p\u003e \u003cp\u003eA.6.3 Matrix square root 417\u003c\/p\u003e \u003cp\u003eA.6.4 Singular value decomposition 418\u003c\/p\u003e \u003cp\u003eA.7 Statistics via matrix operations 419\u003c\/p\u003e \u003cp\u003eB Brief introduction to R 421\u003c\/p\u003e \u003cp\u003eB.1 Data entry and manipulation 422\u003c\/p\u003e \u003cp\u003eB.2 A turbo-charged calculator 426\u003c\/p\u003e \u003cp\u003eB.3 R functions 427\u003c\/p\u003e \u003cp\u003eB.3.1 Inbuilt R functions 427\u003c\/p\u003e \u003cp\u003eB.3.2 Flow control 429\u003c\/p\u003e \u003cp\u003eB.3.3 User-defined functions 429\u003c\/p\u003e \u003cp\u003eB.4 R packages 430\u003c\/p\u003e \u003cp\u003eReferences 432\u003c\/p\u003e \u003cp\u003eIndex 453\u003c\/p\u003e","brand":"John Wiley \u0026 Sons Inc","offers":[{"title":"Default Title","offer_id":49528833343831,"sku":"9781118619650","price":73.76,"currency_code":"GBP","in_stock":true}],"thumbnail_url":"\/\/cdn.shopify.com\/s\/files\/1\/0817\/1739\/5799\/files\/9781118619650.jpg?v=1731873199","url":"https:\/\/bookcurl.com\/products\/statistical-data-analytics-9781118619650","provider":"Book Curl","version":"1.0","type":"link"}