{"product_id":"data-mining-442-wiley-series-in-probability-and-statistics-9780471268512","title":"Data Mining 442 Wiley Series in Probability and Statistics","description":"\u003cb\u003eBook Synopsis\u003c\/b\u003e\u003cbr\u003eThis reference book develops a systematic process of data exploration, data cleaning and evolving a suitable modelling strategy to help analysts determine and implement a final technique.\u003cbr\u003e\u003cbr\u003e\u003cb\u003eTrade Review\u003c\/b\u003e\u003cbr\u003e\"Statisticians not conversant with today's statistical take on DQ should read this book…and be stimulated to do important research in DQ.\" (\u003ci\u003eJournal of the American Statistical Association\u003c\/i\u003e, March 2006)  \u003cp\u003e\"…uniquely integrates several approaches for data cleaning and exploration…\" (\u003ci\u003eJournal of Statistical Computation \u0026amp; Simulation\u003c\/i\u003e, April 2004)\u003c\/p\u003e \u003cp\u003e\"...provides a uniquely integrated approach...for serious data analysts everywhere...\" (\u003ci\u003eZentralblatt Math\u003c\/i\u003e, Vol. 1027, 2004)\u003c\/p\u003e\u003cbr\u003e\u003cbr\u003e\u003cb\u003eTable of Contents\u003c\/b\u003e\u003cbr\u003e0.1 Preface.  \u003cp\u003e1 Exploratory Data Mining and Data Cleaning: An Overview.\u003c\/p\u003e \u003cp\u003e1.1 Introduction.\u003c\/p\u003e \u003cp\u003e1.2 Cautionary Tales.\u003c\/p\u003e \u003cp\u003e1.3 Taming the Data.\u003c\/p\u003e \u003cp\u003e1.4 Challenges.\u003c\/p\u003e \u003cp\u003e1.5 Methods.\u003c\/p\u003e \u003cp\u003e1.6 EDM.\u003c\/p\u003e \u003cp\u003e1.6.1 EDM Summaries - Parametric.\u003c\/p\u003e \u003cp\u003e1.6.2 EDM Summaries - Nonparametric.\u003c\/p\u003e \u003cp\u003e1.7 End­to­End Data Quality (DQ).\u003c\/p\u003e \u003cp\u003e1.7.1 DQ in Data Preparation.\u003c\/p\u003e \u003cp\u003e1.7.2 EDM and Data Glitches.\u003c\/p\u003e \u003cp\u003e1.7.3 Tools for DQ.\u003c\/p\u003e \u003cp\u003e1.7.4 End­to­End DQ: The Data Quality Continuum.\u003c\/p\u003e \u003cp\u003e1.7.5 Measuring Data Quality.\u003c\/p\u003e \u003cp\u003e1.8 Conclusion.\u003c\/p\u003e \u003cp\u003e2 Exploratory Data Mining.\u003c\/p\u003e \u003cp\u003e2.1 Introduction.\u003c\/p\u003e \u003cp\u003e2.2 Uncertainty.\u003c\/p\u003e \u003cp\u003e2.2.1 Annotated Bibliography.\u003c\/p\u003e \u003cp\u003e2.3 EDM: Exploratory Data Mining.\u003c\/p\u003e \u003cp\u003e2.4 EDM Summaries.\u003c\/p\u003e \u003cp\u003e2.4.1 Typical Values.\u003c\/p\u003e \u003cp\u003e2.4.2 Attribute Variation.\u003c\/p\u003e \u003cp\u003e2.4.3 Example.\u003c\/p\u003e \u003cp\u003e2.4.4 Attribute Relationships.\u003c\/p\u003e \u003cp\u003e2.4.5 Annotated Bibliography.\u003c\/p\u003e \u003cp\u003e2.5 What Makes a Summary Useful?\u003c\/p\u003e \u003cp\u003e2.5.1 Statistical Properties.\u003c\/p\u003e \u003cp\u003e2.5.2 Computational Criteria.\u003c\/p\u003e \u003cp\u003e2.5.3 Annotated Bibliography.\u003c\/p\u003e \u003cp\u003e2.6 Data­Driven Approach - Nonparametric Analysis.\u003c\/p\u003e \u003cp\u003e2.6.1 The Joy of Counting.\u003c\/p\u003e \u003cp\u003e2.6.2 Empirical Cumulative Distribution Function (ECDF).\u003c\/p\u003e \u003cp\u003e2.6.3 Univariate Histograms.\u003c\/p\u003e \u003cp\u003e2.6.4 Annotated Bibliography.\u003c\/p\u003e \u003cp\u003e2.7 EDM in Higher Dimensions.\u003c\/p\u003e \u003cp\u003e2.8 Rectilinear Histograms.\u003c\/p\u003e \u003cp\u003e2.9 Depth and Multivariate Binning.\u003c\/p\u003e \u003cp\u003e2.9.1 Data Depth.\u003c\/p\u003e \u003cp\u003e2.9.2 Aside: Depth­Related Topics.\u003c\/p\u003e \u003cp\u003e2.9.3 Annotated Bibliography.\u003c\/p\u003e \u003cp\u003e2.10 Conclusion.\u003c\/p\u003e \u003cp\u003e3 Partitions and Piecewise Models.\u003c\/p\u003e \u003cp\u003e3.1 Divide and Conquer.\u003c\/p\u003e \u003cp\u003e3.1.1 Why Do We Need Partitions?\u003c\/p\u003e \u003cp\u003e3.1.2 Dividing Data.\u003c\/p\u003e \u003cp\u003e3.1.3 Applications of Partition­based EDM Summaries.\u003c\/p\u003e \u003cp\u003e3.2 Axis­Aligned Partitions and Data Cubes.\u003c\/p\u003e \u003cp\u003e3.3 Nonlinear Partitions.\u003c\/p\u003e \u003cp\u003e3.3.1 Annotated Bibliography.\u003c\/p\u003e \u003cp\u003e3.4 DataSpheres (DS).\u003c\/p\u003e \u003cp\u003e3.4.1 Layers.\u003c\/p\u003e \u003cp\u003e3.4.2 Data Pyramids.\u003c\/p\u003e \u003cp\u003e3.4.3 EDM Summaries.\u003c\/p\u003e \u003cp\u003e3.4.4 Annotated Bibliography.\u003c\/p\u003e \u003cp\u003e3.5 Set Comparison Using EDM Summaries.\u003c\/p\u003e \u003cp\u003e3.5.1 Motivation.\u003c\/p\u003e \u003cp\u003e3.5.2 Comparison Strategy.\u003c\/p\u003e \u003cp\u003e3.5.3 Statistical Tests for Change.\u003c\/p\u003e \u003cp\u003e3.5.4 Application - Two Case Studies.\u003c\/p\u003e \u003cp\u003e3.5.5 Annotated Bibliography.\u003c\/p\u003e \u003cp\u003e3.6 Discovering Complex Structure in Data with EDM Summaries.\u003c\/p\u003e \u003cp\u003e3.6.1 Exploratory Model Fitting in Interactive Response Time.\u003c\/p\u003e \u003cp\u003e3.6.2 Annotated Bibliography.\u003c\/p\u003e \u003cp\u003e3.7 Piecewise Linear Regression.\u003c\/p\u003e \u003cp\u003e3.7.1 An Application.\u003c\/p\u003e \u003cp\u003e3.7.2 Regression Coefficients.\u003c\/p\u003e \u003cp\u003e3.7.3 Improvement in Fit.\u003c\/p\u003e \u003cp\u003e3.7.4 Annotated Bibliography.\u003c\/p\u003e \u003cp\u003e3.8 One­Pass Classification.\u003c\/p\u003e \u003cp\u003e3.8.1 Quantile­Based Prediction with Piecewise Models.\u003c\/p\u003e \u003cp\u003e3.8.2 Simulation Study.\u003c\/p\u003e \u003cp\u003e3.8.3 Annotated Bibliography.\u003c\/p\u003e \u003cp\u003e3.9 Conclusion.\u003c\/p\u003e \u003cp\u003e4 Data Quality.\u003c\/p\u003e \u003cp\u003e4.1 Introduction.\u003c\/p\u003e \u003cp\u003e4.2 The Meaning of Data Quality.\u003c\/p\u003e \u003cp\u003e4.2.1 An Example.\u003c\/p\u003e \u003cp\u003e4.2.2 Data Glitches.\u003c\/p\u003e \u003cp\u003e4.2.3 Gaps in Time Series Records.\u003c\/p\u003e \u003cp\u003e4.2.4 Conventional Definition.\u003c\/p\u003e \u003cp\u003e4.2.5 Times Have Changed.\u003c\/p\u003e \u003cp\u003e4.2.6 Annotated Bibliography.\u003c\/p\u003e \u003cp\u003e4.3 Updating DQ Metrics: Data Quality Continuum.\u003c\/p\u003e \u003cp\u003e4.3.1 Data Gathering.\u003c\/p\u003e \u003cp\u003e4.3.2 Data Delivery.\u003c\/p\u003e \u003cp\u003e4.3.3 Data Monitoring.\u003c\/p\u003e \u003cp\u003e4.3.4 Data Storage.\u003c\/p\u003e \u003cp\u003e4.3.5 Data Integration.\u003c\/p\u003e \u003cp\u003e4.3.6 Data Retrieval.\u003c\/p\u003e \u003cp\u003e4.3.7 Data Mining\/Analysis.\u003c\/p\u003e \u003cp\u003e4.3.8 Annotated Bibliography.\u003c\/p\u003e \u003cp\u003e4.4 The Meaning of Data Quality Revisited.\u003c\/p\u003e \u003cp\u003e4.4.1 Data Interpretation.\u003c\/p\u003e \u003cp\u003e4.4.2 Data Suitability.\u003c\/p\u003e \u003cp\u003e4.4.3 Dataset Type.\u003c\/p\u003e \u003cp\u003e4.4.4 Attribute Type.\u003c\/p\u003e \u003cp\u003e4.4.5 Application Type.\u003c\/p\u003e \u003cp\u003e4.4.6 Data Quality - A Many Splendored Thing.\u003c\/p\u003e \u003cp\u003e4.4.7 Annotated Bibliography.\u003c\/p\u003e \u003cp\u003e4.5 Measuring Data Quality.\u003c\/p\u003e \u003cp\u003e4.5.1 DQ Components and Their Measurement.\u003c\/p\u003e \u003cp\u003e4.5.2 Combining DQ Metrics.\u003c\/p\u003e \u003cp\u003e4.6 The DQ Process.\u003c\/p\u003e \u003cp\u003e4.7 Conclusion.\u003c\/p\u003e \u003cp\u003e4.7.1 Four Complementary Approaches.\u003c\/p\u003e \u003cp\u003e4.7.2 Annotated Bibliography.\u003c\/p\u003e \u003cp\u003e5 Data Quality: Techniques and Algorithms.\u003c\/p\u003e \u003cp\u003e5.1 Introduction.\u003c\/p\u003e \u003cp\u003e5.2 DQ Tools Based on Statistical Techniques.\u003c\/p\u003e \u003cp\u003e5.2.1 Missing Values.\u003c\/p\u003e \u003cp\u003e5.2.2 Incomplete Data.\u003c\/p\u003e \u003cp\u003e5.2.3 Outliers.\u003c\/p\u003e \u003cp\u003e5.2.4 Time Series Outliers: A Case Study.\u003c\/p\u003e \u003cp\u003e5.2.5 Goodness­of­Fit.\u003c\/p\u003e \u003cp\u003e5.2.6 Annotated Bibliography.\u003c\/p\u003e \u003cp\u003e5.3 Database Techniques for DQ.\u003c\/p\u003e \u003cp\u003e5.3.1 What is a Relational Database?\u003c\/p\u003e \u003cp\u003e5.3.2 Why Are Data Dirty?\u003c\/p\u003e \u003cp\u003e5.3.3 Extraction, Transformation, and Loading (ETL).\u003c\/p\u003e \u003cp\u003e5.3.4 Approximate Matching.\u003c\/p\u003e \u003cp\u003e5.3.5 Database Profiling.\u003c\/p\u003e \u003cp\u003e5.3.6 Annotated Bibliography.\u003c\/p\u003e \u003cp\u003e5.4 Metadata and Domain Expertise.\u003c\/p\u003e \u003cp\u003e5.4.1 Lineage Tracing.\u003c\/p\u003e \u003cp\u003e5.4.2 Annotated Bibliography.\u003c\/p\u003e \u003cp\u003e5.5 Measuring Data Quality?\u003c\/p\u003e \u003cp\u003e5.5.1 Inventory Building - A Case Study.\u003c\/p\u003e \u003cp\u003e5.5.2 Learning and Recommendations.\u003c\/p\u003e \u003cp\u003e5.6 Data Quality and Its Challenges.\u003c\/p\u003e","brand":"Wiley","offers":[{"title":"Default Title","offer_id":53515424563543,"sku":"9780471268512","price":116.96,"currency_code":"GBP","in_stock":true}],"url":"https:\/\/bookcurl.com\/products\/data-mining-442-wiley-series-in-probability-and-statistics-9780471268512","provider":"Book Curl","version":"1.0","type":"link"}