Data mining Books
O'Reilly Media Learning SQL
Book SynopsisAs data floods into your company, you need to put it to work right away-and SQL is the best tool for the job. With the latest edition of this introductory guide, author Alan Beaulieu helps developers get up to speed with SQL fundamentals for writing database applications, performing administrative tasks, and generating reports.
£39.74
Springer International Publishing AG Neural Networks and Deep Learning
Book SynopsisChapters 6 and 7 present radial-basis function (RBF) networks and restricted Boltzmann machines. Advanced topics in neural networks: Chapters 8, 9, and 10 discuss recurrent neural networks, convolutional neural networks, and graph neural networks.
£42.74
Cambridge University Press Computer Age Statistical Inference Student
Book SynopsisThe twenty-first century has seen a breathtaking expansion of statistical methodology, both in scope and influence. ''Data science'' and ''machine learning'' have become familiar terms in the news, as statistical methods are brought to bear upon the enormous data sets of modern science and commerce. How did we get here? And where are we going? How does it all fit together? Now in paperback and fortified with exercises, this book delivers a concentrated course in modern statistical thinking. Beginning with classical inferential theories - Bayesian, frequentist, Fisherian - individual chapters take up a series of influential topics: survival analysis, logistic regression, empirical Bayes, the jackknife and bootstrap, random forests, neural networks, Markov Chain Monte Carlo, inference after model selection, and dozens more. The distinctly modern approach integrates methodology and algorithms with statistical inference. Each chapter ends with class-tested exercises, and the book concludes with speculation on the future direction of statistics and data science.Table of ContentsPart I. Classic Statistical Inference: 1. Algorithms and inference; 2. Frequentist inference; 3. Bayesian inference; 4. Fisherian inference and maximum likelihood estimation; 5. Parametric models and exponential families; Part II. Early Computer-Age Methods: 6. Empirical Bayes; 7. James–Stein estimation and ridge regression; 8. Generalized linear models and regression trees; 9. Survival analysis and the EM algorithm; 10. The jackknife and the bootstrap; 11. Bootstrap confidence intervals; 12. Cross-validation and Cp estimates of prediction error; 13. Objective Bayes inference and Markov chain Monte Carlo; 14. Statistical inference and methodology in the postwar era; Part III. Twenty-First-Century Topics: 15. Large-scale hypothesis testing and false-discovery rates; 16. Sparse modeling and the lasso; 17. Random forests and boosting; 18. Neural networks and deep learning; 19. Support-vector machines and kernel methods; 20. Inference after model selection; 21. Empirical Bayes estimation strategies; Epilogue; References; Author Index; Subject Index.
£30.99
Cengage Learning, Inc Data Visualization
Book SynopsisDATA VISUALIZATION: Exploring and Explaining with Data is designed to introduce best practices in data visualization to undergraduate and graduate students. This is one of the first books on data visualization designed for college courses. The book contains material on effective design, choice of chart type, effective use of color, how to both explore data visually, and how to explain concepts and results visually in a compelling way with data. The book explains both the "why" of data visualization and the "how." That is, the book provides lucid explanations of the guiding principles of data visualization through the use of interesting examples.Table of Contents1. Introduction. 2. Selecting a Chart Type. 3. Data Visualization and Design. 4. Purposeful Use of Color. 5. Visualizing Variability. 6. Exploring Data Visually. 7. Explaining Visually to Influence with Data. 8. Data Dashboards. 9. Telling the Truth with Data Visualization.
£58.89
Springer-Verlag New York Inc. Recommender Systems Handbook
Book SynopsisPreface.- Introduction.- Part 1: General Recommendation Techniques.- Trust Your Neighbors: A Comprehensive Survey of Neighborhood-based Methods for Recommender Systems (Desrosiers).- Advances in Collaborative Filtering (Koren).- Item Recommendation from Implicit Feedback (Rendle).- Deep Learning for Recommender Systems (Zhang).- Context Aware Re commender Sytems : From Foundatiom to Recent Developments (Bauman).- Semantics and Content-based Recommendations (Musto).- Part 2: Special Recommendation Techniques.- Session-based Recommender Systems (lannoch)..- Adversarial Recommender Systems: Attack,Defense, and Advances (Di Nola).- Group Recommender Systems: Beyond Preferance Aggregation (Masthoff).- People-to-People Reciprocal Recommenders (Koprinska).- Natural Language Processing for Recommender Systems (Sar-Shalom).- Design and Evaluation of Cross-domain Recommender Systems (Cremonesi).- Part 3: Value and Impact of Recommender Systems.- Value and Impact of Recommender SyTable of ContentsPreface.- Introduction.- Part 1: General Recommendation Techniques.- Trust Your Neighbors: A Comprehensive Survey of Neighborhood-based Methods for Recommender Systems (Desrosiers).- Advances in Collaborative Filtering (Koren).- Item Recommendation from Implicit Feedback (Rendle).- Deep Learning for Recommender Systems (Zhang).- Context Aware Re commender Sytems : From Foundatiom to Recent Developments (Bauman).- Semantics and Content-based Recommendations (Musto).- Part 2: Special Recommendation Techniques.- Session-based Recommender Systems (lannoch)..- Adversarial Recommender Systems: Attack,Defense, and Advances (Di Nola).- Group Recommender Systems: Beyond Preferance Aggregation (Masthoff).- People-to-People Reciprocal Recommenders (Koprinska).- Natural Language Processing for Recommender Systems (Sar-Shalom).- Design and Evaluation of Cross-domain Recommender Systems (Cremonesi).- Part 3: Value and Impact of Recommender Systems.- Value and Impact of Recommender Systems (Zanker).- Evaluating Recommender Systems (Shani).- Novelty and Diversity in Recommender Systems (Castells).- Multistakeholder Recommender Systems (Burke).- Fairness in Recommender Systems (Ekstrand).- Part 4: Human Computer Interaction.- Beyond Explaining Single Item Recommendations (Tintarev).- Personality and Recommender Systems (Tkalčič).- Individual and Group Decision Making and Recommender Systems (Jameson).- Part 5: Recommender Systems Applications .- Social Recommender Systems (Guy).- Food Recommender Systems (Trattner).- Music Recommendation Systems: Techniques, Use Cases, and Challenges (Schedl).- Multimedia Recommender Systems: Algorithms and Challenges (Deldjoo).- Fashion Recommender Systems (Dokoohaki).
£999.99
John Wiley & Sons Inc Statistical Data Analytics
Book SynopsisA comprehensive introduction to statistical methods for data mining and knowledge discovery.Table of ContentsPreface xiii Part I Background: Introductory Statistical Analytics 1 1 Data analytics and data mining 3 1.1 Knowledge discovery: finding structure in data 3 1.2 Data quality versus data quantity 5 1.3 Statistical modeling versus statistical description 7 2 Basic probability and statistical distributions 10 2.1 Concepts in probability 10 2.1.1 Probability rules 11 2.1.2 Random variables and probability functions 12 2.1.3 Means, variances, and expected values 17 2.1.4 Median, quartiles, and quantiles 18 2.1.5 Bivariate expected values, covariance, and correlation 20 2.2 Multiple random variables∗ 21 2.3 Univariate families of distributions 23 2.3.1 Binomial distribution 23 2.3.2 Poisson distribution 26 2.3.3 Geometric distribution 27 2.3.4 Negative binomial distribution 27 2.3.5 Discrete uniform distribution 28 2.3.6 Continuous uniform distribution 29 2.3.7 Exponential distribution 29 2.3.8 Gamma and chi-square distributions 30 2.3.9 Normal (Gaussian) distribution 32 2.3.10 Distributions derived from normal 37 2.3.11 The exponential family 41 3 Data manipulation 49 3.1 Random sampling 49 3.2 Data types 51 3.3 Data summarization 52 3.3.1 Means, medians, and central tendency 52 3.3.2 Summarizing variation 56 3.3.3 Summarizing (bivariate) correlation 59 3.4 Data diagnostics and data transformation 60 3.4.1 Outlier analysis 60 3.4.2 Entropy∗ 62 3.4.3 Data transformation 64 3.5 Simple smoothing techniques 65 3.5.1 Binning 66 3.5.2 Moving averages∗ 67 3.5.3 Exponential smoothing∗ 69 4 Data visualization and statistical graphics 76 4.1 Univariate visualization 77 4.1.1 Strip charts and dot plots 77 4.1.2 Boxplots 79 4.1.3 Stem-and-leaf plots 81 4.1.4 Histograms and density estimators 83 4.1.5 Quantile plots 87 4.2 Bivariate and multivariate visualization 89 4.2.1 Pie charts and bar charts 90 4.2.2 Multiple boxplots and QQ plots 95 4.2.3 Scatterplots and bubble plots 98 4.2.4 Heatmaps 102 4.2.5 Time series plots∗ 105 5 Statistical inference 115 5.1 Parameters and likelihood 115 5.2 Point estimation 117 5.2.1 Bias 118 5.2.2 The method of moments 118 5.2.3 Least squares/weighted least squares 119 5.2.4 Maximum likelihood∗ 120 5.3 Interval estimation 123 5.3.1 Confidence intervals 123 5.3.2 Single-sample intervals for normal (Gaussian) parameters 124 5.3.3 Two-sample intervals for normal (Gaussian) parameters 128 5.3.4 Wald intervals and likelihood intervals∗ 131 5.3.5 Delta method intervals∗ 135 5.3.6 Bootstrap intervals∗ 137 5.4 Testing hypotheses 138 5.4.1 Single-sample tests for normal (Gaussian) parameters 140 5.4.2 Two-sample tests for normal (Gaussian) parameters 142 5.4.3 Walds tests, likelihood ratio tests, and ‘exact’ tests∗ 145 5.5 Multiple inferences∗ 148 5.5.1 Bonferroni multiplicity adjustment 149 5.5.2 False discovery rate 151 Part II Statistical Learning and Data Analytics 161 6 Techniques for supervised learning: simple linear regression 163 6.1 What is “supervised learning?” 163 6.2 Simple linear regression 164 6.2.1 The simple linear model 164 6.2.2 Multiple inferences and simultaneous confidence bands 171 6.3 Regression diagnostics 175 6.4 Weighted least squares (WLS) regression 184 6.5 Correlation analysis 187 6.5.1 The correlation coefficient 187 6.5.2 Rank correlation 190 7 Techniques for supervised learning: multiple linear regression 198 7.1 Multiple linear regression 198 7.1.1 Matrix formulation 199 7.1.2 Weighted least squares for the MLR model 200 7.1.3 Inferences under the MLR model 201 7.1.4 Multicollinearity 208 7.2 Polynomial regression 210 7.3 Feature selection 211 7.3.1 R2p plots 212 7.3.2 Information criteria: AIC and BIC 215 7.3.3 Automated variable selection 216 7.4 Alternative regression methods∗ 223 7.4.1 Loess 224 7.4.2 Regularization: ridge regression 230 7.4.3 Regularization and variable selection: the Lasso 238 7.5 Qualitative predictors: ANOVA models 242 8 Supervised learning: generalized linear models 258 8.1 Extending the linear regression model 258 8.1.1 Nonnormal data and the exponential family 258 8.1.2 Link functions 259 8.2 Technical details for GLiMs∗ 259 8.2.1 Estimation 260 8.2.2 The deviance function 261 8.2.3 Residuals 262 8.2.4 Inference and model assessment 264 8.3 Selected forms of GLiMs 265 8.3.1 Logistic regression and binary-data GLiMs 265 8.3.2 Trend testing with proportion data 271 8.3.3 Contingency tables and log-linear models 273 8.3.4 Gamma regression models 281 9 Supervised learning: classification 291 9.1 Binary classification via logistic regression 292 9.1.1 Logistic discriminants 292 9.1.2 Discriminant rule accuracy 296 9.1.3 ROC curves 297 9.2 Linear discriminant analysis (LDA) 297 9.2.1 Linear discriminant functions 297 9.2.2 Bayes discriminant/classification rules 302 9.2.3 Bayesian classification with normal data 303 9.2.4 Naïve Bayes classifiers 308 9.3 k-Nearest neighbor classifiers 308 9.4 Tree-based methods 312 9.4.1 Classification trees 312 9.4.2 Pruning 314 9.4.3 Boosting 321 9.4.4 Regression trees 321 9.5 Support vector machines∗ 322 9.5.1 Separable data 322 9.5.2 Nonseparable data 325 9.5.3 Kernel transformations 326 10 Techniques for unsupervised learning: dimension reduction 341 10.1 Unsupervised versus supervised learning 341 10.2 Principal component analysis 342 10.2.1 Principal components 342 10.2.2 Implementing a PCA 344 10.3 Exploratory factor analysis 351 10.3.1 The factor analytic model 351 10.3.2 Principal factor estimation 353 10.3.3 Maximum likelihood estimation 354 10.3.4 Selecting the number of factors 355 10.3.5 Factor rotation 356 10.3.6 Implementing an EFA 357 10.4 Canonical correlation analysis∗ 361 11 Techniques for unsupervised learning: clustering and association 373 11.1 Cluster analysis 373 11.1.1 Hierarchical clustering 376 11.1.2 Partitioned clustering 384 11.2 Association rules/market basket analysis 395 11.2.1 Association rules for binary observations 396 11.2.2 Measures of rule quality 397 11.2.3 The Apriori algorithm 398 11.2.4 Statistical measures of association quality 402 A Matrix manipulation 411 A.1 Vectors and matrices 411 A.2 Matrix algebra 412 A.3 Matrix inversion 414 A.4 Quadratic forms 415 A.5 Eigenvalues and eigenvectors 415 A.6 Matrix factorizations 416 A.6.1 QR decomposition 417 A.6.2 Spectral decomposition 417 A.6.3 Matrix square root 417 A.6.4 Singular value decomposition 418 A.7 Statistics via matrix operations 419 B Brief introduction to R 421 B.1 Data entry and manipulation 422 B.2 A turbo-charged calculator 426 B.3 R functions 427 B.3.1 Inbuilt R functions 427 B.3.2 Flow control 429 B.3.3 User-defined functions 429 B.4 R packages 430 References 432 Index 453
£73.76
John Wiley & Sons Inc Data Mining and Learning Analytics
Book SynopsisAddresses the impacts of data mining on education and reviews applications in educational research teaching, and learning This book discusses the insights, challenges, issues, expectations, and practical implementation of data mining (DM) within educational mandates. Initial series of chapters offer a general overview of DM, Learning Analytics (LA), and data collection models in the context of educational research, while also defining and discussing data mining's four guiding principles prediction, clustering, rule association, and outlier detection. The next series of chapters showcase the pedagogical applications of Educational Data Mining (EDM) and feature case studies drawn from Business, Humanities, Health Sciences, Linguistics, and Physical Sciences education that serve to highlight the successes and some of the limitations of data mining research applications in educational settings. The remaining chapters focus exclusively on EDM's emerging role in helping to aTable of ContentsNotes on Contributors xi Introduction: Education At Computational Crossroads xxiiiSamira ElAtia, Donald Ipperciel, and Osmar R. Zaïane Part I At The Intersection of Two Fields: EDM 1 Chapter 1 Educational Process Mining: A Tutorial and Case Study Using Moodle Data Sets 3Cristóbal Romero, Rebeca Cerezo, Alejandro Bogarín, and Miguel Sanchez‐Santillán 1.1 Background 5 1.2 Data Description and Preparation 7 1.2.1 Preprocessing Log Data 7 1.2.2 Clustering Approach for Grouping Log Data 11 1.3 Working with ProM 16 1.3.1 Discovered Models 19 1.3.2 Analysis of the Models’ Performance 23 1.4 Conclusion 26 Acknowledgments 27 References 27 Chapter 2 On Big Data And Text Mining in the Humanities29Geoffrey Rockwell and Bettina Berendt 2.1 Busa and the Digital Text 30 2.2 Thesaurus Linguae Graecae and the Ibycus Computer as Infrastructure 32 2.2.1 Complete Data Sets 33 2.3 Cooking with Statistics 35 2.4 Conclusions 37 References 38 Chapter 3 Finding Predictors in Higher Education41David Eubanks, William Evers Jr., and Nancy Smith 3.1 Contrasting Traditional and Computational Methods 42 3.2 Predictors and Data Exploration 45 3.3 Data Mining Application: An Example 50 3.4 Conclusions 52 References 53 Chapter 4 Educational Data Mining: A MOOC Experience55Ryan S. Baker, Yuan Wang, Luc Paquette, Vincent Aleven, Octav Popescu, Jonathan Sewall, Carolyn Rosé, Gaurav Singh Tomar, Oliver Ferschke, Jing Zhang, Michael J. Cennamo, Stephanie Ogden, Therese Condit, José Diaz, Scott Crossley, Danielle S. McNamara, Denise K. Comer, Collin F. Lynch, Rebecca Brown, Tiffany Barnes, and Yoav Bergner 4.1 Big Data in Education: The Course 55 4.1.1 Iteration 1: Coursera 55 4.1.2 Iteration 2: edX 56 4.2 Cognitive Tutor Authoring Tools 57 4.3 Bazaar 58 4.4 Walkthrough 58 4.4.1 Course Content 58 4.4.2 Research on BDEMOOC 61 4.5 Conclusion 65 Acknowledgments 65 References 65 Chapter 5 Data Mining and Action Research 67Ellina Chernobilsky, Edith Ries, and Joanne Jasmine 5.1 Process 69 5.2 Design Methodology 71 5.3 Analysis and Interpretation of Data 72 5.3.1 Quantitative Data Analysis and Interpretation 73 5.3.2 Qualitative Data Analysis and Interpretation 74 5.4 Challenges 75 5.5 Ethics 76 5.6 Role of Administration in the Data Collection Process 76 5.7 Conclusion 77 References 77 Part II Pedagogical Applications of EDM79 Chapter 6 Design of an Adaptive Learning System and Educational Data Mining81Zhiyong Liu and Nick Cercone 6.1 Dimensionalities of the User Model in ALS 83 6.2 Collecting Data for ALS 85 6.3 Data Mining in ALS 86 6.3.1 Data Mining for User Modeling 87 6.3.2 Data Mining for Knowledge Discovery 88 6.4 ALS Model and Function Analyzing 90 6.4.1 Introduction of Module Functions 90 6.4.2 Analyzing the Workflow 93 6.5 Future Works 94 6.6 Conclusions 94 Acknowledgment 95 References 95 Chapter 7 The “Geometry” of Naive Bayes: Teaching Probabilities by “Drawing” Them99Giorgio Maria Di Nunzio 7.1 Introduction 99 7.1.1 Main Contribution 100 7.1.2 Related Works 101 7.2 The Geometry of NB Classification 102 7.2.1 Mathematical Notation 102 7.2.2 Bayesian Decision Theory 103 7.3 Two-Dimensional Probabilities 105 7.3.1 Working with Likelihoods and Priors Only 107 7.3.2 De‐normalizing Probabilities 108 7.3.3 NB Approach 109 7.3.4 Bernoulli Naïve Bayes 110 7.4 A New Decision Line: Far from the Origin 111 7.4.1 De‐normalization Makes (Some) Problems Linearly Separable 112 7.5 Likelihood Spaces, When Logarithms make a Difference (or a SUM) 114 7.5.1 De‐normalization Makes (Some) Problems Linearly Separable 115 7.5.2 A New Decision in Likelihood Spaces 116 7.5.3 A Real Case Scenario: Text Categorization 117 7.6 Final Remarks 118 References 119 Chapter 8 Examining the Learning Networks of a MOOC121Meaghan Brugha and Jean‐Paul Restoule 8.1 Review of Literature 122 8.2 Course Context 124 8.3 Results and Discussion 125 8.4 Recommendations for Future Research 133 8.5 Conclusions 134 References 135 Chapter 9 Exploring the Usefulness of Adaptive ELearning Laboratory Environments in Teaching Medical Science139Thuan Thai and Patsie Polly 9.1 Introduction 139 9.2 Software for Learning and Teaching 141 9.2.1 Reflective Practice: ePortfolio 141 9.2.2 Online Quizzes 143 9.2.3 Online Practical Lessons 144 9.2.4 Virtual Laboratories 145 9.2.5 The Gene Suite 147 9.3 Potential Limitations 152 9.4 Conclusion 153 Acknowledgments 153 References 154 Chapter 10 Investigating Co‐Occurrence Patterns of Learners’ Grammatical Errors across Proficiency Levels and Essay Topics Based on Association Analysis 157Yutaka Ishii 10.1 Introduction 157 10.1.1 The Relationship between Data Mining and Educational Research 157 10.1.2 English Writing Instruction in the Japanese Context 158 10.2 Literature Review 159 10.3 Method 160 10.3.1 Konan‐JIEM Learner Corpus 160 10.3.2 Association Analysis 162 10.4 Experiment 1 162 10.5 Experiment 2 163 10.6 Discussion and Conclusion 164 Appendix A: Example of Learner’s Essay (University Life) 164 Appendix B: Support Values of all Topics 165 Appendix C: Support Values of Advanced, Intermediate, and Beginner Levels of Learners 168 References 169 Part III EDM and Educational Research 173 Chapter 11 Mining Learning Sequences in MOOCs: Does Course Design Constrain Students’ Behaviors Or Do Students Shape Their Own Learning? 175Lorenzo Vigentini, Simon McIntyre, Negin Mirriahi, and Dennis Alonzo 11.1 Introduction 175 11.1.1 Perceptions and Challenges of MOOC Design 176 11.1.2 What Do We Know About Participants’ Navigation: Choice and Control 177 11.2 Data Mining in MOOCs: Related Work 178 11.2.1 Setting the Hypotheses 179 11.3 The Design and Intent of the LTTO MOOC 180 11.3.1 Course Grading and Certification 183 11.3.2 Delivering the Course 183 11.3.3 Operationalize Engagement, Personal Success, and Course Success in LTTO 184 11.4 Data Analysis 184 11.4.1 Approaches to Process the Data Sources 185 11.4.2 LTTO in Numbers 186 11.4.3 Characterizing Patterns of Completion and Achievement 186 11.4.4 Redefining Participation and Engagement 189 11.5 Mining Behaviors and Intents 191 11.5.1 Participants’ Intent and Behaviors: A Classification Model 191 11.5.2 Natural Clustering Based on Behaviors 194 11.5.3 Stated Intents and Behaviors: Are They Related? 198 11.6 Closing the Loop: Informing Pedagogy and Course Enhancement 198 11.6.1 Conclusions, Lessons Learnt, and Future Directions 200 References 201 Chapter 12 Understanding Communication Patterns in MOOCs: Combining Data Mining and Qualitative Methods 207Rebecca Eynon, Isis Hjorth, Taha Yasseri, and Nabeel Gillani 12.1 Introduction 207 12.2 Methodological Approaches to Understanding Communication Patterns in MOOCs 209 12.3 Description 210 12.3.1 Structural Connections 211 12.4 Examining Dialogue 213 12.5 Interpretative Models 214 12.6 Understanding Experience 215 12.7 Experimentation 216 12.8 Future Research 217 References 218 Chapter 13 An Example of Data Mining: Exploring The Relationship Between Applicant Attributes and Academic Measures of Success in a Pharmacy Program 223Dion Brocks and Ken Cor 13.1 Introduction 223 13.2 Methods 225 13.3 Results 228 13.4 Discussion 230 13.4.1 Prerequisite Predictors 230 13.4.2 Demographic Predictors 232 13.5 Conclusion 234 Appendix A 234 References 236 Chapter 14 A New Way of Seeing: Using a Data Mining Approach to Understand Children’s Views of Diversity and “Difference” in Picture Books237Robin A. Moeller and Hsin‐liang Chen 14.1 Introduction 237 14.2 Study 1: Using Data Mining to Better Understand Perceptions of Race 238 14.2.1 Background 238 14.2.2 Research Questions 239 14.2.3 Methods 240 14.2.4 Findings 240 14.2.5 Discussion 248 14.3 Study 2: Translating Data Mining Results to Picture Book Concepts of “Difference” 248 14.3.1 Background 248 14.3.2 Research Questions 249 14.3.3 Methodology 250 14.3.4 Findings 250 14.3.5 Discussion and Implications 252 14.4 Conclusions 252 References 252 Chapter 15 Data Mining with Natural Language Processing and Corpus Linguistics: Unlocking Access to School Children’s Language in Diverse Contexts to Improve Instructional and Assessment Practices255Alison L. Bailey, Anne Blackstock‐Bernstein, Eve Ryan, and Despina Pitsoulakis 15.1 Introduction 255 15.2 Identifying the Problem 256 15.3 Use of Corpora and Technology in Language Instruction and Assessment 261 15.3.1 Language Corpora in ESL and EFL Teaching and Learning 261 15.3.2 Previous Extensions of Corpus Linguistics to School‐Age Language 262 15.3.3 Corpus Linguistics in Language Assessment 263 15.3.4 Big Data Purposes, Techniques, and Technology 264 15.4 Creating a School‐Age Learner Corpus and Digital Data Analytics System 266 15.4.1 Language Measures Included in DRGON 267 15.4.2 The DLLP as a Promising Practice 268 15.5 Next Steps, “Modest Data,” and Closing Remarks 269 Acknowledgments 271 Appendix A: Examples of Oral and Written Explanation Elicitation Prompts 272 References 272 Index 277
£98.06
APress IoT Solutions in Microsofts Azure IoT Suite
Book SynopsisCollect and analyze sensor and usage data from Internet of Things applications with Microsoft Azure IoT Suite. Internet connectivity to everyday devices such as light bulbs, thermostats, and even voice-command devices such as Google Home and Amazon.com''s Alexa is exploding. These connected devices and their respective applications generate large amounts of data that can be mined to enhance user-friendliness and make predictions about what a user might be likely to do next. Microsoft''s Azure IoT Suite is a cloud-based platform that is ideal for collecting data from connected devices. You''ll learn in this book about data acquisition and analysis, including real-time analysis. Real-world examples are provided to teach you to detect anomalous patterns in your data that might lead to business advantage. We live in a time when the amount of data being generated and stored is growing at an exponential rate. Understanding and getting real-time insight into these datTable of ContentsIntroductionPart I: Getting Started1. The World of Big Data and IoT2. Generating Data with DevicesPart II: Data on the Move3. Azure IoT Hub4. Ingesting Data with Azure IoT Hub5. Azure Stream Analytics6. Real-Time Data Streaming7. Azure Data Factory8. Integrating Data Between Data Stores Using Azure Data FactoryPart III: Data at Rest9. Azure Data Lake Store10. Azure Data Lake Analytics11. U-SQL12. Azure HDInsight13. Real-time Insights and Reporting on Big Data14. Azure Machine LearningPart IV: More on Cortana Intelligence15. Azure Data Catalog16. Azure Event Hubs
£999.99
APress Data versus Democracy
Book Synopsis Human attention is in the highest demand it has ever been. The drastic increase in available information has compelled individuals to find a way to sift through the media that is literally at their fingertips. Content recommendation systems have emerged as the technological solution to this social and informational problem, but they''ve also created a bigger crisis in confirming our biases by showing us only, and exactly, what it predicts we want to see. Data versus Democracy investigates and explores how, in the era of social media, human cognition, algorithmic recommendation systems, and human psychology are all working together to reinforce (and exaggerate) human bias. The dangerous confluence of these factors is driving media narratives, influencing opinions, and possibly changing election results. In this book, algorithmic recommendations, clickbait, familiarity bias, propaganda, Trade Review“A very well written book that has an engaging style of writing, doesn’t become dry or bogged down in the details, but still showcases the depth of knowledge that Shaffer has on the subject. … It’s accessible and it provides a satisfying read to those looking for deep analysis of this emerging problem faced by the world.” (The Robotics Law Journal, Vol. 5 (2), September - October, 2019) Table of ContentsPart I: The Propaganda Problem.- Chapter 1: Pay Attention: How Information Abundance Affects the Way We Consume Media .- Chapter 2: Cog in the System: How the Limits of Our Brains Leave Us Vulnerable to Cognitive Hacking.- Chapter 3: Swimming Upstream: How Content Recommendation Engines Impact Information and Manipulate Our Attention.- Part II: Case Studies.- Chapter 4: Domestic Disturbance: Ferguson, GamerGate, and the Rise of the American Alt-Right.- Chapter 5: Democracy Hacked, Part 1: Russian Interference and the New Cold War .- Chapter 6: Democracy Hacked, Part 2: Rumors, Bots, and Genocide in the Global South .- Chapter 7: Conclusion: Where Do We Go from Here?.-
£22.49
APress Finding Ghosts in Your Data
a huge range and FREE tracked UK delivery on ALL orders.
£49.49
O'Reilly Media Data Science from Scratch
Book SynopsisWith this updated second edition, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch.
£39.74
De Gruyter Practical AI for Business Leaders Product
Book SynopsisMost economists agree that AI is a general purpose technology (GPT) like the steam engine, electricity, and the computer. AI will drive innovation in all sectors of the economy for the foreseeable future. Practical AI for Business Leaders, Product Managers, and Entrepreneurs is a technical guidebook for the business leader or anyone responsible for leading AI-related initiatives in their organization. The book can also be used as a foundation to explore the ethical implications of AI. Authors Alfred Essa and Shirin Mojarad provide a gentle introduction to foundational topics in AI. Each topic is framed as a triad: concept, theory, and practice. The concept chapters develop the intuition, culminating in a practical case study. The theory chapters reveal the underlying technical machinery. The practice chapters provide code in Python to implement the models discussed in the case study. With this book, readers will learn: The technical foundations of machine learning and deep leaTable of Contents Introduction What is AI and why it is at the center of major business transformation? How is it related to machine learning? What is deep learning, and how is it related to ML? Why is it important? How the book is organized Who is the audience? Section 1: Machine Learning Chapter 1.1, introduction, machine learning, different types of machine learning Chapter 1.2, Machine Learning Technical Overview Chapter 1.3, Hands-On Machine Learning with Scikit Learn Chapter 1.4, Advanced Topics/flavors of Machine learning Appendix: mathematical interlude Section 2: Deep Learning Chapter 2.1, introduction (what is it, why is it important) Chapter 2.2, Deep Learning Technical Overview Chapter 2.3, Hands-On Deep Learning with Keras Chapter 2.4, Advanced Topics/flavors of deep learning Appendix: mathematical interlude Section 3: Putting AI into Practice: Innovation Framework Chapter 3.1: Diffusion and Dynamics of Innovation Chapter 3.2: Managing an Innovation Portfolio
£40.95
Packt Publishing Limited TensorFlow Powerful Predictive Analytics with
Book SynopsisPredictive analytics discovers hidden patterns from structured and unstructured data for automated decision making in business intelligence. Predictive decisions are becoming a huge trend worldwide, catering to wide industry sectors by predicting which decisions are more likely to give maximum results. TensorFlow, Google's brainchild, is ...
£999.99
Springer Nature Switzerland AG Core Data Analysis: Summarization, Correlation,
Book SynopsisThis text examines the goals of data analysis with respect to enhancing knowledge, and identifies data summarization and correlation analysis as the core issues. Data summarization, both quantitative and categorical, is treated within the encoder-decoder paradigm bringing forward a number of mathematically supported insights into the methods and relations between them. Two Chapters describe methods for categorical summarization: partitioning, divisive clustering and separate cluster finding and another explain the methods for quantitative summarization, Principal Component Analysis and PageRank. Features:· An in-depth presentation of K-means partitioning including a corresponding Pythagorean decomposition of the data scatter. · Advice regarding such issues as clustering of categorical and mixed scale data, similarity and network data, interpretation aids, anomalous clusters, the number of clusters, etc.· Thorough attention to data-driven modelling including a number of mathematically stated relations between statistical and geometrical concepts including those between goodness-of-fit criteria for decision trees and data standardization, similarity and consensus clustering, modularity clustering and uniform partitioning.New edition highlights: · Inclusion of ranking issues such as Google PageRank, linear stratification and tied rankings median, consensus clustering, semi-average clustering, one-cluster clustering· Restructured to make the logics more straightforward and sections self-containedCore Data Analysis: Summarization, Correlation and Visualization is aimed at those who are eager to participate in developing the field as well as appealing to novices and practitioners. Trade Review“This book provides a clear overview of the data analysis process, the different types of statistical techniques employed for data analysis, and their role and purpose. … There is good use of a variety of examples to demonstrate how the different techniques are applied in practice. The book’s main purpose would be as a textbook for undergraduate students, or a reference book for data analysts.” (Mark Taylor, Computing Reviews, May 5, 2022)Table of Contents
£54.99
Springer Nature Switzerland AG Mining Over Air: Wireless Communication Networks Analytics
a huge range and FREE tracked UK delivery on ALL orders.
£80.99
Springer Nature Switzerland AG Tracing the Life Cycle of Ideas in the Humanities and Social Sciences
a huge range and FREE tracked UK delivery on ALL orders.
£80.99
Springer Nature Switzerland AG Artificial Adaptive Systems Using Auto Contractive Maps: Theory, Applications and Extensions
Book SynopsisThis book offers an introduction to artificial adaptive systems and a general model of the relationships between the data and algorithms used to analyze them. It subsequently describes artificial neural networks as a subclass of artificial adaptive systems, and reports on the backpropagation algorithm, while also identifying an important connection between supervised and unsupervised artificial neural networks. The book’s primary focus is on the auto contractive map, an unsupervised artificial neural network employing a fixed point method versus traditional energy minimization. This is a powerful tool for understanding, associating and transforming data, as demonstrated in the numerous examples presented here. A supervised version of the auto contracting map is also introduced as an outstanding method for recognizing digits and defects. In closing, the book walks the readers through the theory and examples of how the auto contracting map can be used in conjunction with another artificial neural network, the “spin-net,” as a dynamic form of auto-associative memory.Table of ContentsAn Introduction.- Artificial Neural Networks.- Auto-Contractive Maps.- Visualization of Auto-CM Output.- Dataset Transformations and Auto-CM.- Comparison of Auto-CM to Various Other Data Understanding Approaches.
£80.99
Springer Nature Switzerland AG Domain-Specific Knowledge Graph Construction
Book SynopsisThe vast amounts of ontologically unstructured information on the Web, including HTML, XML and JSON documents, natural language documents, tweets, blogs, markups, and even structured documents like CSV tables, all contain useful knowledge that can present a tremendous advantage to the Artificial Intelligence community if extracted robustly, efficiently and semi-automatically as knowledge graphs. Domain-specific Knowledge Graph Construction (KGC) is an active research area that has recently witnessed impressive advances due to machine learning techniques like deep neural networks and word embeddings. This book will synthesize Knowledge Graph Construction over Web Data in an engaging and accessible manner. The book describes a timely topic for both early -and mid-career researchers. Every year, more papers continue to be published on knowledge graph construction, especially for difficult Web domains. This book serves as a useful reference, as well as an accessible but rigorous overview of this body of work. The book presents interdisciplinary connections when possible to engage researchers looking for new ideas or synergies. The book also appeals to practitioners in industry and data scientists since it has chapters on both data collection, as well as a chapter on querying and off-the-shelf implementations.Table of Contents1. What is a knowledge graph?.- 2. Information Extraction.- 3. Entity Resolution.- 4. Advanced Topic: Knowledge Graph Completion.- 5. Ecosystems
£52.24
Springer Nature Switzerland AG Advances in Knowledge Discovery and Data Mining: 23rd Pacific-Asia Conference, PAKDD 2019, Macau, China, April 14-17, 2019, Proceedings, Part II
Book SynopsisThe three-volume set LNAI 11439, 11440, and 11441 constitutes the thoroughly refereed proceedings of the 23rd Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2019, held in Macau, China, in April 2019. The 137 full papers presented were carefully reviewed and selected from 542 submissions. The papers present new ideas, original research results, and practical development experiences from all KDD related areas, including data mining, data warehousing, machine learning, artificial intelligence, databases, statistics, knowledge engineering, visualization, decision-making systems, and the emerging applications. They are organized in the following topical sections: classification and supervised learning; text and opinion mining; spatio-temporal and stream data mining; factor and tensor analysis; healthcare, bioinformatics and related topics; clustering and anomaly detection; deep learning models and applications; sequential pattern mining; weakly supervised learning; recommender system; social network and graph mining; data pre-processing and featureselection; representation learning and embedding; mining unstructured and semi-structured data; behavioral data mining; visual data mining; and knowledge graph and interpretable data mining.
£62.99
Springer Nature Switzerland AG Computational Intelligence in Music, Sound, Art and Design: 8th International Conference, EvoMUSART 2019, Held as Part of EvoStar 2019, Leipzig, Germany, April 24–26, 2019, Proceedings
Book SynopsisThis book constitutes the refereed proceedings of the 8th International Conference on Evolutionary Computation in Combinatorial Optimization, EvoMUSART 2019, held in Leipzig, Germany, in April 2019, co-located with the Evo*2019 events EuroGP, EvoCOP and EvoApplications. The 16 revised full papers presented were carefully reviewed and selected from 24 submissions. The papers cover a wide range of topics and application areas, including: visual art and music generation, analysis, and interpretation; sound synthesis; architecture; video; poetry; design; and other creative tasks.Table of ContentsDeep Learning Concepts for Evolutionary Art.- Adversarial Evolution and Deep Learning – How Does An Artist Play with Our Visual System.- Autonomy, Authenticity, Authorship and Intention in Computer Generated Art.- Camera Obscurer: Generative Art for Design Inspiration.- Swarm-Based Identification of Animation Key Points from 2D-medialness Maps.- Paintings, Polygons and Plant Propagation.- Evolutionary Games for Audiovisual Works: Exploring the Demographic Prisoner's Dilemma.- Emojinating: Evolving Emoji Blends.- Automatically Generating Engaging Presentation Slide Decks.- Tired of choosing? Just Add Structure and Virtual Reality.- EvoChef: Show Me What to Cook! Artificial Evolution of Culinary Arts.- Comparing Models for Harmony Prediction in An Interactive Audio Looper.- Stochastic Synthesizer Patch Exploration in Edisyn.- Evolutionary Multi-Objective Training Set Selection of Data Instances and Augmentations for Vocal Detection.- Automatic Jazz Melody Composition Through a Learning-Based Genetic Algorithm.- Exploring Transfer Functions in Evolved CTRNNs for Music Generation.
£44.99
Springer Nature Switzerland AG Database Systems for Advanced Applications: DASFAA 2019 International Workshops: BDMS, BDQM, and GDMA, Chiang Mai, Thailand, April 22–25, 2019, Proceedings
Book SynopsisThis book constitutes the workshop proceedings of the 24th International Conference on Database Systems for Advanced Applications, DASFAA 2019, held in Chiang Mai, Thailand, in April 2019. The 14 full papers presented were carefully selected and reviewed from 26 submissions to the three following workshops: the 6th International Workshop on Big Data Management and Service, BDMS 2019; the 4th International Workshop on Big Data Quality Management, BDQM 2019; and the Third International Workshop on Graph Data Management and Analysis, GDMA 2019. This volume also includes the short papers, demo papers, and tutorial papers of the main conference DASFAA 2019.Table of ContentsThe 6th International Workshop on Big Data Management and Service (BDSM 2019).- A Probabilistic Approach for Inferring Latent Entity Associations in Textual Web Contents.- UHRP Uncertainty-Based Pruning Method for Anonymized Data Linear Regression.- Meta-path based MiRNA-disease Association Prediction.- Medical Question Retrieval based on Siamese Neural Network and Transfer learning method.- An adaptive Kalman filter based Ocean Wave Prediction Model using Motion Reference Unit Data.- ASLM: Adaptive Single Layer Model for Learned Index.- SparseMAAC: Sparse Attention for Multi-Agent Reinforcement Learning.- The 4th International Workshop on Big Data Quality Management (BDQM 2019).- Identifying Reference Relationship of Desktop Files Based on Access Logs.- Visualization of Photo Album: Selecting a Representative Photo of a Specific Event.- Data Quality Management in Institutional Research Output Data Center.- Generalized Bayesian Structure Learning from Noisy Datasets.- The Third International Workshop on Graph Data Management and Analysis (GDMA 2019).- ANDMC: An Algorithm for Author Name Disambiguation Based on Molecular Cross Clustering.- Graph Based Aspect Extraction and Rating Classification of Customer Review Data.- Streaming Massive Electric Power Data Analysis Based on Spark Streaming.- Short Papers.- Deletion Robust k-Coverage Queries.- Episodic Memory Network with Self-Attention for Emotion Detection.- Detecting Suicidal Ideation with Data Protection in Online Communities.- Hierarchical Conceptual Labeling.- Anomaly Detection in Time-Evolving Attributed Networks.- A Multi-task Learning Framework for Automatic Early Detection of Alzheimer’s.- Top-k Spatial Keyword Query with Typicality and Semantics.- Align Reviews with Topics in Attention Network for Rating Prediction.- PSMSP: A Parallelized Sampling-based Approach for Mining Top-k Sequential Patterns in Database Graphs.- Value-Oriented Ranking of Online Reviews Based on Reviewer-influenced Graph.- Ancient Chinese Landscape Painting Composition Classification by Using Semantic Variational Autoencoder.- Learning Time-Aware Distributed Representations of Locations from Spatio-Temporal Trajectories.- Hyper2vec: Biased Random Walk for Hyper-Network Embedding.- Privacy-preserving and dynamic spatial range aggregation query processing in wireless sensor networks.- Adversarial Discriminative Denoising for Distant Supervision Relation Extraction.- Nonnegative Spectral Clustering for Large-Scale Semi-Supervised Learning.- Distributed PARAFAC Decomposition Method based on In-Memory Big Data System.- GPU-Accelerated Dynamic Graph Coloring.- Relevance-based Entity Embedding.- An Iterative Map-Trajectory Co-Optimisation Framework Based on Map-Matching and Map Update.- Exploring Regularity in Traditional Chinese Medicine Clinical Data Using Heterogeneous Weighted Networks Embedding.- AGREE: Attentive Tour Group Recommendation with Multi-Modal Data.- Random Decision DAG: An Entropy Based Compression Approach for Random Forest.- Generating Behavior Features for Cold-Start Spam Review Detection.- TCL: Tensor-CNN-LSTM for Travel Time Prediction with Sparse Trajectory Data.- A Semi-supervised Classification Approach for Multiple Time-varying Networks with Total Variation.- Multidimensional Skylines Over Streaming Data.- A domain adaptation approach for multistream classification.- Gradient Boosting Censored Regression for Winning Price Prediction in Real-Time Bidding.- Deep Sequential Multi-task Modeling for Next Check-in Time and Location Prediction.- SemiSync: Semi-supervised Clustering by Synchronization.- Neural Review Rating Prediction with Hierarchical Attentions and Latent Factors.- MVS-match: An Efficient Subsequence Matching Approach Based on the Series Synopsis.- Temporal-Spatial Recommendation for On-demand Cinemas.- Finding the key influences on the house price by Finite Mixture Model based on the real estate data in Changchun.- Semi-supervised Clustering with Deep Metric Learning.- Spatial Bottleneck Minimum Task Assignment with Time-delay.- A Mimic Learning Method for Disease Risk Prediction with Incomplete Initial Data.- Hospitalization Behavior Prediction Based on Attention and Time Adjustment Factors in Bidirectional LSTM.- Modeling Item Category for Effective Recommendation.- Distributed Reachability Queries on Massive Graphs.- Edge-Based Shortest Path Caching in Road Networks.- Extracting Definitions and Hypernyms with a Two-Phase Framework.- Tag Recommendation by Word-Level Tag Sequence Modeling.- A New Statistics Collecting Method with Adaptive Strategy.- Word Sense Disambiguation with Massive Contextual Texts.- Learning DMEs from Positive and Negative Examples.- Serial and Parallel Recurrent Convolutional Neural Networks for Biomedical Named Entity Recognition.- DRGAN: A GAN-based Framework for Doctor Recommendation in Chinese On-line QA Communities.- Attention-based Abnormal-Aware Fusion Network for Radiology Report Generation.- LearningTour: A Machine Learning Approach for Tour Recommendation based on Users’ Historical Travel Experience.- TF-Miner: Topic-specific Facet Mining by Label Propagation.- Fast Raft Replication for Transactional Database Systems over Unreliable Networks.- Parallelizing Big De Bruijn Graph Traversal for Genome Assembly on GPU Clusters.- GScan: Exploiting Sequential Scans for Subgraph Matching.- SIMD Accelerates the Probe Phase of Star Joins in Main Memory Databases.- A Deep Recommendation Model Incorporating Adaptive Knowledge-based Representations.- BLOMA: Explain Collaborative Filtering via Boosted Local Rank-One Matrix Approximation.- Spatiotemporal-Aware Region Recommendation with Deep Metric Learning.- On the Impact of the Length of Subword Vectors on Word Embeddings.- Using Dilated Residual Network to Model Distant Supervision Relation Extraction.- Modeling More Globally: A Hierarchical Attention Network via Multi-Task Learning for Aspect-Based Sentiment Analysis.- A Sparse Matrix-based Join for SPARQL Query Processing.- Change Point Detection for Streaming High-Dimensional time series.- Demo Papers.- Distributed Query Engine for Multiple-Query Optimization over Data Stream.- Adding Value by Combining Business and Sensor Data: An Industry 4.0 Use Case.- AgriKG: An Agricultural Knowledge Graph and Its Applications.- KGVis: An Interactive Visual Query Language for Knowledge Graphs.- OperaMiner: Extracting Character Relations from Opera Scripts using Deep Neural Networks.- GparMiner: A System to mine Graph Pattern Association Rules.- A Data Publishing System Based on Privacy Preservation.- Privacy as a Service: Publishing Data and Models.- Dynamic Bus Route Adjustment Based on Hot Bus Stop Pair Extraction.- DHDSearch: A Framework for Batch Time Series Searching on MapReduce.- Bus Stop Refinement based on Hot Spot Extraction.- Adaptive Transaction Scheduling for Highly Contended Workloads.- IMOptimizer: An Online Interactive Parameter Optimization System based on Big Data.- Tutorial Papers.- Cohesive Subgraphs with Hierarchical Decomposition on Big Graphs.- Tracking User Behaviours: Laboratory-Based and In-The-Wild User S.- Mining Knowledge Graphs for Vision Tasks.- Enterprise Knowledge Graph From Specific Business Task to Enterprise Knowledge Management.- Knowledge Graph Data Management.- Deep learning for Healthcare Data Processing.
£62.99
Springer Nature Switzerland AG Intelligent Tutoring Systems: 15th International Conference, ITS 2019, Kingston, Jamaica, June 3–7, 2019, Proceedings
Book SynopsisThis book constitutes the proceedings of the 15th International Conference on Intelligent Tutoring Systems, ITS 2019, held in Kingston, Jamaica, in June 2019. The 14 full papers and 13 short papers presented in this volume were carefully reviewed and selected from 42 submissions. In the back matter of the volume 4 poster papers are included. They deal with the use of advanced computer technologies and interdisciplinary research for enabling, supporting, and enhancing human learning.Table of ContentsA Learning Early-warning Model Based on Knowledge Points.- Adaptive Learning Spaces with Context-Awareness.- Agents’ Cognitive vs. Socio-affective Support in Response to Learner’s Confusion.- An Adaptive Approach to Provide Feedback for Students in Programming Problem Solving.- Analysis and Prediction of Student Emotions While Doing Programming Exercises.- Analyzing Best Hints for a Programming IST.- Analyzing the Group Formation Process in Intelligent Tutoring Systems.- Analyzing the usage of the classical ITS software architecture and refining it.- Assessing Students’ Clinical Reasoning using Gaze and EEG Features.- Computer-Aided Intervention for Reading Comprehension Disabilities.- Conceptualization of IMS that Estimates Learners’ Mental States from Learners’ Physiological Information Using Deep Neural Network Algorithm.- Data-Driven Student Clusters Based on Online Learning Behavior in a Flipped Classroom with an Intelligent Tutoring System.- Decision Support for an Adversarial Game Environment using Automatic Hint Generation.- Detecting Collaborative Learning through Emotions: An Investigation using Facial Expression Recognition.- Fact Checking Misinformation Using Recommendations from Emotional Pedagogical Agents.- Intelligent On-line Exam Management and Evaluation System.- Learning by Arguing in Argument-Based Machine Learning Framework.- Model for data analysis process and its relationship to the hypothesis-driven and data-driven research approaches.- On the discovery of educational patterns using biclustering.- Parent-Child Interaction in Children's Learning How to Use a New Application.- Patterns of Collaboration Dialogue Acts in Typed-Chat Group Problem-Solving.- PKULAE: A Learning Attitude Evaluation Method Based on Learning Behavior.- Predicting MOOCs Dropout Using only two easily obtainable Features from the First Week’s Activities.- Predicting subjective enjoyment of aspects of a videogame from psychophysiological measures of arousal and valence.- Providing the Option to Skip Feedback – A Reproducibility Study.- Reducing Annotation Effort in Automatic Essay Evaluation Using Locality Sensitive Hashing.- Representing and Evaluating Strategies for Solving Parsons Puzzles.- Testing the Robustness of Inquiry Practices once Scaffolding is Removed.- Toward Real-Time System Adaptation using Excitement Detection from Eye Tracking.- Towards Predicting Attention And Workload During Math Problem Solving.- Using a Simulator to Choose the Best Hints in a Reinforcement Learning-Based Multimodal ITS.
£44.99
Springer Nature Switzerland AG Discovery Science: 22nd International Conference, DS 2019, Split, Croatia, October 28–30, 2019, Proceedings
Book SynopsisThis book constitutes the proceedings of the 22nd International Conference on Discovery Science, DS 2019, held in Split, Coratia, in October 2019. The 21 full and 19 short papers presented together with 3 abstracts of invited talks in this volume were carefully reviewed and selected from 63 submissions. The scope of the conference includes the development and analysis of methods for discovering scientific knowledge, coming from machine learning, data mining, intelligent data analysis, big data analysis as well as their application in various scientific domains. The papers are organized in the following topical sections: Advanced Machine Learning; Applications; Data and Knowledge Representation; Feature Importance; Interpretable Machine Learning; Networks; Pattern Discovery; and Time Series.Table of ContentsAdvanced Machine Learning.- Applications.- Data and Knowledge Representation.- Feature Importance.- Interpretable Machine Learning.- Networks.- Pattern Discovery.- Time Series.
£62.99
Springer Nature Switzerland AG Intelligence Science and Big Data Engineering. Visual Data Engineering: 9th International Conference, IScIDE 2019, Nanjing, China, October 17–20, 2019, Proceedings, Part I
Book SynopsisThe two volumes LNCS 11935 and 11936 constitute the proceedings of the 9th International Conference on Intelligence Science and Big Data Engineering, IScIDE 2019, held in Nanjing, China, in October 2019. The 84 full papers presented were carefully reviewed and selected from 252 submissions.The papers are organized in two parts: visual data engineering; and big data and machine learning. They cover a large range of topics including information theoretic and Bayesian approaches, probabilistic graphical models, big data analysis, neural networks and neuro-informatics, bioinformatics, computational biology and brain-computer interfaces, as well as advances in fundamental pattern recognition techniques relevant to image processing, computer vision and machine learning.
£62.99
Springer Nature Switzerland AG Applying Predictive Analytics: Finding Value in
Book SynopsisThe new edition of this textbook presents a practical, updated approach to predictive analytics for classroom learning. The authors focus on using analytics to solve business problems and compares several different modeling techniques, all explained from examples using the SAS Enterprise Miner software. The authors demystify complex algorithms to show how they can be utilized and explained within the context of enhancing business opportunities. Each chapter includes an opening vignette that provides real-life examples of how business analytics have been used in various aspects of organizations to solve issues or improve their results. A running case provides an example of a how to build and analyze a complex analytics model and utilize it to predict future outcomes. The new edition includes chapters on clusters and associations and text mining to support predictive models. An additional case is also included that can be used with each chapter or as a semester project.Table of ContentsChapter 1 Introduction to Predictive Analytics1 1.1 Predictive Analytics in Action2 1.2 Analytics Landscape8 1.3 Analytics 1.3.2 Predictive Analytics 1.4 Regression Analysis 1.5 Machine Learning Techniques 1.6 Predictive Analytics Model 1.7 Opportunities in Analytics 1.8 Introduction to the Automobile Insurance Claim Fraud Example 1.9 Chapter Summary References Chapter 239 Know Your Data – Data Preparation39 2.1 Classification of Data40 2.1.1 Qualitative versus Quantitative 2.1.2 Scales of Measurement 2.2. Data Preparation Methods. 2.2.1 Inconsistent Formats 2.2.2 Missing Data 2.2.3 Outliers 2.2.4 Other Data Cleansing Considerations 2.3 Data Sets and Data Partitioning 2.4 SAS Enterprise Miner™ Model Components 2.4.1 Step 1. Create Three of the Model Components 2.4.2 Step 2. Import an Excel File and Save as a SAS File 2.4.3 Step 3. Create the Data Source 2.4.4 Step 4. Partition the Data Source 2.4.5 Step 5 Data Exploration 2.4.6 Step 6 Missing Data 2.4.7 Step 7. Handling Outliers 2.4.8 Step 8. Categorical Variables with Too Many Levels 2.5 Chapter Summary References Chapter 35 What do Descriptive Statistics Tell Us 3.1 Descriptive Analytics 3.2 The Role of the Mean, Median and Mode 3.3 Variance and Distribution 3.4 The Shape of the Distribution 3.4.2 Kurtosis 3.5 Covariance and Correlation 3.6 Variable Reduction 3.6.1 Variable Clustering 3.6.2 Principal Component Analysis 3.7 Hypothesis Testing2 3.8 Analysis of Variance (ANOVA)5 3.9 Chi Square6 3. Fit Statistics8 3. Stochastic Models9 3.12 Chapter Summary1 References2 Chapter 4 Predictive Models Using Regression5 4.1 Regression6 4.1.1 Classical assumptions7 4.2 Ordinary Least Squares8 4.3 Simple Linear Regression8 4.3.1 Determining Relationship Between Two Variables9 4.3.2 Line of Best Fit and Simple Linear Regression Equation9 4.4 Multiple Linear Regression1 4.4.1 Metrics to Evaluate the Strength of the Regression Line2 4.3.2 Best-fit model3 4.3.3 Selection of Variables in Regression3 4.5 Principal Component Regression5 4.5.1 Principal Component Analysis Revisited5 4.5.2 Principal Component Regression6 4.6 Partial Least Squares6 4.7 Logistic Regression7 4.7.1 Binary Logistic Regression8 4.7.2 Examination of Coefficients1 4.7.3 Multinomial Logistic Regression3 4.7.4 Ordinal Logistic Regression3 4.8 Implementation of Regression in SAS Enterprise Miner™3 4.8.1 Regression Node Train Properties: Class Targets4 4.8.2 Regression Node Train Properties: Model Options5 4.8.3 Regression Node Train Properties: Model Selection6 4.9 Implementation of Two-Factor Interaction and Polynomial Terms8 4.9.1 Regression Node Train Properties: Equation8 4. DMINE Regression in SAS Enterprise Miner™0 4..1 DMINE Properties0 4..2 DMINE Results2 4. Partial Least Squares Regression in SAS Enterprise Miner™4 4..1 Partial Least Squares Properties4 4..2 Partial Least Squares Results7 4. Least Angles Regression in SAS Enterprise Miner™9 4..1 Least Angle Regression Properties0 4..2 Least Angles Regression Results1 4. Other Forms of Regression4 4. Chapter Summary6 References9 Chapter 5 The Second of the Big Three – Decision Trees1 5.1 What is a Decision Tree?2 5.2 Creating a Decision Tree4 5.3 Data Partitions and Decision Trees6 5.4 Creating a Decision Tree Using SAS Enterprise Miner™9 The key properties include:5 Subtree Properties5 5.4.1 Overfitting1 5.5 Creating an Interactive Decision Tree using SAS Enterprise Miner ™1 5.6 Creating a Maximal Decision Tree using SAS Enterprise Miner ™6 5.7 Chapter Summary9 References1 Chapter 6 The Third of the Big Three - Neural Networks3 6.1 What is a Neural Network?4 6.2 History of Neural Networks6 6.3 Components of a Neural Network8 6.4 Neural Network Architectures2 6.5 Training a Neural Network5 6.6 Radial Basis Function Neural Networks6 6.7 Creating a Neural Network using SAS Enterprise MinerÔ7 6.8 Using SAS Enterprise MinerÔ to Automatically Generate a Neural Network0 6.9 Explaining a Neural Network6 6. Chapter Summary0 References3 Chapter 7 Model Comparisons and Scoring5 7.1 Beyond the Big 7.2 Gradient Boosting6 7.3 Ensemble Models0 7.4 Random Forests2 7.6 Two-Stage Model8 7.7 Comparing Predictive Models0 7.7.1 Evaluating Fit Statistics – Which Model Do We Use?2 7.8 Using Historical Data to Predict the Future – Scoring5 7.8.1 Analyzing and Reporting Results8 7.8.2 Save Data Node9 7.8.3 Reporter Node0 7.9 The Importance of Predictive Analytics2 7.9.1 What Should We Expect for Predictive Analytics in the Future?3 7. Chapter Summary4 References6 Chapter 8 finding Associations in Data through Cluster Analysis9 8.1 Applications and Uses of Cluster Analysis9 8.2 Types of Clustering Techniques0 8.3 Hierarchical Clustering1 8.3.1 Agglomerative Clustering1 8.3.2 Divisive Clustering1 8.3.3 Agglomerative vs Divisive Clustering6 8.4 Non-hierarchical clustering7 8.4.1 K-means Clustering7 8.4.2 Initial Centroid Selection1 8.4.3 Determining the Number of Clusters2 8.4.4 Evaluating your clusters5 8.5 Hierarchical vs Nonhierarchical6 8.6 Cluster Analysis using SAS Enterprise Miner™6 8.6.1 Cluster Node7 8.6.2 Additional Key Properties of the Cluster Node8 8.7 Applying Cluster Analysis to the Insurance Claim Fraud Data Set9 8.8 Chapter Summary8 References9 9.1 What is Text Analytics?1 9.2 Information Retrieval2 9.3 Text Parsing5 9.4 Zipf’s Law8 9.5 Text Filter9 9.6 Text Cluster1 9.7 Text Topic4 9.8 Text Rule Builder7 9.9 Text Profile8 9. Chapter Summary9 Discussion Questions0 References1 Appendix A3 Data Dictionary for the Automobile Insurance Claim Fraud Data Example3 Appendix B5 Can you Predict the Money Laundering Cases?5 B.1 Introduction5 B.2. Business Problem8 B.3. Analyze Data9 B.4. Development and Optimization of a Best Fit Model2 B.5. Final Report3 References4
£56.99
Springer Nature Switzerland AG Federated Learning for IoT Applications
Book SynopsisThis book presents how federated learning helps to understand and learn from user activity in Internet of Things (IoT) applications while protecting user privacy. The authors first show how federated learning provides a unique way to build personalized models using data without intruding on users’ privacy. The authors then provide a comprehensive survey of state-of-the-art research on federated learning, giving the reader a general overview of the field. The book also investigates how a personalized federated learning framework is needed in cloud-edge architecture as well as in wireless-edge architecture for intelligent IoT applications. To cope with the heterogeneity issues in IoT environments, the book investigates emerging personalized federated learning methods that are able to mitigate the negative effects caused by heterogeneities in different aspects. The book provides case studies of IoT based human activity recognition to demonstrate the effectiveness of personalized federated learning for intelligent IoT applications, as well as multiple controller design and system analysis tools including model predictive control, linear matrix inequalities, optimal control, etc. This unique and complete co-design framework will benefit researchers, graduate students and engineers in the fields of control theory and engineering. Table of ContentsChapter 1. Introduction to Federated Learning.- Chapter 2. Federated Learning for IoT Devices.- Chapter 3. Personalized Federated Learning.- Chapter 4. Federated Learning for an IoT Application.- Chapter 5. Some observations on the behaviour of Federated Learning.- Chapter 6. Federated Learning with Cooperating Devices: A Consensus Approach.- Chapter 7. A prospective study of federated machine learning in medical image fusion.- Chapter 8. Communication-Efficient Federated Learning in Wireless-Edge Architecture.- Chapter 9. Towards Ubiquitous AI in 6G with Federated Learning.- Chapter 10. Federated Learning using Tensor Flow.- Chapter 11. Cyber Security and privacy of Connected and Automated Vehicles (CAVs) based Federated Learning: Challenges, Opportunities and Open Issues.- Chapter 12. Security Issues & Solutions for Healthcare Informatics.- Chapter 13. Federated Learning: Challenges, Methods, and Future Directions.- Chapter 14. Quantum Federated Learning for Wireless Communications.- Chapter 15. Federated machine learning with data mining in health care.- Chapter 16. Federated Learning for data mining in Healthcare.
£94.99
Springer Nature Switzerland AG Text Mining with MATLAB®
Book SynopsisText Mining with MATLAB® provides a comprehensive introduction to text mining using MATLAB. It is designed to help text mining practitioners, as well as those with little-to-no experience with text mining in general, familiarize themselves with MATLAB and its complex applications. The book is structured in three main parts: The first part, Fundamentals, introduces basic procedures and methods for manipulating and operating with text within the MATLAB programming environment. The second part of the book, Mathematical Models, is devoted to motivating, introducing, and explaining the two main paradigms of mathematical models most commonly used for representing text data: the statistical and the geometrical approach. Eventually, the third part of the book, Techniques and Applications, addresses general problems in text mining and natural language processing applications such as document categorization, document search, content analysis, summarization, question answering, and conversational systems. This second edition includes updates in line with the recently released “Text Analytics Toolbox” within the MATLAB product and introduces three new chapters and six new sections in existing ones. All descriptions presented are supported with practical examples that are fully reproducible. Further reading, as well as additional exercises and projects, are proposed at the end of each chapter for those readers interested in conducting further experimentation. Table of Contents1. Introduction.- PART I: FUNDAMENTALS.- 2. Handling Text Data.- 3. Regular Expressions.- 4. Basic Operations with Strings.- 5. Reading and Writing Files.- 6. The Structure of Language.- PART II: MATHEMATICAL MODELS.- 7. Basic Corpus Statistics.- 8. Statistical Models.- 9. Geometrical Models.- 10. Dimensionality Reduction.- PART III: METHODS AND APPLICATIONS.- 11. Document Categorization.- 12. Document Search.- 13. Content Analysis.- 14. Keyword Extraction and Summarization.- 15. Question Answering and Dialogue.
£56.99
Springer Nature Switzerland AG Discovery Science: 24th International Conference, DS 2021, Halifax, NS, Canada, October 11–13, 2021, Proceedings
Book SynopsisThis book constitutes the proceedings of the 24th International Conference on Discovery Science, DS 2021, which took place virtually during October 11-13, 2021.The 36 papers presented in this volume were carefully reviewed and selected from 76 submissions. The contributions were organized in topical sections named: applications; classification; data streams; graph and network mining; machine learning for COVID-19; neural networks and deep learning; preferences and recommender systems; representation learning and feature selection; responsible artificial intelligence; and spatial, temporal and spatiotemporal data. Table of ContentsApplications.- Automated Grading of Exam Responses: An Extensive Classification Benchmark.- Automatic human-like detection of code smells.- HTML-LSTM: Information Extraction from HTML Tables in Web Pages using Tree-Structured LSTM.- Predicting reach to find persuadable customers: improving uplift models for churn prevention.- Classification.- A Semi-Supervised Framework for Misinformation Detection.- An Analysis of Performance Metrics for Imbalanced Classification.- Combining Predictions under Uncertainty: The Case of Random Decision Trees.- Shapley-Value Data Valuation for Semi-Supervised Learning.- Data streams.- A Network Intrusion Detection System for Concept Drifting Network Traffic Data.- Incremental k-Nearest Neighbors Using Reservoir Sampling for Data Streams.- Statistical Analysis of Pairwise Connectivity.- Graph and Network Mining.- FHA: Fast Heuristic Attack against Graph Convolutional Networks.- Ranking Structured Objects with Graph Neural Networks.- Machine Learning for COVID-19.- Knowledge discovery of the delays experienced in reporting covid19 confirmed positive cases using time to event models.- Multi-Scale Sentiment Analysis of Location-Enriched COVID-19 Arabic Social Data.- Prioritization of COVID-19 literature via unsupervised keyphrase extraction and document representation learning.- Sentiment Nowcasting during the COVID-19 Pandemic.- Neural Networks and Deep Learning.- A Sentence-level Hierarchical BERT Model for Document Classification with Limited Labelled Data.- Calibrated Resampling for Imbalance and Long-Tails in Deep learning.- Consensus Based Vertically Partitioned Multi-Layer Perceptrons for Edge Computing.- Controlling BigGAN Image Generation with a Segmentation Network.- GANs for tabular healthcare data generation: a review on utility and privacy.- Preferences and Recommender Systems.- An Ensemble Hypergraph Learning framework for Recommendation.- KATRec: Knowledge Aware aTtentive Sequential Recommendations.- Representation Learning and Feature Selection.- Elliptical Ordinal Embedding.- Unsupervised Feature Ranking via Attribute Networks.- Responsible Artificial Intelligence.- Deriving a Single Interpretable Model by Merging Tree-based Classifiers.- Ensemble of Counterfactual Explainers. Riccardo Guidotti and Salvatore Ruggieri.- Learning Time Series Counterfactuals via Latent Space Representations.- Leveraging Grad-CAM to Improve the Accuracy of Network Intrusion Detection Systems.- Local Interpretable Classifier Explanations with Self-generated Semantic Features.- Privacy risk assessment of individual psychometric profiles.- The Case for Latent Variable vs Deep Learning Methods in Misinformation Detection: An Application to COVID-19.- Spatial, Temporal and Spatiotemporal Data.- Local Exceptionality Detection in Time Series Using Subgroup Discovery.- Neural Additive Vector Autoregression Models for Causal Discovery in Time Series.- Spatially-Aware Autoencoders for Detecting Contextual Anomalies in Geo-Distributed Data.
£62.99
Springer International Publishing AG Data Analytics and Management in Data Intensive Domains: 23rd International Conference, DAMDID/RCDL 2021, Moscow, Russia, October 26–29, 2021, Revised Selected Papers
Book SynopsisThis book constitutes the post-conference proceedings of the 23rd International Conference on Data Analytics and Management in Data Intensive Domains, DAMDID/RCDL 2021, held in Moscow, Russia, in October 2021*.The 16 revised full papers were carefully reviewed and selected from 61 submissions. The papers are organized in the following topical sections: problem solving infrastructures, experiment organization, and machine learning applications; data analysis in astronomy; data analysis in material and earth sciences; information extraction from text* The conference was held virtually due to the COVID-19 pandemic.Table of ContentsProblem Solving Infrastructures, Experiment Organization, and Machine Learning Applications.- MLDev: Data Science Experiment Automation and Reproducibility Software.- Response to Cybersecurity Threats of Informational Infrastructure Based on Conceptual Models.- Social Network Analysis of the Professional Community Interaction - Movie Industry Case.- Data Analysis in Astronomy.- Cross-Matching of Large Sky Surveys and Study of Astronomical Objects Apparent in Ultraviolet Band Only.- The Diversity of Light Curves of Supernovae Associated with Gamma-Ray Bursts.- Application of Machine Learning Methods for Cross-Matching Astronomical Catalogs.- Pipeline for Detection of Transient Objects in Optical Surveys.- VALD in Astrophysics.- Data Analysis in Material and Earth Sciences.- Machine Learning Application to Predict New Inorganic Compounds – Results and Perspectives.- Interoperability and Architecture Requirements Analysis and Metadata Standardization for a Research Data Infrastructure in Catalysis.- Fast Predictions of Lattice Energies by Continuous Isometry Invariants of Crystal Structures.- Image Recognition for Large Soil Maps Archive Overview: Metadata Extraction and Georeferencing Tool Development.- Information Extraction from Text.- Cross-lingual Plagiarism Detection Method.- Methods for Automatic Argumentation Structure Prediction.- A System for Information Extraction from Scientific Texts in Russian.- Improving Neural Abstractive Summarization with Reliable Sentence Sampling.
£58.49
Springer International Publishing AG Learning to Quantify
Book SynopsisThis open access book provides an introduction and an overview of learning to quantify (a.k.a. “quantification”), i.e. the task of training estimators of class proportions in unlabeled data by means of supervised learning. In data science, learning to quantify is a task of its own related to classification yet different from it, since estimating class proportions by simply classifying all data and counting the labels assigned by the classifier is known to often return inaccurate (“biased”) class proportion estimates. The book introduces learning to quantify by looking at the supervised learning methods that can be used to perform it, at the evaluation measures and evaluation protocols that should be used for evaluating the quality of the returned predictions, at the numerous fields of human activity in which the use of quantification techniques may provide improved results with respect to the naive use of classification techniques, and at advanced topics in quantification research. The book is suitable to researchers, data scientists, or PhD students, who want to come up to speed with the state of the art in learning to quantify, but also to researchers wishing to apply data science technologies to fields of human activity (e.g., the social sciences, political science, epidemiology, market research) which focus on aggregate (“macro”) data rather than on individual (“micro”) data.Table of Contents- 1. The Case for Quantification. - 2. Applications of Quantification. - 3. Evaluation of Quantification Algorithms. - 4. Methods for Learning to Quantify. - 5. Advanced Topics. - 6. The Quantification Landscape. - 7. The Road Ahead.
£999.99
Springer International Publishing AG Computational Intelligence Methods for Bioinformatics and Biostatistics: 17th International Meeting, CIBB 2021, Virtual Event, November 15–17, 2021, Revised Selected Papers
Book SynopsisThis book constitutes revised selected papers from the 17th International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics, CIBB 2021, which was held virtually during November 15–17, 2021. The 19 papers included in these proceedings were carefully reviewed and selected from 26 submissions, and they focus on bioinformatics, computational biology, health informatics, cheminformatics, biotechnology, biostatistics, and biomedical imaging.Table of ContentsChemical Neural Networks and Synthetic Cell Biotechnology: Preludes to Chemical AI.- Development of Bayesian network for multiple sclerosis risk factor interaction analysis.- Real-Time Automatic Plankton Detection, Tracking and Classification on Raw Hologram.- The first in-silico model of leg movement activity during sleep.- Transfer learning and magnetic resonance imaging techniques for deep neural network-based diagnosis of early cognitive decline and dementia.- Improving bacterial sRNA identification by combining genomic context and sequence-derived features.- High-dimensional multi-trait GWAS by reverse prediction of genotypes using machine learning methods.- A Non-Negative Matrix Tri-Factorization based Method for Predicting Antitumor Drug Sensitivity.- A Rule-based Approach for Generating Synthetic Biological Pathways.- Machine Learning Classifiers based on Dimensionality Reduction Techniques for the Early Diagnosis of Alzheimer’s Disease using Magnetic Resonance Imaging and Positron Emission Tomography Brain Data.- Text Mining Enhancements for Image Recognition of Gene Names and Gene Relations.- Sentence Classification to Detect Tables for Helping Extraction of Regulatory Interactions in Bacteria.- RF-Isolation: a Novel Representation of Structural Connectivity Networks for Multiple Sclerosis Classification.- Summarizing Global SARS–CoV–2 Geographical Spread by Phylogenetic Multitype Branching Models.- Explainable AI Models for COVID-19 Diagnosis using CT-Scan Images and Clinical Data.- The need of standardised metadata to encode causal relationships: Towards safer data-driven machine learning biological solutions.- Deep Recurrent Neural Networks for the Generation of Synthetic Coronavirus Spike Protein Sequences.- Recent Dimensionality Reduction Techniques for High-Dimensional COVID-19 Data.- Soft brain ageing indicators based on light-weight LeNet-like neural networks and localized 2D brain age biomarkers.
£999.99
Springer International Publishing AG ICT Innovations 2022. Reshaping the Future Towards a New Normal: 14th International Conference, ICT Innovations 2022, Skopje, Macedonia, September 29 – October 1, 2022, Proceedings
Book SynopsisThis book constitutes the refereed proceedings of the 14th International Conference on ICT Innovations 2022. Reshaping the Future Towards a New Normal, ICT Innovations 2022, held in Skopje, Macedonia, during September 29–October 1, 2022. The 14 full papers and 1 short papers included in this book were carefully reviewed and selected from 42 submissions. They were organized in topical sections as follows: theoretical foundations and distributed computing; artificial intelligence and deep learning; applied artificial intelligence; education; and medical informatics.Table of ContentsThe New Normal: Innovative Informal Digital Learning after the Pandemic.- Theoretical foundations and distributed computing.- StegIm: Image in Image Steganography.- A Property of an Error-Detecting Code Based on Quasigroups.- Multi-access edge computing smart relocation approach from an NFV perspective.- Artificial intelligence and deep learning.- MACEDONIZER - The Macedonian Transformer Language Model.- Deep learning-based sentiment classification of social network texts in Amharic language.- Using centrality measures to extract knowledge from cryptocurrencies’ interdependencies networks.- Applied artificial intelligence.- Evaluating micro frontend approaches for code reusability.- Combining Static and Dynamic Features to Improve Longitudinal Image Retrieval for Alzheimer's Disease.- Architecture for collecting and analysing data from sensor devices.- Education.- Adapting a Web 2.0-based Course to a Fully Online Course and Readapting it Back for Face-to-Face Use.- Challenges and opportunities for women studying STEM.- Medical informatics.- Novel Methodology for Improving the Generalization Capability of Chemo-Informatics Deep Learning Models.- An exploration of Autism Spectrum Disorder classification from structural and functional MRI images.- Detection of High Noise Levels in Electrocardiograms.
£56.99
Springer International Publishing AG Data Science and Big Data Computing: Frameworks
Book SynopsisThis illuminating text/reference surveys the state of the art in data science, and provides practical guidance on big data analytics. Expert perspectives are provided by authoritative researchers and practitioners from around the world, discussing research developments and emerging trends, presenting case studies on helpful frameworks and innovative methodologies, and suggesting best practices for efficient and effective data analytics. Features: reviews a framework for fast data applications, a technique for complex event processing, and agglomerative approaches for the partitioning of networks; introduces a unified approach to data modeling and management, and a distributed computing perspective on interfacing physical and cyber worlds; presents techniques for machine learning for big data, and identifying duplicate records in data repositories; examines enabling technologies and tools for data mining; proposes frameworks for data extraction, and adaptive decision making and social media analysis.Trade Review“This title presents recent research and future trends in the area of big data. … It will be of value to students and researchers looking for research topics and to data scientists exploring ongoing work in the field of big data. Summing Up: Recommended. Graduate students; faculty and professionals.” (C. Tappert, Choice, Vol. 54 (7), March, 2017)Table of ContentsPart I: Data Science Applications and Scenarios An Interoperability Framework and Distributed Platform for Fast Data ApplicationsJosé Carlos Martins Delgado Complex Event Processing Framework for Big Data ApplicationsRenta Chintala Bhargavi Agglomerative Approaches for Partitioning of Networks in Big Data ScenariosAnupam Biswas, Gourav Arora, Gaurav Tiwari, Srijan Khare, Vyankatesh Agrawal and Bhaskar Biswas Identifying Minimum-Sized Influential Vertices on Large-Scale Weighted Graphs: A Big Data PerspectiveYing Xie, Jing (Selena) He and Vijay V. Raghavan Part II: Big Data Modelling and Frameworks A Unified Approach to Data Modelling and Management in Big Data EraCatalin Negru, Florin Pop, Mariana Mocanu and Valentin Cristea Interfacing Physical and Cyber Worlds: A Big Data PerspectiveZartasha Baloch, Faisal Karim Shaikh and Mukhtiar A. Unar Distributed Platforms and Cloud Services: Enabling Machine Learning for Big DataDaniel Pop, Gabriel Iuhasz and Dana Petcu An Analytics Driven Approach to Identify Duplicate Bug Records in Large Data RepositoriesAnjaneyulu Pasala, Sarbendu Guha, Gopichand Agnihotram, Satya Prateek B and Srinivas Padmanabhuni Part III: Big Data Tools and Analytics Large Scale Data Analytics Tools: Apache Hive, Pig and HBaseN. Maheswari and M. Sivagami Big Data Analytics: Enabling Technologies and ToolsMohanavadivu Periasamy and Pethuru Raj A Framework for Data Mining and Knowledge Discovery in Cloud ComputingDerya Birant and Pelin Yıldırım Feature Selection for Adaptive Decision Making in Big Data AnalyticsJaya Sil and Asit Kumar Das Social Impact and Social Media Analysis Relating to Big DataNirmala Dorasamy and Nataša Pomazalová
£98.99
Springer International Publishing AG Machine Learning for Health Informatics: State-of-the-Art and Future Challenges
Book SynopsisMachine learning (ML) is the fastest growing field in computer science, and Health Informatics (HI) is amongst the greatest application challenges, providing future benefits in improved medical diagnoses, disease analyses, and pharmaceutical development. However, successful ML for HI needs a concerted effort, fostering integrative research between experts ranging from diverse disciplines from data science to visualization. Tackling complex challenges needs both disciplinary excellence and cross-disciplinary networking without any boundaries. Following the HCI-KDD approach, in combining the best of two worlds, it is aimed to support human intelligence with machine intelligence. This state-of-the-art survey is an output of the international HCI-KDD expert network and features 22 carefully selected and peer-reviewed chapters on hot topics in machine learning for health informatics; they discuss open problems and future challenges in order to stimulate further research and international progress in this field.Table of ContentsMachine Learning for Health Informatics.- Bagging Soft Decision Trees.- Grammars for Discrete Dynamics.- Empowering Bridging Term Discovery for Cross-domain Literature Mining in the TextFlows Platform.- Visualisation of Integrated Patient-Centric Data as Pathways: Enhancing Electronic Medical Records in Clinical Practice.- Deep learning trends for focal brain pathology segmentation in MRI.- Differentiation between Normal and Epileptic EEG using K-Nearest-Neighbors Technique.- Survey on Feature Extraction and Applications of Biosignals.- Argumentation for knowledge representation, conflict resolution, defeasible inference and its integration with machine learning.- Machine Learning and Data mining Methods for Managing Parkinson’s Disease.- Challenges of Medical Text and Image Processing: Machine Learning Approaches.- Visual Intelligent Decision Support Systems in the medical field: design and evaluation.
£53.99
Springer International Publishing AG The Data Science Design Manual
Book SynopsisThis engaging and clearly written textbook/reference provides a must-have introduction to the rapidly emerging interdisciplinary field of data science. It focuses on the principles fundamental to becoming a good data scientist and the key skills needed to build systems for collecting, analyzing, and interpreting data. The Data Science Design Manual is a source of practical insights that highlights what really matters in analyzing data, and provides an intuitive understanding of how these core concepts can be used. The book does not emphasize any particular programming language or suite of data-analysis tools, focusing instead on high-level discussion of important design principles. This easy-to-read text ideally serves the needs of undergraduate and early graduate students embarking on an “Introduction to Data Science” course. It reveals how this discipline sits at the intersection of statistics, computer science, and machine learning, with a distinct heft and character of its own. Practitioners in these and related fields will find this book perfect for self-study as well. Additional learning tools: Contains “War Stories,” offering perspectives on how data science applies in the real world Includes “Homework Problems,” providing a wide range of exercises and projects for self-study Provides a complete set of lecture slides and online video lectures at www.data-manual.com Provides “Take-Home Lessons,” emphasizing the big-picture concepts to learn from each chapter Recommends exciting “Kaggle Challenges” from the online platform Kaggle Highlights “False Starts,” revealing the subtle reasons why certain approaches fail Offers examples taken from the data science television show “The Quant Shop” (www.quant-shop.com) Trade Review “The book is more than a typical manual. In fact, the author himself designates it as a textbook for an introductory course on data science. The chapters are richly equipped with exercises. The topics are always explained starting with a proper motivation and continuing with practical examples. This is perhaps the most outstanding feature of the book. It can serve as a regular textbook for an academic course. In fact, I should like to recommend it exactly for this purpose. On the other hand, it provides a wealth of material for people from industry, such as software engineers, and can serve as a manual for them to accomplish data science tasks. It should be noted that the book is not just a text, but a much more complex product, including a full set of lecture slides available online as well as a solutions wiki.” (P. Navrat, Computing Reviews, February, 23, 2018) Table of ContentsWhat is Data Science? Mathematical Preliminaries Data Munging Scores and Rankings Statistical Analysis Visualizing Data Mathematical Models Linear Algebra Linear and Logistic Regression Distance and Network Methods Machine Learning Big Data: Achieving Scale
£45.55
Springer International Publishing AG Multimodal Analysis of User-Generated Multimedia Content
a huge range and FREE tracked UK delivery on ALL orders.
£116.99
Springer-Verlag Berlin and Heidelberg GmbH & Co. KG New Frontiers in Artificial Intelligence: JSAI
Book SynopsisThis book constitutes the thoroughly refereed joint post-proceedings of three international workshops organized by the Japanese Society for Artificial Intelligence, held in Tokyo, Japan in June 2006 during the 20th Annual Conference JSAI 2006. The volume starts with eight award winning papers of the JSAI 2006 main conference that are presented along with the 21 revised full workshop papers, carefully reviewed and selected for inclusion in the volume.Table of ContentsAwarded Papers.- Overview of Awarded Papers – The 20th Annual Conference of JSAI.- Translational Symmetry in Subsequence Time-Series Clustering.- Visualization of Contents Archive by Contour Map Representation.- Discussion Ontology: Knowledge Discovery from Human Activities in Meetings.- Predicting Types of Protein-Protein Interactions Using a Multiple-Instance Learning Model.- Lattice for Musical Structure and Its Arithmetics.- Viewlon: Visualizing Information on Semantic Sensor Network.- Cooperative Task Achievement System Between Humans and Robots Based on Stochastic Memory Model of Spatial Environment.- People Who Create Knowledge Sharing Communities.- Logic and Engineering of Natural Language Semantics.- Logic and Engineering of Natural Language Semantics (LENLS) 3.- A Dynamic Semantics of Intentional Identity.- Prolegomena to General-Imaging-Based Probabilistic Dynamic Epistemic Logic.- Logical Dynamics of Commands and Obligations.- On Factive Islands: Pragmatic Anomaly vs. Pragmatic Infelicity.- Aspects of the Indefiniteness Effect.- Interpreting Metaphors in a New Semantic Theory of Concept.- Covert Emotive Modality Is a Monster.- Conversational Implicatures Via General Pragmatic Pressures.- Dake-wa: Exhaustifying Assertions.- Unembedded ‘Negative’ Quantifiers.- Learning with Logics and Logics for Learning.- The Fourth Workshop on Learning with Logics and Logics for Learning (LLLL2006).- Consistency Conditions for Inductive Inference of Recursive Functions.- Inferability of Closed Set Systems from Positive Data.- An Extended Branch and Bound Search Algorithm for Finding Top-N Formal Concepts of Documents.- N-Gram Analysis Based on Zero-Suppressed BDDs.- Risk Mining.- Risk Mining - Overview.- Analysis on a Relation Between Enterprise Profit and Financial State by Using Data Mining Techniques.- Unusual Condition Detection of Bearing Vibration in Hydroelectric Power Plants for Risk Management.- Structural Health Assessing by Interactive Data Mining Approach in Nuclear Power Plant.- Developing Mining-Grid Centric e-Finance Portals for Risk Management.- Knowledge Discovery from Click Stream Data and Effective Site Management.- Sampling-Based Stream Mining for Network Risk Management.- Relation Between Abductive and Inductive Types of Nursing Risk Management.
£44.99
Springer-Verlag Berlin and Heidelberg GmbH & Co. KG Ubiquitous Social Media Analysis: Third International Workshops MUSE 2012, Bristol, UK, September 24, 2012, and MSM 2012, Milwaukee, WI, USA, June 25, 2012, Revised Selected Papers
Book SynopsisThis book constitutes the thoroughly refereed joint post-proceedings of the Third International Workshop on Mining Ubiquitous and Social Environments, MUSE 2012, held in Bristol, UK, in September 2012, and the Third International Workshop on Modeling Social Media, MSM 2012, held in Milwaukee, WI, USA, in June 2012. The 8 full papers included in the book are revised and significantly extended versions of papers submitted to the workshops. They cover a wide range of topics organized in three main themes: communities and group structure in ubiquitous social media; ubiquitous modeling and aspects of social interactions and influence.Table of ContentsHow to Carve up the World: Learning and Collaboration for Structure Recommendation.- A Topological Approach for Detecting Twitter Communities with Common Interests.- Using Geographic Cost Functions to Discover Vessel Itineraries from AIS Messages.- Social Media as a Source of Sensing to Study City Dynamics and Urban Social Behavior: Approaches, Models and Opportunities.- An Analysis of Interactions within and between Extreme Right Communities in Social Media.- Who will Interact with Whom? A Case-Study in Second Life Using Online Social Network and Location-Based Social Network Features to Predict Interactions between Users.- Identifying Influential Users by Their Postings in Social Networks.- Modeling a Web Forum Ecosystem into an Enriched Social Graph.
£41.99
Springer Fachmedien Wiesbaden Data Analytics
Book SynopsisThis book is a comprehensive introduction to the methods and algorithms of modern data analytics. It provides a sound mathematical basis, discusses advantages and drawbacks of different approaches, and enables the reader to design and implement data analytics solutions for real-world applications.
£999.99
Springer-Verlag Berlin and Heidelberg GmbH & Co. KG Fundamentals of Business Intelligence
Book SynopsisThis book presents a comprehensive and systematic introduction to transforming process-oriented data into information about the underlying business process, which is essential for all kinds of decision-making. To that end, the authors develop step-by-step models and analytical tools for obtaining high-quality data structured in such a way that complex analytical tools can be applied. The main emphasis is on process mining and data mining techniques and the combination of these methods for process-oriented data. After a general introduction to the business intelligence (BI) process and its constituent tasks in chapter 1, chapter 2 discusses different approaches to modeling in BI applications. Chapter 3 is an overview and provides details of data provisioning, including a section on big data. Chapter 4 tackles data description, visualization, and reporting. Chapter 5 introduces data mining techniques for cross-sectional data. Different techniques for the analysis of temporal data are then detailed in Chapter 6. Subsequently, chapter 7 explains techniques for the analysis of process data, followed by the introduction of analysis techniques for multiple BI perspectives in chapter 8. The book closes with a summary and discussion in chapter 9. Throughout the book, (mostly open source) tools are recommended, described and applied; a more detailed survey on tools can be found in the appendix, and a detailed code for the solutions together with instructions on how to install the software used can be found on the accompanying website. Also, all concepts presented are illustrated and selected examples and exercises are provided.The book is suitable for graduate students in computer science, and the dedicated website with examples and solutions makes the book ideal as a textbook for a first course in business intelligence in computer science or business information systems. Additionally, practitioners and industrial developers who are interested in the concepts behind business intelligence will benefit from the clear explanations and many examples.Trade Review“The usage of examples and case studies enable real life application and brings asophisticated text to life. … the book is a comprehensive and thoroughly well thought out introduction to the subject of business intelligence and the reader will not be left wanting as the clear examples are numerous. … Readers interested in the value of data and the concepts behind business intelligence will find the book and its accompanying website highly informative.” (Georgette Banham, bcs, The Chartered Institute for IT, bcs.org, August, 2016)“This book focuses primarily on the data mining, data warehousing, data analytics, data visualization, data presentation, and process analysis dimensions of BI in detail. … One of the noteworthy strengths of this book is the inclusion of comprehensive lists with very recent and relevant references for BI at the end of each chapter. This should make the book very useful for academic research on the topic.” (Satya Prakash Saraswat, Computing Reviews, February, 2016)Table of Contents1 Introduction.- 2 Modeling in Business Intelligence.- 3 Data Provisioning.- 4 Data Description and Visualization.- 5 Data Mining for Cross-Sectional Data.- 6 Data Mining for Temporal Data.- 7 Process Analysis.- 8 Analysis of Multiple Business Perspectives.- 9 Summary.- A Survey on Business Intelligence Tools.
£61.74
Bpb Publications SAP S/4HANA Central Finance and Group Reporting:
Book Synopsis
£999.99
Springer Verlag, Singapore Crowdsourced Data Management: Hybrid Machine-Human Computing
Book SynopsisThis book provides an overview of crowdsourced data management. Covering all aspects including the workflow, algorithms and research potential, it particularly focuses on the latest techniques and recent advances. The authors identify three key aspects in determining the performance of crowdsourced data management: quality control, cost control and latency control. By surveying and synthesizing a wide spectrum of studies on crowdsourced data management, the book outlines important factors that need to be considered to improve crowdsourced data management. It also introduces a practical crowdsourced-database-system design and presents a number of crowdsourced operators. Self-contained and covering theory, algorithms, techniques and applications, it is a valuable reference resource for researchers and students new to crowdsourced data management with a basic knowledge of data structures and databases.Table of Contents1. Introduction.- 2. Crowdsourcing Background. 3. Quality Control.- 4. Cost Control.- 5. Latency Control.- 6. Crowdsourcing Database Systems and Optimization.- 7. Crowdsourced Operators.- Conclusion.
£80.99
Springer International Publishing AG Dimensionality Reduction in Data Science
Book SynopsisThis book provides a practical and fairly comprehensive review of Data Science through the lens of dimensionality reduction, as well as hands-on techniques to tackle problems with data collected in the real world. State-of-the-art results and solutions from statistics, computer science and mathematics are explained from the point of view of a practitioner in any domain science, such as biology, cyber security, chemistry, sports science and many others. Quantitative and qualitative assessment methods are described to implement and validate the solutions back in the real world where the problems originated.The ability to generate, gather and store volumes of data in the order of tera- and exo bytes daily has far outpaced our ability to derive useful information with available computational resources for many domains.This book focuses on data science and problem definition, data cleansing, feature selection and extraction, statistical, geometric, information-theoretic, biomolecular and machine learning methods for dimensionality reduction of big datasets and problem solving, as well as a comparative assessment of solutions in a real-world setting.This book targets professionals working within related fields with an undergraduate degree in any science area, particularly quantitative. Readers should be able to follow examples in this book that introduce each method or technique. These motivating examples are followed by precise definitions of the technical concepts required and presentation of the results in general situations. These concepts require a degree of abstraction that can be followed by re-interpreting concepts like in the original example(s). Finally, each section closes with solutions to the original problem(s) afforded by these techniques, perhaps in various ways to compare and contrast dis/advantages to other solutions.Table of Contents1. What is Data Science (DS)?1.1 Major Families of Data Science Problems1.1.1 Classification Problems1.1.2 Prediction Problems1.1.3 Clustering Problems1.2 Data, Big Data and Pre-processing1.2.1 What is Data?1.2.2 Big data1.2.3 Data Cleansing1.2.4 Data Visualization1.2.5 Data Understanding1.3 Populations and Data Sampling1.3.1 Sampling1.3.2 Training, Testing and Validation1.4 Overview and Scope1.4.1 Prerequisites and Layout1.4.2 Data Science Methodology1.4.3 Scope of the Book2. Solutions to Data Science Problems2.1 Conventional Statistical Solutions2.1.1 Linear Multiple Regression Model: Continuous Response2.1.2 Logistic Regression: Categorical Response2.1.3 Variable Selection and Model Building2.1.4 Generalized Linear Model (GLM)2.1.5 Decision Trees2.1.6 Bayesian Learning2.2 Machine Learning Solutions: Supervised2.2.1 k-Nearest Neighbors (kNN)2.2.2 Ensemble Methods2.2.3 Support Vector Machines (SVMs)2.2.4 Neural Networks (NNs)2.3 Machine Learning Solutions: Unsupervised2.3.1 Hard Clustering2.3.2 Soft Clustering2.4 Controls, Evaluation and Assessment2.4.1 Evaluation Methods2.4.2 Metrics for Assessment3. What is Dimensionality Reduction (DR)?3.1 Dimensionality Reduction3.2 Major Approaches to Dimensionality Reduction3.2.1 Conventional Statistical Approaches3.2.2 Geometric Approaches3.2.3 Information-theoretic Approaches3.2.4 Molecular Computing Approaches3.3 The Blessings of Dimensionality4. Conventional Statistical Approaches4.1 Principal Component Analysis (PCA)4.1.1 Obtaining the Principal Components4.1.2 Singular value decomposition (SVD)4.2 Nonlinear PCA 4.2.1 Kernel PCA4.2.2 Independent component analysis (ICA)4.3 Nonnegative Matrix Factorization (NMF)4.3.1 Approximate Solutions4.3.2 Clustering and Other Applications4.4 Discriminant Analysis4.4.1 Linear discriminant analysis (LDA)4.4.2 Quadratic discriminant analysis (QDA)4.5 Sliced Inverse Regression (SIR)5. Geometric Approaches5.1 Introduction to Manifolds5.2 Manifold Learning Methods5.2.1 Multi-Dimensional Scaling (MDS)5.2.2 Isometric Mapping (ISOMAP)5.2.3 t-Stochastic Neighbor Embedding ( t-SNE )5.3 Exploiting Randomness (RND)6. Information-theoretic Approaches6.1 Shannon Entropy (H)6.2 Reduction by Conditional Entropy6.3 Reduction by Iterated Conditional Entropy6.4 Reduction by Conditional Entropy on Targets6.5 Other Variations7. Molecular Computing Approaches7.1 Encoding Abiotic Data into DNA7.2 Deep Structure of DNA Spaces7.2.1 Structural Properties of DNA Spaces7.2.2 Noncrosshybridizing (nxh) Bases7.3 Reduction by Genomic Signatures7.3.1 Background7.3.2 Genomic Signatures7.4 Reduction by Pmeric Signatures8. Statistical Learning Approaches8.1 Reduction by Multiple Regression8.2 Reduction by Ridge Regression8.3 Reduction by Lasso Regression 8.4 Selection versus Shrinkage8.5 Further refinements9. Machine Learning Approaches9.1 Autoassociative Feature Encoders9.1.1 Undercomplete Autoencoders 9.1.2 Sparse Autoencoders9.1.3 Variational Autoencoders9.1.4 Dimensionality Reduction in MNIST Images9.2 Neural Feature Selection9.2.1 Facial Features, Expressions and Displays9.2.2 The Cohn-Kanade Dataset9.2.3 Primary and Derived Features9.3 Other Methods10. Metaheuristics of DR Methods10.1 Exploiting Feature Grouping10.2 Exploiting Domain Knowledge10.2.1 What is Domain Knowledge?10.2.2 Domain Knowledge for Dimensionality Reduction10.3 Heuristic Rules for Feature Selection, Extraction and Number10.4 About Explainability of Solutions10.4.1 What is Explainability?10.4.2 Explainability in Dimensionality Reduction10.5 Choosing Wisely10.6 About the Curse of Dimensionality10.7 About the No-Free-Lunch Theorem (NFL)11. Appendices11.1 Statistics and Probability Background11.1.1 Commonly Used Discrete Distributions11.1.2 Commonly Used Continuous Distributions11.1.3 Major Results In Probability and Statistics11.2 Linear Algebra Background11.2.1 Fields, Vector Spaces and Subspaces11.2.2 Linear independence, Bases and Dimension11.2.3 Linear Transformations and Matrices11.2.4 Eigenvalues and Spectral Decomposition11.3 Computer Science Background11.3.1 Computational Science and Complexity11.3.2 Machine Learning11.4 Typical Data Science Problems11.5 A Sample of Common and Big Datasets11.6 Computing Platforms11.6.1 The Environment R11.6.2 Python environmentsReferences
£43.99
HarperCollins Publishers Inc Everybody Lies
Book Synopsis New York Times BestsellerForeword by Steven Pinker,...
£21.74
Pearson Education (US) Pandas for Everyone
Book SynopsisDaniel Chen is a graduate student in the Interdisciplinary PhD program in Genetics, Bioinformatics & Computational Biology (GBCB) at Virginia Polytechnic Institute and State University (Virginia Tech). He is involved with Software Carpentry as an instructor, Mentoring Committee Member, and currently serves as the Assessment Committee Chair. He completed his Masters in Public Health at Columbia University Mailman School of Public Health in Epidemiology with a certificate in Advanced Epidemiology and currently extending his Master's thesis work in the Social and Decision Analytics Laboratory under the Virginia Bioinformatics Institute on attitude diffusion in social networks.Table of ContentsForeword by Anne M. Brown xxiii Foreword by Jared Lander xxv Preface xxvii Changes in the Second Edition xxxix Part I: Introduction 1 Chapter 1. Pandas DataFrame Basics 3 Learning Objectives 3 1.1 Introduction 3 1.2 Load Your First Data Set 4 1.3 Look at Columns, Rows, and Cells 6 1.4 Grouped and Aggregated Calculations 23 1.5 Basic Plot 27 Conclusion 28 Chapter 2. Pandas Data Structures Basics 31 Learning Objectives 31 2.1 Create Your Own Data 31 2.2 The Series 33 2.3 The DataFrame 42 2.4 Making Changes to Series and DataFrames 45 2.5 Exporting and Importing Data 52 Conclusion 63 Chapter 3. Plotting Basics 65 Learning Objectives 65 3.1 Why Visualize Data? 65 3.2 Matplotlib Basics 66 3.3 Statistical Graphics Using matplotlib 72 3.4 Seaborn 78 3.5 Pandas Plotting Method 111 Conclusion 115 Chapter 4. Tidy Data 117 Learning Objectives 117 Note About This Chapter 117 4.1 Columns Contain Values, Not Variables 118 4.2 Columns Contain Multiple Variables 122 4.3 Variables in Both Rows and Columns 126 Conclusion 129 Chapter 5. Apply Functions 131 Learning Objectives 131 Note About This Chapter 131 5.1 Primer on Functions 131 5.2 Apply (Basics) 133 5.3 Vectorized Functions 138 5.4 Lambda Functions (Anonymous Functions) 141 Conclusion 142 Part II: Data Processing 143 Chapter 6. Data Assembly 145 Learning Objectives 145 6.1 Combine Data Sets 145 6.2 Concatenation 146 6.3 Observational Units Across Multiple Tables 154 6.4 Merge Multiple Data Sets 160 Conclusion 167 Chapter 7. Data Normalization 169 Learning Objectives 169 7.1 Multiple Observational Units in a Table (Normalization) 169 Conclusion 173 Chapter 8. Groupby Operations: Split-Apply-Combine 175 Learning Objectives 175 8.1 Aggregate 176 8.2 Transform 184 8.3 Filter 188 8.4 The pandas.core.groupby.DataFrameGroupBy object 190 8.5 Working with a MultiIndex 195 Conclusion 199 Part III: Data Types 203 Chapter 9. Missing Data 203 Learning Objectives 203 9.1 What Is a NaN Value? 203 9.2 Where Do Missing Values Come From? 205 9.3 Working with Missing Data 210 9.4 Pandas Built-In NA Missing 216 Conclusion 218 Chapter 10. Data Types 219 Learning Objectives 219 10.1 Data Types 219 10.2 Converting Types 220 10.3 Categorical Data 225 Conclusion 227 Chapter 11. Strings and Text Data 229 Introduction 229 Learning Objectives 229 11.1 Strings 229 11.2 String Methods 233 11.3 More String Methods 234 11.4 String Formatting (F-Strings) 236 11.5 Regular Expressions (RegEx) 239 11.6 The regex Library 247 Conclusion 247 Chapter 12. Dates and Times 249 Learning Objectives 249 12.1 Python's datetime Object 249 12.2 Converting to datetime 250 12.3 Loading Data That Include Dates 253 12.4 Extracting Date Components 254 12.5 Date Calculations and Timedeltas 257 12.6 Datetime Methods 259 12.7 Getting Stock Data 261 12.8 Subsetting Data Based on Dates 263 12.9 Date Ranges 266 12.10 Shifting Values 270 12.11 Resampling 276 12.12 Time Zones 278 12.13 Arrow for Better Dates and Times 280 Conclusion 280 Part IV: Data Modeling 281 Chapter 13. Linear Regression (Continuous Outcome Variable) 283 13.1 Simple Linear Regression 283 13.2 Multiple Regression 287 13.3 Models with Categorical Variables 289 13.4 One-Hot Encoding in scikit-learn with Transformer Pipelines 294 Conclusion 296 Chapter 14. Generalized Linear Models 297 About This Chapter 297 14.1 Logistic Regression (Binary Outcome Variable) 297 14.2 Poisson Regression (Count Outcome Variable) 304 14.3 More Generalized Linear Models 308 Conclusion 309 Chapter 15. Survival Analysis 311 15.1 Survival Data 311 15.2 Kaplan Meier Curves 312 15.3 Cox Proportional Hazard Model 314 Conclusion 317 Chapter 16. Model Diagnostics 319 16.1 Residuals 319 16.2 Comparing Multiple Models 324 16.3 k-Fold Cross-Validation 329 Conclusion 334 Chapter 17. Regularization 335 17.1 Why Regularize? 335 17.2 LASSO Regression 337 17.3 Ridge Regression 338 17.4 Elastic Net 340 17.5 Cross-Validation 341 Conclusion 343 Chapter 18. Clustering 345 18.1 k-Means 345 18.2 Hierarchical Clustering 351 Conclusion 356 Part V. Conclusion 357 Chapter 19. Life Outside of Pandas 359 19.1 The (Scientific) Computing Stack 359 19.2 Performance 360 19.3 Dask 360 19.4 Siuba 360 19.5 Ibis 361 19.6 Polars 361 19.7 PyJanitor 361 19.8 Pandera 361 19.9 Machine Learning 361 19.10 Publishing 362 19.11 Dashboards 362 Conclusion 362 Chapter 20. It's Dangerous To Go Alone! 363 20.1 Local Meetups 363 20.2 Conferences 363 20.3 The Carpentries 364 20.4 Podcasts 364 20.5 Other Resources 365 Conclusion 365 Appendices 367 A. Concept Maps 369B. Installation and Setup 373C. Command Line 377D. Project Templates 379E. Using Python 381F. Working Directories 383G. Environments 385H. Install Packages 389I. Importing Libraries 391J. Code Style 393K. Containers: Lists, Tuples, and Dictionaries 395L. Slice Values 399M. Loops 401N. Comprehensions 403O. Functions 405P. Ranges and Generators 409Q. Multiple Assignment 413R. Numpy ndarray 415S. Classes 417T. SettingWithCopyWarning 419U. Method Chaining 423V. Timing Code 427W. String Formatting 429X. Conditionals (if-elif-else) 433Y. New York ACS Logistic Regression Example 435Z. Replicating Results in R 443 Index 451
£34.19
John Wiley & Sons Inc Making Sense of Data I
Book SynopsisPraise for the First Edition . a well-written book on data analysis and data mining that provides an excellent foundation. CHOICE This is a must-read book for learning practical statistics and data analysis.Table of ContentsPREFACE ix 1 INTRODUCTION 1 1.1 Overview 1 1.2 Sources of Data 2 1.3 Process for Making Sense of Data 3 1.4 Overview of Book 13 1.5 Summary 16 Further Reading 16 2 DESCRIBING DATA 17 2.1 Overview 17 2.2 Observations and Variables 18 2.3 Types of Variables 20 2.4 Central Tendency 22 2.5 Distribution of the Data 24 2.6 Confidence Intervals 36 2.7 Hypothesis Tests 40 Exercises 42 Further Reading 45 3 PREPARING DATA TABLES 47 3.1 Overview 47 3.2 Cleaning the Data 48 3.3 Removing Observations and Variables 49 3.4 Generating Consistent Scales Across Variables 49 3.5 New Frequency Distribution 51 3.6 Converting Text to Numbers 52 3.7 Converting Continuous Data to Categories 53 3.8 Combining Variables 54 3.9 Generating Groups 54 3.10 Preparing Unstructured Data 55 Exercises 57 Further Reading 57 4 UNDERSTANDING RELATIONSHIPS 59 4.1 Overview 59 4.2 Visualizing Relationships Between Variables 60 4.3 Calculating Metrics About Relationships 69 Exercises 81 Further Reading 82 5 IDENTIFYING AND UNDERSTANDING GROUPS 83 5.1 Overview 83 5.2 Clustering 88 5.3 Association Rules 111 5.4 Learning Decision Trees from Data 122 Exercises 137 Further Reading 140 6 BUILDING MODELS FROM DATA 141 6.1 Overview 141 6.2 Linear Regression 149 6.3 Logistic Regression 161 6.4 k-Nearest Neighbors 167 6.5 Classification and Regression Trees 172 6.6 Other Approaches 178 Exercises 179 Further Reading 182 APPENDIX A ANSWERS TO EXERCISES 185 APPENDIX B HANDS-ON TUTORIALS 191 B.1 Tutorial Overview 191 B.2 Access and Installation 191 B.3 Software Overview 192 B.4 Reading in Data 193 B.5 Preparation Tools 195 B.6 Tables and Graph Tools 199 B.7 Statistics Tools 202 B.8 Grouping Tools 204 B.9 Models Tools 207 B.10 Apply Model 211 B.11 Exercises 211 BIBLIOGRAPHY 227 INDEX 231
£59.36
Taylor & Francis Inc Leadership Strategies in the Age of Big Data
Book SynopsisHarnessing the power of technology is one of the key measures of effective leadership. Leadership Strategies in the Age of Big Data, Algorithms, and Analytics will help leaders think and act like strategists to maintain a leading-edge competitive advantage. Written by a leading expert in the field, this book provides new insights on how to successfully transition companies by aligning an organization's culture to accept the benefits of digital technology.The author emphasizes the importance of creating a team spirit with employees to embrace the digital age and develop strategic business plans that pinpoint new markets for growth, strengthen customer relationships, and develop competitive strategies. Understanding how to deal with inconsistencies when facts generated by data analytics disagree with your own experience, intuition, and knowledge of the competitive situation is key to successful leadership.Table of ContentsChapter 1. Developing Effective Leadership:The human interface with big data, algorithms, and analytics. Chapter 2. Initiate speed of implementation to maintain a digital advantage. Chapter 3. Apply analytics to concentrate at the decisive point for maximum impact. Chapter 4. Activate maneuver and indirect approach to create surprise. Chapter 5. Employ big data to determine the culminating point of a competitive campaign. Chapter 6. Use data to determine how long to maintain offensive action. Chapter 7. Align big data with the corporate culture. Chapter 8. Decide on a bold approach or cautious restraint based on data analytics. Chapter 9. Utilize big data, algorithms, and analytics to maximize use of competitor intelligence. Chapter 10. Choose offensive and defensive strategies by understanding the human interaction. Chapter 11. Factor-in friction and luck that make analytics a gamble. Chapter 12. Use data to neutralize the competitor’s effectiveness. Appendix. Strategic Business Plan outline.
£47.49
Cambridge University Press The Art of Feature Engineering
Book SynopsisWhen machine learning engineers work with data sets, they may find the results aren''t as good as they need. Instead of improving the model or collecting more data, they can use the feature engineering process to help improve results by modifying the data''s features to better capture the nature of the problem. This practical guide to feature engineering is an essential addition to any data scientist''s or machine learning engineer''s toolbox, providing new ideas on how to improve the performance of a machine learning solution. Beginning with the basic concepts and techniques, the text builds up to a unique cross-domain approach that spans data on graphs, texts, time series, and images, with fully worked out case studies. Key topics include binning, out-of-fold estimation, feature selection, dimensionality reduction, and encoding variable-length data. The full source code for the case studies is available on a companion website as Python Jupyter notebooks.Trade Review'Pablo Duboue is a true grandmaster of the art and science of feature engineering. His foundational contributions to the creation of IBM Watson were a critical component of its success. Now readers can benefit from his expertise. His book provides deep insights into to how to develop, assess, combine, and enhance machine learning features. Of particular interest to advanced practitioners is his discussion of feature engineering and deep learning; there is a pervasive myth in the industry that deep learning and big data have made feature engineering obsolete, but the book explains why that is often incorrect for real-world computing applications and explains the relationship between building effective features and deep neural network architectures. The book engages with countless other basic and advanced topics in the area of machine learning and feature engineering, making it a valuable resource for machine learning practitioners of all levels of experience.' J. William Murdock, IBM'Feature engineering is the process of identifying, selecting and evaluating input variables to statistical and machine learning models for a given problem. Pablo Duboue's The Art of Feature Engineering introduces the process with rich detail from a practitioner's point of view, and adds new insights through four input data scenarios for the same prediction task. Highly recommended!' Nelson Correa, Andinum Inc.'TAoFE is a comprehensive handbook - sure to be a hit with data science practitioners. With highly accessible and didactic explanations of complex concepts, the book represents the state-of-the-art, and shows in practical terms how it applies to a wide range of real-world case studies.' Gavin Brown, University of Manchester'This book provides a large catalogue of feature manipulation techniques along with non-trivial examples to illustrate their applicability and impact on performance. It could be suitable as a textbook for an upper level undergrad or graduate text mining or multimodal data analysis class. Recent graduates starting in field data mining and text analysis will find this a useful text.' Wlodek Zadrozny, University of North CarolinaTable of ContentsPart I. Fundamentals: 1. Introduction; 2. Features, combined; 3. Features, expanded; 4. Features, reduced; 5. Advanced topics; Part II. Case Studies: 6. Graph data; 7. Timestamped data; 8. Textual data; 9. Image data; 10. Other domains.
£39.89