Description

Book Synopsis
Thomas Erl is a top-selling IT author, founder of Arcitura Education and series editor of the Prentice Hall Service Technology Series from Thomas Erl. With more than 200,000 copies in print worldwide, his books have become international bestsellers and have been formally endorsed by senior members of major IT organizations, such as IBM, Microsoft, Oracle, Intel, Accenture, IEEE, HL7, MITRE, SAP, CISCO, HP and many others. As CEO of Arcitura Education Inc., Thomas has led the development of curricula for the internationally recognized Big Data Science Certified Professional (BDSCP), Cloud Certified Professional (CCP) and SOA Certified Professional (SOACP) accreditation programs, which have established a series of formal, vendor-neutral industry certifications obtained by thousands of IT professionals around the world. Thomas has toured more than 20 countries as a speaker and instructor. More than 100 articles and interviews by Thomas have been published in numerous publica

Table of Contents
Acknowledgments xvii
Reader Services xviii
PART I: THE FUNDAMENTALS OF BIG DATA
Chapter 1: Understanding Big Data 3

Concepts and Terminology 5
Datasets 5
Data Analysis 6
Data Analytics 6
Descriptive Analytics 8
Diagnostic Analytics 9
Predictive Analytics 10
Prescriptive Analytics 11
Business Intelligence (BI) 12
Key Performance Indicators (KPI) 12
Big Data Characteristics 13
Volume 14
Velocity 14
Variety 15
Veracity 16
Value 16
Different Types of Data 17
Structured Data 18
Unstructured Data 19
Semi-structured Data 19
Metadata 20
Case Study Background 20
History 20
Technical Infrastructure and Automation Environment 21
Business Goals and Obstacles 22
Case Study Example 24
Identifying Data Characteristics 26
Volume 26
Velocity 26
Variety 26
Veracity 26
Value 27
Identifying Types of Data 27
Chapter 2: Business Motivations and Drivers for Big Data Adoption 29
Marketplace Dynamics 30
Business Architecture 33
Business Process Management 36
Information and Communications Technology 37
Data Analytics and Data Science 37
Digitization 38
Affordable Technology and Commodity Hardware 38
Social Media 39
Hyper-Connected Communities and Devices 40
Cloud Computing 40
Internet of Everything (IoE) 42
Case Study Example 43
Chapter 3: Big Data Adoption and Planning Considerations 47
Organization Prerequisites 49
Data Procurement 49
Privacy 49
Security 50
Provenance 51
Limited Realtime Support 52
Distinct Performance Challenges 53
Distinct Governance Requirements 53
Distinct Methodology 53
Clouds 54
Big Data Analytics Lifecycle 55
Business Case Evaluation 56
Data Identification 57
Data Acquisition and Filtering 58
Data Extraction 60
Data Validation and Cleansing 62
Data Aggregation and Representation 64
Data Analysis 66
Data Visualization 68
Utilization of Analysis Results 69
Case Study Example 71
Big Data Analytics Lifecycle 73
Business Case Evaluation 73
Data Identification 74
Data Acquisition and Filtering 74
Data Extraction 74
Data Validation and Cleansing 75
Data Aggregation and Representation 75
Data Analysis 75
Data Visualization 76
Utilization of Analysis Results 76
Chapter 4: Enterprise Technologies and Big Data Business Intelligence 77
Online Transaction Processing (OLTP) 78
Online Analytical Processing (OLAP) 79
Extract Transform Load (ETL) 79
Data Warehouses 80
Data Marts 81
Traditional BI 82
Ad-hoc Reports 82
Dashboards 82
Big Data BI 84
Traditional Data Visualization 84
Data Visualization for Big Data 85
Case Study Example 86
Enterprise Technology 86
Big Data Business Intelligence 87
PART II: STORING AND ANALYZING BIG DATA
Chapter 5: Big Data Storage Concepts 91

Clusters 93
File Systems and Distributed File Systems 93
NoSQL 94
Sharding 95
Replication 97
Master-Slave 98
Peer-to-Peer 100
Sharding and Replication 103
Combining Sharding and Master-Slave Replication 104
Combining Sharding and Peer-to-Peer Replication 105
CAP Theorem 106
ACID 108
BASE 113
Case Study Example 117
Chapter 6: Big Data Processing Concepts 119
Parallel Data Processing 120
Distributed Data Processing 121
Hadoop 122
Processing Workloads 122
Batch 123
Transactional 123
Cluster 124
Processing in Batch Mode 125
Batch Processing with MapReduce 125
Map and Reduce Tasks 126
Map 127
Combine 127
Partition 129
Shuffle and Sort 130
Reduce 131
A Simple MapReduce Example 133
Understanding MapReduce Algorithms 134
Processing in Realtime Mode 137
Speed Consistency Volume (SCV) 137
Event Stream Processing 140
Complex Event Processing 141
Realtime Big Data Processing and SCV 141
Realtime Big Data Processing and MapReduce 142
Case Study Example 143
Processing Workloads 143
Processing in Batch Mode 143
Processing in Realtime 144
Chapter 7: Big Data Storage Technology 145
On-Disk Storage Devices 147
Distributed File Systems 147
RDBMS Databases 149
NoSQL Databases 152
Characteristics 152
Rationale 153
Types 154
Key-Value 156
Document 157
Column-Family 159
Graph 160
NewSQL Databases 163
In-Memory Storage Devices 163
In-Memory Data Grids 166
Read-through 170
Write-through 170
Write-behind 172
Refresh-ahead 172
In-Memory Databases 175
Case Study Example 179
Chapter 8: Big Data Analysis Techniques 181
Quantitative Analysis 183
Qualitative Analysis 184
Data Mining 184
Statistical Analysis 184
A/B Testing 185
Correlation 186
Regression 188
Machine Learning 190
Classification (Supervised Machine Learning) 190
Clustering (Unsupervised Machine Learning) 191
Outlier Detection 192
Filtering 193
Semantic Analysis 195
Natural Language Processing 195
Text Analytics 196
Sentiment Analysis 197
Visual Analysis 198
Heat Maps 198
Time Series Plots 200
Network Graphs 201
Spatial Data Mapping 202
Case Study Example 204
Correlation 204
Regression 204
Time Series Plot 205
Clustering 205
Classification 205
Appendix A: Case Study Conclusion 207
About the Authors 211

Thomas Erl 211
Wajid Khattak 211
Paul Buhler 212
Index 213

Big Data Fundamentals

Product form

£28.02

Includes FREE delivery

RRP £29.49 – you save £1.47 (4%)

Order before 4pm today for delivery by Fri 19 Dec 2025.

A Paperback / softback by Thomas Erl, Wajid Khattak, Paul Buhler

Out of stock


    View other formats and editions of Big Data Fundamentals by Thomas Erl

    Publisher: Pearson Education (US)
    Publication Date: 13/01/2016
    ISBN13: 9780134291079, 978-0134291079
    ISBN10: 0134291077

    Description

    Book Synopsis
    Thomas Erl is a top-selling IT author, founder of Arcitura Education and series editor of the Prentice Hall Service Technology Series from Thomas Erl. With more than 200,000 copies in print worldwide, his books have become international bestsellers and have been formally endorsed by senior members of major IT organizations, such as IBM, Microsoft, Oracle, Intel, Accenture, IEEE, HL7, MITRE, SAP, CISCO, HP and many others. As CEO of Arcitura Education Inc., Thomas has led the development of curricula for the internationally recognized Big Data Science Certified Professional (BDSCP), Cloud Certified Professional (CCP) and SOA Certified Professional (SOACP) accreditation programs, which have established a series of formal, vendor-neutral industry certifications obtained by thousands of IT professionals around the world. Thomas has toured more than 20 countries as a speaker and instructor. More than 100 articles and interviews by Thomas have been published in numerous publica

    Table of Contents
    Acknowledgments xvii
    Reader Services xviii
    PART I: THE FUNDAMENTALS OF BIG DATA
    Chapter 1: Understanding Big Data 3

    Concepts and Terminology 5
    Datasets 5
    Data Analysis 6
    Data Analytics 6
    Descriptive Analytics 8
    Diagnostic Analytics 9
    Predictive Analytics 10
    Prescriptive Analytics 11
    Business Intelligence (BI) 12
    Key Performance Indicators (KPI) 12
    Big Data Characteristics 13
    Volume 14
    Velocity 14
    Variety 15
    Veracity 16
    Value 16
    Different Types of Data 17
    Structured Data 18
    Unstructured Data 19
    Semi-structured Data 19
    Metadata 20
    Case Study Background 20
    History 20
    Technical Infrastructure and Automation Environment 21
    Business Goals and Obstacles 22
    Case Study Example 24
    Identifying Data Characteristics 26
    Volume 26
    Velocity 26
    Variety 26
    Veracity 26
    Value 27
    Identifying Types of Data 27
    Chapter 2: Business Motivations and Drivers for Big Data Adoption 29
    Marketplace Dynamics 30
    Business Architecture 33
    Business Process Management 36
    Information and Communications Technology 37
    Data Analytics and Data Science 37
    Digitization 38
    Affordable Technology and Commodity Hardware 38
    Social Media 39
    Hyper-Connected Communities and Devices 40
    Cloud Computing 40
    Internet of Everything (IoE) 42
    Case Study Example 43
    Chapter 3: Big Data Adoption and Planning Considerations 47
    Organization Prerequisites 49
    Data Procurement 49
    Privacy 49
    Security 50
    Provenance 51
    Limited Realtime Support 52
    Distinct Performance Challenges 53
    Distinct Governance Requirements 53
    Distinct Methodology 53
    Clouds 54
    Big Data Analytics Lifecycle 55
    Business Case Evaluation 56
    Data Identification 57
    Data Acquisition and Filtering 58
    Data Extraction 60
    Data Validation and Cleansing 62
    Data Aggregation and Representation 64
    Data Analysis 66
    Data Visualization 68
    Utilization of Analysis Results 69
    Case Study Example 71
    Big Data Analytics Lifecycle 73
    Business Case Evaluation 73
    Data Identification 74
    Data Acquisition and Filtering 74
    Data Extraction 74
    Data Validation and Cleansing 75
    Data Aggregation and Representation 75
    Data Analysis 75
    Data Visualization 76
    Utilization of Analysis Results 76
    Chapter 4: Enterprise Technologies and Big Data Business Intelligence 77
    Online Transaction Processing (OLTP) 78
    Online Analytical Processing (OLAP) 79
    Extract Transform Load (ETL) 79
    Data Warehouses 80
    Data Marts 81
    Traditional BI 82
    Ad-hoc Reports 82
    Dashboards 82
    Big Data BI 84
    Traditional Data Visualization 84
    Data Visualization for Big Data 85
    Case Study Example 86
    Enterprise Technology 86
    Big Data Business Intelligence 87
    PART II: STORING AND ANALYZING BIG DATA
    Chapter 5: Big Data Storage Concepts 91

    Clusters 93
    File Systems and Distributed File Systems 93
    NoSQL 94
    Sharding 95
    Replication 97
    Master-Slave 98
    Peer-to-Peer 100
    Sharding and Replication 103
    Combining Sharding and Master-Slave Replication 104
    Combining Sharding and Peer-to-Peer Replication 105
    CAP Theorem 106
    ACID 108
    BASE 113
    Case Study Example 117
    Chapter 6: Big Data Processing Concepts 119
    Parallel Data Processing 120
    Distributed Data Processing 121
    Hadoop 122
    Processing Workloads 122
    Batch 123
    Transactional 123
    Cluster 124
    Processing in Batch Mode 125
    Batch Processing with MapReduce 125
    Map and Reduce Tasks 126
    Map 127
    Combine 127
    Partition 129
    Shuffle and Sort 130
    Reduce 131
    A Simple MapReduce Example 133
    Understanding MapReduce Algorithms 134
    Processing in Realtime Mode 137
    Speed Consistency Volume (SCV) 137
    Event Stream Processing 140
    Complex Event Processing 141
    Realtime Big Data Processing and SCV 141
    Realtime Big Data Processing and MapReduce 142
    Case Study Example 143
    Processing Workloads 143
    Processing in Batch Mode 143
    Processing in Realtime 144
    Chapter 7: Big Data Storage Technology 145
    On-Disk Storage Devices 147
    Distributed File Systems 147
    RDBMS Databases 149
    NoSQL Databases 152
    Characteristics 152
    Rationale 153
    Types 154
    Key-Value 156
    Document 157
    Column-Family 159
    Graph 160
    NewSQL Databases 163
    In-Memory Storage Devices 163
    In-Memory Data Grids 166
    Read-through 170
    Write-through 170
    Write-behind 172
    Refresh-ahead 172
    In-Memory Databases 175
    Case Study Example 179
    Chapter 8: Big Data Analysis Techniques 181
    Quantitative Analysis 183
    Qualitative Analysis 184
    Data Mining 184
    Statistical Analysis 184
    A/B Testing 185
    Correlation 186
    Regression 188
    Machine Learning 190
    Classification (Supervised Machine Learning) 190
    Clustering (Unsupervised Machine Learning) 191
    Outlier Detection 192
    Filtering 193
    Semantic Analysis 195
    Natural Language Processing 195
    Text Analytics 196
    Sentiment Analysis 197
    Visual Analysis 198
    Heat Maps 198
    Time Series Plots 200
    Network Graphs 201
    Spatial Data Mapping 202
    Case Study Example 204
    Correlation 204
    Regression 204
    Time Series Plot 205
    Clustering 205
    Classification 205
    Appendix A: Case Study Conclusion 207
    About the Authors 211

    Thomas Erl 211
    Wajid Khattak 211
    Paul Buhler 212
    Index 213

    Recently viewed products

    © 2025 Book Curl

      • American Express
      • Apple Pay
      • Diners Club
      • Discover
      • Google Pay
      • Maestro
      • Mastercard
      • PayPal
      • Shop Pay
      • Union Pay
      • Visa

      Login

      Forgot your password?

      Don't have an account yet?
      Create account