Description

Book Synopsis

Move your career forward with AWS certification! Prepare for the AWS Certified Data Analytics Specialty Exam with this thorough study guide

This comprehensive study guide will help assess your technical skills and prepare for the updated AWS Certified Data Analytics exam. Earning this AWS certification will confirm your expertise in designing and implementing AWS services to derive value from data. The AWS Certified Data Analytics Study Guide: Specialty (DAS-C01) Exam is designed for business analysts and IT professionals who perform complex Big Data analyses.

This AWS Specialty Exam guide gets you ready for certification testing with expert content, real-world knowledge, key exam concepts, and topic reviews. Gain confidence by studying the subject areas and working through the practice questions. Big data concepts covered in the guide include:

  • Collection
  • Storage
  • Processing
  • Analysis
  • Visualization
  • D

    Table of Contents

    Introduction xxi

    Assessment Test xxx

    Chapter 1 History of Analytics and Big Data 1

    Evolution of Analytics Architecture Over the Years 3

    The New World Order 5

    Analytics Pipeline 6

    Data Sources 7

    Collection 8

    Storage 8

    Processing and Analysis 9

    Visualization, Predictive and Prescriptive Analytics 9

    The Big Data Reference Architecture 10

    Data Characteristics: Hot, Warm, and Cold 11

    Collection/Ingest 12

    Storage 13

    Process/Analyze 14

    Consumption 15

    Data Lakes and Their Relevance in Analytics 16

    What is a Data Lake? 16

    Building a Data Lake on AWS 19

    Step 1: Choosing the Right Storage – Amazon S3

    Is the Base 19

    Step 2: Data Ingestion – Moving the Data into

    the Data Lake 21

    Step 3: Cleanse, Prep, and Catalog the Data 22

    Step 4: Secure the Data and Metadata 23

    Step 5: Make Data Available for Analytics 23

    Using Lake Formation to Build a Data Lake on AWS 23

    Exam Objectives 24

    Objective Map 25

    Assessment Test 27

    References 29

    Chapter 2 Data Collection 31

    Exam Objectives 32

    AWS IoT 33

    Common Use Cases for AWS IoT 35

    How AWS IoT Works 36

    Amazon Kinesis 38

    Amazon Kinesis Introduction 40

    Amazon Kinesis Data Streams 40

    Amazon Kinesis Data Analytics 54

    Amazon Kinesis Video Streams 61

    AWS Glue 64

    Glue Data Catalog 66

    Glue Crawlers 68

    Authoring ETL Jobs 69

    Executing ETL Jobs 71

    Change Data Capture with Glue Bookmarks 71

    Use Cases for AWS Glue 72

    Amazon SQS 72

    Amazon Data Migration Service 74

    What is AWS DMS Anyway? 74

    What Does AWS DMS Support? 75

    AWS Data Pipeline 77

    Pipeline Definition 77

    Pipeline Schedules 78

    Task Runner 79

    Large-Scale Data Transfer Solutions 81

    AWS Snowcone 81

    AWS Snowball 82

    AWS Snowmobile 85

    AWS Direct Connect 86

    Summary 87

    Review Questions 88

    References 90

    Exercises & Workshops 91

    Chapter 3 Data Storage 93

    Introduction 94

    Amazon S3 95

    Amazon S3 Data Consistency Model 96

    Data Lake and S3 97

    Data Replication in Amazon S3 100

    Server Access Logging in Amazon S3 101

    Partitioning, Compression, and File Formats on S3 101

    Amazon S3 Glacier 103

    Vault 103

    Archive 104

    Amazon DynamoDB 104

    Amazon DynamoDB Data Types 105

    Amazon DynamoDB Core Concepts 108

    Read/Write Capacity Mode in DynamoDB 108

    DynamoDB Auto Scaling and Reserved Capacity 111

    Read Consistency and Global Tables 111

    Amazon DynamoDB: Indexing and Partitioning 113

    Amazon DynamoDB Accelerator 114

    Amazon DynamoDB Streams 115

    Amazon DynamoDB Streams – Kinesis Adapter 116

    Amazon DocumentDB 117

    Why a Document Database? 117

    Amazon DocumentDB Overview 119

    Amazon Document DB Architecture 120

    Amazon DocumentDB Interfaces 120

    Graph Databases and Amazon Neptune 121

    Amazon Neptune Overview 122

    Amazon Neptune Use Cases 123

    Storage Gateway 123

    Hybrid Storage Requirements 123

    AWS Storage Gateway 125

    Amazon EFS 127

    Amazon EFS Use Cases 130

    Interacting with Amazon EFS 132

    Amazon EFS Security Model 132

    Backing Up Amazon EFS 132

    Amazon FSx for Lustre 133

    Key Benefits of Amazon FSx for Lustre 134

    Use Cases for Lustre 135

    AWS Transfer for SFTP 135

    Summary 136

    Exercises 137

    Review Questions 140

    Further Reading 142

    References 142

    Chapter 4 Data Processing and Analysis 143

    Introduction 144

    Types of Analytical Workloads 144

    Amazon Athena 146

    Apache Presto 147

    Apache Hive 148

    Amazon Athena Use Cases and Workloads 149

    Amazon Athena DDL, DML, and DCL 150

    Amazon Athena Workgroups 151

    Amazon Athena Federated Query 153

    Amazon Athena Custom UDFs 154

    Using Machine Learning with Amazon Athena 154

    Amazon EMR 155

    Apache Hadoop Overview 156

    Amazon EMR Overview 157

    Apache Hadoop on Amazon EMR 158

    EMRFS 166

    Bootstrap Actions and Custom AMI 167

    Security on EMR 167

    EMR Notebooks 168

    Apache Hive and Apache Pig on Amazon EMR 169

    Apache Spark on Amazon EMR 174

    Apache HBase on Amazon EMR 182

    Apache Flink, Apache Mahout, and Apache MXNet 184

    Choosing the Right Analytics Tool 186

    Amazon Elasticsearch Service 188

    When to Use Elasticsearch 188

    Elasticsearch Core Concepts (the ELK Stack) 189

    Amazon Elasticsearch Service 191

    Amazon Redshift 192

    What is Data Warehousing? 192

    What is Redshift? 193

    Redshift Architecture 195

    Redshift AQUA 198

    Redshift Scalability 199

    Data Modeling in Redshift 205

    Data Loading and Unloading 213

    Query Optimization in Redshift 217

    Security in Redshift 221

    Kinesis Data Analytics 225

    How Does It Work? 226

    What is Kinesis Data Analytics for Java? 228

    Comparing Batch Processing Services 229

    Comparing Orchestration Options on AWS 230

    AWS Step Functions 230

    Comparing Different ETL Orchestration Options 230

    Summary 231

    Exam Essentials 232

    Exercises 232

    Review Questions 235

    References 237

    Recommended Workshops 237

    Amazon Athena Blogs 238

    Amazon Redshift Blogs 240

    Amazon EMR Blogs 241

    Amazon Elasticsearch Blog 241

    Amazon Redshift References and Further Reading 242

    Chapter 5 Data Visualization 243

    Introduction 244

    Data Consumers 245

    Data Visualization Options 246

    Amazon QuickSight 247

    Getting Started 248

    Working with Data 250

    Data Preparation 255

    Data Analysis 256

    Data Visualization 258

    Machine Learning Insights 261

    Building Dashboards 262

    Embedding QuickSight Objects into Other Applications 264

    Administration 265

    Security 266

    Other Visualization Options 267

    Predictive Analytics 270

    What is Predictive Analytics? 270

    The AWS ML Stack 271

    Summary 273

    Exam Essentials 273

    Exercises 274

    Review Questions 275

    References 276

    Additional Reading Material 276

    Chapter 6 Data Security 279

    Introduction 280

    Shared Responsibility Model 280

    Security Services on AWS 282

    AWS IAM Overview 285

    IAM User 285

    IAM Groups 286

    IAM Roles 287

    Amazon EMR Security 289

    Public Subnet 290

    Private Subnet 291

    Security Configurations 293

    Block Public Access 298

    VPC Subnets 298

    Security Options during Cluster Creation 299

    EMR Security Summary 300

    Amazon S3 Security 301

    Managing Access to Data in Amazon S3 301

    Data Protection in Amazon S3 305

    Logging and Monitoring with Amazon S3 306

    Best Practices for Security on Amazon S3 308

    Amazon Athena Security 308

    Managing Access to Amazon Athena 309

    Data Protection in Amazon Athena 310

    Data Encryption in Amazon Athena 311

    Amazon Athena and AWS Lake Formation 312

    Amazon Redshift Security 312

    Levels of Security within Amazon Redshift 313

    Data Protection in Amazon Redshift 315

    Redshift Auditing 316

    Redshift Logging 317

    Amazon Elasticsearch Security 317

    Elasticsearch Network Configuration 318

    VPC Access 318

    Accessing Amazon Elasticsearch and Kibana 319

    Data Protection in Amazon Elasticsearch 322

    Amazon Kinesis Security 325

    Managing Access to Amazon Kinesis 325

    Data Protection in Amazon Kinesis 326

    Amazon Kinesis Best Practices 326

    Amazon QuickSight Security 327

    Managing Data Access with Amazon QuickSight 327

    Data Protection 328

    Logging and Monitoring 329

    Security Best Practices 329

    Amazon DynamoDB Security 329

    Access Management in DynamoDB 329

    IAM Policy with Fine-Grained Access Control 330

    Identity Federation 331

    How to Access Amazon DynamoDB 332

    Data Protection with DynamoDB 332

    Monitoring and Logging with DynamoDB 333

    Summary 334

    Exam Essentials 334

    Exercises/Workshops 334

    Review Questions 336

    References and Further Reading 337

    Appendix Answers to Review Questions 339

    Chapter 1: History of Analytics and Big Data 340

    Chapter 2: Data Collection 342

    Chapter 3: Data Storage 343

    Chapter 4: Data Processing and Analysis 344

    Chapter 5: Data Visualization 346

    Chapter 6: Data Security 346

    Index 349

AWS Certified Data Analytics Study Guide

    Product form

    £35.62

    Includes FREE delivery

    RRP £47.50 – you save £11.88 (25%)

    Order before 4pm today for delivery by Mon 6 Jul 2026.

    A Paperback / softback by Asif Abbasi

      Trusted by thousands of customers. See 2,385+ Customer Reviews

      View other formats and editions of AWS Certified Data Analytics Study Guide by Asif Abbasi

      Publisher: John Wiley & Sons Inc
      Publication Date: 08/02/2021
      ISBN13: 9781119649472, 978-1119649472
      ISBN10: 1119649471

      Description

      Book Synopsis

      Move your career forward with AWS certification! Prepare for the AWS Certified Data Analytics Specialty Exam with this thorough study guide

      This comprehensive study guide will help assess your technical skills and prepare for the updated AWS Certified Data Analytics exam. Earning this AWS certification will confirm your expertise in designing and implementing AWS services to derive value from data. The AWS Certified Data Analytics Study Guide: Specialty (DAS-C01) Exam is designed for business analysts and IT professionals who perform complex Big Data analyses.

      This AWS Specialty Exam guide gets you ready for certification testing with expert content, real-world knowledge, key exam concepts, and topic reviews. Gain confidence by studying the subject areas and working through the practice questions. Big data concepts covered in the guide include:

      • Collection
      • Storage
      • Processing
      • Analysis
      • Visualization
      • D

        Table of Contents

        Introduction xxi

        Assessment Test xxx

        Chapter 1 History of Analytics and Big Data 1

        Evolution of Analytics Architecture Over the Years 3

        The New World Order 5

        Analytics Pipeline 6

        Data Sources 7

        Collection 8

        Storage 8

        Processing and Analysis 9

        Visualization, Predictive and Prescriptive Analytics 9

        The Big Data Reference Architecture 10

        Data Characteristics: Hot, Warm, and Cold 11

        Collection/Ingest 12

        Storage 13

        Process/Analyze 14

        Consumption 15

        Data Lakes and Their Relevance in Analytics 16

        What is a Data Lake? 16

        Building a Data Lake on AWS 19

        Step 1: Choosing the Right Storage – Amazon S3

        Is the Base 19

        Step 2: Data Ingestion – Moving the Data into

        the Data Lake 21

        Step 3: Cleanse, Prep, and Catalog the Data 22

        Step 4: Secure the Data and Metadata 23

        Step 5: Make Data Available for Analytics 23

        Using Lake Formation to Build a Data Lake on AWS 23

        Exam Objectives 24

        Objective Map 25

        Assessment Test 27

        References 29

        Chapter 2 Data Collection 31

        Exam Objectives 32

        AWS IoT 33

        Common Use Cases for AWS IoT 35

        How AWS IoT Works 36

        Amazon Kinesis 38

        Amazon Kinesis Introduction 40

        Amazon Kinesis Data Streams 40

        Amazon Kinesis Data Analytics 54

        Amazon Kinesis Video Streams 61

        AWS Glue 64

        Glue Data Catalog 66

        Glue Crawlers 68

        Authoring ETL Jobs 69

        Executing ETL Jobs 71

        Change Data Capture with Glue Bookmarks 71

        Use Cases for AWS Glue 72

        Amazon SQS 72

        Amazon Data Migration Service 74

        What is AWS DMS Anyway? 74

        What Does AWS DMS Support? 75

        AWS Data Pipeline 77

        Pipeline Definition 77

        Pipeline Schedules 78

        Task Runner 79

        Large-Scale Data Transfer Solutions 81

        AWS Snowcone 81

        AWS Snowball 82

        AWS Snowmobile 85

        AWS Direct Connect 86

        Summary 87

        Review Questions 88

        References 90

        Exercises & Workshops 91

        Chapter 3 Data Storage 93

        Introduction 94

        Amazon S3 95

        Amazon S3 Data Consistency Model 96

        Data Lake and S3 97

        Data Replication in Amazon S3 100

        Server Access Logging in Amazon S3 101

        Partitioning, Compression, and File Formats on S3 101

        Amazon S3 Glacier 103

        Vault 103

        Archive 104

        Amazon DynamoDB 104

        Amazon DynamoDB Data Types 105

        Amazon DynamoDB Core Concepts 108

        Read/Write Capacity Mode in DynamoDB 108

        DynamoDB Auto Scaling and Reserved Capacity 111

        Read Consistency and Global Tables 111

        Amazon DynamoDB: Indexing and Partitioning 113

        Amazon DynamoDB Accelerator 114

        Amazon DynamoDB Streams 115

        Amazon DynamoDB Streams – Kinesis Adapter 116

        Amazon DocumentDB 117

        Why a Document Database? 117

        Amazon DocumentDB Overview 119

        Amazon Document DB Architecture 120

        Amazon DocumentDB Interfaces 120

        Graph Databases and Amazon Neptune 121

        Amazon Neptune Overview 122

        Amazon Neptune Use Cases 123

        Storage Gateway 123

        Hybrid Storage Requirements 123

        AWS Storage Gateway 125

        Amazon EFS 127

        Amazon EFS Use Cases 130

        Interacting with Amazon EFS 132

        Amazon EFS Security Model 132

        Backing Up Amazon EFS 132

        Amazon FSx for Lustre 133

        Key Benefits of Amazon FSx for Lustre 134

        Use Cases for Lustre 135

        AWS Transfer for SFTP 135

        Summary 136

        Exercises 137

        Review Questions 140

        Further Reading 142

        References 142

        Chapter 4 Data Processing and Analysis 143

        Introduction 144

        Types of Analytical Workloads 144

        Amazon Athena 146

        Apache Presto 147

        Apache Hive 148

        Amazon Athena Use Cases and Workloads 149

        Amazon Athena DDL, DML, and DCL 150

        Amazon Athena Workgroups 151

        Amazon Athena Federated Query 153

        Amazon Athena Custom UDFs 154

        Using Machine Learning with Amazon Athena 154

        Amazon EMR 155

        Apache Hadoop Overview 156

        Amazon EMR Overview 157

        Apache Hadoop on Amazon EMR 158

        EMRFS 166

        Bootstrap Actions and Custom AMI 167

        Security on EMR 167

        EMR Notebooks 168

        Apache Hive and Apache Pig on Amazon EMR 169

        Apache Spark on Amazon EMR 174

        Apache HBase on Amazon EMR 182

        Apache Flink, Apache Mahout, and Apache MXNet 184

        Choosing the Right Analytics Tool 186

        Amazon Elasticsearch Service 188

        When to Use Elasticsearch 188

        Elasticsearch Core Concepts (the ELK Stack) 189

        Amazon Elasticsearch Service 191

        Amazon Redshift 192

        What is Data Warehousing? 192

        What is Redshift? 193

        Redshift Architecture 195

        Redshift AQUA 198

        Redshift Scalability 199

        Data Modeling in Redshift 205

        Data Loading and Unloading 213

        Query Optimization in Redshift 217

        Security in Redshift 221

        Kinesis Data Analytics 225

        How Does It Work? 226

        What is Kinesis Data Analytics for Java? 228

        Comparing Batch Processing Services 229

        Comparing Orchestration Options on AWS 230

        AWS Step Functions 230

        Comparing Different ETL Orchestration Options 230

        Summary 231

        Exam Essentials 232

        Exercises 232

        Review Questions 235

        References 237

        Recommended Workshops 237

        Amazon Athena Blogs 238

        Amazon Redshift Blogs 240

        Amazon EMR Blogs 241

        Amazon Elasticsearch Blog 241

        Amazon Redshift References and Further Reading 242

        Chapter 5 Data Visualization 243

        Introduction 244

        Data Consumers 245

        Data Visualization Options 246

        Amazon QuickSight 247

        Getting Started 248

        Working with Data 250

        Data Preparation 255

        Data Analysis 256

        Data Visualization 258

        Machine Learning Insights 261

        Building Dashboards 262

        Embedding QuickSight Objects into Other Applications 264

        Administration 265

        Security 266

        Other Visualization Options 267

        Predictive Analytics 270

        What is Predictive Analytics? 270

        The AWS ML Stack 271

        Summary 273

        Exam Essentials 273

        Exercises 274

        Review Questions 275

        References 276

        Additional Reading Material 276

        Chapter 6 Data Security 279

        Introduction 280

        Shared Responsibility Model 280

        Security Services on AWS 282

        AWS IAM Overview 285

        IAM User 285

        IAM Groups 286

        IAM Roles 287

        Amazon EMR Security 289

        Public Subnet 290

        Private Subnet 291

        Security Configurations 293

        Block Public Access 298

        VPC Subnets 298

        Security Options during Cluster Creation 299

        EMR Security Summary 300

        Amazon S3 Security 301

        Managing Access to Data in Amazon S3 301

        Data Protection in Amazon S3 305

        Logging and Monitoring with Amazon S3 306

        Best Practices for Security on Amazon S3 308

        Amazon Athena Security 308

        Managing Access to Amazon Athena 309

        Data Protection in Amazon Athena 310

        Data Encryption in Amazon Athena 311

        Amazon Athena and AWS Lake Formation 312

        Amazon Redshift Security 312

        Levels of Security within Amazon Redshift 313

        Data Protection in Amazon Redshift 315

        Redshift Auditing 316

        Redshift Logging 317

        Amazon Elasticsearch Security 317

        Elasticsearch Network Configuration 318

        VPC Access 318

        Accessing Amazon Elasticsearch and Kibana 319

        Data Protection in Amazon Elasticsearch 322

        Amazon Kinesis Security 325

        Managing Access to Amazon Kinesis 325

        Data Protection in Amazon Kinesis 326

        Amazon Kinesis Best Practices 326

        Amazon QuickSight Security 327

        Managing Data Access with Amazon QuickSight 327

        Data Protection 328

        Logging and Monitoring 329

        Security Best Practices 329

        Amazon DynamoDB Security 329

        Access Management in DynamoDB 329

        IAM Policy with Fine-Grained Access Control 330

        Identity Federation 331

        How to Access Amazon DynamoDB 332

        Data Protection with DynamoDB 332

        Monitoring and Logging with DynamoDB 333

        Summary 334

        Exam Essentials 334

        Exercises/Workshops 334

        Review Questions 336

        References and Further Reading 337

        Appendix Answers to Review Questions 339

        Chapter 1: History of Analytics and Big Data 340

        Chapter 2: Data Collection 342

        Chapter 3: Data Storage 343

        Chapter 4: Data Processing and Analysis 344

        Chapter 5: Data Visualization 346

        Chapter 6: Data Security 346

        Index 349

      Recently viewed products

      © 2026 Book Curl

        • American Express
        • Apple Pay
        • Diners Club
        • Discover
        • Google Pay
        • Maestro
        • Mastercard
        • PayPal
        • Shop Pay
        • Union Pay
        • Visa

        Login

        Forgot your password?

        Don't have an account yet?
        Create account