Description

Book Synopsis
Prepare for the Azure Data Engineering certificationand an exciting new career in analyticswith this must-have study aide In the MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203, accomplished data engineer and tech educator Benjamin Perkins delivers a hands-on, practical guide to preparing for the challenging Azure Data Engineer certification and for a new career in an exciting and growing field of tech. In the book, you'll explore all the objectives covered on the DP-203 exam while learning the job roles and responsibilities of a newly minted Azure data engineer. From integrating, transforming, and consolidating data from various structured and unstructured data systems into a structure that is suitable for building analytics solutions, you'll get up to speed quickly and efficiently with Sybex's easy-to-use study aids and tools. This Study Guide also offers: Career-ready advice for anyone hoping to ace their first data engineering job interview and excel in their first day in the fieldIndispensable tips and tricks to familiarize yourself with the DP-203 exam structure and help reduce test anxietyComplimentary access to Sybex's expansive online study tools, accessible across multiple devices, and offering access to hundreds of bonus practice questions, electronic flashcards, and a searchable, digital glossary of key terms A one-of-a-kind study aid designed to help you get straight to the crucial material you need to succeed on the exam and on the job, the MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203 belongs on the bookshelves of anyone hoping to increase their data analytics skills, advance their data engineering career with an in-demand certification, or hoping to make a career change into a popular new area of tech.

Table of Contents

Introduction xxvii

Part I Azure Data Engineer Certification and Azure Products 1

Chapter 1 Gaining the Azure Data Engineer Associate Certification 3

The Journey to Certification 7

How to Pass Exam DP- 203 8

Understanding the Exam Expectations and Requirements 9

Use Azure Daily 17

Read Azure Articles to Stay Current 17

Have an Understanding of All Azure Products 20

Azure Product Name Recognition 21

Azure Data Analytics 23

Azure Synapse Analytics 23

Azure Databricks 26

Azure HDInsight 28

Azure Analysis Services 30

Azure Data Factory 31

Azure Event Hubs 33

Azure Stream Analytics 34

Other Products 35

Azure Storage Products 36

Azure Data Lake Storage 37

Azure Storage 40

Other Products 42

Azure Databases 43

Azure Cosmos DB 43

Azure SQL Server Products 46

Additional Azure Databases 46

Other Products 47

Azure Security 48

Azure Active Directory 48

Role- Based Access Control 51

Attribute- Based Access Control 53

Azure Key Vault 53

Other Products 55

Azure Networking 56

Virtual Networks 56

Other Products 59

Azure Compute 59

Azure Virtual Machines 59

Azure Virtual Machine Scale Sets 60

Azure App Service Web Apps 60

Azure Functions 60

Azure Batch 60

Azure Management and Governance 60

Azure Monitor 61

Azure Purview 61

Azure Policy 62

Azure Blueprints (Preview) 62

Azure Lighthouse 62

Azure Cost Management and Billing 62

Other Products 63

Summary 64

Exam Essentials 64

Review Questions 66

Chapter 2 CREATE DATABASE dbName; GO 69

The Brainjammer 70

A Historical Look at Data 71

Variety 73

Velocity 74

Volume 74

Data Locations 74

Data File Formats 75

Data Structures, Types, and Concepts 83

Data Structures 83

Data Types and Management 92

Data Concepts 95

Data Programming and Querying for Data Engineers 125

Data Programming 126

Querying Data 143

Understanding Big Data Processing 169

Big Data Stages 169

Etl, Elt, Eltl 174

Analytics Types 175

Big Data Layers 176

Summary 177

Exam Essentials 177

Review Questions 179

Part II Design and Implement Data Storage 181

Chapter 3 Data Sources and Ingestion 183

Where Does Data Come From? 185

Design a Data Storage Structure 189

Design an Azure Data Lake Solution 190

Recommended File Types for Storage 198

Recommended File Types for Analytical Queries 199

Design for Efficient Querying 200

Design for Data Pruning 203

Design a Folder Structure That Represents the Levels of Data Transformation 203

Design a Distribution Strategy 205

Design a Data Archiving Solution 206

Design a Partition Strategy 207

Design a Partition Strategy for Files 209

Design a Partition Strategy for Analytical Workloads 210

Design a Partition Strategy for Efficiency and Performance 211

Design a Partition Strategy for Azure Synapse Analytics 211

Identify When Partitioning Is Needed in Azure Data Lake Storage Gen 2 212

Design the Serving/Data Exploration Layer 213

Design Star Schemas 214

Design Slowly Changing Dimensions 215

Design a Dimensional Hierarchy 219

Design a Solution for Temporal Data 220

Design for Incremental Loading 222

Design Analytical Stores 223

Design Metastores in Azure Synapse Analytics and Azure Databricks 224

The Ingestion of Data into a Pipeline 228

Azure Synapse Analytics 228

Azure Data Factory 268

Azure Databricks 275

Event Hubs and IoT Hub 301

Azure Stream Analytics 303

Apache Kafka for HDInsight 314

Migrating and Moving Data 316

Summary 317

Exam Essentials 317

Review Questions 319

Chapter 4 The Storage of Data 321

Implement Physical Data Storage Structures 322

Implement Compression 322

Implement Partitioning 325

Implement Sharding 328

Implement Different Table Geometries with Azure Synapse Analytics Pools 329

Implement Data Redundancy 331

Implement Distributions 341

Implement Data Archiving 342

Azure Synapse Analytics Develop Hub 346

Implement Logical Data Structures 360

Build a Temporal Data Solution 361

Build a Slowly Changing Dimension 365

Build a Logical Folder Structure 368

Build External Tables 369

Implement File and Folder Structures for Efficient Querying and Data Pruning 372

Implement a Partition Strategy 375

Implement a Partition Strategy for Files 376

Implement a Partition Strategy for Analytical Workloads 377

Implement a Partition Strategy for Streaming Workloads 378

Implement a Partition Strategy for Azure Synapse Analytics 378

Design and Implement the Data Exploration Layer 379

Deliver Data in a Relational Star Schema 379

Deliver Data in Parquet Files 385

Maintain Metadata 386

Implement a Dimensional Hierarchy 386

Create and Execute Queries by Using a Compute Solution That Leverages SQL Serverless and Spark Cluster 388

Recommend Azure Synapse Analytics Database Templates 389

Implement Azure Synapse Analytics Database Templates 389

Additional Data Storage Topics 390

Storing Raw Data in Azure Databricks for Transformation 390

Storing Data Using Azure HDInsight 392

Storing Prepared, Trained, and Modeled Data 393

Summary 394

Exam Essentials 395

Review Questions 396

Part III Develop Data Processing 399

Chapter 5 Transform, Manage, and Prepare Data 401

Chapter 6 Ingest and Transform Data 402

Transform Data Using Azure Synapse Pipelines 404

Transform Data Using Azure Data Factory 410

Transform Data Using Apache Spark 414

Transform Data Using Transact- SQL 429

Transform Data Using Stream Analytics 431

Cleanse Data 433

Split Data 435

Shred JSON 439

Encode and Decode Data 445

Configure Error Handling for the Transformation 450

Normalize and Denormalize Values 451

Transform Data by Using Scala 461

Perform Exploratory Data Analysis 463

Transformation and Data Management Concepts 473

Transformation 473

Data Management 480

Azure Databricks 481

Data Modeling and Usage 485

Data Modeling with Machine Learning 486

Usage 494

Summary 500

Exam Essentials 500

Review Questions 502

Create and Manage Batch Processing and Pipelines 505

Design and Develop a Batch Processing Solution 507

Design a Batch Processing Solution 510

Develop Batch Processing Solutions 512

Create Data Pipelines 538

Handle Duplicate Data 560

Handle Missing Data 569

Handle Late- Arriving Data 571

Upsert Data 572

Configure the Batch Size 578

Configure Batch Retention 581

Design and Develop Slowly Changing Dimensions 582

Design and Implement Incremental Data Loads 583

Integrate Jupyter/IPython Notebooks into a Data Pipeline 590

Chapter 7 Revert Data to a Previous State 591

Handle Security and Compliance Requirements 592

Design and Create Tests for Data Pipelines 593

Scale Resources 593

Design and Configure Exception Handling 593

Debug Spark Jobs Using the Spark UI 594

Implement Azure Synapse Link and Query the Replicated Data 594

Use PolyBase to Load Data to a SQL Pool 595

Read from and Write to a Delta Table 595

Manage Batches and Pipelines 596

Trigger Batches 597

Schedule Data Pipelines 597

Validate Batch Loads 598

Implement Version Control for Pipeline Artifacts 604

Manage Data Pipelines 607

Manage Spark Jobs in a Pipeline 609

Handle Failed Batch Loads 610

Summary 610

Exam Essentials 611

Review Questions 612

Design and Implement a Data Stream Processing Solution 615

Develop a Stream Processing Solution 617

Design a Stream Processing Solution 618

Create a Stream Processing Solution 630

Process Time Series Data 657

Design and Create Windowed Aggregates 658

Process Data Within One Partition 661

Process Data Across Partitions 663

Upsert Data 665

Handle Schema Drift 674

Configure Checkpoints/Watermarking During Processing 680

Replay Archived Stream Data 685

Design and Create Tests for Data Pipelines 688

Monitor for Performance and Functional Regressions 689

Optimize Pipelines for Analytical or Transactional Purposes 689

Scale Resources 690

Design and Configure Exception Handling 691

Handle Interruptions 694

Ingest and Transform Data 694

Transform Data Using Azure Stream Analytics 694

Monitor Data Storage and Data Processing 695

Monitor Stream Processing 695

Summary 695

Exam Essentials 696

Review Questions 697

Part IV Secure, Monitor, and Optimize Data Storage and Data Processing 699

Chapter 8 Keeping Data Safe and Secure 701

Design Security for Data Policies and Standards 702

Design a Data Auditing Strategy 711

Design a Data Retention Policy 716

Design for Data Privacy 717

Design to Purge Data Based on Business Requirements 719

Design Data Encryption for Data at Rest and in Transit 719

Design Row- Level and Column- Level Security 722

Design a Data Masking Strategy 723

Design Access Control for Azure Data Lake Storage Gen 2 724

Implement Data Security 730

Implement a Data Auditing Strategy 731

Manage Sensitive Information 739

Implement a Data Retention Policy 745

Encrypt Data at Rest and in Motion 748

Implement Row- Level and Column- Level Security 749

Implement Data Masking 753

Manage Identities, Keys, and Secrets Across Different Data Platform Technologies 755

Implement Access Control for Azure Data Lake Storage Gen 2 765

Implement Secure Endpoints (Private and Public) 772

Implement Resource Tokens in Azure Databricks 778

Load a DataFrame with Sensitive Information 779

Write Encrypted Data to Tables or Parquet Files 780

Develop a Batch Processing Solution 781

Handle Security and Compliance Requirements 782

Design and Implement the Data Exploration Layer 784

Browse and Search Metadata in Microsoft Purview Data Catalog 784

Push New or Updated Data Lineage to Microsoft Purview 785

Summary 786

Exam Essentials 787

Review Questions 789

Chapter 9 Monitoring Azure Data Storage and Processing 791

Monitoring Data Storage and Data Processing 793

Implement Logging Used by Azure Monitor 793

Configure Monitoring Services 799

Understand Custom Logging Options 821

Measure Query Performance 822

Monitor Data Pipeline Performance 823

Monitor Cluster Performance 824

Measure Performance of Data Movement 824

Interpret Azure Monitor Metrics and Logs 825

Monitor and Update Statistics about Data Across a System 828

Schedule and Monitor Pipeline Tests 830

Interpret a Spark Directed Acyclic Graph 830

Monitor Stream Processing 832

Implement a Pipeline Alert Strategy 832

Develop a Batch Processing Solution 832

Design and Create Tests for Data Pipelines 832

Develop a Stream Processing Solution 837

Monitor for Performance and Functional Regressions 837

Design and Create Tests for Data Pipelines 838

Azure Monitoring Overview 841

Azure Batch 841

Azure Key Vault 842

Azure SQL 843

Summary 844

Exam Essentials 844

Review Questions 846

Chapter 10 Troubleshoot Data Storage Processing 849

Optimize and Troubleshoot Data Storage and Data Processing 851

Optimize Resource Management 854

Compact Small Files 857

Handle Skew in Data 859

Handle Data Spill 860

Find Shuffling in a Pipeline 862

Tune Shuffle Partitions 864

Tune Queries by Using Indexers 869

Tune Queries by Using Cache 876

Optimize Pipelines for Analytical or Transactional Purposes 877

Optimize Pipeline for Descriptive versus Analytical Workloads 886

Troubleshoot a Failed Spark Job 888

Troubleshoot a Failed Pipeline Run 890

Rewrite User- Defined Functions 899

Design and Develop a Batch Processing Solution 901

Design and Configure Exception Handling 902

Debug Spark Jobs by Using the Spark UI 902

Scale Resources 902

Monitor Batches and Pipelines 904

Handle Failed Batch Loads 904

Design and Develop a Stream Processing Solution 905

Optimize Pipelines for Analytical or Transactional Purposes 905

Handle Interruptions 906

Scale Resources 908

Summary 909

Exam Essentials 910

Review Questions 912

Appendix Answers to Review Questions 915

Chapter 1: Gaining the Azure Data Engineer Associate Certification 916

Chapter 2: CREATE DATABASE dbName; GO 916

Chapter 3: Data Sources and Ingestion 917

Chapter 4: The Storage of Data 918

Chapter 5: Transform, Manage, and Prepare Data 918

Chapter 6. Create and Manage Batch Processing and Pipelines 919

Chapter 7: Design and Implement a Data Stream Processing Solution 920

Chapter 8: Keeping Data Safe and Secure 921

Chapter 9: Monitoring Azure Data Storage and Processing 921

Chapter 10: Troubleshoot Data Storage Processing 922

Index 925

MCA Microsoft Certified Associate Azure Data

    Product form

    £45.12

    Includes FREE delivery

    RRP £47.50 – you save £2.38 (5%)

    Order before 4pm tomorrow for delivery by Mon 6 Jul 2026.

    A Paperback / softback by Benjamin Perkins

    5 in stock

      Trusted by thousands of customers. See 2,385+ Customer Reviews

      View other formats and editions of MCA Microsoft Certified Associate Azure Data by Benjamin Perkins

      Publisher: John Wiley & Sons Inc
      Publication Date: 06/09/2023
      ISBN13: 9781119885429, 978-1119885429
      ISBN10: 1119885426

      Description

      Book Synopsis
      Prepare for the Azure Data Engineering certificationand an exciting new career in analyticswith this must-have study aide In the MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203, accomplished data engineer and tech educator Benjamin Perkins delivers a hands-on, practical guide to preparing for the challenging Azure Data Engineer certification and for a new career in an exciting and growing field of tech. In the book, you'll explore all the objectives covered on the DP-203 exam while learning the job roles and responsibilities of a newly minted Azure data engineer. From integrating, transforming, and consolidating data from various structured and unstructured data systems into a structure that is suitable for building analytics solutions, you'll get up to speed quickly and efficiently with Sybex's easy-to-use study aids and tools. This Study Guide also offers: Career-ready advice for anyone hoping to ace their first data engineering job interview and excel in their first day in the fieldIndispensable tips and tricks to familiarize yourself with the DP-203 exam structure and help reduce test anxietyComplimentary access to Sybex's expansive online study tools, accessible across multiple devices, and offering access to hundreds of bonus practice questions, electronic flashcards, and a searchable, digital glossary of key terms A one-of-a-kind study aid designed to help you get straight to the crucial material you need to succeed on the exam and on the job, the MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203 belongs on the bookshelves of anyone hoping to increase their data analytics skills, advance their data engineering career with an in-demand certification, or hoping to make a career change into a popular new area of tech.

      Table of Contents

      Introduction xxvii

      Part I Azure Data Engineer Certification and Azure Products 1

      Chapter 1 Gaining the Azure Data Engineer Associate Certification 3

      The Journey to Certification 7

      How to Pass Exam DP- 203 8

      Understanding the Exam Expectations and Requirements 9

      Use Azure Daily 17

      Read Azure Articles to Stay Current 17

      Have an Understanding of All Azure Products 20

      Azure Product Name Recognition 21

      Azure Data Analytics 23

      Azure Synapse Analytics 23

      Azure Databricks 26

      Azure HDInsight 28

      Azure Analysis Services 30

      Azure Data Factory 31

      Azure Event Hubs 33

      Azure Stream Analytics 34

      Other Products 35

      Azure Storage Products 36

      Azure Data Lake Storage 37

      Azure Storage 40

      Other Products 42

      Azure Databases 43

      Azure Cosmos DB 43

      Azure SQL Server Products 46

      Additional Azure Databases 46

      Other Products 47

      Azure Security 48

      Azure Active Directory 48

      Role- Based Access Control 51

      Attribute- Based Access Control 53

      Azure Key Vault 53

      Other Products 55

      Azure Networking 56

      Virtual Networks 56

      Other Products 59

      Azure Compute 59

      Azure Virtual Machines 59

      Azure Virtual Machine Scale Sets 60

      Azure App Service Web Apps 60

      Azure Functions 60

      Azure Batch 60

      Azure Management and Governance 60

      Azure Monitor 61

      Azure Purview 61

      Azure Policy 62

      Azure Blueprints (Preview) 62

      Azure Lighthouse 62

      Azure Cost Management and Billing 62

      Other Products 63

      Summary 64

      Exam Essentials 64

      Review Questions 66

      Chapter 2 CREATE DATABASE dbName; GO 69

      The Brainjammer 70

      A Historical Look at Data 71

      Variety 73

      Velocity 74

      Volume 74

      Data Locations 74

      Data File Formats 75

      Data Structures, Types, and Concepts 83

      Data Structures 83

      Data Types and Management 92

      Data Concepts 95

      Data Programming and Querying for Data Engineers 125

      Data Programming 126

      Querying Data 143

      Understanding Big Data Processing 169

      Big Data Stages 169

      Etl, Elt, Eltl 174

      Analytics Types 175

      Big Data Layers 176

      Summary 177

      Exam Essentials 177

      Review Questions 179

      Part II Design and Implement Data Storage 181

      Chapter 3 Data Sources and Ingestion 183

      Where Does Data Come From? 185

      Design a Data Storage Structure 189

      Design an Azure Data Lake Solution 190

      Recommended File Types for Storage 198

      Recommended File Types for Analytical Queries 199

      Design for Efficient Querying 200

      Design for Data Pruning 203

      Design a Folder Structure That Represents the Levels of Data Transformation 203

      Design a Distribution Strategy 205

      Design a Data Archiving Solution 206

      Design a Partition Strategy 207

      Design a Partition Strategy for Files 209

      Design a Partition Strategy for Analytical Workloads 210

      Design a Partition Strategy for Efficiency and Performance 211

      Design a Partition Strategy for Azure Synapse Analytics 211

      Identify When Partitioning Is Needed in Azure Data Lake Storage Gen 2 212

      Design the Serving/Data Exploration Layer 213

      Design Star Schemas 214

      Design Slowly Changing Dimensions 215

      Design a Dimensional Hierarchy 219

      Design a Solution for Temporal Data 220

      Design for Incremental Loading 222

      Design Analytical Stores 223

      Design Metastores in Azure Synapse Analytics and Azure Databricks 224

      The Ingestion of Data into a Pipeline 228

      Azure Synapse Analytics 228

      Azure Data Factory 268

      Azure Databricks 275

      Event Hubs and IoT Hub 301

      Azure Stream Analytics 303

      Apache Kafka for HDInsight 314

      Migrating and Moving Data 316

      Summary 317

      Exam Essentials 317

      Review Questions 319

      Chapter 4 The Storage of Data 321

      Implement Physical Data Storage Structures 322

      Implement Compression 322

      Implement Partitioning 325

      Implement Sharding 328

      Implement Different Table Geometries with Azure Synapse Analytics Pools 329

      Implement Data Redundancy 331

      Implement Distributions 341

      Implement Data Archiving 342

      Azure Synapse Analytics Develop Hub 346

      Implement Logical Data Structures 360

      Build a Temporal Data Solution 361

      Build a Slowly Changing Dimension 365

      Build a Logical Folder Structure 368

      Build External Tables 369

      Implement File and Folder Structures for Efficient Querying and Data Pruning 372

      Implement a Partition Strategy 375

      Implement a Partition Strategy for Files 376

      Implement a Partition Strategy for Analytical Workloads 377

      Implement a Partition Strategy for Streaming Workloads 378

      Implement a Partition Strategy for Azure Synapse Analytics 378

      Design and Implement the Data Exploration Layer 379

      Deliver Data in a Relational Star Schema 379

      Deliver Data in Parquet Files 385

      Maintain Metadata 386

      Implement a Dimensional Hierarchy 386

      Create and Execute Queries by Using a Compute Solution That Leverages SQL Serverless and Spark Cluster 388

      Recommend Azure Synapse Analytics Database Templates 389

      Implement Azure Synapse Analytics Database Templates 389

      Additional Data Storage Topics 390

      Storing Raw Data in Azure Databricks for Transformation 390

      Storing Data Using Azure HDInsight 392

      Storing Prepared, Trained, and Modeled Data 393

      Summary 394

      Exam Essentials 395

      Review Questions 396

      Part III Develop Data Processing 399

      Chapter 5 Transform, Manage, and Prepare Data 401

      Chapter 6 Ingest and Transform Data 402

      Transform Data Using Azure Synapse Pipelines 404

      Transform Data Using Azure Data Factory 410

      Transform Data Using Apache Spark 414

      Transform Data Using Transact- SQL 429

      Transform Data Using Stream Analytics 431

      Cleanse Data 433

      Split Data 435

      Shred JSON 439

      Encode and Decode Data 445

      Configure Error Handling for the Transformation 450

      Normalize and Denormalize Values 451

      Transform Data by Using Scala 461

      Perform Exploratory Data Analysis 463

      Transformation and Data Management Concepts 473

      Transformation 473

      Data Management 480

      Azure Databricks 481

      Data Modeling and Usage 485

      Data Modeling with Machine Learning 486

      Usage 494

      Summary 500

      Exam Essentials 500

      Review Questions 502

      Create and Manage Batch Processing and Pipelines 505

      Design and Develop a Batch Processing Solution 507

      Design a Batch Processing Solution 510

      Develop Batch Processing Solutions 512

      Create Data Pipelines 538

      Handle Duplicate Data 560

      Handle Missing Data 569

      Handle Late- Arriving Data 571

      Upsert Data 572

      Configure the Batch Size 578

      Configure Batch Retention 581

      Design and Develop Slowly Changing Dimensions 582

      Design and Implement Incremental Data Loads 583

      Integrate Jupyter/IPython Notebooks into a Data Pipeline 590

      Chapter 7 Revert Data to a Previous State 591

      Handle Security and Compliance Requirements 592

      Design and Create Tests for Data Pipelines 593

      Scale Resources 593

      Design and Configure Exception Handling 593

      Debug Spark Jobs Using the Spark UI 594

      Implement Azure Synapse Link and Query the Replicated Data 594

      Use PolyBase to Load Data to a SQL Pool 595

      Read from and Write to a Delta Table 595

      Manage Batches and Pipelines 596

      Trigger Batches 597

      Schedule Data Pipelines 597

      Validate Batch Loads 598

      Implement Version Control for Pipeline Artifacts 604

      Manage Data Pipelines 607

      Manage Spark Jobs in a Pipeline 609

      Handle Failed Batch Loads 610

      Summary 610

      Exam Essentials 611

      Review Questions 612

      Design and Implement a Data Stream Processing Solution 615

      Develop a Stream Processing Solution 617

      Design a Stream Processing Solution 618

      Create a Stream Processing Solution 630

      Process Time Series Data 657

      Design and Create Windowed Aggregates 658

      Process Data Within One Partition 661

      Process Data Across Partitions 663

      Upsert Data 665

      Handle Schema Drift 674

      Configure Checkpoints/Watermarking During Processing 680

      Replay Archived Stream Data 685

      Design and Create Tests for Data Pipelines 688

      Monitor for Performance and Functional Regressions 689

      Optimize Pipelines for Analytical or Transactional Purposes 689

      Scale Resources 690

      Design and Configure Exception Handling 691

      Handle Interruptions 694

      Ingest and Transform Data 694

      Transform Data Using Azure Stream Analytics 694

      Monitor Data Storage and Data Processing 695

      Monitor Stream Processing 695

      Summary 695

      Exam Essentials 696

      Review Questions 697

      Part IV Secure, Monitor, and Optimize Data Storage and Data Processing 699

      Chapter 8 Keeping Data Safe and Secure 701

      Design Security for Data Policies and Standards 702

      Design a Data Auditing Strategy 711

      Design a Data Retention Policy 716

      Design for Data Privacy 717

      Design to Purge Data Based on Business Requirements 719

      Design Data Encryption for Data at Rest and in Transit 719

      Design Row- Level and Column- Level Security 722

      Design a Data Masking Strategy 723

      Design Access Control for Azure Data Lake Storage Gen 2 724

      Implement Data Security 730

      Implement a Data Auditing Strategy 731

      Manage Sensitive Information 739

      Implement a Data Retention Policy 745

      Encrypt Data at Rest and in Motion 748

      Implement Row- Level and Column- Level Security 749

      Implement Data Masking 753

      Manage Identities, Keys, and Secrets Across Different Data Platform Technologies 755

      Implement Access Control for Azure Data Lake Storage Gen 2 765

      Implement Secure Endpoints (Private and Public) 772

      Implement Resource Tokens in Azure Databricks 778

      Load a DataFrame with Sensitive Information 779

      Write Encrypted Data to Tables or Parquet Files 780

      Develop a Batch Processing Solution 781

      Handle Security and Compliance Requirements 782

      Design and Implement the Data Exploration Layer 784

      Browse and Search Metadata in Microsoft Purview Data Catalog 784

      Push New or Updated Data Lineage to Microsoft Purview 785

      Summary 786

      Exam Essentials 787

      Review Questions 789

      Chapter 9 Monitoring Azure Data Storage and Processing 791

      Monitoring Data Storage and Data Processing 793

      Implement Logging Used by Azure Monitor 793

      Configure Monitoring Services 799

      Understand Custom Logging Options 821

      Measure Query Performance 822

      Monitor Data Pipeline Performance 823

      Monitor Cluster Performance 824

      Measure Performance of Data Movement 824

      Interpret Azure Monitor Metrics and Logs 825

      Monitor and Update Statistics about Data Across a System 828

      Schedule and Monitor Pipeline Tests 830

      Interpret a Spark Directed Acyclic Graph 830

      Monitor Stream Processing 832

      Implement a Pipeline Alert Strategy 832

      Develop a Batch Processing Solution 832

      Design and Create Tests for Data Pipelines 832

      Develop a Stream Processing Solution 837

      Monitor for Performance and Functional Regressions 837

      Design and Create Tests for Data Pipelines 838

      Azure Monitoring Overview 841

      Azure Batch 841

      Azure Key Vault 842

      Azure SQL 843

      Summary 844

      Exam Essentials 844

      Review Questions 846

      Chapter 10 Troubleshoot Data Storage Processing 849

      Optimize and Troubleshoot Data Storage and Data Processing 851

      Optimize Resource Management 854

      Compact Small Files 857

      Handle Skew in Data 859

      Handle Data Spill 860

      Find Shuffling in a Pipeline 862

      Tune Shuffle Partitions 864

      Tune Queries by Using Indexers 869

      Tune Queries by Using Cache 876

      Optimize Pipelines for Analytical or Transactional Purposes 877

      Optimize Pipeline for Descriptive versus Analytical Workloads 886

      Troubleshoot a Failed Spark Job 888

      Troubleshoot a Failed Pipeline Run 890

      Rewrite User- Defined Functions 899

      Design and Develop a Batch Processing Solution 901

      Design and Configure Exception Handling 902

      Debug Spark Jobs by Using the Spark UI 902

      Scale Resources 902

      Monitor Batches and Pipelines 904

      Handle Failed Batch Loads 904

      Design and Develop a Stream Processing Solution 905

      Optimize Pipelines for Analytical or Transactional Purposes 905

      Handle Interruptions 906

      Scale Resources 908

      Summary 909

      Exam Essentials 910

      Review Questions 912

      Appendix Answers to Review Questions 915

      Chapter 1: Gaining the Azure Data Engineer Associate Certification 916

      Chapter 2: CREATE DATABASE dbName; GO 916

      Chapter 3: Data Sources and Ingestion 917

      Chapter 4: The Storage of Data 918

      Chapter 5: Transform, Manage, and Prepare Data 918

      Chapter 6. Create and Manage Batch Processing and Pipelines 919

      Chapter 7: Design and Implement a Data Stream Processing Solution 920

      Chapter 8: Keeping Data Safe and Secure 921

      Chapter 9: Monitoring Azure Data Storage and Processing 921

      Chapter 10: Troubleshoot Data Storage Processing 922

      Index 925

      Recently viewed products

      © 2026 Book Curl

        • American Express
        • Apple Pay
        • Diners Club
        • Discover
        • Google Pay
        • Maestro
        • Mastercard
        • PayPal
        • Shop Pay
        • Union Pay
        • Visa

        Login

        Forgot your password?

        Don't have an account yet?
        Create account