Description

Book Synopsis

Gain a foundational understanding of SRE and learn its basic concepts and architectural best practices for deploying Azure IaaS, PaaS, and microservices-based resilient architectures.

The book starts with the base concepts of SRE operations and developer needs, followed by definitions and acronyms of Service Level Agreements in real-world scenarios. Moving forward, you will learn how to build resilient IaaS solutions, PaaS solutions, and microservices architecture in Azure. Here you will go through Azure reference architecture for high-available storage, networking and virtual machine computing, describing Availability Sets and Zones and Scale Sets as main scenarios. You will explore similar reference architectures for Platform Services such as App Services with Web Apps, and work with data solutions like Azure SQL and Azure Cosmos DB. 

Next, you will learn automation to enable SRE with Azure DevOps Pipelines and GitHub Actions. You''ll also gain an unders

Table of Contents
​Chapter 1: The foundation of SRE

This chapter lays out the foundation of Site Resiliency Engineering, founded by Google. From the base concepts of how IT Operations and Developers need to collaborate, to how SRE helps organizations in running business-critical workloads without major downtime

Chapter 2: Service Level Management definitions and acronyms and their meaning in a real-life context

This Chapter describes all common Service Level Agreements (SLA) definitions and acronyms, looked at from a real-world scenario to provide a clear understanding

o Some examples, SLA, SLO, MTTF, MTBF, MTTR,…

Chapter 3: Architecting Resilient Infrastructure as a Service (IaaS) Solutions in Azure

SRE is all about providing ultimate uptime of your organization’s workloads, and this chapter will cover that in relation to Azure IaaS Compute solutions. Explaining the Azure reference architecture for high-available storage, networking and Virtual Machine computing, describing Availability Sets and Zones and ScaleSets as main scenarios. It will also touch on preparing for Disaster Recovery with Azure Backup and Azure Site Recovery, helping you to quickly mitigate outages in case of a failure

Chapter 4: Architecting Resilient Platform as a Service (PaaS) Solutions in Azure

Following on the scenario of Virtual Machines, this chapter details similar reference architectures for Platform Services such as App Services with Web Apps, but also touching on data solutions like Azure SQL and Azure Cosmos DB

Chapter 5: Architecting Resilient Serverless and Microservices architectures in Azure

This third chapter in the reference architecture topic describes how to build high-available, business-critical scenarios using Serverless Functions and Azure LogicApps, as well as Microservices scenarios using Azure Container Instance and Azure Kubernetes Service (AKS).

Chapter 6: Automation to enable SRE with Azure DevOps Pipelines / GitHub Actions

Automation is the cornerstone to SRE, allowing businesses to not only deploy new workloads in a easy way, but also relying on SRE to avoid critical outages or, when an outage occurs, relying on automation to mitigate the problem as fast as possible. Sharing several examples from both Azure DevOps Pipelines and GitHub Actions, this chapter provides the reader a lot of real-life examples to reuse in their own environment

Chapter 7: Efficiently handling blameless post-mortems

Post-Mortems are the way to look back at what caused the outage, and describe any lessons learned for the future, helping in avoiding a similar outage in the future, or assist in quickly fixing an identical incident. Blameless is where the focus is on finding the root-cause of the problem, without pinpointing any individual or team as being the victim. This chapter describes how an open culture around post-mortems dramatically helps in optimizing SRE and the overall company culture around managing and running IT systems and application workloads.

Chapter 8: Monitoring as the key to knowledge

Besides the automated deployments, monitoring is the 2nd big technical topic in any SRE scenario. You can’t manage what you don’t know. This chapter provides an overview of Azure Monitor and Log Analytics, which forms the foundation of monitoring Azure and Hybrid-running workloads. Starting from metrics for the different Azure services touched on in earlier chapters, this chapter also covers how to export logs to 3rd party solutions such as Splunk or integrating dashboarding tools like Grafana

The Art of Site Reliability Engineering SRE with

Product form

£28.49

Includes FREE delivery

RRP £29.99 – you save £1.50 (5%)

Order before 4pm tomorrow for delivery by Wed 14 Jan 2026.

A Paperback / softback by Unai Huete Beloki

Out of stock


    View other formats and editions of The Art of Site Reliability Engineering SRE with by Unai Huete Beloki

    Publisher: APress
    Publication Date: 20/09/2022
    ISBN13: 9781484287033, 978-1484287033
    ISBN10: 1484287037

    Description

    Book Synopsis

    Gain a foundational understanding of SRE and learn its basic concepts and architectural best practices for deploying Azure IaaS, PaaS, and microservices-based resilient architectures.

    The book starts with the base concepts of SRE operations and developer needs, followed by definitions and acronyms of Service Level Agreements in real-world scenarios. Moving forward, you will learn how to build resilient IaaS solutions, PaaS solutions, and microservices architecture in Azure. Here you will go through Azure reference architecture for high-available storage, networking and virtual machine computing, describing Availability Sets and Zones and Scale Sets as main scenarios. You will explore similar reference architectures for Platform Services such as App Services with Web Apps, and work with data solutions like Azure SQL and Azure Cosmos DB. 

    Next, you will learn automation to enable SRE with Azure DevOps Pipelines and GitHub Actions. You''ll also gain an unders

    Table of Contents
    ​Chapter 1: The foundation of SRE

    This chapter lays out the foundation of Site Resiliency Engineering, founded by Google. From the base concepts of how IT Operations and Developers need to collaborate, to how SRE helps organizations in running business-critical workloads without major downtime

    Chapter 2: Service Level Management definitions and acronyms and their meaning in a real-life context

    This Chapter describes all common Service Level Agreements (SLA) definitions and acronyms, looked at from a real-world scenario to provide a clear understanding

    o Some examples, SLA, SLO, MTTF, MTBF, MTTR,…

    Chapter 3: Architecting Resilient Infrastructure as a Service (IaaS) Solutions in Azure

    SRE is all about providing ultimate uptime of your organization’s workloads, and this chapter will cover that in relation to Azure IaaS Compute solutions. Explaining the Azure reference architecture for high-available storage, networking and Virtual Machine computing, describing Availability Sets and Zones and ScaleSets as main scenarios. It will also touch on preparing for Disaster Recovery with Azure Backup and Azure Site Recovery, helping you to quickly mitigate outages in case of a failure

    Chapter 4: Architecting Resilient Platform as a Service (PaaS) Solutions in Azure

    Following on the scenario of Virtual Machines, this chapter details similar reference architectures for Platform Services such as App Services with Web Apps, but also touching on data solutions like Azure SQL and Azure Cosmos DB

    Chapter 5: Architecting Resilient Serverless and Microservices architectures in Azure

    This third chapter in the reference architecture topic describes how to build high-available, business-critical scenarios using Serverless Functions and Azure LogicApps, as well as Microservices scenarios using Azure Container Instance and Azure Kubernetes Service (AKS).

    Chapter 6: Automation to enable SRE with Azure DevOps Pipelines / GitHub Actions

    Automation is the cornerstone to SRE, allowing businesses to not only deploy new workloads in a easy way, but also relying on SRE to avoid critical outages or, when an outage occurs, relying on automation to mitigate the problem as fast as possible. Sharing several examples from both Azure DevOps Pipelines and GitHub Actions, this chapter provides the reader a lot of real-life examples to reuse in their own environment

    Chapter 7: Efficiently handling blameless post-mortems

    Post-Mortems are the way to look back at what caused the outage, and describe any lessons learned for the future, helping in avoiding a similar outage in the future, or assist in quickly fixing an identical incident. Blameless is where the focus is on finding the root-cause of the problem, without pinpointing any individual or team as being the victim. This chapter describes how an open culture around post-mortems dramatically helps in optimizing SRE and the overall company culture around managing and running IT systems and application workloads.

    Chapter 8: Monitoring as the key to knowledge

    Besides the automated deployments, monitoring is the 2nd big technical topic in any SRE scenario. You can’t manage what you don’t know. This chapter provides an overview of Azure Monitor and Log Analytics, which forms the foundation of monitoring Azure and Hybrid-running workloads. Starting from metrics for the different Azure services touched on in earlier chapters, this chapter also covers how to export logs to 3rd party solutions such as Splunk or integrating dashboarding tools like Grafana

    Recently viewed products

    © 2026 Book Curl

      • American Express
      • Apple Pay
      • Diners Club
      • Discover
      • Google Pay
      • Maestro
      • Mastercard
      • PayPal
      • Shop Pay
      • Union Pay
      • Visa

      Login

      Forgot your password?

      Don't have an account yet?
      Create account