Description

Book Synopsis

A practical, step-by-step guide to designing world-class, high availability systems using both classical and DFSS reliability techniques

Whether designing telecom, aerospace, automotive, medical, financial, or public safety systems, every engineer aims for the utmost reliability and availability in the systems he, or she, designs. But between the dream of world-class performance and reality falls the shadow of complexities that can bedevil even the most rigorous design process. While there are an array of robust predictive engineering tools, there has been no single-source guide to understanding and using them . . . until now.

Offering a case-based approach to designing, predicting, and deploying world-class high-availability systems from the ground up, this book brings together the best classical and DFSS reliability techniques. Although it focuses on technical aspects, this guide considers the business and market constraints that require that systems be design

Table of Contents
Preface xiii

List of Abbreviations xvii

1. Introduction 1

2. Initial Considerations for Reliability Design 3

2.1 The Challenge 3

2.2 Initial Data Collection 3

2.3 Where Do We Get MTBF Information? 5

2.4 MTTR and Identifying Failures 6

2.5 Summary 7

3. A Game of Dice: An Introduction to Probability 8

3.1 Introduction 8

3.2 A Game of Dice 10

3.3 Mutually Exclusive and Independent Events 10

3.4 Dice Paradox Problem and Conditional Probability 15

3.5 Flip a Coin 21

3.6 Dice Paradox Revisited 23

3.7 Probabilities for Multiple Dice Throws 24

3.8 Conditional Probability Revisited 27

3.9 Summary 29

4. Discrete Random Variables 30

4.1 Introduction 30

4.2 Random Variables 31

4.3 Discrete Probability Distributions 33

4.4 Bernoulli Distribution 34

4.5 Geometric Distribution 35

4.6 Binomial Coeffi cients 38

4.7 Binomial Distribution 40

4.8 Poisson Distribution 43

4.9 Negative Binomial Random Variable 48

4.10 Summary 50

5. Continuous Random Variables 51

5.1 Introduction 51

5.2 Uniform Random Variables 52

5.3 Exponential Random Variables 53

5.4 Weibull Random Variables 54

5.5 Gamma Random Variables 55

5.6 Chi-Square Random Variables 59

5.7 Normal Random Variables 59

5.8 Relationship between Random Variables 60

5.9 Summary 61

6. Random Processes 62

6.1 Introduction 62

6.2 Markov Process 63

6.3 Poisson Process 63

6.4 Deriving the Poisson Distribution 64

6.5 Poisson Interarrival Times 69

6.6 Summary 71

7. Modeling and Reliability Basics 72

7.1 Introduction 72

7.2 Modeling 75

7.3 Failure Probability and Failure Density 77

7.4 Unreliability, F(t) 78

7.5 Reliability, R(t) 79

7.6 MTTF 79

7.7 MTBF 79

7.8 Repairable System 80

7.9 Nonrepairable System 80

7.10 MTTR 80

7.11 Failure Rate 81

7.12 Maintainability 81

7.13 Operability 81

7.14 Availability 82

7.15 Unavailability 84

7.16 Five 9s Availability 85

7.17 Downtime 85

7.18 Constant Failure Rate Model 85

7.19 Conditional Failure Rate 88

7.20 Bayes’s Theorem 94

7.21 Reliability Block Diagrams 98

7.22 Summary 107

8. Discrete-Time Markov Analysis 110

8.1 Introduction 110

8.2 Markov Process Defined 112

8.3 Dynamic Modeling 116

8.4 Discrete Time Markov Chains 116

8.5 Absorbing Markov Chains 123

8.6 Nonrepairable Reliability Models 129

8.7 Summary 140

9. Continuous-Time Markov Systems 141

9.1 Introduction 141

9.2 Continuous-Time Markov Processes 141

9.3 Two-State Derivation 143

9.4 Steps to Create a Markov Reliability Model 147

9.5 Asymptotic Behavior (Steady-State Behavior) 148

9.6 Limitations of Markov Modeling 154

9.7 Markov Reward Models 154

9.8 Summary 155

10. Markov Analysis: Nonrepairable Systems 156

10.1 Introduction 156

10.2 One Component, No Repair 156

10.3 Nonrepairable Systems: Parallel System with No Repair 165

10.4 Series System with No Repair: Two Identical Components 172

10.5 Parallel System with Partial Repair: Identical Components 176

10.6 Parallel System with No Repair: Nonidentical Components 183

10.7 Summary 192

11. Markov Analysis: Repairable Systems 193

11.1 Repairable Systems 193

11.2 One Component with Repair 194

11.3 Parallel System with Repair: Identical Component Failure and Repair Rates 204

11.4 Parallel System with Repair: Different Failure and Repair Rates 217

11.5 Summary 239

12. Analyzing Confidence Levels 240

12.1 Introduction 240

12.2 pdf of a Squared Normal Random Variable 240

12.3 pdf of the Sum of Two Random Variables 243

12.4 pdf of the Sum of Two Gamma Random Variables 245

12.5 pdf of the Sum of n Gamma Random Variables 246

12.6 Goodness-of-Fit Test Using Chi-Square 249

12.7 Confidence Levels 257

12.8 Summary 264

13. Estimating Reliability Parameters 266

13.1 Introduction 266

13.2 Bayes’ Estimation 268

13.3 Example of Estimating Hardware MTBF 273

13.4 Estimating Software MTBF 273

13.5 Revising Initial MTBF Estimates and Tradeoffs 274

13.6 Summary 277

14. Six Sigma Tools for Predictive Engineering 278

14.1 Introduction 278

14.2 Gathering Voice of Customer (VOC) 279

14.3 Processing Voice of Customer 281

14.4 Kano Analysis 282

14.5 Analysis of Technical Risks 284

14.6 Quality Function Deployment (QFD) or House of Quality 284

14.7 Program Level Transparency of Critical Parameters 287

14.8 Mapping DFSS Techniques to Critical Parameters 287

14.9 Critical Parameter Management (CPM) 287

14.10 First Principles Modeling 289

14.11 Design of Experiments (DOE) 289

14.12 Design Failure Modes and Effects Analysis (DFMEA) 289

14.13 Fault Tree Analysis 290

14.14 Pugh Matrix 290

14.15 Monte Carlo Simulation 291

14.16 Commercial DFSS Tools 291

14.17 Mathematical Prediction of System Capability instead of “Gut Feel” 293

14.18 Visualizing System Behavior Early in the Life Cycle 297

14.19 Critical Parameter Scorecard 297

14.20 Applying DFSS in Third-Party Intensive Programs 298

14.21 Summary 300

15. Design Failure Modes and Effects Analysis 302

15.1 Introduction 302

15.2 What Is Design Failure Modes and Effects Analysis (DFMEA)? 302

15.3 Definitions 303

15.4 Business Case for DFMEA 303

15.5 Why Conduct DFMEA? 305

15.6 When to Perform DFMEA 305

15.7 Applicability of DFMEA 306

15.8 DFMEA Template 306

15.9 DFMEA Life Cycle 312

15.10 The DFMEA Team 324

15.11 DFMEA Advantages and Disadvantages 327

15.12 Limitations of DFMEA 328

15.13 DFMEAs, FTAs, and Reliability Analysis 328

15.14 Summary 330

16. Fault Tree Analysis 331

16.1 What Is Fault Tree Analysis? 331

16.2 Events 332

16.3 Logic Gates 333

16.4 Creating a Fault Tree 335

16.5 Fault Tree Limitations 339

16.6 Summary 339

17. Monte Carlo Simulation Models 340

17.1 Introduction 340

17.2 System Behavior over Mission Time 344

17.3 Reliability Parameter Analysis 344

17.4 A Worked Example 348

17.5 Component and System Failure Times Using Monte Carlo Simulations 359

17.6 Limitations of Using Nontime-Based Monte Carlo Simulations 361

17.7 Summary 365

18. Updating Reliability Estimates: Case Study 367

18.1 Introduction 367

18.2 Overview of the Base Station Controller—Data Only (BSC-DO) System 367

18.3 Downtime Calculation 368

18.4 Calculating Availability from Field Data Only 371

18.5 Assumptions Behind Using the Chi-Square Methodology 372

18.6 Fault Tree Updates from Field Data 372

18.7 Summary 376

19. Fault Management Architectures 377

19.1 Introduction 377

19.2 Faults, Errors, and Failures 378

19.3 Fault Management Design 381

19.4 Repair versus Recovery 382

19.5 Design Considerations for Reliability Modeling 383

19.6 Architecture Techniques to Improve Availability 383

19.7 Redundancy Schemes 384

19.8 Summary 395

20 Application of DFMEA to Real-Life Example 397

20.1 Introduction 397

20.2 Cage Failover Architecture Description 397

20.3 Cage Failover DFMEA Example 399

20.4 DFMEA Scorecard 401

20.5 Lessons Learned 402

20.6 Summary 403

21. Application of FTA to Real-Life Example 404

21.1 Introduction 404

21.2 Calculating Availability Using Fault Tree Analysis 404

21.3 Building the Basic Events 405

21.4 Building the Fault Tree 406

21.5 Steps for Creating and Estimating the Availability Using FTA 408

21.6 Summary 416

22. Complex High Availability System Analysis 420

22.1 Introduction 420

22.2 Markov Analysis of the Hardware Components 420

22.3 Building a Fault Tree from the Hardware Markov Model 427

22.4 Markov Analysis of the Software Components 427

22.5 Markov Analysis of the Combined Hardware and Software Components 433

22.6 Techniques for Simplifying Markov Analysis 437

22.7 Summary 446

References 447

Index 450

Designing High Availability Systems

Product form

£104.36

Includes FREE delivery

RRP £115.95 – you save £11.59 (9%)

Order before 4pm today for delivery by Tue 23 Dec 2025.

A Hardback by Zachary Taylor, Subramanyam Ranganathan

15 in stock


    View other formats and editions of Designing High Availability Systems by Zachary Taylor

    Publisher: John Wiley & Sons Inc
    Publication Date: 10/12/2013
    ISBN13: 9781118551127, 978-1118551127
    ISBN10: 1118551125

    Description

    Book Synopsis

    A practical, step-by-step guide to designing world-class, high availability systems using both classical and DFSS reliability techniques

    Whether designing telecom, aerospace, automotive, medical, financial, or public safety systems, every engineer aims for the utmost reliability and availability in the systems he, or she, designs. But between the dream of world-class performance and reality falls the shadow of complexities that can bedevil even the most rigorous design process. While there are an array of robust predictive engineering tools, there has been no single-source guide to understanding and using them . . . until now.

    Offering a case-based approach to designing, predicting, and deploying world-class high-availability systems from the ground up, this book brings together the best classical and DFSS reliability techniques. Although it focuses on technical aspects, this guide considers the business and market constraints that require that systems be design

    Table of Contents
    Preface xiii

    List of Abbreviations xvii

    1. Introduction 1

    2. Initial Considerations for Reliability Design 3

    2.1 The Challenge 3

    2.2 Initial Data Collection 3

    2.3 Where Do We Get MTBF Information? 5

    2.4 MTTR and Identifying Failures 6

    2.5 Summary 7

    3. A Game of Dice: An Introduction to Probability 8

    3.1 Introduction 8

    3.2 A Game of Dice 10

    3.3 Mutually Exclusive and Independent Events 10

    3.4 Dice Paradox Problem and Conditional Probability 15

    3.5 Flip a Coin 21

    3.6 Dice Paradox Revisited 23

    3.7 Probabilities for Multiple Dice Throws 24

    3.8 Conditional Probability Revisited 27

    3.9 Summary 29

    4. Discrete Random Variables 30

    4.1 Introduction 30

    4.2 Random Variables 31

    4.3 Discrete Probability Distributions 33

    4.4 Bernoulli Distribution 34

    4.5 Geometric Distribution 35

    4.6 Binomial Coeffi cients 38

    4.7 Binomial Distribution 40

    4.8 Poisson Distribution 43

    4.9 Negative Binomial Random Variable 48

    4.10 Summary 50

    5. Continuous Random Variables 51

    5.1 Introduction 51

    5.2 Uniform Random Variables 52

    5.3 Exponential Random Variables 53

    5.4 Weibull Random Variables 54

    5.5 Gamma Random Variables 55

    5.6 Chi-Square Random Variables 59

    5.7 Normal Random Variables 59

    5.8 Relationship between Random Variables 60

    5.9 Summary 61

    6. Random Processes 62

    6.1 Introduction 62

    6.2 Markov Process 63

    6.3 Poisson Process 63

    6.4 Deriving the Poisson Distribution 64

    6.5 Poisson Interarrival Times 69

    6.6 Summary 71

    7. Modeling and Reliability Basics 72

    7.1 Introduction 72

    7.2 Modeling 75

    7.3 Failure Probability and Failure Density 77

    7.4 Unreliability, F(t) 78

    7.5 Reliability, R(t) 79

    7.6 MTTF 79

    7.7 MTBF 79

    7.8 Repairable System 80

    7.9 Nonrepairable System 80

    7.10 MTTR 80

    7.11 Failure Rate 81

    7.12 Maintainability 81

    7.13 Operability 81

    7.14 Availability 82

    7.15 Unavailability 84

    7.16 Five 9s Availability 85

    7.17 Downtime 85

    7.18 Constant Failure Rate Model 85

    7.19 Conditional Failure Rate 88

    7.20 Bayes’s Theorem 94

    7.21 Reliability Block Diagrams 98

    7.22 Summary 107

    8. Discrete-Time Markov Analysis 110

    8.1 Introduction 110

    8.2 Markov Process Defined 112

    8.3 Dynamic Modeling 116

    8.4 Discrete Time Markov Chains 116

    8.5 Absorbing Markov Chains 123

    8.6 Nonrepairable Reliability Models 129

    8.7 Summary 140

    9. Continuous-Time Markov Systems 141

    9.1 Introduction 141

    9.2 Continuous-Time Markov Processes 141

    9.3 Two-State Derivation 143

    9.4 Steps to Create a Markov Reliability Model 147

    9.5 Asymptotic Behavior (Steady-State Behavior) 148

    9.6 Limitations of Markov Modeling 154

    9.7 Markov Reward Models 154

    9.8 Summary 155

    10. Markov Analysis: Nonrepairable Systems 156

    10.1 Introduction 156

    10.2 One Component, No Repair 156

    10.3 Nonrepairable Systems: Parallel System with No Repair 165

    10.4 Series System with No Repair: Two Identical Components 172

    10.5 Parallel System with Partial Repair: Identical Components 176

    10.6 Parallel System with No Repair: Nonidentical Components 183

    10.7 Summary 192

    11. Markov Analysis: Repairable Systems 193

    11.1 Repairable Systems 193

    11.2 One Component with Repair 194

    11.3 Parallel System with Repair: Identical Component Failure and Repair Rates 204

    11.4 Parallel System with Repair: Different Failure and Repair Rates 217

    11.5 Summary 239

    12. Analyzing Confidence Levels 240

    12.1 Introduction 240

    12.2 pdf of a Squared Normal Random Variable 240

    12.3 pdf of the Sum of Two Random Variables 243

    12.4 pdf of the Sum of Two Gamma Random Variables 245

    12.5 pdf of the Sum of n Gamma Random Variables 246

    12.6 Goodness-of-Fit Test Using Chi-Square 249

    12.7 Confidence Levels 257

    12.8 Summary 264

    13. Estimating Reliability Parameters 266

    13.1 Introduction 266

    13.2 Bayes’ Estimation 268

    13.3 Example of Estimating Hardware MTBF 273

    13.4 Estimating Software MTBF 273

    13.5 Revising Initial MTBF Estimates and Tradeoffs 274

    13.6 Summary 277

    14. Six Sigma Tools for Predictive Engineering 278

    14.1 Introduction 278

    14.2 Gathering Voice of Customer (VOC) 279

    14.3 Processing Voice of Customer 281

    14.4 Kano Analysis 282

    14.5 Analysis of Technical Risks 284

    14.6 Quality Function Deployment (QFD) or House of Quality 284

    14.7 Program Level Transparency of Critical Parameters 287

    14.8 Mapping DFSS Techniques to Critical Parameters 287

    14.9 Critical Parameter Management (CPM) 287

    14.10 First Principles Modeling 289

    14.11 Design of Experiments (DOE) 289

    14.12 Design Failure Modes and Effects Analysis (DFMEA) 289

    14.13 Fault Tree Analysis 290

    14.14 Pugh Matrix 290

    14.15 Monte Carlo Simulation 291

    14.16 Commercial DFSS Tools 291

    14.17 Mathematical Prediction of System Capability instead of “Gut Feel” 293

    14.18 Visualizing System Behavior Early in the Life Cycle 297

    14.19 Critical Parameter Scorecard 297

    14.20 Applying DFSS in Third-Party Intensive Programs 298

    14.21 Summary 300

    15. Design Failure Modes and Effects Analysis 302

    15.1 Introduction 302

    15.2 What Is Design Failure Modes and Effects Analysis (DFMEA)? 302

    15.3 Definitions 303

    15.4 Business Case for DFMEA 303

    15.5 Why Conduct DFMEA? 305

    15.6 When to Perform DFMEA 305

    15.7 Applicability of DFMEA 306

    15.8 DFMEA Template 306

    15.9 DFMEA Life Cycle 312

    15.10 The DFMEA Team 324

    15.11 DFMEA Advantages and Disadvantages 327

    15.12 Limitations of DFMEA 328

    15.13 DFMEAs, FTAs, and Reliability Analysis 328

    15.14 Summary 330

    16. Fault Tree Analysis 331

    16.1 What Is Fault Tree Analysis? 331

    16.2 Events 332

    16.3 Logic Gates 333

    16.4 Creating a Fault Tree 335

    16.5 Fault Tree Limitations 339

    16.6 Summary 339

    17. Monte Carlo Simulation Models 340

    17.1 Introduction 340

    17.2 System Behavior over Mission Time 344

    17.3 Reliability Parameter Analysis 344

    17.4 A Worked Example 348

    17.5 Component and System Failure Times Using Monte Carlo Simulations 359

    17.6 Limitations of Using Nontime-Based Monte Carlo Simulations 361

    17.7 Summary 365

    18. Updating Reliability Estimates: Case Study 367

    18.1 Introduction 367

    18.2 Overview of the Base Station Controller—Data Only (BSC-DO) System 367

    18.3 Downtime Calculation 368

    18.4 Calculating Availability from Field Data Only 371

    18.5 Assumptions Behind Using the Chi-Square Methodology 372

    18.6 Fault Tree Updates from Field Data 372

    18.7 Summary 376

    19. Fault Management Architectures 377

    19.1 Introduction 377

    19.2 Faults, Errors, and Failures 378

    19.3 Fault Management Design 381

    19.4 Repair versus Recovery 382

    19.5 Design Considerations for Reliability Modeling 383

    19.6 Architecture Techniques to Improve Availability 383

    19.7 Redundancy Schemes 384

    19.8 Summary 395

    20 Application of DFMEA to Real-Life Example 397

    20.1 Introduction 397

    20.2 Cage Failover Architecture Description 397

    20.3 Cage Failover DFMEA Example 399

    20.4 DFMEA Scorecard 401

    20.5 Lessons Learned 402

    20.6 Summary 403

    21. Application of FTA to Real-Life Example 404

    21.1 Introduction 404

    21.2 Calculating Availability Using Fault Tree Analysis 404

    21.3 Building the Basic Events 405

    21.4 Building the Fault Tree 406

    21.5 Steps for Creating and Estimating the Availability Using FTA 408

    21.6 Summary 416

    22. Complex High Availability System Analysis 420

    22.1 Introduction 420

    22.2 Markov Analysis of the Hardware Components 420

    22.3 Building a Fault Tree from the Hardware Markov Model 427

    22.4 Markov Analysis of the Software Components 427

    22.5 Markov Analysis of the Combined Hardware and Software Components 433

    22.6 Techniques for Simplifying Markov Analysis 437

    22.7 Summary 446

    References 447

    Index 450

    Recently viewed products

    © 2025 Book Curl

      • American Express
      • Apple Pay
      • Diners Club
      • Discover
      • Google Pay
      • Maestro
      • Mastercard
      • PayPal
      • Shop Pay
      • Union Pay
      • Visa

      Login

      Forgot your password?

      Don't have an account yet?
      Create account