Description
Book SynopsisA guide to the essential techniques for designing and building dependable distributed systems. Instead of covering a broad range of research works for each dependability strategy, it focuses on only a selected few, explaining each in depth, usually with a comprehensive set of examples.
Table of ContentsList of Figures xiii
List of Tables xxi
Acknowledgements xxiii
Preface xxv
References xxviii
1 Introduction to Dependable Distributed Computing 1
1.1 Basic Concepts and Terminologies 2
1.2 Means to Achieve Dependability 9
References 13
2 Logging and Checkpointing 15
2.1 System Model 16
2.2 Checkpoint-Based Protocols 21
2.3 Log Based Protocols 34
References 54
3 Recovery-Oriented Computing 57
3.1 System Model 59
3.2 Fault Detection and Localization 62
3.3 Microreboot 83
3.4 Overcoming Operator Errors 87
References 93
4 Data and Service Replication 974.1 Service Replication 99
4.2 Data Replication 105
4.3 Optimistic Replication 111
4.4 CAP Theorem 131
References 138
5 Group Communication Systems 141
5.1 System Model 143
5.2 Sequencer Based Group Communication System 146
5.3 Sender Based Group Communication System 160
5.4 Vector Clock Based Group Communication System 186
References 191
6 Consensus and the Paxos Algorithms 193
6.1 The Consensus Problem
6.2 The Paxos Algorithm 196
6.3 Multi-Paxos 206
6.4 Dynamic Paxos 210
6.5 Fast Paxos 221
6.6 Implementations of the Paxos Family Algorithms 229
References 236
7 Byzantine Fault Tolerance 239
7.1 The Byzantine Generals Problem 240
7.2 Practical Byzantine Fault Tolerance 255
7.3 Fast Byzantine Agreement 271
7.4 Speculative Byzantine Fault Tolerance 271
References 284