Analysis of Distributed Snapshot Algorithms

Many problems in distributed systems can be cast in terms of  the problem of detecting global states. For instance, the global state  detection algorithm helps to solve an important class of problems:  stable property detection. A stable property is one that persists: once  a stable property becomes true it remains true thereafter. Examples  of stable properties are “computation has terminate”, “the system is  deadlocked” and “all tokens in a token ring have disappeared”. [3]  Distributed Snapshot algorithms are categorized by underlying  message delivery mechanisms FIFO, Non-FIFO and Causal Ordering.  Through FIFO channels the messages arrive in the order in which they  were transmitted and in Non-FIFO channels the order is not ensured.  Causal Ordering mechanism delivers the messages in the order they  were created.  Snapshot recording durations at each process contribute to the  overall efficiency of the algorithm. In this paper we are presenting  the observed variations in snapshot recording durations at processes  in a distributed system. We conclude with key characteristics of a  reliable and effective snapshot algorithm. Simulations were achieved  using SimGrid Java API.

Related Works

Report on learning methods from the following course in Coursera