Skip to topic | Skip to bottom
Minfo.DistTD6r1.18 - 04 Oct 2016 - 09:12 - FrancoiseBaudetopic end

Start of topic | Skip to actions
Exercise for Monday 15 of October 2015 You have to write the answer for the Exercise number 1 of the Exam from 2009. This exercise is about deadlock detection. It takes some inspiration from the Dijkstra-VanGastern termination detection algorithm. Here is a correction


The exercices related to this chapter content, plus some additional exercices taken from Chapter9 from the book of Ghosh.

Here you find the best solutions given by students last years, one in French, and another in English.

For those who are interested, they can study this other termination algorithm, due to Dijkstra and Scholten, published in Information Processing Letters in 1980. A nice synthesis is also available in the book of Ghosh. This is an interesting probe-echo algorithm. If you decide to summarize how it works, you can illustrate it using a simple network of processes.

You find a correction regarding exercice 5 of Ghosh's exercices here

Here Proof of the 4 counters algorithm (sorry in French) for termination detection, by Mattern, as a complement of Exercise 1.

It is encouraged to read e.g. chapter 14 of Ghosh Book which constitutes a very nice summary of key problems in distributed transactions management.

To complement the study of commit of distributed transactions, you can solve the last exercise that is recalled next: Explain in your own words why the 3 phases commit protocol is a non-blocking one compared to the 2 phases commit protocol.

No limpid answer was found in the various student copies in the previous years, that could be provided as good answer for all. An hopefully better explanation is below.

From wi state, in the 2PC, a cohort may be blocked in case the coordinator has failed; and ci needs to wait until coordinator recovers in order to learn about the decision taken at coordinator before its failure: This gives a blocking nature of the 2PC.

On the contrary, in 3PC, from wi state a cohort is not blocked, but will automatically on timeout if the prepare message or abort message is not received, change to abort state (ai). The non reception of the decision message can indeed be due to the coordinator failure.

The key point here is to ask whether it could happen that the coordinator eventually decide to commit (as all cohorts agreed initially to commit) but at least one of the cohorts has changed to ai state (and thus will not apply the uniform decision) ? The risk being that on recovery, the coordinator executes a commit on its side, but not all cohorts apply this same decision. The reply to this question is by chance NO: indeed there are two cases

1) the coordinator had no time to decide of a commit and send the Prepare msg to all cohorts, but fails in between: it will change to a1 so, it will not commit. And the same will happen due to this timeout on each ci for all i>1 => all go to ai state eventually. Ci are not blocked waiting the coordinator decision.

2) the coordinator has been able to send the 'Prepare to commit' message to all cohorts. It now waits for an ack but if no all ack arrive back it will change to a1 and send an abort to all ci; if a ci is already in pi state, this abort message will make it change to ai. Everyone will abort.

The coordinator decision is to commit, after all acks are received. This is confirmed by sending a Commit msg. In 2PC this could be blocking in case the coordinator fails. Here, this is not. Indeed, once the ack are all received, in case of failure of c1, it will always recover into c1 state. Meaning even if coordinator had no time to send Commit message to cohorts, commit on c1 will happen. On ci: once the ack has been sent back, and if no explicit abort message from coordinator arrives, ci will always commit. In all below 3 situations, ci will always commit: a) either it is lucky to receive the Commit message from coordinator; b) if it does not receive it after a timeout (meaning the message Commit was even not sent), c) and even in case of ci failure then recovery. All this, again, gives this protocol its non-blocking property.


A Zookeper version for the 2PC: consult this interesting recipe described here

Commment about the number of messages complexity, that is quoted as being a drawback: "There are two important drawbacks of the approach described above. One is the message complexity, which is O(nē)."

-- FrancoiseBaude - 04 Oct 2016
to top

You are here: Minfo > DistributedAlgo > DistTD6

to top

Copyright © 1999-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding WIKIDeptinfo? Send feedback