Submitted:
03 September 2025
Posted:
04 September 2025
You are already at the latest version
Abstract

Keywords:
1. Introduction
1.1. Contributions
- (i)
-
Protocol-Level Innovations Within a Ring-Based Framework: This study introduces DCTOP, a novel improvement to the classical LCR protocol while retaining its ring-based design. It introduces:
- Lamport Logical Clock used for message timestamping which achieves efficient concurrent message ordering, reducing latency and improving fairness.
- Dynamic Last-Process Identification to replace LCR’s globally fixed last process assumption, accelerating message stabilization and accelerates delivery.
- (ii)
- Relaxed Failure Assumption: DCTOP reduces the fault tolerance threshold from N = f + 1 to N = 2f + 1, enabling faster message delivery with fewer failures.
- (iii)
- Foundation for Real-World Deployment: While simulations excluded failures and large-scale setups, ongoing work involves a cloud-based, fault-tolerant implementation to validate DCTOP under practical conditions.
2. System Model
3. Daisy Chain Total Order Protocol- DCTOP
- (a)
- First, to improve the latency of LCR by utilizing Lamport logical clocks for sequencing concurrent messages.
- (b)
- Second, to employ a novel concept of the dynamically determined “last” process for ordering concurrent messages, while ensuring optimal achievable throughput.
- (c)
- Third, the relaxation of the crash failure assumption in LCR.
3.1. Data Structures
- Logical clock : This is an integer object initialized to zero. It is used to timestamp messages.
- Stability clock : This is an integer object that holds the largest timestamp, , known to as stable. Initially, is zero.
- Message Buffer (): This field holds the sent or received messages by .
- Delivery Queue (: Messages waiting to be delivered are queued in this queue object.
- Garbage Collection Queue (: After a message is delivered, the message is transferred to to be garbage collected.
- Message origin field shows the id of the process in that initiated the message multicast.
- Message timestampfield holds the timestamp given to M by M_origin.
- Message destination field holds the destination of M which is the CN of the process that sends/forwards M.
- Message flag (M_flag) it is a Boolean field which can be true or false and is initiated to be false when M is formed.
3.2. DCTOP Principles
- (1)
- Message Sending, Receiving and Forwarding: The Lamport logical clock is used to timestamp a message m within the ring network before m is sent. Therefore, denotes the timestamp for message .
- (2)
-
Timestamp Stability: A message timestamp TS, , is said to be stable in a given process if and only if the process is guaranteed not to receive any , any longer.Observations:
- 1)
- A timestamp is also stable in when TS becomes stable in .
- 2)
- The term “stable” is used to refer to the fact that once TS becomes stable in , it remains stable for ever. This usage corresponds to that of “stable” property used by Chandy and Lamport [24]. Therefore, the earliest instance when a given TS becomes stable in will be the interest in the later discussions.
- 3)
- When TS becomes stable in , the process can potentially total order (TO) deliver all received but undelivered , because stability of TS eliminates the possibility of ever receiving any , , in the future.
- (3)
- Crashproofing of Messages: A message m is crashproof if m is in possession of at least () processes. Therefore, a message m is crashproof in when knows that m has been received by at least () processes. The rationale for crashproofing is that when we have at least processes that have received a given message m even if of them crash there will be at least one process that can be relied on in sending m to others and this emphasizes the importance of crashproofness in our system.
3.3. DCTOP Algorithm Main Points
- When forms and sends m, it sets m_flag = false, before it deposits m in its .
-
When receives m and = m_origin
- It checks if ≥ f. If this is true then m is crashproofed, it does not deliver m immediately. Moreover, it sets m_flag = true and deposits m in its . if m is not crashproofed, then m_flag remains false.
- It then checks if ≠ . if this is true, it sets m destination, m_destn = and deposits m in its ,
- Otherwise, m is stable then it updates as = max { , m_ts}, and transfer all m, m_ts ≤ to . Then, it forms µ(m), sets the two header fields, µ(m)_origin=, µ(m)_destn= and deposit µ(m) in
-
When receives µ(m), it knows that every process has received m.
- If m in µ(m) does not indicate a higher stabilisation in , that is, m_ts ≤ and ≥ f then ignores µ(m), otherwise, if < f, sets m_flag= true, µ(m)_destn = and deposit µ(m) in
- However, if m in µ(m) indicates a higher stabilisation in , that is, m_ts > , updates as = max { , m_ts}, and transfer all m, m_ts ≤ to .
- If = , ignores µ(m) otherwise, it sets µ(m)_destn = and deposit µ(m) in
- Whenever is non-empty, deques m from the head of and delivers to application process. then enters a copy of into to represent a successful TO delivery. This action is repeated until becomes empty.
3.4. DCTOP Delivery Requirements
- (i)
- m_ts must be stable in
- (ii)
- m must be crashproof in , and
- (iii)
- Any two stable and crashproofed , and are delivered in total order: is delivered before or and
3.5. Group Membership Changes
3.6. Proof of Correctness
- (a)
- If mi_ts = mk_ts, then orders mk before mi in mBufferi since (This study assumed that when messages have equal timestamp, message from a higher origin is ordered before message from a lower origin.). When receives µ(mi) for message mi it transfers both messages to DQ and can utoDeliver both messages, mk before mi, because TS is already known to be stable because of TS equality.
- (b)
- If mi_ts < mk_ts then orders mi before mk in mBufferi. When receives µ(mi) for message mi it transfers both messages to DQ and can utoDeliver mi only since it is stable and is at the head of DQ. will eventually utoDeliver mk when it receives µ(mk) for mk since it is now at the head of DQ after mi delivery.
- (c)
- Option (a) or (b) is applicable in any other processes within the DCTOP system since there is no membership changes. Thus, if any correct process sends a message m, then it eventually delivers m.
- Case 1:
- (a)
- The link between any pair of consecutive processes in the ring maintains FIFO, and
- (b)
- Processes forward messages in the order they received those messages.
- Case 2:
4. Fairness Control Environment
- (1)
- the is empty, or
- (2)
-
the is not empty and either
- (2.1)
- had forwarded exactly one message originating from every other process or
- (2.2)
- the message at the head of the originates from a process whose message the process had already forwarded.
5. Experiments and Performance Comparison
5.1. Results and Discussion
6. Conclusions and Future Work
Abbreviations
| LCR | Logical Ring and Ring Protocol |
| DCTOP | Daisy Chain Total Order Protocol |
| VC | Vector Clock |
| LC | Logical Clock |
| TO | Total Order |
| CN | Clockwise Neigbhour |
| ACN | Anti-Clockwise Neigbhour |
| SC | Stability Clock |
| DQ | Delivery Queue |
| DCQ | Garbage Collection Queue |
| UTO | Uniform Total Order (uto) |
References
- Choudhury, G. Garimella, A. Patra, D. Ravi, and P. Sarkar, "Crash-tolerant consensus in directed graph revisited." In International Colloquium on Structural Information and Communication Complexity (pp. 55-71). Cham: Springer International Publishing.
- M. Pease, R. Shostak, and L. Lamport, “Reaching agreement in the presence of faults,” Journal of the ACM (JACM), vol. 27, no. 2, 1980, pp. 228-234, . [CrossRef]
- E. W. Vollset, and P. D. Ezhilchelvan, "Design and performance-study of crash-tolerant protocols for broadcasting and reaching consensus in manets." In 24th IEEE Symposium on Reliable Distributed Systems, pp. 166-175.
- M. Correia, D. G. Ferro, F. P. Junqueira, and M. Serafini, "Practical hardening of crash-tolerant systems." In In Proceedings of the 2012 USENIX conference on Annual Technical Conference, pp. 453-466.
- M. Wiesmann, F. Pedone, A. Schiper, B. Kemme, and G. Alonso, "Understanding replication in databases and distributed systems." pp. 464-474. [CrossRef]
- Helal, A. A. Heddaya, and B. B. Bhargava, Replication techniques in distributed systems: Springer Science & Business Media, 2006.
- X. Défago, A. Schiper, and P. Urbán, “Total order broadcast and multicast algorithms: Taxonomy and survey,” ACM Computing Surveys (CSUR), vol. 36, no. 4, pp. 372-421, 2004, . [CrossRef]
- D. Ongaro, and J. Ousterhout, "In search of an understandable consensus algorithm (extended version)," Tech Report. May, 2014. http://ramcloud. stanford. edu/Raft. pdf, (Accessed on June 6, 2024).
- F. Junqueira, and B. Reed, ZooKeeper: distributed process coordination: " O'Reilly Media, Inc.", 2013.
- P. Hunt, M. Konar, F. P. Junqueira, and B. Reed, "{ZooKeeper}: Wait-free Coordination for Internet-scale Systems." In Proceedings of the 2010 USENIX conference on USENIX annual technical conference (USENIXATC'10).
- M. Burrows, "The Chubby lock service for loosely-coupled distributed systems." In Proceedings of the 7th symposium on Operating systems design and implementation (OSDI '06), pp. 335-350.
- Ejem, P. Ezhilchelvan,: Design and Performance Evaluation of High Throughput and Low Latency Total Order Protocol. In: 38th Annual UK Performance Engineering Workshop (2022).
- J. Pu, M. Gao, and H. Qu, “SimpleChubby: a simple distributed lock service.”, https://www.scs.stanford.edu/14au-cs244b/labs/projects/pu_gao_qu.pdf, (Accessed on May 31, 2024).
- L. Lamport, “Paxos made simple,” ACM SIGACT News (Distributed Computing Column) 32, 4 (Whole Number 121, December 2001), pp. 51-58, 2001.
- D. Ongaro, and J. Ousterhout, "In search of an understandable consensus algorithm." In Proceedings of the 2014 USENIX conference on USENIX Annual Technical Conference (USENIX ATC'14), pp. 305-319.
- K. Shvachko, H. Kuang, S. Radia, and R. Chansler, "The hadoop distributed file system." 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), 2010, doi: 10.1109/MSST.2010.5496972, pp. 1-10.
- Ejem, P. Ezhilchevan,: Design and performance evaluation of raft variations. In: 39th Annual UK Performance Engineering Workshop (2023).
- R. Guerraoui, R. R. Levy, B. Pochon, and V. Quéma, “Throughput optimal total order broadcast for cluster environments,” ACM Transactions on Computer Systems (TOCS), vol. 28, no. 2, pp. 1-32, 2010, . [CrossRef]
- Moraru, David G. Andersen, and M. Kaminsky. There is more consensus in Egalitarian parliaments. In: Proceedings of the 24th ACM Symposium on Operating Systems Principles, 2013, 358–372. [CrossRef]
- M. Biely, Z. Milosevic, N. Santos, and A. Schiper, "S-paxos: Offloading the leader for high throughput state machine replication." 2012 IEEE 31st Symposium on Reliable Distributed Systems, 2012, pp. 111-120. [CrossRef]
- R. Guerraoui, R. R. Levy, B. Pochon, and V. Quema. (2006). High Throughput Total Order Broadcast for Cluster Environments. In: International Conference on Dependable Systems and Networks (DNS'06), 2006, pp. 549-557. [CrossRef]
- Toshniwal, S. Taneja, A. Shukla, K. Ramasamy, J. M. Patel, S. Kulkarni, J. Jackson, K. Gade, M. Fu, and J. Donham, "Storm@ twitter." pp. 147-156, In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (SIGMOD '14). [CrossRef]
- M. Chandy, and L. Lamport, “Distributed snapshots: Determining global states of distributed systems,” ACM Transactions on Computer Systems (TOCS), vol. 3, no. 1, pp. 63-75, 1985, . [CrossRef]
- Liskov, and J. Cowling. (2012). Viewstamped Replication Revisited. MIT Technical Report MIT-CSAIL-TR-2012-021, https://pmg.csail.mit.edu/papers/vr-revisited.pdf, (Acessed online on 07/04/2024).
- Birman, and T. Joseph, "Exploiting virtual synchrony in distributed systems." In Proceedings of the eleventh ACM Symposium on Operating systems principles (SOSP '87), pp. 123–138. [CrossRef]
- Y. Amir, and J. Stanton, The spread wide area group communication system. Johns Hopkins University. Center for Networking and Distributed Systems:[Technical Report: CNDS 98-4]. 1998, (Accessed on April 28, 2024).
- Ejem A., Njoku C. N., Uzoh O. F., Odii J. N, "Queue Control Model in a Clustered Computer Network using M/M/m Approach," International Journal of Computer Trends and Technology (IJCTT), vol. 35, no. 1, pp. 12-20, 2016. [CrossRef]









Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).