Home > Explained > Failover Mechanism Pillars

Failover Mechanism Pillars

August 17th, 2009

A successful failover solution is one that accomplishes its intended goal transparently. There are three elements that contribute to a successful failover mechanism: data replication, node failure detection, and concerted DNS propagation.

Cross Datacenter Failover

Cross Datacenter Failover



Data replication is the mechanism that makes sure changes on the active node, whether those are database or simply content, make their way into the backup node. The assumption here is that the backup node will at some point take over the role the main node. The latter can happen anytime the main node experiences a failure. Data replication can either be real-time or delayed.

Node failure can be caused by several factors such as network failure, process level failure, or infrastructure failure. Regardless of the root cause, the failover mechanism should function as intended. There are several ways to check on the health of a node. There is heart beat and daemon level checks. Heart beat type checks are only useful for local area failover configurations. Regional failover health checks are better performed by probing daemons remotely such as HTTP on port 80 or HTTPS on port 443.

A strong DNS system is essential to a seamless failover solution. That is how domains ultimately resolve to IP addresses. For a seamless failover system, the DNS system has to be dispersed worldwide and have records with very low propagation attributes. Not only does it need to be dispersed to ensure resiliency at the DNS level but all DNS actions have to coordinated amongst all the individual DNS servers.

In this post, we have touched briefly on each of the three elements that make up a failover system. We shall continue covering all aspects of failover in subsequent entries.

Stay tuned!

  1. No comments yet.
  1. No trackbacks yet.