Admin Process (AP) Quorum
To ensure safety (never returning an incorrect result), any change to the domain configuration requires an Admin Process (AP) quorum.
An AP quorum exists when the APs that are available to the other running APs are a majority of the number of APs in the domain.
Typically, the minimum number required APs for a majority is easy to identify.
You can also apply one of the following formulas according to whether there is an odd number or an even number of APs in the domain membership.
Suppose that there are n
APs in the domain.
Number of APs in the Domain Membership | Minimum Number of Available APs Required for Quorum |
---|---|
Even number |
|
Odd number |
|
For maximum failure tolerance, create an Admin Domain containing an odd number of APs of three or more with an initialMembership greater than one.
This will reduce the chance of losing Admin Quorum in the event of an AP failure or becoming unreachable due to network or hardware failure event.
In addition, including more than one AP in the initialMembership will avoid a possible situation where a single initial membership AP (e.g. admin-0) loses its raftlog storage system and is then unable to rejoin the domain.
Changes to the initialMembership cannot be made after the Domain has been bootstrapped.
For more information about configuring the Admin Domain and the initialMembership , see Configuring NuoDB Admin.
|
Without an AP quorum, a database may continue to run and serve clients. See Confirming Domain and Database Status. |
To perform any of the following tasks, there must be an AP quorum because each of these tasks entails an update to the durable domain configuration, which is maintained by each :
-
Creating a database
-
Cold restart (no processes are running) of an existing database
-
Starting a TE or SM
-
Adding or removing an AP from the domain
NuoDB APs function according to the Raft Consensus Algorithm. The durable domain configuration is implemented by using domain state machines (DSM) that use a Raft log. See Durable Domain Configuration.
When a domain contains an even number of APs then a quorum requires that APs are available for one half plus one of the total count of APs. |
Admin Domain Configuration Examples
Two Admin Processes
If a domain contains two APs they both must be running for there to be an AP quorum. If an AP disconnects from the domain with two APs then the domain no longer has an AP quorum. None of the tasks that change the durable domain configuration can be performed until an AP quorum is restored. While a domain with two APs provides durability if one AP fails, a domain must have at least three APs to continue normal operations if one AP fails.
Three Admin Processes
If a domain contains three APs (as in the example in Admin Process (AP)) it is fully operational as long as two APs are available. If one AP host machine is disconnected from the network, the other APs continue to allow safe operations in the domain. Also, they continue to ping and try to reconnect to the missing AP until it is back online. Upon reconnection, the third AP safely catches up by synchronizing its durable domain configuration with the durable domain configuration of the other two APs.
In a domain that has three APs, the minimum number of available APs required for a quorum is two. However, an important distinction from a domain with two APs is that, in the domain with three APs, one AP can disconnect from the domain and the domain continues to operate safely.
Five Admin Processes
If a domain contains five APs then there is a quorum when there are at least three available APs.
At any given moment, the leader AP (see Admin Process (AP)) determines whether there is a quorum. It does not matter whether there is more than one region. The leader AP determines the number of available APs in the domain regardless of region.
Minimum Admin Processes Required Table
The following table shows the minimum number of APs that are required for a quorum according to the numbers of APs in the durable domain configuration. It also shows the maximum number of APs in a domain that can fail without limiting performance of domain tasks.
Number of APs in the Durable Domain Configuration | Minimum Number of APs Required for Quorum | Maximum Number of APs That Can Fail Without Limiting Operations |
---|---|---|
1 |
1 |
0 |
2 |
2 |
0 |
3 |
2 |
1 |
4 |
3 |
1 |
5 |
3 |
2 |
6 |
4 |
2 |
7 |
4 |
3 |
8 |
5 |
3 |
9 |
5 |
4 |
10 |
6 |
4 |
As you configure your domain, you should consider what portion of your domain can become disconnected without preventing domain operations. In your domain, if you assign APs to multiple regions, remember that it is the number of available APs in the domain, and not in a particular region, that determines whether there is a quorum. For example, suppose there is a domain with the following configuration:
-
APs
AP1
,AP2
, andAP3
are in thewest
region. -
APs
AP4
andAP5
are in thecentral
region. -
AP
AP6
is in theeast
region.
The following figure illustrates this configuration:
This domain has six APs, which means that a quorum requires a minimum of four APs to be available.
If the east
region becomes disconnected, then one AP becomes unavailable, as shown in the following figure:
The domain has an AP quorum as long as at least four of the five APs in the west
and central
regions are available.
If only the central
region is disconnected, then the domain loses two APs as shown in the following figure:
Together, the west
and east
regions have four APs.
As long as they are all available, the domain has an AP quorum.
However, if the central
and east
regions are connected but the west
region becomes disconnected, an AP quorum is no longer possible.
Together, the central
and east
regions have only three available APs, as shown in the following figure:
The loss of the west
region is a potential, single point of failure with regard to tasks that change the durable domain configuration.
This is the case even though two other regions remain connected.