About Safe Commit Protocol

NuoDB strongly recommends that you use the safe commit protocol.

Table Partitions and Storage Groups automatically use the safe commit protocol.

The safe commit protocol is enabled when you start a Transaction Engine.

Note: You need to restart a Transaction Engine in order to change the commit setting of that Transaction Engine.

Note: In managed databases only, the default setting for the commit option is remote:1. In unmanaged databases, the options are whatever the user enters. The default engine commit option is local.

Details about the safe commit protocol are organized as follows:

What the Safe Commit Protocol Does

The safe commit protocol works as follows for successful commits that insert, delete or update data:

  1. The client requests a commit.
  2. The transaction engine sends a pre-commit message to each of the database's available storage managers and transaction engines,(all nodes).
  3. Each available storage manager acknowledges the pre-commit.
  4. The transaction engine sends a commit message to the storage managers.
  5. The transaction engine acknowledges the commit.

Note: A transaction that modifies the database modifies one or more storage groups. These are the modified storage groups of a transaction. In order to commit a transaction T, T must successfully pre-commit to the running SMs that serve the modified storage groups of T.

As for all commit protocols, the acknowledgment to the requesting client indicates that the transaction is visible to other clients. With safe commit, the acknowledgment to the requesting client also guarantees that the transaction commit is durable.

Even when a database uses the safe commit protocol, there are multiple ways that a transaction might be unsuccessful and these instances might have multiple outcomes.

Note: If a storage group SG goes offline, an ongoing transaction that modifies SG will not be resolved as committed or failed until that storage group comes back online (or is deleted). The client waiting to commit that transaction will block. If the commit fails after all modified storage groups come back online the error message will be in the format "Transaction NNNNN failed because storage groups X, Y, Z went offline during commit".

Restarting Databases That Use Safe Commit

When you need to restart a database that uses the safe commit protocol you must do so as follows:

Durability Under Safe Commit

Failure of a database process (TE or SM) is typically transient in that it can be restarted. For example, if a storage manager fails (perhaps due to power failure), you can usually resolve the failure (restore power) and restart the storage manager. In rare cases, storage media can suffer permanent failure that prevents the storage manager from being restarted. This results in the permanent loss of an archive or journal. A permanently failed database process cannot be restarted and so cannot resume serving the database. A failure event is over when the failed database process is replaced with a running database process. For example, a permanently lost disk failure event is over when the failed storage manager is replaced with a running storage manager.

An archive that permanently fails is no longer available. An unavailable archive might cause a database to be unable to enforce durability. This can happen if you need to perform a cold restart of the database and the missing, failed archive is the only archive that contains one or more updates.

Note: A cold restart means that the database is completely shut down; no database processes are running when you start the database. In this configuration, if an archive is permanently lost when restarting a storage group, then durability may be violated (committed transactions may be lost).

To enforce durability in the event of one or more permanently, failed archives, set the max-lost-archives database option to N,N+1 etc.

Note: This table shows settings for up to 5 permanent failures to illustrate the behavior. This is not an enforced limit.

By default, max-lost-archives is set to 0. This means that a database cannot tolerate the permanent loss of any archives.In this configuration, if an archive is permanently lost when restarting a storage group, then durability may be violated (committed transactions may be lost). If you set max-lost-archivesto 1, your database can permanently lose 1 archive at a time and a cold restart will not violate durability. In general, setting max-lost-archives to n ensures durability after a cold restart even if n archives at most were permanently lost.

There is a cost for ensuring durability in the event of permanent storage failure. The database will be unavailable to commit transactions that insert, delete or update data while the number of running storage managers is less than or equal to the value of the max-lost-archives database option. This is how durability is guaranteed if a cold restart of the database is required. Read-only transactions can still be executed. The following image shows this tradeoff:

When configuring a database, you must consider the likelihood of permanently losing a disk, and set max-lost-archives appropriately. Of course, it is not possible to configure safe commit to ensure durability in the event that all archives become permanently lost at one time. Safe commit does not replace a backup plan.

Examples of Safe Commit Behavior

This section has 3 examples as follows:

Scenario 1 - Permanent Storage Loss without Cold Restart

Scenario 2 - Permanent Storage Loss with Cold Restart

Scenario 3 - Permanent Storage Loss with Cold Restart

Consider the following database configuration:

Scenario 1 - Permanent Storage Loss without Cold Restart

In the following scenario, there is a permanent storage loss but a cold restart is not needed. The database continues to run and durability is not violated.

  1. TE1 commits transaction T1.
  2. SM1 crashes but the disk is not lost. SM2 continues running.
  3. TE1 commits transaction T2. This is allowed because when max-lost-archives is set to 0 then only 1 storage manager is required to be running in order to commit transactions that insert, delete or update data. Only SM2 has T2.
  4. Restart SM1.
  5. SM1 synchronizes with SM2. Both SM1 and SM2 have T2.
  6. SM2 crashes and cannot be restarted.
  7. Start SM3.
  8. SM3 synchronizes with SM1. Both SMs have T2.

Scenario 2 - Permanent Storage Loss with Cold Restart

In this scenario, a possible sequence of failures (perhaps a power outage) requires a cold restart, and there is a complicating factor of the permanent failure of SM2 which results (in this case) of the loss of transaction T2 requires a cold restart. Because the database does not continue to run, durability is violated.

  1. TE1 commits transaction T1.
  2. SM1 crashes but the disk is not lost. SM2 continues running.
  3. TE1 commits transaction T2, which modifies the database. Only SM2 has T2.
  4. SM2 crashes and cannot be restarted.
  5. The only transaction engine terminates. The database is down.
  6. Restart SM1. This archive defines the database.
  7. Start a TE. T2 is lost.

Durability is violated because T2 was committed on only one archive (SM2) and that archive was permanently lost before another archive could synchronize with it.

Note: If all Storage Managers crash, Transaction Engines will continue running. So even where a cold restart is required, the transaction engines will crash.

Scenario 3 - Permanent Storage Loss with Cold Restart

Here, we have the same as Scenario 2 - Permanent Storage Loss with Cold Restartwith max-lost-archives set to 1 instead of 0:

  1. TE1 commits transaction T1.
  2. SM1 crashes but the disk is not lost. SM2 continues running.
  3. TE1 tries to commit transaction T2, which would modify the database.
  4. NuoDB does not allow this commit because when max-lost archives is set to 1 then there is a requirement for 2 storage managers to be running in order to commit a transaction that updates, inserts or deletes data.
  5. SM2 crashes and cannot be restarted.
  6. The only transaction engine terminates. The database is down.
  7. Restart SM1. This archive defines the database.
  8. Start a TE. No transactions are lost.

In this scenario, durability is not violated. No transactions were committed when there was only one storage manager running. A cold restart was required and durability was maintained.

Comparison with remote:n Commit Protocol

With the safe commit protocol, you can add or remove storage managers without changing the database's configuration and durability continues to be guaranteed. The safe commit protocol always requires a commit acknowledgment from each running storage manager.

With the remote:n commit protocol, you can replace n with the number of the database's storage managers and achieve the same durability guarantee as the safe commit protocol. However, if you add or remove storage managers and you want to maintain the durability guarantee then you must update the database's configuration.

For example, if your database uses two storage managers and the commit option is set to remote:2. You have the same durability guarantee as if the database was using the safe commit protocol. To continue to enforce the durability guarantee after you add a storage manager, you need to do the following:

  1. Shut down the database's running transaction engines.
  2. Perform the following two steps in either order: