How Journaling Ensures Data Recovery

Data Recovery Scenario

Remote Commit with Journaling

Two inserts are made to foo. As before, one will be persisted to the archive before the SM goes offline, the other will not. The TE sends a pre-commit message for each transaction to the SM and waits for an ACK message in return. The SM will only send its ACK message after the journal write is complete. When it is, the SM acknowledges the message to the TE, which then sends a commit ACK back to the client.

The first transaction, foo.1, is queued and then written to the archive. The second transaction, foo.2, is queued to the archive as before, and is not written when the SM terminates. This example assumes the commit messages were not cached in the TE memory when the SM terminated. This scenario (where commit cache is cleared before data is archived), can occur when the SM terminates, when a user kills one or more processes, or when power is lost on the host. The following diagrams illustrate the role of the journal in a use case where data would be temporarily lost and then automatically recovered at SM restart.

As before, two inserts are made to foo. In this example, transaction messages are first written to the journal before a pre-commit acknowledgment is issued. Any transactions that are not yet persistent to the archive when the SM terminates are preserved in the journal.

Note: If the Journal's volume runs out of disk space, the SM will ASSERT ("...file write failed: No space left on device"). To resolve this issue, free up additional disk space, delete the Journal and run nuochk on the archive.

State of the SM Post-termination

When the storage manager goes down, all transaction and atomThe internal object structure representing all data in a NuoDB database. Atoms are self-coordinating objects that represent specific types of information (such as data, indexes or schemas). messages held in its memory are lost, as are the connections to the archive and journal directories. However, transaction engine messages that have been acknowledged are guaranteed to have been written to the journal by the SM. In this example, pre-commit messages for foo.1 and foo.2 were written to the journal before the the pre-commit messages were acknowledged. Thus a record of the transaction survives even if the atoms were not persisted to the archive.

SM Restart and Query

Automatic Data Recovery at SM Restart

Recovery is immediate and fully automated when the storage manager is restarted and pointed at the archive and journal. The journal recovery reader automatically starts and reads the journal into memory. The SM checks journaled messages against data persisted in the archive. If the archive is not consistent with the journal, the SM recreates the data from the journaled messages, and persists it to the archive.

Data Recovered

After the journal recovery reader and archive writer have worked their magic and recovered committed but not persisted data, the archive is updated and when queried returns a complete and consistent view of the database.