Using Journal Hot Copy

A journal hot copy creates a transactionally-consistent copy of the changes in an SM since the previous journal hot copy.

An SM Running a Journal Hot Copy
Figure 1. An SM Running an Journal Hot Copy
  • Like full and incremental hot copy, journal hot copy includes every storage group the hot copied SM serves.

  • Journal hot copy also requires the use of a backup set.

  • Journal hot copy enables point-in-time restore, allowing restoration to a chosen transaction.

  • Unlike full and incremental hot copy, journal hot copy does not copy atom files, it only copies journal files.

  • A journal hot copy of an SM is approximately the size of all journal files created since the last journal hot copy. This is proportional to the number of changes made in the SM’s archive.

    • Note that for a small database with a high rate of change, the backup set can exceed the size of the original database due to the accumulation of incementals and journals. Run full hot copy more frequently to avoid this.

  • Once a full hot copy has commenced, it is only possible to run journal hot copy against that new backup set. However, journal hot copies can continue while the full hot copy is running. Note that this has implications for Backup Set Location.

  • SMs ordinarily reap (delete) journals once their messages have been processed and changed atoms copied into the archive. Journal hot copy delays that process; the journal must also have been backed up before it can be reaped. This has implications for the size of the journal volume and how often you run journal hot copy.

Journal hot copy is temporarily suspended by archive synchronization. To re-enable journal hot copy after archive synchronization, run a full or incremental hot copy on the SM after archive synchronization completes.

Point in Time Restore

When restoring from a backup set containing journal hot copies, the user must request a transaction to restore. This is known as a point-in-time restore. To restore the latest state possible specify the most recent transaction in the backup set.

NuoDB Archive will apply only those messages from the hot copied journal files necessary to restore this transaction (and its dependencies).

  • If the backup set contains no incremental hot copies, then journal messages will be applied to the full hot copy. However applying journal files during restore can be expensive.

  • If the backup set contains both journal hot copies and incremental hot copies, restore will only apply journal files that were hot copied after the last incremental hot copy that was taken before the transaction being restored. This minimizes restore time.

It is important to take periodic incremental hot copies, or else to start new backup sets regularly with full hot copy. Otherwise the latency of point-in-time restore may become large.

Backup Set Location

If the point-in-time chosen occurs during the time a full hot copy backup was occurring, restoration will require both the backup set created by that full hotcopy and the previous backup set.

  • For this reason, backup sets must be created within the same parent directory.

    • This is not necessary if you are only running incremental backups, but keeping backup sets together in this way does make them easier to manage.

  • When performing a point-in-time restore, simply specify the later backup set and, if needed, nuoarchive will look for the previous backup set stored alongside.

  • When compressing or removing backup sets, make sure to keep at least the two most recent backup sets. In practice this means you need to allow room for three backup sets. Once the third backup set has been created by a full hot copy, you can consider removing the first.

  • Try to schedule full hot copies during a quiet time, if possible, to reduce both the load on the SM and the likelihood of a point-in-time transaction overlapping two backup sets in this manner.

Enabling Journal Hot Copy

To use journal hot copy, an SM must first be enabled with the journal-hot-copy startup option:

nuocmd start process --db-name test --engine-type SM --options journal-hot-copy enable ...

Typically journal hotcopy only needs enabling for some SMs, but it depends on your requirements. For this reason, it is not recommended to set journal-hot-copy enable as a default option when the database is created.

When started with the option journal-hot-copy enable, an SM will retain each journal file on disk until it has been copied into a backup set.

This means that unless journal hot copy is run often enough, an SM running with journal hot copy enabled could exhaust free disk space on the journal device.

  • To avoid this situation, run journal hot copy regularly and ensure that it was successful.

  • If a journal hot copy fails, the journals cannot be reaped.

  • If an SM runs out of disk space it shuts down as it can no longer write changes. By shutting down, it allows the database to continue provided there are other SM(s) running.

Recommendations:

  • You should monitor the size of the journal directory on each hot copy SM to understand the necessary frequency for journal hot copy, based on your application workload.

  • Also monitor for failed hot copies as these can cause the disk to fill with unreaped journals.

Creating a large, empty "balloon" file in the journal directory is also recommended (it can be quickly deleted at any time to provide extra disk space in an emergency).

Journal Hot Copy and Point-in-Time Restore

To request a journal hot copy specify --type journal when executing the hot copy and specify the latest backup set directory.

  • You must use the most-recently created backup set when initiating journal hot copy. This is different to incremental backup which can be run against any backup set containing a successfull full hot copy.

  • The backup set directory does not need to contain a completed full hot copy - the full hot copy may be in progress when the journal hot copy is executed.

Transactions from journal hot copies in a backup set must be restored into a new archive in order to use them to start an SM. Restoration using journal hot copies is a two-step process using the nuoarchive utility.

  1. Identify a transaction of interest in a backup set by its timestamp using the --report-timestamps option of nuoarchive.

    nuoarchive restore --report-timestamps <backup-set-directory>
  2. Restore up to and including that transaction by passing the transaction id to the --restore-snapshot option:

    nuoarchive restore --restore-snapshot <txn-id> --restore-dir <dest-dir> <backup-set-directory>
    • This will extract up to and including the transaction specified by --restore-snapshot into the directory specified by --restore-dir.

    • Note that <dest-dir> must either be empty or not exist.

    • Only committed transactions can be restored.

Which Transaction

In a production system with many transactions running concurrently it can be very hard to identify the transaction to restore to just using --report-timestamps.

A database administrator or application may choose to make note of transaction ids of interest and save them for later with associated information (such as the SQL that was run or other context). Such information might be saved to a dedicated database table.

  • For example, during an important transaction like an ETL, prior to commit, the transaction id might be extracted via gettransactionid() and saved in case it is needed later for point-in-time restore.

Restoring a transaction will restore the effects of that transaction and the effects of all transactions visible to it at the time it executed on a TE (its dependencies, that is all previously committed transactions that the selected transaction depends on).

There is no option to restore all transactions in a journal hot copy in a backup set. However, you can do this by specifying the transaction id of the last transaction listed by nuoarchive restore --report-timestamps.

Examples

Example 1: Running Journal Hot Copy into a Backup Set

In this example:

  • An empty table foo with a single integer column has already been created.

  • We will use archive id 0 to generate our backups.

Procedure:

  1. Before journal hot copy can be executed, a backup set must exist.

    Create a backup set first using full hot copy (this step is identical regardless whether incremental and/or journal backup is being used):

    nuocmd hotcopy database --db-name test --type full --backup-dirs 0 /volumes/backups/test-2022-04-18 --timeout 60
  2. To produce an interesting journal hot copy, we’ll perform three inserts using autocommit into the empty table foo.

    SQL> insert into foo values(1);
    SQL> insert into foo values(2);
    SQL> insert into foo values(3);

    To perform a point-in-time restoration to one of these transactions you will need to know its transaction-id

    1. Use SQL statement and transaction logging to log record transactions as they happen (however this will significantly increase the size of nuaodmin.log).

    2. Switch to explicit transactions for each insert and get the transation id in the SQL:

      SQL> start transaction;
      SQL> insert into foo values(1);
      SQL> select gettransactionid() from dual;  -- Your application must record this
      SQL> commit
      -- Repeat for each insert
  3. Perform a type journal hot copy into the new backup set. Journal backup must always be to your latest backup set.

    nuocmd hotcopy database --db-name test --type journal ---backup-dirs 0 /volumes/backups/test-2022-04-18 --timeout 60
    • Note that, unlike incremental hot copy, the full hot copy invoked to create the backup set does not need to have finished before running the journal hot copy.

Example 2: Restoring to a Point in Time

  1. Use nuoarchive to report the timestamps of transactions available for restore. Three transactions are restorable.

    nuoarchive restore --report-timestamps --timeout 60 /volumes/backups/test-2022-04-18
    
    2021-12-19T20:58:15 6530
    2021-12-19T20:58:16 6658
    2021-12-19T20:58:18 6786
  2. We will restore the transaction with id 6658. It corresponds to the second transaction we committed: insert into foo values (2).

    nuoarchive restore --restore-snapshot 6658 \
                       --restore-dir /volumes/archives/test2 \
                       /volumes/backups/test-2022-04-18
  3. If we start a new database on the restored archive (say test2) we can see that we restored the requested transaction.

    $ nuosql test2 --user <username>
    Password:
    SQL> select * from foo;
    N
    --
    1
    2