NuoDB System Architecture

NuoDB is a distributed, relational database management system that uses a microservice-style architecture that divides processing into three layers:

Management
Transaction
Storage

Figure 1. System Overview

Management Layer (Administration)

The administration layer consists of Admin Processes (APs). The AP is a Java application called nuoadmin. On a bare-metal or VM-based implementation, every node (host) must have an AP running on it. In a containerized implementation such as Kubernetes, far fewer APs are typically deployed and they run in their own separate pods.

The APs also provide a connection brokering and load-balancing service to clients - see Client Connections below.

Transaction Layer (SQL Processing)

Transaction Engines (TEs) run in this layer and perform the parsing, optimization and execution of queries. They fetch data as required, either from another TE or an SM. All changes are passed to the SMs to be committed to disk. Once data has been loaded by a TE, it is stored in its cache for faster access next time. For more information, see Data Caching.

This layer is where Atomicity, Consistency and Isolation are implemented. NuoDB offers both SQL compliance and ACID transactions even though the database is distributed. This is in marked contrast to many NoSQL databases that typically have their own custom "SQL-like" query languages and usually do not offer ACID guarantees.

Multiple TEs are recommended both for high-availability (redundancy) and throughput (to handle the SQL load on the database).

Storage Layer (Persistence)

Storage Managers (SMs) are responsible for persistence (Durability). The TEs communicate with SMs either to request data or to store data. Data is stored in files (ideally on a local disk) and reread as necessary.

In NuoDB, the directory on the disk and all its contents is called an archive (unlike other databases where an archive is a backup). By default, every SM manages a complete copy of the entire database. NuoDB can use an SM archive to restore individual SMs after a planned or unplanned outage. For more information, see Archive Synchronization.

NuoDB aligns with the principles of the CAP Theorem by prioritizing data consistency over availability. When a TE needs to commit changes, it sends the data to all the running SMs. The TE waits for all the SMs to acknowledge the commit before acknowledging the commit to its client. This approach may cause a delay. However, it ensures that all running SMs are always up to date.

For performance and resilience, like most database systems, NuoDB appends all changes to a journal on disk, only periodically writing changes to its archive. Once changes in the journal have been written to the archive, they are removed from the journal (a process called reaping). If a system crash occurs, NuoDB uses the journal to recover the database to a consistent state.

Domain

The set of APs, and any databases they manage, are referred to as a NuoDB Domain. Databases are unique to their domain and the TEs and SMs for a database are unique to that database. TEs and SMs are never shared by more than one database.

NuoDB provides a command-line utility, nuocmd, for domain and database management. The utility communicates with APs via a REST-style interface over HTTP or HTTPS, and the APs, in turn carry out the requests. nuocmd can be used to create archives and databases, start and stop database processes, fetch runtime and debugging information, and much more. The REST API is documented and can be used by your own scripts or applications.

Subdividing a database across subsets of SMs is possible using Data Partitioning. Partitioning allows each SM to contain only a portion of the entire database.

For a form of Eventual Consistency (data is always available, but may be slightly out of date) use Asynchronous SMs.

Client Connections

Processes may come and go in a distributed environment, for example due to scaling in or out. To avoid clients having to keep track of the processes in the database, clients always connect via an AP (any AP, it does not matter which). All the APs have complete knowledge of all processes in the domain at all times and can return the address of a suitable TE for the client to talk to.

Once connected to a TE, the client will run SQL directly against that TE. They only ever need to talk to an AP again if they need new connections. Clients never talk to SMs.

Connections should be made to timeout, allowing the AP to load-balance connections across all TEs over time. This is something developers must ensure - for example by using connection pools that support connection timeout. Why is this important? Suppose a new TE is added to ease the SQL load on existing TEs. As connections to the other TEs timeout, some of the new connections will be to the new TE. If client connections never timed out, they would continue using the original TEs and never connect to the new TE.

See Connection Brokering for more details.

Database Processes

Because both TEs and SMs share common-code, in particular, data caching, they are both implemented by the same C++ application called nuodb in NUODB_HOME/bin. When nuodb starts, a run-time option tells it to run as either a TE or an SM. The nuodb application cannot be invoked directly. It must be started by the domain.