Admin Platform

A NuoDB Domain is defined by a set of coordinating APs peered together.

Each AP uses a local file (its raftlog) to hold all the information about its domain, both its expected and actual state. The APs are completely redundant, any one of them has complete up-to-date knowledge of the domain and all databases defined within it, running or otherwise.

Ensuring Majority

Changes to the domain are only possible whilst a majority (over 50%) of all known APs are running at any given time. If the number of APs falls below that number the domain goes into read-only mode. Databases will keep running but no changes to the domain or database processes are possible and no new connections will be given to clients. This mechanism comes into play if network connectivity fails between some APs or if the AP processes themselves fail or are taken down.

Once sufficient APs are running again, majority can be restored. Majority decisions are handled using the RAFT algorithm, hence the name of the AP’s local file.

Domain Management

Once the APs are running and peering successfully, the domain can be managed using the nuocmd utility. This is the only way to create archives and databases and to start and stop database processes.

A NuoDB database is not just files stored on disk. It is also the engine processes (TEs and SMs) that make it work.

In a bare-metal or VM-based deployment, each TE or SM is started by an AP process. The APs decide amongst themselves which AP is responsible for which engine process(es).

Under Kubernetes, the engines are started inside their own pod by Kubernetes itself. Once the engine has started, it is handed off to an AP to manage it from then on. A dedicated script, nuodocker, is installed in our container for this purpose.

Client Connection Load Balancing

When a client requests a connection, the AP it is talking to will decide the best TE for the client to connect to. By default connections are allocated randomly across all available TEs for a given database.

However the APs load-balancing capability can force connections to, or away from, a specific TE or subset of TEs. Hence, if certain connections can be restricted to querying only a subset of tables, connections can also be restricted to a subset of TEs. Only those TEs will have those tables cached in their local memory. In this way the cache requirements of different TEs can be managed by restricting the data they will cache. Alternatively, different TEs can be restricted to specific types of queries, for example OLTP or OLAP.

See the dedicated section on load-balancing for full details.