Automatic Management of NuoDB State

Introduction

In Kubernetes-based deployments, the creation of NuoDB objects in the NuoDB domain state is automated by nuodocker, which is the entrypoint script for Admin, TE, and SM containers. For example, nuodocker start sm is the entrypoint for an SM container, which creates an archive object and a database object (if they do not already exist) before starting an SM process. This is in contrast with bare-metal deployments, where users manage NuoDB domain state by interacting with NuoDB directly, using the nuocmd command-line tool.

In Kubernetes-based deployments, NuoDB objects like NuoDB Admin Processes, archives, databases, and database processes are created automatically by nuodocker, while users typically interact only with Kubernetes. To allow users to manage NuoDB clusters indirectly via Kubernetes, a mapping exists between Kubernetes objects and NuoDB objects:

  • Pods controlled by the Admin StatefulSet (identified by Pod name) are associated with Admin servers (identified by server ID).

  • Pods controlled by the database StatefulSet (identified by Pod name) are associated with SM processes (identified by start ID).

  • Pods controlled by the database Deployment (identified by Pod name) are associated with TE processes (identified by start ID).

  • Persistent Volumes Claims (PVCs) bound to SM Pods (identified by claim name) are associated with archives (identified by archive ID).

  • Database StatefulSet and Deployment objects for SM and TE respectively are associated with a database.

To enable observability of NuoDB state via these Kubernetes objects, readiness probes are defined for NuoDB Pods that invoke nuocmd check subcommands. The section below describes how NuoDB objects are automatically updated or deleted in response to Kubernetes events.

Kubernetes-Aware Admin (KAA)

The NuoDB Admin layer has Operator-like capabilities that allow it keep Kubernetes objects in-sync with the corresponding NuoDB objects throughout all stages of the object lifecycle, including deletion. This includes the following use cases:

  1. When the Admin StatefulSet is scaled down, the Admin Processes that will no longer be scheduled are excluded from consensus, but not removed from the Raft membership, so that the Admin Process can be included when the Admin StatefulSet is scaled back up.

  2. When the database StatefulSet (SMs) is scaled down, the archive IDs associated with the PVCs that will no longer be scheduled are removed, but not purged, so that they can be reused when the database StatefulSet is scaled back up.

  3. When the PVC associated associated with an archive is deleted, the archive ID is removed and purged, because it is no longer bound by the StatefulSet to any SM Pods that are scheduled.

  4. When the container associated with a particular process start ID exits, either because the Pod was deleted or the container exited and was subsequently replaced by a new one within the same Pod, the process is removed from the domain state.

The functionality above is supported by the Kubernetes-Aware Admin (KAA) module, which is enabled by default in Kubernetes-based deployments that follow the conventions described below.

Kubernetes Conventions

In order to enable the resync use cases above, a set of conventions is defined that Kubernetes implementations must follow so that the Admin can unambiguously map Kubernetes objects to NuoDB objects and vice-versa. These conventions are already followed by the NuoDB Helm Charts (ignoring configurations that use DaemonSets for SMs) and are listed below:

  • Admins and SMs are defined by StatefulSets, and TEs are defined by Deployments.

  • The Admin StatefulSet creates Pods with names that are identical to the server IDs.

  • The Admin StatefulSet defines a volumeClaimTemplate with name raftlog, which results in Admin Pods having the volume raftlog used to store Raft data. For example:

    apiVersion: "apps/v1"
    kind: StatefulSet
    metadata:
      annotations:
        description: |-
          NuoAdmin statefulset resource for NuoDB Admin layer.
       ...
      name: {{ template "admin.fullname" . }}
    spec:
      ...
      volumeClaimTemplates:
      - metadata:
          # PVC for Raft data must be named "raftlog"
          name: raftlog
          labels:
            ...
  • Database StatefulSets and Deployments contain the database name in the label database.

  • Database StatefulSets each define a volumeClaimTemplate with name archive-volume, which results in SM Pods having the volume archive-volume used to store archive data. For example:

    apiVersion: apps/v1
    kind: StatefulSet
    metadata:
      labels:
        database: {{ .Values.database.name }}
        ...
      name: sm-{{ template "database.fullname" . }}
    spec:
      ...
      volumeClaimTemplates:
      - metadata:
          # PVC for archive must be named "archive-volume"
          name: archive-volume
          labels:
            database: {{ .Values.database.name }}
            ...
  • A Kubernetes service account and role-binding is defined that gives access to various Kubernetes resources, including Pods, StatefulSets, PVCs (verbs get, list, and watch). The service account credentials must be made available to the Admin and database process Pods. The service account also has privileges to inspect, create, and update Lease objects (verbs get, create, and update), which are used for coordination among Admins when performing resync.

Resync Using Kubernetes State - A Working Example

With explicit mappings between Kubernetes objects and NuoDB objects, it is possible for the Admin to inspect Kubernetes state and to execute Raft commands to cause NuoDB domain state to converge with it, as described above. Each Admin Process receives a stream of Kubernetes state-change events which begins with an image of the state at the time that the Admin begins listening.

For the examples below, assume the following minimally-redundant NuoDB deployment consisting of 2 Admins, 2 SMs, and 2 TEs.

kubectl get pod
NAME                                                  READY   STATUS    RESTARTS   AGE
dom-nuodb-cluster0-admin-0                            1/1     Running   0          42m
dom-nuodb-cluster0-admin-1                            1/1     Running   0          10m
sm-db-nuodb-cluster0-demo-database-0                  1/1     Running   0          84s
sm-db-nuodb-cluster0-demo-database-1                  1/1     Running   0          5m44s
te-db-nuodb-cluster0-demo-database-6d9c946569-ghvgf   1/1     Running   0          4m27s
te-db-nuodb-cluster0-demo-database-6d9c946569-tv2rb   1/1     Running   0          3m3s
kubectl exec dom-nuodb-cluster0-admin-0 -- nuocmd show domain
server version: 4.1.vee-2-644d1d6206, server license: Enterprise
server time: 2020-08-05T14:52:45.414, client token: a0405af0f77144187d3ded054295abd60bba9bc1
Servers:
  [dom-nuodb-cluster0-admin-0] dom-nuodb-cluster0-admin-0.nuodb.default.svc.cluster.local:48005 [last_ack = 1.64] [member = ADDED] [raft_state = ACTIVE] (LEADER, Leader=dom-nuodb-cluster0-admin-0, log=0/182/182) Connected *
  [dom-nuodb-cluster0-admin-1] dom-nuodb-cluster0-admin-1.nuodb.default.svc.cluster.local:48005 [last_ack = 1.63] [member = ADDED] [raft_state = ACTIVE] (FOLLOWER, Leader=dom-nuodb-cluster0-admin-0, log=0/182/182) Connected
Databases:
  demo [state = RUNNING]
    [TE] te-db-nuodb-cluster0-demo-database-6d9c946569-tv2rb/172.17.0.11:48006 [start_id = 12] [server_id = dom-nuodb-cluster0-admin-0] [pid = 39] [node_id = 3] [last_ack =  5.00] MONITORED:RUNNING
    [TE] te-db-nuodb-cluster0-demo-database-6d9c946569-ghvgf/172.17.0.7:48006 [start_id = 13] [server_id = dom-nuodb-cluster0-admin-1] [pid = 39] [node_id = 2] [last_ack =  6.00] MONITORED:RUNNING
    [SM] sm-db-nuodb-cluster0-demo-database-1/172.17.0.9:48006 [start_id = 15] [server_id = dom-nuodb-cluster0-admin-0] [pid = 59] [node_id = 4] [last_ack =  5.00] MONITORED:RUNNING
    [SM] sm-db-nuodb-cluster0-demo-database-0/172.17.0.8:48006 [start_id = 16] [server_id = dom-nuodb-cluster0-admin-1] [pid = 59] [node_id = 5] [last_ack =  2.90] MONITORED:RUNNING

Use Case 1: Admin Scale-down

To support Admin scale-down, all Admin Processes need to have access to the Admin StatefulSet (assuming that there is only one Admin StatefulSet for a namespace), and each should exclude from consensus all server IDs in the Raft membership with ordinals greater than or equal to current replica count for the Admin StatefulSet. For example, if the Admin StatefulSet has replicas: 3 and the Raft membership consists of server IDs admin-0, admin-1, admin-2, admin-3, admin-4, then admin-3 and admin-4 should be excluded from consensus. This is done automatically by the Admin Processes by overriding the membership in the Raft membership state machine.

Consider the deployment environment above, which has replicas: 2 for the Admin StatefulSet dom-nuodb-cluster0-admin. If we invoke kubectl scale --replicas=1 on dom-nuodb-cluster0-admin, then the Admin with server ID dom-nuodb-cluster0-admin-1 will no longer be scheduled by Kubernetes. But a majority of 2 is 2, so dom-nuodb-cluster0-admin-0 will automatically exclude dom-nuodb-cluster0-admin-1 from consensus in order to allow configuration changes to be made in its absence.

kubectl scale --replicas=1 statefulset dom-nuodb-cluster0-admin
statefulset.apps/dom-nuodb-cluster0-admin scaled
kubectl exec dom-nuodb-cluster0-admin-0 -- nuocmd show domain
server version: 4.1.vee-2-644d1d6206, server license: Enterprise
server time: 2020-08-05T15:15:03.729, client token: 38590c7b6cbfaab83be4c1ef2c57eb0d4ce977bd
Servers:
  [dom-nuodb-cluster0-admin-0] dom-nuodb-cluster0-admin-0.nuodb.default.svc.cluster.local:48005 [last_ack = 1.35] [member = ADDED] [raft_state = ACTIVE] (LEADER, Leader=dom-nuodb-cluster0-admin-0, log=0/182/182) Connected *
  [dom-nuodb-cluster0-admin-1] dom-nuodb-cluster0-admin-1.nuodb.default.svc.cluster.local:48005 [last_ack = 7.36] [member = ADDED] [raft_state = ACTIVE] (FOLLOWER, Leader=dom-nuodb-cluster0-admin-0, log=0/182/182) Evicted
Databases:
  demo [state = RUNNING]
    [TE] te-db-nuodb-cluster0-demo-database-6d9c946569-tv2rb/172.17.0.11:48006 [start_id = 12] [server_id = dom-nuodb-cluster0-admin-0] [pid = 39] [node_id = 3] [last_ack =  3.15] MONITORED:RUNNING
    [TE] te-db-nuodb-cluster0-demo-database-6d9c946569-ghvgf/172.17.0.7:48006 [start_id = 13] [server_id = dom-nuodb-cluster0-admin-1] [pid = 39] [node_id = 2] [last_ack = 14.17] MONITORED:RUNNING
    [SM] sm-db-nuodb-cluster0-demo-database-1/172.17.0.9:48006 [start_id = 15] [server_id = dom-nuodb-cluster0-admin-0] [pid = 59] [node_id = 4] [last_ack =  3.16] MONITORED:RUNNING
    [SM] sm-db-nuodb-cluster0-demo-database-0/172.17.0.8:48006 [start_id = 16] [server_id = dom-nuodb-cluster0-admin-1] [pid = 59] [node_id = 5] [last_ack = 10.99] MONITORED:RUNNING

Note that dom-nuodb-cluster0-admin-1 still appears in nuocmd show domain output, as do all database processes that are connected to it (start IDs 13 and 16), but is shown as Evicted to signal that it is not participating in Raft consensus. To permanently remove the scaled-down Admin from the membership, the PVC for the scaled-down Admin can be manually deleted by the user, as follows:

kubectl delete pvc raftlog-dom-nuodb-cluster0-admin-1
persistentvolumeclaim "raftlog-dom-nuodb-cluster0-admin-1" deleted
kubectl exec dom-nuodb-cluster0-admin-0 -- nuocmd show domain
server version: 4.1.vee-2-644d1d6206, server license: Enterprise
server time: 2020-08-05T15:19:29.365, client token: 497c19844e489c6307a2dd315bc43d3475f02191
Servers:
  [dom-nuodb-cluster0-admin-0] dom-nuodb-cluster0-admin-0.nuodb.default.svc.cluster.local:48005 [last_ack = 0.88] [member = ADDED] [raft_state = ACTIVE] (LEADER, Leader=dom-nuodb-cluster0-admin-0, log=0/183/183) Connected *
Databases:
  demo [state = RUNNING]
    [TE] te-db-nuodb-cluster0-demo-database-6d9c946569-tv2rb/172.17.0.11:48006 [start_id = 12] [server_id = dom-nuodb-cluster0-admin-0] [pid = 39] [node_id = 3] [last_ack =  8.76] MONITORED:RUNNING
    [TE] te-db-nuodb-cluster0-demo-database-6d9c946569-ghvgf/172.17.0.7:48006 [start_id = 13] [server_id = dom-nuodb-cluster0-admin-1] [pid = 39] [node_id = 2] [last_ack = >60] MONITORED:UNREACHABLE(RUNNING)
    [SM] sm-db-nuodb-cluster0-demo-database-1/172.17.0.9:48006 [start_id = 15] [server_id = dom-nuodb-cluster0-admin-0] [pid = 59] [node_id = 4] [last_ack =  8.76] MONITORED:RUNNING
    [SM] sm-db-nuodb-cluster0-demo-database-0/172.17.0.8:48006 [start_id = 16] [server_id = dom-nuodb-cluster0-admin-1] [pid = 59] [node_id = 5] [last_ack = >60] MONITORED:UNREACHABLE(RUNNING)

This leaves the database processes that were connected to that Admin in the domain state. They can be manually restarted by deleting the Pods, which will cause the SM and TE to be replaced by ones connected to the remaining Admin:

kubectl delete pod sm-db-nuodb-cluster0-demo-database-0 te-db-nuodb-cluster0-demo-database-6d9c946569-ghvgf

Use Case 2: SM Scale-down

SM scale-down is similar to Admin scale-down, except that the Admin has to map the unscheduled SM Pods to the archive IDs to be removed. The Admin performs the following actions when it detects an SM scale-down event:

  1. Find all of the non-running archive IDs for the database whose StatefulSet was scaled down.

  2. Find the most recent tombstone for each non-running archive ID.

  3. For each archive ID, if the pod-name NuoDB process label in its most recent tombstone has ordinal greater than the current replica count, remove (but do not purge) the archive.

The archive is removed but not purged so that if the database StatefulSet is scaled back up, the archive can be resurrected.

Consider the deployment environment above, which has replicas: 2 for the SM StatefulSet sm-db-nuodb-cluster0-demo-database. If we invoke kubectl scale --replicas=1 on sm-db-nuodb-cluster0-demo-database, then the SM sm-db-nuodb-cluster0-demo-database-1 will no longer be scheduled by Kubernetes. Since storage group leader assignment requires that archive histories for all archive objects for the database are collected, the Admin must exclude the archive object so that the absence of an SM on that archive does not block database restart.

kubectl exec dom-nuodb-cluster0-admin-0 -- nuocmd show domain
server version: 4.1.vee-2-644d1d6206, server license: Enterprise
server time: 2020-08-05T15:36:15.209, client token: 3b7260437f92d4867df91ff75abf60e9f3bddd81
Servers:
  [dom-nuodb-cluster0-admin-0] dom-nuodb-cluster0-admin-0.nuodb.default.svc.cluster.local:48005 [last_ack = 0.37] [member = ADDED] [raft_state = ACTIVE] (LEADER, Leader=dom-nuodb-cluster0-admin-0, log=0/207/207) Connected *
Databases:
  demo [state = RUNNING]
    [TE] te-db-nuodb-cluster0-demo-database-6d9c946569-tv2rb/172.17.0.11:48006 [start_id = 12] [server_id = dom-nuodb-cluster0-admin-0] [pid = 39] [node_id = 3] [last_ack =  4.52] MONITORED:RUNNING
    [SM] sm-db-nuodb-cluster0-demo-database-1/172.17.0.9:48006 [start_id = 15] [server_id = dom-nuodb-cluster0-admin-0] [pid = 59] [node_id = 4] [last_ack =  4.52] MONITORED:RUNNING
    [TE] te-db-nuodb-cluster0-demo-database-6d9c946569-4js2b/172.17.0.5:48006 [start_id = 17] [server_id = dom-nuodb-cluster0-admin-0] [pid = 41] [node_id = 6] [last_ack =  8.51] MONITORED:RUNNING
    [SM] sm-db-nuodb-cluster0-demo-database-0/172.17.0.7:48006 [start_id = 18] [server_id = dom-nuodb-cluster0-admin-0] [pid = 59] [node_id = 7] [last_ack =  4.42] MONITORED:RUNNING
kubectl scale --replicas=1 statefulset sm-db-nuodb-cluster0-demo-database
statefulset.apps/sm-db-nuodb-cluster0-demo-database scaled
kubectl exec dom-nuodb-cluster0-admin-0 -- nuocmd show domain
server version: 4.1.vee-2-644d1d6206, server license: Enterprise
server time: 2020-08-05T15:36:39.996, client token: 6fa1dd68425ce936a956f797279f3023034cb112
Servers:
  [dom-nuodb-cluster0-admin-0] dom-nuodb-cluster0-admin-0.nuodb.default.svc.cluster.local:48005 [last_ack = 1.15] [member = ADDED] [raft_state = ACTIVE] (LEADER, Leader=dom-nuodb-cluster0-admin-0, log=0/212/212) Connected *
Databases:
  demo [state = RUNNING]
    [TE] te-db-nuodb-cluster0-demo-database-6d9c946569-tv2rb/172.17.0.11:48006 [start_id = 12] [server_id = dom-nuodb-cluster0-admin-0] [pid = 39] [node_id = 3] [last_ack =  9.32] MONITORED:RUNNING
    [TE] te-db-nuodb-cluster0-demo-database-6d9c946569-4js2b/172.17.0.5:48006 [start_id = 17] [server_id = dom-nuodb-cluster0-admin-0] [pid = 41] [node_id = 6] [last_ack =  3.98] MONITORED:RUNNING
    [SM] sm-db-nuodb-cluster0-demo-database-0/172.17.0.7:48006 [start_id = 18] [server_id = dom-nuodb-cluster0-admin-0] [pid = 59] [node_id = 7] [last_ack =  9.22] MONITORED:RUNNING
kubectl exec dom-nuodb-cluster0-admin-0 -- nuocmd show archives
[0] <NO VALUE> : /var/opt/nuodb/archive/nuodb/demo @ demo [journal_path = ] [snapshot_archive_path = ] RUNNING
  [SM] sm-db-nuodb-cluster0-demo-database-0/172.17.0.7:48006 [start_id = 18] [server_id = dom-nuodb-cluster0-admin-0] [pid = 59] [node_id = 7] [last_ack =  8.10] MONITORED:RUNNING
kubectl exec dom-nuodb-cluster0-admin-0 -- nuocmd show archives --removed
[1] <NO VALUE> : /var/opt/nuodb/archive/nuodb/demo @ demo [journal_path = ] [snapshot_archive_path = ] REMOVED(NOT_RUNNING)
  [SM] sm-db-nuodb-cluster0-demo-database-1/172.17.0.9:48006 [start_id = 15] [server_id = dom-nuodb-cluster0-admin-0] [pid = 59] [node_id = 4] EXITED(REQUESTED_SHUTDOWN:SHUTTING_DOWN):(2020-08-05T15:36:32.197+0000) Gracefully shutdown engine (?)

Note that the database object is in RUNNING state even though there is only one SM process (nuocmd show domain) and that there is one active archive object for the database (nuocmd show archives). The archive object still exists in the domain state as a removed archive (nuocmd show archive --removed), so that if the SM StatefulSet is scaled back up, the archive will be bound to and restarted by the next instance of sm-db-nuodb-cluster0-demo-database-1 that is scheduled by Kubernetes.

Use Case 3: PVC Deletion

If a PVC is explicitly deleted for a Pod controlled by a StatefulSet, then a new PVC will be provisioned by Kubernetes for it the next time it is scheduled. In this case, the Admin will automatically remove and purge the associated archive ID, since an SM will never be started on it.

Continuing the example above, archive ID 1, which is associated with PVC archive-volume-sm-db-nuodb-cluster0-demo-database-1, can be purged from the domain state by deleting PVC archive-volume-sm-db-nuodb-cluster0-demo-database-1, which still exists despite the fact that the SM StatefulSet has been scaled down:

kubectl delete pvc archive-volume-sm-db-nuodb-cluster0-demo-database-1
persistentvolumeclaim "archive-volume-sm-db-nuodb-cluster0-demo-database-1" deleted
kubectl exec dom-nuodb-cluster0-admin-0 -- nuocmd show archives --removed

Use Case 4: Pod Deletion or Container Exit

If a Pod is deleted or a database container exits, the Admin should remove any process object generated by it. Normally, the Admin Process connected to a database process will detect that it has exited, either because a TCP_RST is generated by the socket connection with the database process, or because the timeout specified by the processLivenessCheckSec property in nuoadmin.conf has elasped since it last received a message from the database process. If the connected Admin is not running, as was the case in the example in the Use Case 1: Admin Scale-down section when the Pods for the orphaned database processes were deleted, then the command to remove the process object from the domain state will be executed as a result of the deletion or state-change event on the Pod.

Use Case 5: Automatic Database Protocol Upgrade

A new version of NuoDB database software may also introduce a new version of the database protocol. NuoDB supports explicit database protocol version upgrade which is performed after upgrading the NuoDB image for all database processes. To simplify NuoDB version rollout, Kubernetes Aware Admin (KAA) is used to automatically upgrade database protocol and restart a Transaction Engine (TE) as an upgrade finalization step.

For more information on how to perform the steps manually, see Upgrade the Database Protocol.

The database protocol version is eligible for an upgrade to the maximum available version if all conditions below are met:

  • the database is in the RUNNING state

  • all database processes are MONITORED:RUNNING

  • the automatic upgrade is enabled for this database (disabled by default)

  • all database processes are running with the same release binary version

  • there are available protocol versions for upgrading to

Automatic database protocol upgrade is enabled on database level by setting the nuodb.com/automatic-database-protocol-upgrade annotation to true. It should be added to the TE deployment on the entry point cluster only. This configuration is exposed by the NuoDB Helm Charts using the database.automaticProtocolUpgrade.enabled Helm option.

Consider, for example, a NuoDB minor release upgrade from 4.1.3 to 4.2.3 which requires a database protocol upgrade. The KAA automatic database protocol upgrade will be used so that the NuoDB upgrade is finalized automatically.

The effective database protocol version is checked by running the nuocmd show database-versions command, which combines information about database effective version, available protocol versions, and database processes release versions.

kubectl exec dom-nuodb-cluster0-admin-0 -- nuocmd show database-versions --db-name demo
effective version ID: 1376256, effective version: 4.1|4.1.1|4.1.2|4.1.3, max version ID: 1376256
Available versions:
Process versions:
  version ID: 1376256, version: 4.1|4.1.1|4.1.2|4.1.3, release: 4.1.3.rel413-2-68cf9daff3
    [SM] sm-database-nuodb-cluster0-demo-0/172.17.0.16:48006 [start_id = 0] [server_id = dom-nuodb-cluster0-admin-0] [pid = 112] [node_id = 1] [last_ack =  6.85] MONITORED:RUNNING
    [SM] sm-database-nuodb-cluster0-demo-1/172.17.0.17:48006 [start_id = 1] [server_id = dom-nuodb-cluster0-admin-0] [pid = 112] [node_id = 2] [last_ack =  3.82] MONITORED:RUNNING
    [TE] te-database-nuodb-cluster0-demo-64d58b4955-9hzwr/172.17.0.15:48006 [start_id = 2] [server_id = dom-nuodb-cluster0-admin-0] [pid = 44] [node_id = 3] [last_ack =  7.95] MONITORED:RUNNING
    [TE] te-database-nuodb-cluster0-demo-64d58b4955-5gp7d/172.17.0.14:48006 [start_id = 3] [server_id = dom-nuodb-cluster0-admin-0] [pid = 43] [node_id = 4] [last_ack =  6.44] MONITORED:RUNNING

When using NuoDB Helm Charts the NuoDB release upgrade is done by helm upgrade command and setting the new NuoDB image.

After the database protocol version has been upgraded, the NuoDB Archive cannot be used with NuoDB software versions that only support the previous database protocol version. As a result, downgrading after the database protocol version has been changed will require restoring a backup of the database. Make sure that a recent and valid database backup is available before processing with NuoDB upgrade.

The database version once all NuoDB Admin and database pods are upgraded using the new NuoDB image but before the database protocol is upgraded is shown below.

kubectl exec dom-nuodb-cluster0-admin-0 -- nuocmd show database-versions --db-name demo
effective version ID: 1376256, effective version: 4.1|4.1.1|4.1.2|4.1.3, max version ID: 1441792
Available versions:
  version ID: 1441792, version: 4.2|4.2.1|4.2.2|4.2.3
Process versions:
  version ID: 1441792, version: 4.2|4.2.1|4.2.2|4.2.3, release: 4.2.3.rel42dev-349-fa78133f5a
    [SM] sm-database-nuodb-cluster0-demo-1/172.17.0.17:48006 [start_id = 5] [server_id = dom-nuodb-cluster0-admin-0] [pid = 95] [node_id = 6] [last_ack =  7.88] MONITORED:RUNNING
    [SM] sm-database-nuodb-cluster0-demo-0/172.17.0.16:48006 [start_id = 6] [server_id = dom-nuodb-cluster0-admin-1] [pid = 96] [node_id = 7] [last_ack =  9.70] MONITORED:RUNNING
    [TE] te-database-nuodb-cluster0-demo-5d5f5f6bf9-l5fwh/172.17.0.19:48006 [start_id = 7] [server_id = dom-nuodb-cluster0-admin-1] [pid = 43] [node_id = 8] [last_ack =  2.29] MONITORED:RUNNING
    [TE] te-database-nuodb-cluster0-demo-5d5f5f6bf9-nrgvp/172.17.0.18:48006 [start_id = 8] [server_id = dom-nuodb-cluster0-admin-0] [pid = 44] [node_id = 9] [last_ack =  1.50] MONITORED:RUNNING

All database processes are using NuoDB 4.2.3 release and there is one protocol version (version ID 1441792) that is available.

If all the conditions for automatic database protocol upgrade are satisfied, the KAA will automatically upgrade the protocol to the maximum available protocol version and restart a TE database process.

The database versions after the KAA performs the upgrade finalization phase is shown below.

kubectl exec dom-nuodb-cluster0-admin-0 -- nuocmd show database-versions --db-name demo
effective version ID: 1441792, effective version: 4.2|4.2.1|4.2.2|4.2.3, max version ID: 1441792
Available versions:
Process versions:
  version ID: 1441792, version: 4.2|4.2.1|4.2.2|4.2.3, release: 4.2.3.rel42dev-349-fa78133f5a
    [SM] sm-database-nuodb-cluster0-demo-1/172.17.0.17:48006 [start_id = 5] [server_id = dom-nuodb-cluster0-admin-0] [pid = 95] [node_id = 6] [last_ack =  4.22] MONITORED:RUNNING
    [SM] sm-database-nuodb-cluster0-demo-0/172.17.0.16:48006 [start_id = 6] [server_id = dom-nuodb-cluster0-admin-1] [pid = 96] [node_id = 7] [last_ack =  6.04] MONITORED:RUNNING
    [TE] te-database-nuodb-cluster0-demo-5d5f5f6bf9-l5fwh/172.17.0.19:48006 [start_id = 7] [server_id = dom-nuodb-cluster0-admin-1] [pid = 43] [node_id = 8] [last_ack =  8.63] MONITORED:RUNNING
    [TE] te-database-nuodb-cluster0-demo-5d5f5f6bf9-nrgvp/172.17.0.18:48006 [start_id = 9] [server_id = dom-nuodb-cluster0-admin-0] [pid = 45] [node_id = 10] [last_ack =  0.95] MONITORED:RUNNING

The database effective version is upgraded to 4.2|4.2.1|4.2.2|4.2.3 (version ID 1376256), database process (start ID 8) has been shutdown and Kubernetes started a new one (start ID 9). KAA resync actions are shown in the logs of the NuoDB Admin Process (AP) that is performing the resync.

kubectl logs dom-nuodb-cluster0-admin-1 | grep KubernetesResourceInspector
2021-10-01T08:22:58.877+0000 INFO  [dom-nuodb-cluster0-admin-1:main] KubernetesResourceInspector Scheduling domain resync with initial delay 3s every 10s...
2021-10-01T08:32:13.346+0000 INFO  [dom-nuodb-cluster0-admin-1:kubeInspector.adminResyncExecutor5-1] KubernetesResourceInspector Upgrading protocol version for database dbName=demo, availableVersions=[DatabaseVersion{name=4.2|4.2.1|4.2.2|4.2.3, versionId=1441792}]
2021-10-01T08:32:13.394+0000 INFO  [dom-nuodb-cluster0-admin-1:kubeInspector.adminResyncExecutor5-1] KubernetesResourceInspector Shutting down startId=8 to finalize the database protocol upgrade

The TE database process selected to be shut down is configured on the database level by setting the nuodb.com/automatic-database-protocol-upgrade.te-preference-query annotation. It should be added to the TE deployment on the entry point cluster only. This configuration is exposed by the NuoDB Helm Charts using the database.automaticProtocolUpgrade.tePreferenceQuery Helm option. Its value should be a valid load balancer query expression which will be used as a user preference when selecting the TE to be shutdown. If the annotation is not configured, a random TE in MONITORED state is selected. If no MONITORED TE database process is found, the automatic database upgrade will be aborted.

Use Case 6: Load Balancer configuration

Load Balancer Policies can be configured in a declarative way via Kubernetes annotations. All load balancer annotations are configured on the entry point cluster only.

There are different types of NuoDB Admin load balancer configuration. The corresponding Kubernetes annotations as well as Helm options are described in the table below.

Load Balancer Configuration Annotation Key Kubernetes Controller Helm Option Example Value

Global Default Policy

nuodb.com/load-balancer-default

NuoDB Admin StatefulSet

admin.lbConfig.default

random(first(label(zone ${ZONE_NAME:-}) any))

Global Default Pre-filter

nuodb.com/load-balancer-prefilter

NuoDB Admin StatefulSet

admin.lbConfig.prefilter

not(label(region tiebreaker))

Named Load Balancer Policies

nuodb.com/load-balancer-policy.<name>

NuoDB Admin StatefulSet

admin.lbConfig.policies.<name>

random(first(label(pod ${pod:-}) label(node ${node:-}) label(zone ${zone:-}) any))

Database Default Policy

nuodb.com/load-balancer-default

Database TE Deployment

database.lbConfig.default

random(first(label(zone ${ZONE_NAME:-}) any))

Global Default Pre-filter

nuodb.com/load-balancer-prefilter

Database TE Deployment

database.lbConfig.prefilter

not(label(zone DR))

For more information and examples, see Load Balancer Policies.

Kubernetes Aware Admin (KAA) will resync the load balancer configuration from the Kubernetes state automatically. The NuoDB Admin load balancer can still be configured manually by using nuocmd set load-balancer and nuocmd set load-balancer-config commands. Manually configured policies will be replaced if they are also configured with annotations. By default, any policies that are not configured using annotations will be preserved. This behavior can be overwritten by setting nuodb.com/sync-load-balancer-config annotation on the NuoDB Admin StatefulSet to true. The setting is exposed by NuoDB Helm Charts using the admin.lbConfig.fullSync Helm option.

Scheduling Resync Actions

The actions for use cases 2, 3, 4, 5 and 6 should only be executed by a specific Admin Process to avoid executing redundant Raft commands. NuoDB achieves this by using Kubernetes Lease objects to designate a Resync Leader, which is the only Admin Process in the Kubernetes cluster that can perform resync actions until the lease expires.

Multi-cluster Support

In a multi-cluster Kubernetes deployment, NuoDB processes are scheduled across separate Kubernetes clusters. Since different Kubernetes clusters generate disjoint events, resync actions are performed by an Admin Process in each cluster. The use of Kubernetes Lease objects allows an Admin in each Kubernetes cluster to act as Resync Leader, allowing NuoDB state to converge with multi-cluster Kubernetes state for use cases 2, 3, 4, 5 and 6. Use case 1, which requires all running Admins to determine the complete set of peers that are not being scheduled due to StatefulSet scale down, is not supported in multi-cluster Kubernetes deployments.