Handling Unreachable Processes

An unreachable NuoDB Admin Process (AP) continues to be recorded in the durable domain configuration. When NuoDB Command’s show domain command indicates that an AP is unreachable then any processes that communicate with the unavailable AP are also shown as being unreachable.

The show domain command is issued using NuoDB Command (nuocmd). For more information on NuoDB Command and other command line tools, see Command Line Tools.

Although they are unreachable, they might still be running even though they have no AP to communicate with. It is safe for these processes to continue running but it is not safe to start another process for the same database. You can start a database process only when the output of the show domain command indicates that all existing processes that serve that database are running. If you try to start a database process and all existing processes that serve that database are not running then NuoDB prevents creation of the new process to avoid running a database in split-brain mode.

As soon as you determine that there are database processes that have no AP to communicate with you should try to restart the AP on that host. If you can restart the AP, it automatically starts communicating with the formerly unreachable database processes. Invoking show domain should now show that those processes are running.

If you cannot restart the AP, you should not kill the database processes only because the local AP is down. The processes can continue to run and when an AP becomes available on the host then that AP will connect with those processes. However, if the processes are not running then you do need to remove the servers associated with the unreachable processes from the durable domain configuration. You do this by removing their server from the domain. See Removing an Unreachable Admin Server from the Durable Domain.