Monitor

This is a NuoDB Manager command. See NuoDB Manager. See also Monitor Alarms and Monitor Hosts.

Description

Subscribes to statistical metrics.

In NuoDB's monitoring framework, each broker tracks all the statistics for its host and NuoDB Manager reports these statistics from its one connection with that broker. In this monitoring framework, the following statistic sets are available using the monitor command:

The monitoring framework is disabled by default and can be enabled by setting the NuoDB Manager monitorDomainNew property to true.

Note: By default, the monitor command reports only database process statistics.

NuoDB Manager can subscribe to metrics from the command line with the --command option and monitor command. Monitors are not available when running NuoDB Manager interactively. The first set of statistics from the monitor command includes all available metrics. Subsequent sets of statistics include only those metrics that have changed since the last time the metrics were published. Typically, the metrics are published every ten seconds. The exact interval is published as the Milliseconds metric.

Database Process Statistics

The commands monitor domain, monitor database, or monitor process subscribe to the stream of NuoDB metrics (see Metrics Published by Database Processes) published by the database processes (TE, SM, or SSM).

With monitorDomainNew equals false (default)

Each database process statistic set is preceded by two process status lines. The first line is of the format:

date time [nodetype] hostname/address:broker_port (region) [ pid = nnn ] [ db = name ] [ nodeId = n ] process_status

For example:

Dec 12, 2014 5:03:10 PM [SM] ip-172-31-40-24/107.23.52.10:48006 (us-east-1) [ pid = 32594 ] [ db = test2 ] [ nodeId = 1 ] RUNNING

The second process status line is the usage summary line.

Sample usage summary line from a TE:

Idle 8% + CPU 15% Runnable 0% Sync 25% Lock 0% Fetch 1% Commit 1% Throttle 57% Network Send 2% 

Sample usage summary line from an SM or SSM:

Idle 0% + CPU 0% Runnable 0% Sync 0% Fetch 0% Throttle 0% Network Send 0% Archive Read 0% Write 0% Journal 0%

This usage summary line breaks down time spent by a database process (TE, SM, or SSM). Idle is the percentage of time that the process had no work to do. The remaining figures break down the time when the process was not idle. CPU is the percentage of this time that was spent running code, in user or kernel mode. Runnable is an estimate of how much CPU would increase if the hardware had more cores. The remaining figures have some unknown amount of overlap with CPU because of the way the accounting is done. Percentages are rounded to the nearest integer.

Output Description Reported by...
Sync Time one thread was waiting for another thread to release exclusive access to an object. TE/SM/SSM
Lock Time a client was waiting for another client to finish a transaction that locked an SQL record. TE
Fetch Time spent waiting to fetch an atomThe internal object structure representing all data in a NuoDB database. Atoms are self-coordinating objects that represent specific types of information (such as data, indexes or schemas). from another node or from disk. TE/SM/SSM
Commit Time a client was waiting in remote commit, waiting for a transaction to be made durable on disk. TE
Throttle Time processing was paused due to over consumption of a resource such as memory or I/O bandwidth. This percentage includes the time provided by the WriteThrottleTime metric. See Metrics Published by Database Processes. TE/SM/SSM
Network Send Time spent sending messages over the network. TE/SM/SSM
Archive Read Time spent reading atoms from the archive on disk. SM/SSM
Write Time spent writing atoms to the archive on disk. SM/SSM
Journal Time spent writing messages to the journal for crash recovery. SM/SSM

With monitorDomainNew equals true

Each database process statistic set is preceded by one process status line of the format:

            date time dbname:process_pid [ hostname/address:broker_port ]

For example:

Nov 3, 2015 1:14:22 PM test:8339 [ ip-172-31-3-189/52.10.143.12:48004 ]

With monitorDomainNew equal to true, you can disable reporting of database process statistics by setting the property monitorDatabaseProcessStats to false.

Operating System Metrics

Operating system (OS) metrics are only available if the property monitorDomainNew equals true (not by default). OS metrics are gathered by the broker for its host. All OS metrics are prefixed with OS- so that the database aggregate statistics set can distinguish OS metrics from NuoDB database process metrics. OS metric statistics sets are preceded by a line with the format:

date time (OS) [ hostname/address:broker_port ]:

For example:

Nov 3, 2015 1:00:02 PM (OS) [ ip-172-31-1-136/172.31.1.136:48004 ]: 

With monitorDomainNew equal to true, you can disable reporting of OS metrics by setting the property monitorHostStats to false.

Database Aggregate Statistics

Database aggregate statistics are only available if the property monitorDomainNew equals true (not by default). Database aggregate statistics include both database process metrics and OS metrics. All OS metrics are prefixed with OS- to distinguish them from NuoDB database process metrics. Database aggregate statistics sets are preceded by a line with the format:

date time (aggregate) [ dbname ]:

For example:

Nov 3, 2015 1:00:02 PM (aggregate) [ test2 ]: 

With monitorDomainNew equal to true, you can disable reporting of database aggregate statistics by setting the property monitorDomainAggregates to false.

Alarms

Reporting of alarms triggered by the database is only available if the property monitorDomainNew equals true (not by default). Alarms are reported as in the following example if the property alarmPretty is true:

Nov 4, 2015 10:09:45 AM NodeLeft Alarm [domainleft]:
  Details: Node SM db=[test] pid=676 id=5 req=SMs (local)
  Alarm Definition [domainleft]: type=NodeLeft dimension=Domain entity=* (Warning)

Alarms are reported as in the following example if the property alarmPretty is false:

Alarm id=43f44ec2-bc6e-472c-87ef-13a8cf3bd6ef entity=[domain] def=[Alarm Definition [domainleft]: type=NodeLeft dimension=Domain entity=* (Warning)] ud=[Node SM db=[test] pid=742 id=7 req=SMs (local)]	

With monitorDomainNew equal to true, you can disable reporting of triggered alarms by setting the property monitorDomainAlarm to false.

Syntax

monitor domain
monitor database database_name
monitor process host host_name pid process_id

Parameters

Useful Properties

monitorDomainNew

The following properties are only relevant if monitorDomainNew equals true.

monitorDatabaseProcessStats
monitorDomainAggregates
monitorDomainAlarm
monitorHostStats
alarmPretty

See NuoDB Manager Properties.

Interactive Example

Currently the monitor command is only supported via the nuodbmgr --command parameter.

Scripting Example

The following example shows monitor domain output for the entire domain. Without setting monitorDomainNew to true, it defaults to false.

$ nuodbmgr --broker host --password password --command "monitor domain"
Dec 12, 2014 5:03:10 PM [SM] ip-172-31-40-24/107.23.52.10:48006 (us-east-1) [ pid = 32594 ] [ db = test2 ] [ nodeId = 1 ] RUNNING
Idle 0% + CPU 0% Runnable 0% Sync 0% Fetch 0% Throttle 0% Network Send 0% Archive Read 0% Write 0% Journal 0%
ActualVersion = 105
AdminReceived = 0
AdminSent = 1
ArchiveBufferedBytes = 0
ArchiveDirectory = /var/opt/nuodb/production-archives/test2
...
Dec 12, 2014 5:03:11 PM [TE] ip-172-31-40-24/107.23.52.10:48007 (us-east-1) [ pid = 32611 ] [ db = test2 ] [ nodeId = 2 ] RUNNING
Idle 8% + CPU 15% Runnable 0% Sync 25% Lock 0% Fetch 1% Commit 1% Throttle 57% Network Send 2% 
ActualVersion = 105
AdminReceived = 0
AdminSent = 1
ArchiveBufferedBytes = 0
...
Dec 12, 2014 5:07:31 PM [TE] ip-172-31-14-118/54.173.96.113:48006 (us-east-1) [ pid = 25120 ] [ db = test ] [ nodeId = 4 ] RUNNING
Idle 8% + CPU 16% Runnable 0% Sync 28% Lock 0% Fetch 1% Commit 1% Throttle 23% Network Send 2% 
HeapAllocated = 15395448
Milliseconds = 10000
UserMilliseconds = 16
 
Dec 12, 2014 5:07:31 PM [SM] ip-172-31-46-122/54.165.58.157:48005 (us-east-1) [ pid = 30522 ] [ db = test ] [ nodeId = 1 ] RUNNING
Idle 0% + CPU 0% Runnable 0% Sync 0% Fetch 0% Throttle 0% Network Send 0% Archive Read 0% Write 0% Journal 0%
HeapActive = 52891648
HeapAllocated = 45047248
KernelMilliseconds = 44
Milliseconds = 10000
NodeApplyPingTime = 0
PacketsReceived = 80
UserMilliseconds = 0
...

In the following example, we set monitorDomainNew to true. The output has a different process status line and will include operating system metrics, aggregate statistics and triggered alarms.

$ nuodbmgr --broker host --password password --command "set property monitorDomainNew value true; monitor domain"
Nov 3, 2015 1:13:02 PM (aggregate) [ domain ]: 
ArchiveQueue = 0.0
BytesReceived = 61.0
BytesSent = 68.59
ClientCncts = 0.0
ClientReceived = 0.0
...
Nov 3, 2015 1:13:02 PM (OS) [ ip-172-31-1-136/172.31.1.136:48004 ]: 
OS-cpuSystemTimePercent = 0.0
OS-cpuTotalTimePercent = 0.0
OS-fsVarDirUsePercent = 43.0
OS-memUsedPercent = 5.41
OS-netAllInbound = 0.0
OS-netAllOutbound = 1.0
...
Nov 3, 2015 1:13:22 PM (aggregate) [ test ]: 
ArchiveQueue = 0.0
BytesReceived = 61.0
BytesSent = 68.15
ClientCncts = 0.0
ClientReceived = 0.0
...
SqlListenerThrottleTime = 0.0
StallPointWaitTime = 0.0
Stalls = 0.0
Updates = 0.0
Nov 3, 2015 1:13:11 PM NodeLeft Alarm [domainleft]:
  Details: Node TE db=[test2] pid=17021 id=22 req=TEs (local)
  Alarm Definition [domainleft]: type=NodeLeft dimension=Domain entity=* (Warning)
...

For more example, see Obtaining Metrics for the Domain, Hosts and Processes