NuoDB Metrics Available from the Collector

The open-source collector called Telegraf is used with the NuoDB Storage Manager (SM) and Transaction Engine (TE) process to collect metrics from the SM and TE and publish those metrics. This section describes key metrics collected about a NuoDB database that can assist in understanding database behavior.

The following are some example InfluxDB column names with processing formulas.

  • raw = no processing

  • value = raw/Milliseconds

  • normvalue = value/NumberCores

  • rate = raw x 1000/Milliseconds

  • ncore = raw x 0.01

  • Milliseconds - from raw statistics, where Milliseconds is the measurement period. Typically, 10s or 10000 ms

  • NumberCores - from raw statistics, Number of cores available on host

  • Rate is metric per second. For example, Number of transactions per second

  • ncore is expressed in percentage. Multiplying by 0.01 will arrive at values between 0 and 1. A typical application would be CPU usage. A 4-core system will have percentage values between 0 and 400%. So, dividing by 100 will give a measurement of the number of cores used.

Collector Statistics Definition Value Range Use Case Collector Value Derivation

Summary.CPU

Summary.CPU is the CPU time per SM and TE

0 to max Millisecond interval

Typical values should not exceed 60%-80% of the total CPU time, excluding peaks of increased workload. Sustained use of over 80% must be investigated.

(UserMilliseconds+KernelMilliseconds )/Milliseconds

Summary.Commit

Summary.Commit is the time taken to commit a transaction durably

0 to max Millisecond interval, better values towards 0

Using history, observe high and low periods of database activity. After a history is established, activity outside the periods must be investigated.

RemoteCommitTime/Milliseconds

Summary.NtwkSend

Summary.NtwkSend is the time spent sending packets for an SM or a TE

0 to max interval, better values towards 0

Using history, observe high and low periods of activity. Actual numbers will depend on network speed and the contention of unrelated traffic.

NodeSocketBufferWriteTime/Milliseconds

Summary.Sync

Summary.Sync is the time per SM and TE spent waiting

0 to max interval, better values towards 0

Examples are waiting for other threads or waiting for system operations to complete. When waiting time is out of line with other summary statistics, there is an occasion to investigate.

(SyncPointWaitTime
StallPointWaitTime - PlatformObjectCheckOpenTime - PlatformObjectCheckPopulatedTime - PlatformObjectCheckCompleteTime )/Milliseconds

Summary.Fetch

Summary.Fetch is the time spent fetching data

0 to hardware max, better values are towards 0

Examples are waiting to load data from disk or waiting for system operations to complete. When fetching data time is out of line with other summary statistics, there is an occasion to investigate.

( PlatformObjectCheckOpenTime
PlatformObjectCheckPopulatedTime
PlatformObjectCheckCompleteTime
LoadObjectTime )/Milliseconds

Summary.Lock

Summary.Lock is the time of how long transactions were blocked by another transaction

0 to hardware max, better values are towards 0

An example would be in READ_COMMITTED mode, where a transaction is modifying a row compared to a second transaction that wants to read the row, the second transaction would be deliberately delayed until the first transaction has committed/rolled back.

TransactionBlockedTime/Milliseconds

Summary.Throttle

Summary.Throttle is the time duration for which the DBMS deliberately degrades performance

0 to hardware max, better values are towards 0

Typically, hardware limitations will have the DBMS deliberately slowing down transactions in an attempt to balance database performance against hardware performance. The usual culprit is variable disk performance.

( ArchiveSyncThrottleTime + MemoryThrottleTime
WriteThrottleTime + ArchiveBandwidthThrottleTime
JournalBandwidthThrottleTime )/Milliseconds

Commits

Commits is the number of transaction commits

0 to hardware max

Using workload history, chart periods of high activity and low activity, to discover any anomalies. Using other graphs correlate events.

commits/Milliseconds * 1000

Rollbacks

Rollbacks is the number of transaction rollbacks

0 to hardware max

Using workload history, chart periods of high activity and low activity, to discover any anomalies. Using other graphs, correlate events. Note that the rollbacks metric can include database-generated rollbacks.

Rollbacks/Milliseconds * 1000

Active SQL Transactions

CurrentActiveTransactions is the number of active SQL transactions

0 to hardware max

Using history, find the number of transactions to see periods of high and low database utilization.

CurrentActiveTransactions

SQL Transaction Time

SqlListenerSqlProcTime is the processing time for transactions

0 to hardware max, better values towards 0

Using history, find the peaks and troughs of how long transactions are taking. Longer times may need investigation.

SqlListenerSqlProcTime/Milliseconds

SQL Transaction Idle Time

SqlListenerIdleTransactionTime is the time a running transaction was idle. It’s a subset of SqlListenerSqlProcTime

0 to hardware max, better values towards 0

Using the history of workload, find how much time is spent waiting for the application or other system events.

SqlListenerIdleTransactionTime/Milliseconds

Client connections

ClientCnct is the number of client connections to the database

0 to hardware max, depending on the application

Using the history of workload, find peaks and troughs of connections. Typically, the number of connections is steady when using a connection pool. It may vary using other methods.

ClientCnct

Inserts

Inserts is the number of SQL inserts executed

0 to hardware max, depending on the application

Using the history of workload, determine peaks and troughs of inserts.

( Inserts/Milliseconds ) * 1000

Deletes

Deletes is the number of SQL deletes executed

0 to hardware max, depending on the application

Using the history of workload, determine peaks and troughs of deletes.

( Deletes/Milliseconds ) * 1000

Updates

Updates is the number of SQL updates executed

0 to hardware max, depending on the application

Using the history of workload, determine peaks and troughs of updates.

( Updates/Milliseconds ) * 1000

Inserts pending time

PendingInsertWaitTime is the time spent waiting for SQL inserts to be executed

0 to hardware max, depending on the application

Using the history of workload, find peaks and troughs of wait time for inserts. A longer pending time is indicative of a performance issue.

PendingInsertWaitTime/Milliseconds

Updates pending time

PendingUpdateWaitTime is the time spent waiting for SQL updates to be executed

0 to hardware max, depending on the application

Using the history of workload, find peaks and troughs of wait time for inserts. A longer pending time is indicative of a performance issue.

PendingUpdateWaitTime/Milliseconds

HeapAllocated

HeapAllocated is the memory used in an SM and TE process in bytes

0 to hardware max, typical values towards --mem setting

heap allocated will have a saw tooth pattern.

HeapAllocated

Objects

Objects are the number of atoms in memory.

0 to hardware max, depending on the workload.

The number of atoms in memory depends on the working set and workload cadence.

Objects

Objects Created

ObjectsCreated are the number of atoms in memory created

0 to hardware max, depending on the workload.

The number of atoms in memory created depends on the working set and workload cadence.

( ObjectsCreated/Milliseconds ) * 1000

Objects Dropped

ObjectsDropped are the number of atoms dropped from memory. They are gone from the database completely

0 to hardware max, depending on the workload

The number of atoms in memory dropped depends on workload and workload cadence.

( ObjectsDropped/Milliseconds ) * 1000

Objects Loaded

ObjectsLoaded are the number of atoms loaded into memory They are loaded from archive storage

0 to hardware max, depending on the workload

The number of atoms in memory loaded depends on workload and workload cadence.

( ObjectsLoaded/Milliseconds ) * 1000

Objects Reloaded

ObjectsReloaded are the number of atoms reloaded into SM memory. They are loaded from TE memory.

0 to hardware max, depending on the workload

The number of atoms in memory loaded depends on workload and workload cadence.

ObjectsReloaded/Milliseconds ) * 1000

Objects Purged

ObjectsPurged are the number of SM atoms marked as not being used in any transaction

0 to hardware max, depending on the workload

The number of atoms in memory loaded depends on workload and workload cadence

( ObjectsPurged/Milliseconds ) * 1000

Objects Dropped Purged

DroppedPurged are the number of SM atoms deleted from SM memory. They were previously marked as purged.

0 to hardware max, depending on the workload. Better values towards zero

The number of atoms in memory purged and then dropped depends on workload and workload cadence. Note that dropped purged will also be included in the dropped statistic.

( DroppedPurged/Milliseconds ) * 1000

Objects Imported

Imported are the number of atoms in memory imported from an SM and TE

0 to hardware max, depending on the workload. Better values towards zero

The number of atoms in memory Imported depends on workload and workload cadence. Note that one SM or TE should export atoms and there should be a corresponding SM or TE importing atoms.

( Imported/Milliseconds ) * 1000

Objects Exported

Exported are the number of atoms in memory exported from an SM or a TE

0 to hardware max, depending on the workload. Better values towards zero

The number of atoms in memory exported depends on workload and workload cadence. Note that one SM and TE should export atoms and there should be a corresponding SM and TE importing atoms.

( Exported/Milliseconds ) * 1000

ObjectsSaved

ObjectsSaved are the number of atoms in memoy saved to disk

0 to hardware max, depending on the workload

The number of atoms in memory saved depends on workload and workload cadence.

( ObjectsSaved/Milliseconds ) * 1000

Archive Fsync Time

ArchiveFsyncTime is the time spent completing file writes to the archive

0 to hardware max, better values towards 0

Archive file sync time should be a small fraction of total measurement time.

ArchiveFsyncTime/Milliseconds

Archive Directory Time

ArchiveDirectoryTime is the amount of time used in the measurement time to manage directories

0 to hardware max, better values towards 0

There would not be much in the way of directory admin.

ArchiveDirectoryTime/Milliseconds