NuoDB Metrics Available from the Collector
The open-source collector called Telegraf is used with the NuoDB Storage Manager (SM) and Transaction Engine (TE) process to collect metrics from the SM and TE and publish those metrics. This section describes key metrics collected about a NuoDB database that can assist in understanding database behavior.
The following are some example InfluxDB column names with processing formulas.
-
raw = no processing
-
value = raw/Milliseconds
-
normvalue = value/NumberCores
-
rate = raw x 1000/Milliseconds
-
ncore = raw x 0.01
|
Collector Statistics | Definition | Value Range | Use Case | Collector Value Derivation |
---|---|---|---|---|
Summary.CPU |
Summary.CPU is the CPU time per SM and TE |
0 to max Millisecond interval |
Typical values should not exceed 60%-80% of the total CPU time, excluding peaks of increased workload. Sustained use of over 80% must be investigated. |
(UserMilliseconds+KernelMilliseconds )/Milliseconds |
Summary.Commit |
Summary.Commit is the time taken to commit a transaction durably |
0 to max Millisecond interval, better values towards 0 |
Using history, observe high and low periods of database activity. After a history is established, activity outside the periods must be investigated. |
RemoteCommitTime/Milliseconds |
Summary.NtwkSend |
Summary.NtwkSend is the time spent sending packets for an SM or a TE |
0 to max interval, better values towards 0 |
Using history, observe high and low periods of activity. Actual numbers will depend on network speed and the contention of unrelated traffic. |
NodeSocketBufferWriteTime/Milliseconds |
Summary.Sync |
Summary.Sync is the time per SM and TE spent waiting |
0 to max interval, better values towards 0 |
Examples are waiting for other threads or waiting for system operations to complete. When waiting time is out of line with other summary statistics, there is an occasion to investigate. |
(SyncPointWaitTime |
Summary.Fetch |
Summary.Fetch is the time spent fetching data |
0 to hardware max, better values are towards 0 |
Examples are waiting to load data from disk or waiting for system operations to complete. When fetching data time is out of line with other summary statistics, there is an occasion to investigate. |
( PlatformObjectCheckOpenTime |
Summary.Lock |
Summary.Lock is the time of how long transactions were blocked by another transaction |
0 to hardware max, better values are towards 0 |
An example would be in READ_COMMITTED mode, where a transaction is modifying a row compared to a second transaction that wants to read the row, the second transaction would be deliberately delayed until the first transaction has committed/rolled back. |
TransactionBlockedTime/Milliseconds |
Summary.Throttle |
Summary.Throttle is the time duration for which the DBMS deliberately degrades performance |
0 to hardware max, better values are towards 0 |
Typically, hardware limitations will have the DBMS deliberately slowing down transactions in an attempt to balance database performance against hardware performance. The usual culprit is variable disk performance. |
( ArchiveSyncThrottleTime + MemoryThrottleTime |
Commits |
Commits is the number of transaction commits |
0 to hardware max |
Using workload history, chart periods of high activity and low activity, to discover any anomalies. Using other graphs correlate events. |
commits/Milliseconds * 1000 |
Rollbacks |
Rollbacks is the number of transaction rollbacks |
0 to hardware max |
Using workload history, chart periods of high activity and low activity, to discover any anomalies. Using other graphs, correlate events. Note that the rollbacks metric can include database-generated rollbacks. |
Rollbacks/Milliseconds * 1000 |
Active SQL Transactions |
CurrentActiveTransactions is the number of active SQL transactions |
0 to hardware max |
Using history, find the number of transactions to see periods of high and low database utilization. |
CurrentActiveTransactions |
SQL Transaction Time |
SqlListenerSqlProcTime is the processing time for transactions |
0 to hardware max, better values towards 0 |
Using history, find the peaks and troughs of how long transactions are taking. Longer times may need investigation. |
SqlListenerSqlProcTime/Milliseconds |
SQL Transaction Idle Time |
SqlListenerIdleTransactionTime is the time a running transaction was idle. It’s a subset of SqlListenerSqlProcTime |
0 to hardware max, better values towards 0 |
Using the history of workload, find how much time is spent waiting for the application or other system events. |
SqlListenerIdleTransactionTime/Milliseconds |
Client connections |
ClientCnct is the number of client connections to the database |
0 to hardware max, depending on the application |
Using the history of workload, find peaks and troughs of connections. Typically, the number of connections is steady when using a connection pool. It may vary using other methods. |
ClientCnct |
Inserts |
Inserts is the number of SQL inserts executed |
0 to hardware max, depending on the application |
Using the history of workload, determine peaks and troughs of inserts. |
( Inserts/Milliseconds ) * 1000 |
Deletes |
Deletes is the number of SQL deletes executed |
0 to hardware max, depending on the application |
Using the history of workload, determine peaks and troughs of deletes. |
( Deletes/Milliseconds ) * 1000 |
Updates |
Updates is the number of SQL updates executed |
0 to hardware max, depending on the application |
Using the history of workload, determine peaks and troughs of updates. |
( Updates/Milliseconds ) * 1000 |
Inserts pending time |
PendingInsertWaitTime is the time spent waiting for SQL inserts to be executed |
0 to hardware max, depending on the application |
Using the history of workload, find peaks and troughs of wait time for inserts. A longer pending time is indicative of a performance issue. |
PendingInsertWaitTime/Milliseconds |
Updates pending time |
PendingUpdateWaitTime is the time spent waiting for SQL updates to be executed |
0 to hardware max, depending on the application |
Using the history of workload, find peaks and troughs of wait time for inserts. A longer pending time is indicative of a performance issue. |
PendingUpdateWaitTime/Milliseconds |
HeapAllocated |
HeapAllocated is the memory used in an SM and TE process in bytes |
0 to hardware max, typical values towards --mem setting |
heap allocated will have a saw tooth pattern. |
HeapAllocated |
Objects |
Objects are the number of atoms in memory. |
0 to hardware max, depending on the workload. |
The number of atoms in memory depends on the working set and workload cadence. |
Objects |
Objects Created |
ObjectsCreated are the number of atoms in memory created |
0 to hardware max, depending on the workload. |
The number of atoms in memory created depends on the working set and workload cadence. |
( ObjectsCreated/Milliseconds ) * 1000 |
Objects Dropped |
ObjectsDropped are the number of atoms dropped from memory. They are gone from the database completely |
0 to hardware max, depending on the workload |
The number of atoms in memory dropped depends on workload and workload cadence. |
( ObjectsDropped/Milliseconds ) * 1000 |
Objects Loaded |
ObjectsLoaded are the number of atoms loaded into memory They are loaded from archive storage |
0 to hardware max, depending on the workload |
The number of atoms in memory loaded depends on workload and workload cadence. |
( ObjectsLoaded/Milliseconds ) * 1000 |
Objects Reloaded |
ObjectsReloaded are the number of atoms reloaded into SM memory. They are loaded from TE memory. |
0 to hardware max, depending on the workload |
The number of atoms in memory loaded depends on workload and workload cadence. |
ObjectsReloaded/Milliseconds ) * 1000 |
Objects Purged |
ObjectsPurged are the number of SM atoms marked as not being used in any transaction |
0 to hardware max, depending on the workload |
The number of atoms in memory loaded depends on workload and workload cadence |
( ObjectsPurged/Milliseconds ) * 1000 |
Objects Dropped Purged |
DroppedPurged are the number of SM atoms deleted from SM memory. They were previously marked as purged. |
0 to hardware max, depending on the workload. Better values towards zero |
The number of atoms in memory purged and then dropped depends on workload and workload cadence. Note that dropped purged will also be included in the dropped statistic. |
( DroppedPurged/Milliseconds ) * 1000 |
Objects Imported |
Imported are the number of atoms in memory imported from an SM and TE |
0 to hardware max, depending on the workload. Better values towards zero |
The number of atoms in memory Imported depends on workload and workload cadence. Note that one SM or TE should export atoms and there should be a corresponding SM or TE importing atoms. |
( Imported/Milliseconds ) * 1000 |
Objects Exported |
Exported are the number of atoms in memory exported from an SM or a TE |
0 to hardware max, depending on the workload. Better values towards zero |
The number of atoms in memory exported depends on workload and workload cadence. Note that one SM and TE should export atoms and there should be a corresponding SM and TE importing atoms. |
( Exported/Milliseconds ) * 1000 |
ObjectsSaved |
ObjectsSaved are the number of atoms in memoy saved to disk |
0 to hardware max, depending on the workload |
The number of atoms in memory saved depends on workload and workload cadence. |
( ObjectsSaved/Milliseconds ) * 1000 |
Archive Fsync Time |
ArchiveFsyncTime is the time spent completing file writes to the archive |
0 to hardware max, better values towards 0 |
Archive file sync time should be a small fraction of total measurement time. |
ArchiveFsyncTime/Milliseconds |
Archive Directory Time |
ArchiveDirectoryTime is the amount of time used in the measurement time to manage directories |
0 to hardware max, better values towards 0 |
There would not be much in the way of directory admin. |
ArchiveDirectoryTime/Milliseconds |