Obtaining Index Use Statistics

After examining the query plan, and modifying the Index as necessary, we can then look at the statistics for that index. Index statistics are useful both to the Query Optimizer and to the user. They tell us more about the index and its effectiveness with the particular data on which it is executing. Index statistics involves four System Tables:

SYSTEM.INDEXSTATISTICS
SYSTEM.INDEXPREFIXSTATISTICS
SYSTEM.INDEXHISTOGRAMS
SYSTEM.INDEXHISTOGRAMBUCKETS

The ANALYZE command should be executed prior to looking at the index statistics. (See ANALYZE).

Index statistics are only available if the object of interest is in memory on the Transaction Engine (TE) to which the client (e.g. nuosql) is connected. If, for whatever reason (e.g. garbage collection), the object is no longer in memory, then index statistics will not be available.

INDEXSTATISTICS

SYSTEM.INDEXSTATISTICS is a system pseudo table containing one row for every index in your database. Its columns are defined below. This is the only table that maps Index statistics (including INDEXHISTOGRAMS and INDEXHISTOGRAMBUCKETS) back to the name of the index. For any given index, if no statistics about the index exist in the table, then no statistics are available to the NuoDB Query Optimizer and you should run ANALYZE on that table or index.

INDEXSTATISTICS Columns

Column Description

OBJECTID

The ID for the database object - important for looking up hstogram information below.

CATALOGID

The catalog ID for the database object - important for looking up histogram information below.

SCHEMA

The schema in which the index was created

TABLENAME

The table name on which the index exists

INDEXNAME

The name of the index

KEYCOUNT

The number of keys in the index - should be the same as the cardinality of the table

INDEXPREFIXSTATISTICS

SYSTEM.INDEXPREFIXSTATISTICS is a system pseudo table containing multiple rows for each index defined in the database. Its columns are defined below. For any given index, if no statistics about the index exist in the table, then no statistics are available to the NuoDB Query Optimizer and you should run ANALYZE on that table or index.

INDEXPREFIXSTATISTICS Columns

Column Description

OBJECTID

The database object ID for the index

CATALOGID

The database catalog ID for the index

PREFIXLENGTH

The number of index key columns that this particular distinct count is valid for

DISTINCTCOUNT

An estimate of the number of distinct values with that prefix

AVERAGEKEYLENGTH

Average key size for the keys whose prefix count is given in this row

SOURCE

"statistics" or "calculated"
We calculate the first prefix, 2 columns of an index and the full key. All other values are interpolated.
"statistics" - means that this is a good faith estimate of the number of distinct values using actual values in the index keys
"calculated" - means that this value has been estimated using the available statistics based on actual values

INDEXHISTOGRAMS

SYSTEM.INDEXHISTOGRAMS is a system pseudo table containing one row for every index in your database. This table represents data related to the index histogram. It is currently almost an exact subset of INDEXSTATISTICS with the exception of FIELDCOUNT, but this is subject to change in the future.

INDEXHSITOGRAMS Columns

Column Description

OBJECTID

The ID for the database object

CATALOGID

The Catalog ID for the database object

HISTOGRAMID

The ID for the histogram, unique to the table

FIELDCOUNT

The number of fields in the index

MAXRESOLUTION

The maximum number of histogram buckets. Index histograms are one of the most important parts of index statistics. By default, the number of histogram buckets created is 10. Raising this number (see CREATE INDEX) will make more precise statistics available to the query optimizer.

INDEXHISTOGRAMBUCKETS

SYSTEM.INDEXHISTOGRAMBUCKETS is a system pseudo table containing one row for each bucket in the Index’s histogram. So, if MAXRESOLUTION is 10, there will be ten rows per index.

INDEXHISTOGRAMBUCKETS Columns

Column Description

OBJECTID

The ID for the database object

CATALOGID

The Catalog ID for the database object

HISTOGRAMID

The ID for the histogram, unique to the table

BUCKETINDEX

The index of the bucket within the histogram. Values in this column are 0 to MAXRESOLUTION.

BOUNDARY

For all but the first bucket, this value is the upper most value for each field in the index, in that bucket. The first bucket contains one key which is the lowest key in the index.