Obtaining Index Use Statistics

After examining the query plan, and modifying the Index as necessary, we can then look at the statistics for that index. Index statistics are useful both to the Query Optimizer and to the user. They tell us more about the index and its effectiveness with the particular data on which it is executing. Index statistics involves four System Tables:

SYSTEM.INDEXSTATISTICS
SYSTEM.INDEXPREFIXSTATISTICS
SYSTEM.INDEXHISTOGRAMS
SYSTEM.INDEXHISTOGRAMBUCKETS

The ANALYZE command should be executed prior to looking at the index statistics. (See ANALYZE).

Index statistics are only available if the object of interest is in memory on the transaction engine (TE) to which the client (e.g. nuosql) is connected. If, for whatever reason (e.g. garbage collection), the object is no longer in memory, then index statistics will not be available.

INDEXSTATISTICS

SYSTEM.INDEXSTATISTICS is a system pseudo table containing one row for every index in your database. Its columns are defined below. This is the only table that maps Index statistics (including INDEXHISTOGRAMS and INDEXHISTOGRAMBUCKETS) back to the name of the index. For any given index, if no statistics about the index exist in the table, then no statistics are available to the NuoDB Query Optimizer and you should run ANALYZE on that table or index.

INDEXSTATISTICS Columns

Column Description
OBJECTID The ID for the database object - important for looking up hstogram information below.
CATALOGID The catalog ID for the database object - important for looking up histogram information below.
SCHEMA The schema in which the index was created
TABLENAME The table name on which the index exists
INDEXNAME The name of the index
KEYCOUNT The number of keys in the index - should be the same as the cardinality of the table

INDEXPREFIXSTATISTICS

SYSTEM.INDEXPREFIXSTATISTICS is a system pseudo table containing multiple rows for each index defined in the database. Its columns are defined below. For any given index, if no statistics about the index exist in the table, then no statistics are available to the NuoDB Query Optimizer and you should run ANALYZE on that table or index.

INDEXPREFIXSTATISTICS Columns

Column Description
OBJECTID The database object ID for the index
CATALOGID The database catalog ID for the index
PREFIXLENGTH The number of index key columns that this particular distinct count is valid for
DISTINCTCOUNT An estimate of the number of distinct values with that prefix
AVERAGEKEYLENGTH Average key size for the keys whose prefix count is given in this row
SOURCE "statistics" or "calculated"
We calculate the first prefix, 2 columns of an index and the full key. All other values are interpolated.
"statistics" - means that this is a good faith estimate of the number of distinct values using actual values in the index keys
"calculated" - means that this value has been estimated using the available statistics based on actual values

INDEXHISTOGRAMS

SYSTEM.INDEXHISTOGRAMS is a system pseudo table containing one row for every index in your database. This table represents data related to the index histogram. It is currently almost an exact subset of INDEXSTATISTICS with the exception of FIELDCOUNT, but this is subject to change in the future.

INDEXHSITOGRAMS Columns

Column Description
OBJECTID The ID for the database object
CATALOGID The Catalog ID for the database object
HISTOGRAMID The ID for the histogram, unique to the table
FIELDCOUNT The number of fields in the index
MAXRESOLUTION The maximum number of histogram buckets. Index histograms are one of the most important parts of index statistics. By default, the number of histogram buckets created is 10. Raising this number (see CREATE INDEX) will make more precise statistics available to the query optimizer.

INDEXHISTOGRAMBUCKETS

SYSTEM.INDEXHISTOGRAMBUCKETS is a system pseudo table containing one row for each bucket in the Index's histogram. So, if MAXRESOLUTION is 10, there will be ten rows per index.

INDEXHISTOGRAMBUCKETS Columns

Column Description
OBJECTID The ID for the database object
CATALOGID The Catalog ID for the database object
HISTOGRAMID The ID for the histogram, unique to the table
BUCKETINDEX The index of the bucket within the histogram. Values in this column are 0 to MAXRESOLUTION.
BOUNDARY For all but the first bucket, this value is the upper most value for each field in the index, in that bucket. The first bucket contains one key which is the lowest key in the index.