CREATE STATISTICS
CREATE STATISTICS
— compute statistics for specific table column(s)
Syntax
CREATE STATISTICS statistics_name
ON [schema.]table_name ( column_name [, ...] )
WITH ( statistics_option [, ...] )
Where statistics_option
can be one or more of the following:
FULLSCAN|SAMPLE n {PERCENT|ROWS}
RESOLUTION n
AUTODROP {ON|OFF}
n
must be an integer value.
Description
CREATE STATISTICS
instructs the system to compute statistics for one or more columns on the specified table_name
, without creating an index.
The statistics will be computed the next time that ANALYZE
is performed on table_name
.
Statistics can be computed for any table column with the exception of LOB type columns.
CREATE STATISTICS
requests the following additional table statistics:
-
Histograms for any column or column group
Histograms are used to estimate queries that filter by a known literal value. For example, a histogram computed on columns f1, f2, and f3 could be used to estimate this predicate:
WHERE f1 = 1 AND f2 = 2 AND f3 > 3 AND f3 < 10
Maximum key size for Histograms is 2048 bytes. -
Scalar statistics for a column group (Scalar statistics are already calculated for individual columns)
Scalar statistics are used to estimate parameterized queries, join conditions, and grouping. For example, scalar statistics on (f1,f2) may improve the estimates for queries using these constructs:
WHERE f1 = ? AND f2 = ?
GROUP BY f1,f2
|
Statistics creation is not allowed for unbounded data types, such as STRING , BYTES , CLOBS , and BLOBS .
|
Parameters
statistics_name
-
Name of the statistics to be computed. No schema name can be included here; the statistics are always computed on the schema of the parent table.
table_name
-
Name (optionally schema-qualified) of the table to be analyzed.
column_name
-
Name of the column(s) of the table to be analyzed.
FULLSCAN|SAMPLE n {PERCENT|ROWS}
-
The statistics are computed by sampling the table. The sampling rate can be controlled by specifying either a specific number of rows or a percentage of the total number.
FULLSCAN
requests that all rows in the table are scanned during the computation of the statistics.FULLSCAN
andSAMPLE 100 PERCENT
produce the same result.
Default isSAMPLE 10000 ROWS
. RESOLUTION n
-
This option allows you to specify the
RESOLUTION
of the histogram, that is, the number of "buckets" used in the histogram. Increasing this number will improve the precision of statistics available to the query optimizer, which will then allow the optimizer to make better decisions, potentially leading to an improvement in query performance. The downside of increasing the number of buckets is that the statistics will consume more memory and require additional compuation.
ValidRESOLUTION
values for statistics are 10 - 4096.
Default isRESOLUTION 256
. AUTODROP {ON|OFF}
-
When
ON
schema changes are not blocked by the presence of statistics on the table.
Default isOFF
.
Examples
The examples below use tables in the HOCKEY
schema, which is a schema defined in the test
database.
The test
database is created when the QuickStart process is run.
See Running the SQL QuickStart.
- Example 1: Creating composite
STATISTICS
onteams
table.
CREATE STATISTICS stats_teams ON hockey.teams(conferenceid, year);
- Example 2: Creating
STATISTICS
with custom sampling.
CREATE STATISTICS stats_teams ON hockey.teams(conferenceid, year) WITH (SAMPLE 20 PERCENT);
- Example 3: Creating
STATISTICS
with full table scan.
CREATE STATISTICS stats_teams ON hockey.teams(conferenceid, year) WITH (FULLSCAN);
- Example 4: Creating
STATISTICS
with custom resolution.
CREATE STATISTICS stats_teams ON hockey.teams(conferenceid, year) WITH (RESOLUTION 512);
- Example 5: Creating
STATISTICS
with all options.
CREATE STATISTICS stats_teams ON hockey.teams(conferenceid, year)
WITH (SAMPLE 10000 ROWS, RESOLUTION 512, AUTODROP ON);