Understanding EXPLAIN Command Output
The purpose of reading the output from the NuoDB EXPLAIN
command (see EXPLAIN) is to see the execution plan that the optimizer will choose to quickly retrieve, join, group or summarize, and sort the rows that the SQL statement operates on or returns.
The SQL statement can be a SELECT
statement or any DML statement that requires the optimizer to first retrieve the rows that will be modified.
The NuoDB EXPLAIN
command displays the execution plan as a tree of operators where each child operator produces the records consumed by its parent operator.
When you read the plan from top to bottom you can see the order in which NuoDB SQL forms the plan.
When you read the plan from bottom to top you see the plan as the query engine will execute it.
An exception to this is the order of the join operations, which should be interpreted top to bottom.
With the information provided in the execution plan, you might choose to modify SQL statement or database structure to improve performance. For example, you might decide to add or remove indexes or override the indexes that the optimizer chooses by using hints (see Using Optimizer Hints).
The EXPLAIN
command output provides the following information:
-
SQL statement - this is the type of SQL statement. This can be any one of the following:
SELECT, INSERT, UPDATE, REPLACE or DELETE
. -
list of fields- the columns being selected, updated, replaced, deleted - the projection.
-
execution plan - operator tree generated by the optimizer. The execution plan takes a set of inputs, displayed as sub-trees, and returns a stream of records as a result of the SQL statement.
Execution Plan Operators
The execution plan is a hierarchical sequence of the following operators:
Operator | Description |
---|---|
|
The parent operation of all index access methods that produce a bitmap of record IDs given some input conditions and index[es] This operation doesn’t pass the matching keys to the parent node, only the record IDs. Depending on the available indexes and predicates there may be various bitmap index scan plans, described in Bitmap Scan Plans. |
|
Eliminates duplicates from the input stream. |
|
Represents any predicate in the Join conditions are represented as |
|
Retrieve all keys between a lower and/or an upper key boundaries from an index.
Unlike the |
|
Builds a hash table from input rows.
Used in combination with other operators like the |
|
Aggregates all rows for In the first phase all incoming rows are consumed and aggregated into a hash table which breaks streamed processing. In the second phase the aggregated values are streamed group-wise in no particular order. Unlike Streamed grouping, |
|
Joins to a table using a hash table (see Hash). The joined table is available as a hash table which allows very efficient lookup of join hits for a single join column. This avoids access paths via the table such as table scan with filter. The compute and memory usage for building of the hash table price to be paid to use this efficient join. |
|
A dedicated join implementation for existential subqueries using hashing. |
|
A dedicated join implementation for existential subqueries using hashing. |
|
A dedicated join implementation for existential subqueries using hashing. |
|
Joins two or more tables based on the predicate conditions of the The output from the |
|
Sorts the input stream given that a prefix of the sort key is already sorted. |
|
Fetch all records given a set of record IDs and produce a stream of records.
The input set of record IDs is represented as a bitmap and is produced by a Since the records are fetched according to their order in the input bitmap, the resulting stream of records has no specific ordering with respect to the keys and cannot be used to satisfy an |
|
A dedicated join implementation for existential subqueries using hashing. |
|
A dedicated join implementation for existential subqueries using hashing. |
|
A dedicated implementation of a right outer join using hashing. |
|
A dedicated join implementation for existential subqueries using hashing. |
|
A dedicated join implementation for existential subqueries looking up qualifying tuples in an index. |
|
Sorts input by sort keys to produce a sorted table. This operator processes all input rows so it breaks streamed processing. The input rows are materialized for the sorted representation, so the memory footprint might be significantly high. Sort supports Spill to Disk. |
|
Aggregates rows for the only group of an aggregated select without If aggregate functions are used without a As only a single result row is produced the Streamed aggregation processes all input rows before returning the result. |
|
Aggregates sorted input rows for For Streamed grouping the input rows need to be sorted by the Streamed grouping has a low memory footprint as it only needs to keep a single group in memory at a time. The sorting requirement means it can be applied in less situations than the more general Hash grouping unless used in combination with a Sort operator. |
|
Retrieve the records that correspond to the keys of an input If all columns of the accessed table that are referenced in the query are present in the index, there is no need to read any records from the table.
Such index scans are called Non-covering index scans fetch the matching records in the order of the record IDs in the input stream.
Since the input is ordered by key, and not by record ID, the records are fetched in via random I/O reads, which is slower compared to the The result of a |
|
Retrieves all records of a table and passes them to the parent operation for further processing.
Table scan is chosen either if there is no index, or the indexes defined are not usable based on the predicates in the The optimizer would determine a potential index to be too costly if, based on the predicates, the majority of rows in the table would be retrieved. In this case, a full table scan would be more efficient. Table scans can be |
|
Combines multiple inputs of the same structure into one result stream. This is a streaming operator. |
|
Calculates the window function across a set of rows specified by the OVER clause. For more information, see SQL Window Functions. |
Bitmap Scan Plans
Indexes in NuoDB can be logically thought of as sequences of pairs (Key, RecordID), ordered by the Key. The result of a bitmap index scan consists of all RecordIDs that match a range of keys specified by a lower and/or an upper key boundaries. The set of matching RecordIDs are represented as a bitmap which is passed to the parent operator to retrieve the resulting records given their record IDs.
The simplest possible bitmap index scan plan consists of a single Bitmap scan
operator that scans a range of keys of a single index.
The boundaries that define this range of keys are derived from the conjuncts in WHERE
clause that reference the index’s columns.
For example, given an index I1(col1, col2)
, and a WHERE
clause (col1 = 'abc' AND col2 >= 5 AND col2 ⇐ 8)
, the range of keys that will be scanned starts with the key 'abc5'
and ends with the key 'abc8'
.
If the index contains the following (Key, RecordID) pairs: {('aaa3', 4), ('abc5', 2), ('abc7', 5), ('abc8', 3), ('xyz1', 1)}, the result of this index scan would be a bitmap with the following bits set: {2,3,5}.
Bitmap index scan operators can be organized into composite plans that utilize more than one index on the same table.
For instance given two indexes I1(col1)
and I2(col2)
the conjunction above could be evaluated by scanning each index using the boundary from the corresponding condition, and then performing a logical AND
between the resulting bitmaps.
Similarly, a disjunction like (col1 = 'abc' OR (col2 >= 5 AND col2 ⇐ 8))
can be evaluated by scanning the two indexes using a Bitmap scan
operation, and then performing a logical OR
between the resulting bitmaps.
Arbitrarily nested AND/OR
conditions can be computed via correctly nested bitmap intersection and union operations.
Such composite plans are called index union and intersection (IIU)
plans.
In order to compute such plans, NuoDB employs two more operations:
-
Bitmap And - this operation takes as input at least two bitmap-producing operations (either
Bitmap scan
orBitmap Or
), and produces a bitmap that is the logicalAND
of all its inputs. -
Bitmap Or - this operation takes as input at least two bitmap-producing operations (either
Bitmap scan
orBitmap And
), and produces a bitmap that is the logicalOR
of all its inputs.
These Bitmap And/Or
operations can be arbitrarily nested.
The cost-based optimizer of NuoDB decides what is the optimal combination of bitmap scan operators that reduces the total cost of scanning the indexes in the IIU plan, plus the subsequent cost of fetching the resulting records via a Record fetch
operation.
Index Ordering Properties
A Record fetch
takes as input a bitmap of record IDs (produced by a bitmap scan plan), and fetches records in record ID order.
This order is unrelated to the order of the keys in the scanned index(es).
As a result bitmap index scan plans cannot be used to optimize GROUP BY
and ORDER BY
clauses, and the records are then processed by additional sorting and grouping operations.
A Streaming record fetch
operation retrieves records in index key order because its input Filter scan
operator produces the resulting keys according to their index order.
The optimizer can make use of this ordering property to optimize the processing of GROUP BY
and ORDER BY
clauses of the SQL statement.
If the order of records output by the streaming index scan matches the order specified by the ORDER BY
clause of the SQL statement, then the optimizer will completely optimize out the sort from the execution plan.
A Streaming covering record fetch
operation is similar to a streaming index scan in that it also fetches records in index key order.
The optimizer will choose this index method if the index contains all of the columns that are to be retrieved.
Therefore, the optimizer can make use of covering indexes to retrieve the result set information without fetching table records, and optimize the processing of both the GROUP BY
and the ORDER BY
clauses.
Execution plan statistics
The EXPLAIN
command, in addition to displaying the execution plan, can provide additional information that the optimizer uses to choose the most efficient execution plan for a SQL statement.
It is also possible to display the actual statistics resulting from the execution of the plan.
See EXPLAIN for the command usage.
This section describes this additional information provided by each of these EXPLAIN
command options.
Cost Estimates
In addition to the query execution plan, the EXPLAIN
command also shows the cost and cardinality estimates that the optimizer used for each operation in the plan.
This functionality is controlled by EXPLAIN
option OPT_ESTIMATES
. its default is ON
, so normally the estimates used by the optimizer are shown as part of the query plan.
These estimates are based on database statistics that are computed for each table and index, and stored in NuoDB system tables (see Improving Query Performance).
These estimates are computed internally by the optimizer and cannot be changed. See Example 6 below.
In order for the optimizer to choose the most efficient execution plan, database statistics should be kept up-to-date to reflect changes to the underlying tables.
To update database statistics, run the ANALYZE
command.
This ANALYZE
command should be run for every table in the SQL statement being passed to the EXPLAIN
command.
Alternatively, it can be run for each index on every table in the SQL statement (see ANALYZE
command).
Some of the information provided by these cost estimates include:
Type of Estimate |
Description |
cost |
This represents an abstract measure of how expensive an execution plan is. It is currently measured in terms of abstact I/O units. The lower the cost of an operation, the faster it is supposed to evaluate. This is the case both for atomic operations, composite operations, and the query execution plan as a whole. |
est. rows |
The number of rows that will be produced by the corresponding operator. |
est. keys |
The number of keys that will be produced by the corresponding index scan operator. |
Analyze Cost Statistics
The ANALYZE
option of the EXPLAIN
command evaluates the SQL statement, and collects actual statistics for each operator of the execution plan.
These statistics are displayed in EXPLAIN
format annotated with the counters' information.
The ANALYZE
command supports all DML statements.
If a SQL statement has bind parameters NuoDB may dynamically re-optimize the execution plan at run time, when the parameters values are provided.
This could cause the optimizer to chose a different execution plan from the one displayed by the EXPLAIN
command output.
The ANALYZE
option is very useful in this case, as it shows the actual execution plan being evaluated.
When the ANALYZE
option is set to ON
, the EXPLAIN
command output displays execution plan operators and the values of various timers and counters as a comma-separated list enclosed in parentheses.
These include:
time |
This is the total time that this node took during execution. The time unit is shown right after the measured time. |
actual rows |
This is the actual number of records produced by the operator.
This number can be compared to the |
actual keys |
This is the number of keys produced as output by an index scan operator.
It can be compared to the |
batches |
The number of vectors used by the operation. |
The Coerce Operator
The COERCE operator is the operator which performs the implicit conversions decided by the Data Type Conversion mechanism. The presence of this operator in a query plan indicates that the initial query contains discrepancies in terms of types which were corrected automatically by the NuoDB query compiler. The COERCE operator is functionally similar to the CAST operator, but is not available as a separate call (it is only used internally). Unlike the CAST operator which performs truncation and/or rounding, in certain cases when the value to be converted does not fit in the target type, the COERCE operator is designed to be lossless and throws an error if the conversion would require truncation and/or rounding.
Example:
create table t(name binary(10));
insert into t values ('Albertine'), ('Hunter'), ('Mandy');
explain select UPPER(name) from t;
Project [UPPER]
Evaluation UPPER(COERCE(T.NAME AS string))
Table scan T (cost: 16.0, est. rows: 3)
In the example above, the column name has type binary(10) but the UPPER expression expects its argument to be of type string.
The NuoDB query compiler is implicitly converting that argument to a string type.
|
About Execution Plans
A few things to note when looking at execution plans:
-
Full Table scans are often (but not always) to be avoided. Indexes can be added to tables based on columns used in the
WHERE
,ORDER BY
, andGROUP BY
clauses. However, adding indexes can have a negative impact on SQL DML statement performance. -
Index hints can be used to override indexes that the optimizer might choose (see Using Optimizer Hints).
-
The output can display additional information which the optimizer computes based on options that can be specified for the
EXPLAIN
command. See EXPLAIN command for a detailed description of these options and the information that it displays. Several of the examples below show this additional information. -
Updating index statistics can provide the optimizer with more accurate cost estimates (see ANALYZE).
-
In multi-table joins, the join order is very important. Poor join order selection can make a query very slow. It is the job of the optimizer to choose the best join order.
-
The optimizer attempts to coerce query constants to the appropriate types to compare against fields in a table. Make sure the constants are of the appropriate data types.
The output from the EXPLAIN
command gives a complete view of the plan that the optimizer produced.
However, you should not expect the execution plan to stay static throughout the life of a database.
Some reasons for this are:
-
As tables shrink and grow relative to other tables the optimizer could make different decisions about the best execution plan.
-
The distribution of the values in each column may change over time, which in turn may change the number of records that would match a condition on this column.
-
Different processes (Transaction Engines) in a database may have slightly different (newer) information about tables and indexes. The execution plan generated may depend on to which TE the client is connected.
Examples of EXPLAIN Command Output
- Example 1: Simple
EXPLAIN
command output with full table scan -
The simplest execution plan is one for which there is no index and no filter conditions. A simple execution plan is generated from the following query:
EXPLAIN SELECT * FROM hockey.players;
This results in the following query plan:
-
a
Project
operator, representing the list of columns being selected from its input stream. -
a
Table scan
operator returns a stream of records representing the entireHOCKEY.PLAYERS
table. TheTable scan
operator also shows the estimated cost to scan the whole table, and the estimated number or rows of theHOCKEY.PLAYERS
. The cost is proportional to the estimated number of records and the estimated record size.The
EXPLAIN
command output for such an execution plan looks as follows.Project PLAYERID, FIRSTNAME, LASTNAME, HEIGHT, WEIGHT, FIRSTNHL, LASTNHL, POSITION, BIRTHYEAR, BIRTHMON, BIRTHDAY, BIRTHCOUNTRY, BIRTHSTATE, BIRTHCITY Table scan PLAYERS (cost: 1141765.6, est. rows: 7520)
-
- Example 2:
EXPLAIN
command output for SQL statement with a predicate -
This example uses the same SQL SELECT statement as the previous example and adds a predicate. However, the column in the
WHERE
clause is not indexed.EXPLAIN SELECT * FROM players WHERE firstname = 'Mike';
This results in a
Project
operator with a single input - aFilter
operator that applies the only condition(PLAYERS.FIRSTNAME = 'Mike')
to the result of its inputTable scan
. TheFilter
operator returns a stream of records with only those records where “FIRSTNAME='Mike'
” is true.The
EXPLAIN
command output for such an execution plan looks as follows.Project PLAYERID, FIRSTNAME, LASTNAME, HEIGHT, WEIGHT, FIRSTNHL, LASTNHL, POSITION, BIRTHYEAR, BIRTHMON, BIRTHDAY, BIRTHCOUNTRY, BIRTHSTATE, BIRTHCITY Filter (PLAYERS.FIRSTNAME = 'Mike') Table scan PLAYERS (cost: 1141765.6, est. rows: 7520)
- Example 3:
EXPLAIN
command output for SQL statement with Index -
Let’s modify the query, adding a predicate on
PLAYERID
, so that it uses the primary key index on thePLAYERS.PLAYERID
field as in:EXPLAIN SELECT firstname, lastname FROM players WHERE PLAYERID > 'a'; Project FIRSTNAME, LASTNAME Filter (PLAYERS.PLAYERID > 'a') Table scan PLAYERS (cost: 1141765.6, est. rows: 7520)
This also results in the same query plan as the previous table, except for the
Project
operator which selects only two of all its input fields.The
EXPLAIN
command output is still showing aTable scan
operator on thePLAYERS
table, even though we are providing a predicate based on the column defined by the primary key index. Why does the optimizer choose to not use the unique primary key index? The optimizer has the benefit of knowing, based on the index statistics, that all of the rows in the table will return a true evaluation of the predicate. In this case, the optimizer computes a cost estimate for a full table scan to be more efficient than a cost estimate for an Index scan.What if the predicate is modified to reduce the number of rows evaluating to true as compared to the total number of rows. At some point, the optimizer will compute the cost of an Index scan to be lower than the cost of a full table scan. Lower cost is interpreted as "more efficient" by the optimizer, therefore it will choose to perform an Index scan.
EXPLAIN SELECT firstname, lastname FROM players WHERE PLAYERID > 'm'; Project FIRSTNAME, LASTNAME Record fetch PLAYERS (cost: 1064919.7, est. rows: 3396) Bitmap filter # Note: Index scan on index PLAYERS..PRIMARY_KEY Bitmap scan PLAYERS..PRIMARY_KEY (PLAYERID[>'m':inf]) (cost: 33688.3, est. keys: 3396) # Lower bound: "c", upper bound: "inf"
This plan performs a
Bitmap index scan
on indexPLAYERS..PRIMARY_KEY
, where the lower bound of the key range is "c", and there is no upper bound. The resulting bitmap is consumed by aRecord fetch
operation that fetches the matching records from tablePLAYERS
. Note that the optimizer removed theFilter
operation from the previous plan because the index scan guarantees that it will produce only the keys that match the conditionPLAYERID > 'm'
.If an upper bound is supplied to the SQL
SELECT
statement, you’ll see a few differences in the output:EXPLAIN SELECT * FROM players WHERE playerid > 'm' AND playerid < 'q'; Project PLAYERID, FIRSTNAME, LASTNAME, HEIGHT, WEIGHT, FIRSTNHL, LASTNHL, POSITION, BIRTHYEAR, BIRTHMON, BIRTHDAY, BIRTHCOUNTRY, BIRTHSTATE, BIRTHCITY Record fetch PLAYERS (cost: 466451.3, est. rows: 1488) Bitmap filter Bitmap scan PLAYERS..PRIMARY_KEY (PLAYERID[>'m':<'q']) (cost: 14756.0, est. keys: 1488) # An upper bound "q" has been added
- Example 4:
EXPLAIN
command output with a JOIN clause -
Now let’s try a
clause. Let’s select all of the players from the 2011 Stanley Cup Champions: The Boston Bruins. The execution plan for this SQL statement is as follows:JOIN
Project FIRSTNAME, LASTNAME, GOALS EXPLAIN SELECT P.FIRSTNAME, P.LASTNAME, S.GOALS FROM PLAYERS P, SCORING S WHERE S.PLAYERID = P.PLAYERID AND P.BIRTHCITY LIKE 'Bos%'; Nested Loop Join (cost: 1771589.6, est. rows: 6193) Filter (P.BIRTHCITY >= 'Bos') Filter (P.BIRTHCITY < 'Bot') Filter (P.BIRTHCITY LIKE 'Bos%' ESCAPE '\\') Table scan P (PLAYERS) (cost: 1141765.6, est. rows: 7520) Record fetch S (SCORING) (cost: 620.4, est. rows: 6) Bitmap filter Bitmap scan SCORING..PRIMARY_KEY (PLAYERID[=P.PLAYERID] YEAR[] STINT[] TEAMID[] POSITION[]) (cost: 157.7, est. keys: 6)
The
Nested Loop Join
operator has two inputs:-
A sequence of
Filter
operators that filter the output of aTable scan
on tablePLAYERS
. -
An
Record fetch
operator on tableSCORING
, that fetches the records of the bitmap resulting from its childBitmap scan
operator. The join conditionS.PLAYERID = P.PLAYERID
is "pushed" to the index scan itself, which uses the fact that the first column of the index isPLAYERID
. The join condition is evaluated by the index scan, and therefore is not present as a separateFilter
operation.Notice that the condition
P.BIRTHCITY LIKE 'Bos%'
has been rewritten to the conjunction(P.BIRTHCITY >= 'Bos' AND P.BIRTHCITY < 'Bot' AND P.BIRTHCITY LIKE 'Bos%')
. The additional conditions were generated automatically by the optimizer to enable potential index use. In this case there was no suitable index to use on column PLAYER.BIRTHCITY.
-
- Example 5:
EXPLAIN
command output with JOIN clause with Index -
In the previous example, the full table scan on the
PLAYERS
table is something we may want to look at. If this was a really big table it could take a considerable amount of time to process. A review of the data model shows that a composite index on both thePLAYERID
andBIRTHCITY
columns of thePLAYER
table would allow for a more direct access based on the predicate, as defined by theWHERE
clause. For simplicity theEXPLAIN
below shows the plan without the optimizer’s estimates (by using theopt_estimates off
option).CREATE INDEX players_pid_city_idx on players(playerid, birthcity); ANALYZE INDEX players_pid_city_idx; EXPLAIN (opt_estimates off) SELECT P.FIRSTNAME, P.LASTNAME, S.GOALS FROM PLAYERS P, SCORING S WHERE S.PLAYERID = P.PLAYERID AND P.BIRTHCITY LIKE 'Bos%'; Project FIRSTNAME, LASTNAME, GOALS Nested Loop Join Filter (P.BIRTHCITY LIKE 'Bos%' ESCAPE '\\') Record fetch P (PLAYERS) Bitmap filter Bitmap scan PLAYERS_PID_CITY_IDX (PLAYERID[] BIRTHCITY[>='Bos':<'Bot']) Record fetch S (SCORING) Bitmap filter Bitmap scan SCORING..PRIMARY_KEY (PLAYERID[=P.PLAYERID] YEAR[] STINT[] TEAMID[] POSITION[])
After the new index is added, the
Nested Loop Join
operator has changed. Its first input (the outer join table) changed fromTable scan
to aRecord fetch
with inputBitmap scan
using the auto-generated condition(P.BIRTHCITY >= 'Bos') AND (P.BIRTHCITY < 'Bot')
. The two conditions themselves have been eliminated from theFilter
operation because they are computed by the index scan.The overall benefit of this new execution plan is reduced number of input records for the
Join
operator. This should result in less execution time. - Example 6:
EXPLAIN
command output with Optimizer Estimates displayed -
Let’s look at this same SQL statement with Index Statistics turned on. Remember to run
ANALYZE
on all tables involved in the SQL statement.ANALYZE TABLE SCORING; ANALYZE TABLE PLAYERS; EXPLAIN (opt_estimates on) SELECT P.FIRSTNAME, P.LASTNAME, S.GOALS FROM PLAYERS P, SCORING S WHERE S.PLAYERID = P.PLAYERID AND P.BIRTHCITY LIKE 'Bos%'; Project FIRSTNAME, LASTNAME, GOALS Nested Loop Join (cost: 1135379.5, est. rows: 6193) Filter (P.BIRTHCITY LIKE 'Bos%' ESCAPE '\\') Record fetch P (PLAYERS) (cost: 505555.6, est. rows: 1128) Bitmap filter Bitmap scan PLAYERS_PID_CITY_IDX (PLAYERID[] BIRTHCITY[>='Bos':<'Bot']) (cost: 151001.6, est. keys: 1128) Record fetch S (SCORING) (cost: 620.4, est. rows: 6) Bitmap filter Bitmap scan SCORING..PRIMARY_KEY (PLAYERID[=P.PLAYERID] YEAR[] STINT[] TEAMID[] POSITION[]) (cost: 157.7, est. keys: 6)
The cost of the
Nested Loop Join
went down from 1771589.6 to 1135379.5 compared to the plan from Example 4. The reason for this reduction is that the plan withTable scan
from Example 4 will scan all 7520 records, estimated as having cost 1141765.6. The new plan to access tablePLAYERS
is an index scan where:-
the cost to scan the keys between 'Bos' and 'Bot' is 151001.6, with much fewer keys to scan than records - 1128 keys.
-
the cost to fetch those 1128 keys by the
Record fetch
operation is 505555.6, which includes also the cost of its inputBitmap scan
operation.It is important to notice that the numbers of records shown in the
EXPLAIN
output are only estimates.
-
- Example 7:
EXPLAIN
command output withANALYZE
option -
For the purpose of this example, we create a table with multiple indexes:
CREATE TABLE testtab (column1 INTEGER, column2 INTEGER, column3 INTEGER, column4ni INTEGER); INSERT INTO testtab VALUES (1,1,1,1), (2,2,2,2), (3,3,3,3), (4,4,4,4), (5,5,5,5); INSERT INTO testtab SELECT column1+5, column2+5, column3+5, column4ni+5 FROM testtab; INSERT INTO testtab SELECT column1+10, column2+10, column3+10, column4ni+10 FROM testtab; INSERT INTO testtab select column1+15, column2+15, column3+15, column4ni+15 FROM testtab; INSERT INTO testtab select column1+20, column2+20, column3+20, column4ni+20 FROM testtab; INSERT INTO testtab select column1+25, column2+25, column3+25, column4ni+25 FROM testtab; INSERT INTO testtab select column1+30, column2+30, column3+30, column4ni+30 FROM testtab; INSERT INTO testtab select column1+35, column2+35, column3+35, column4ni+35 FROM testtab; CREATE INDEX idx_testtab_column1 ON testtab (column1); CREATE INDEX idx_testtab_column2 ON testtab (column2); ANALYZE TABLE testtab;
The following execution plan is generated statically based on default statistics, without knowledge of the parameter values. In other words, the cost estimates are exactly the same, regardless of the parameter values that are substituted. The examples below show that the plan stays the same no matter what are the parameter values, despite the radically different number of records due to different constants in the predicates.
EXPLAIN SELECT * FROM testtab WHERE (column1 BETWEEN ? AND ? OR column1 BETWEEN ? AND ?) AND column2 < ?; Enter value: 143 Enter value: 3 Enter value: 144 Enter value: 50 Project COLUMN1, COLUMN2, COLUMN3, COLUMN4NI Filter (TESTTAB.COLUMN2 < ?:5) Filter (((TESTTAB.COLUMN1 >= ?:1) AND (TESTTAB.COLUMN1 <= ?:2)) OR ((TESTTAB.COLUMN1 >= ?:3) AND (TESTTAB.COLUMN1 <= ?:4))) Record fetch TESTTAB (cost: 3348.7, est. rows: 63) Bitmap filter Bitmap And (cost: 810.4, est. keys: 63) Bitmap Or (cost: 385.9, est. keys: 192) Bitmap scan IDX_TESTTAB_COLUMN1 (COLUMN1[>=?:1:<=?:2]) (cost: 193.0, est. keys: 96) Bitmap scan IDX_TESTTAB_COLUMN1 (COLUMN1[>=?:3:<=?:4]) (cost: 193.0, est. keys: 96) Bitmap scan IDX_TESTTAB_COLUMN2 (COLUMN2[>=-2147483648:<?:5]) (cost: 424.5, est. keys: 211) EXPLAIN SELECT * FROM testtab WHERE (column1 BETWEEN ? AND ? OR column1 BETWEEN ? AND ?) AND column2 < ?; Enter value: 0 Enter value: 0 Enter value: 0 Enter value: 0 Enter value: 0 Project COLUMN1, COLUMN2, COLUMN3, COLUMN4NI Filter (TESTTAB.COLUMN2 < ?:5) Filter (((TESTTAB.COLUMN1 >= ?:1) AND (TESTTAB.COLUMN1 <= ?:2)) OR ((TESTTAB.COLUMN1 >= ?:3) AND (TESTTAB.COLUMN1 <= ?:4))) Record fetch TESTTAB (cost: 3348.7, est. rows: 63) Bitmap filter Bitmap And (cost: 810.4, est. keys: 63) Bitmap Or (cost: 385.9, est. keys: 192) Bitmap scan IDX_TESTTAB_COLUMN1 (COLUMN1[>=?:1:<=?:2]) (cost: 193.0, est. keys: 96) Bitmap scan IDX_TESTTAB_COLUMN1 (COLUMN1[>=?:3:<=?:4]) (cost: 193.0, est. keys: 96) Bitmap scan IDX_TESTTAB_COLUMN2 (COLUMN2[>=-2147483648:<?:5]) (cost: 424.5, est. keys: 211)
When the SQL statement executes the parameter values are substituted. If the optimizer option
DelayedQueryOptimization
is set totrue
, the optimizer could calculate different cost estimates that can change the execution plan. With theEXPLAIN
command optionANALYZE
set toON
, the evaluated plan shows the query engine statistics/counters obtained from the actual execution. This execution plan is generated by the optimizer just before query execution and takes into account the parameter values that have been substituted at execution time. This query plan is much more realistic.set opt_options DelayedQueryOptimization = true; EXPLAIN SELECT * FROM testtab WHERE (column1 BETWEEN ? AND ? OR column1 BETWEEN ? AND ?) AND column2 < ?; Enter value: 2 Enter value: 143 Enter value: 3 Enter value: 144 Enter value: 50 Project COLUMN1, COLUMN2, COLUMN3, COLUMN4NI (actual rows: 143, time: 847us, batches: 2) Filter (TESTTAB.COLUMN2 < 50) (actual rows: 143, time: 843us, batches: 2) Filter (((TESTTAB.COLUMN1 >= 2) AND (TESTTAB.COLUMN1 <= 143)) OR ((TESTTAB.COLUMN1 >= 3) AND (TESTTAB.COLUMN1 <= 144))) Record fetch TESTTAB (cost: 6053.3, est. rows: 144, actual rows: 144, time: 549us, batches: 1) Bitmap filter Bitmap scan IDX_TESTTAB_COLUMN2 (COLUMN2[>=-2147483648:<50]) (cost: 289.4, est. keys: 144, actual keys: 144, time: 307us)
If we eliminate the runtime parameters, by substituting the parameter values as constants into the SQL statement, the
EXPLAIN
command output below shows the static execution plan generated with these constant values. This is the exact same plan as the previous execution plan, which was dynamically generated because of theEXPLAIN
command optionANALYZE
. This is a different plan than the static execution plan that was generated without theANALYZE
option.EXPLAIN (ANALYZE ON) SELECT * FROM testtab WHERE (column1 BETWEEN 2 AND 143 OR column1 BETWEEN 3 AND 144) AND column2 < 50; Project COLUMN1, COLUMN2, COLUMN3, COLUMN4NI (actual rows: 143, time: 577us, batches: 2) Filter (TESTTAB.COLUMN2 <= 49) (actual rows: 143, time: 573us, batches: 2) Filter (((TESTTAB.COLUMN1 >= 2) AND (TESTTAB.COLUMN1 <= 143)) OR ((TESTTAB.COLUMN1 >= 3) AND (TESTTAB.COLUMN1 <= 144))) Record fetch TESTTAB (cost: 6053.3, est. rows: 144, actual rows: 144, time: 537us, batches: 1) Bitmap filter Bitmap scan IDX_TESTTAB_COLUMN2 (COLUMN2[>=-2147483648:<=49]) (cost: 289.4, est. keys: 144, actual keys: 144, time: 299us)
- Example 8: EXPLAIN command output with Indexed SubQuery Execution
-
This example assumes the HOCKEY schema has been restored. The index added in Example #5 has been dropped. In this example, we want to select the number of goals per player per year for players on the Boston team that were born after 1990.
The optimizer starts with the execution of the subquery from the
PLAYERS
table to return only players born after 1990. It then uses that result set to perform an Index scan search of the outer SQL statement table,SCORING
, based on theplayerid
column, which is the first column of theSCORING
table’s primary key index.The Select List shows the columns being retrieved from the
SCORING
table for these players that are returned by the subquery.USE HOCKEY EXPLAIN SELECT playerid, year, goals FROM scoring WHERE teamid = 'BOS' AND playerid IN (SELECT playerid FROM players WHERE birthyear > 1990); Project PLAYERID, YEAR, GOALS Nested Loop Join (cost: 1255445.9, est. rows: 214) Record fetch SCORING (cost: 1220617.4, est. rows: 214) Bitmap filter Bitmap scan SCORING..PRIMARY_KEY (PLAYERID[] YEAR[] STINT[] TEAMID[='BOS'] POSITION[]) (cost: 1188221.1, est. keys: 214) Filter (SCORING.PLAYERID = PLAYERS.PLAYERID) Filter (PLAYERS.BIRTHYEAR >= 1991) Record fetch PLAYERS (cost: 162.8, est. rows: 1) Bitmap filter Bitmap scan PLAYERS..PRIMARY_KEY (PLAYERID[=SCORING.PLAYERID]) (cost: 9.9, est. keys: 1)
- Example 10: EXPLAIN command output with Materialization Strategy for SubQueries
-
This example assumes the HOCKEY schema has been restored. The index added in Example #5 has been dropped. The materialization strategy avoids computation of the same subquery multiple times. It is applicable only for deterministic and non-correlated subqueries, which would produce the same result for each row of the outer SQL statement table.
In the following example, we want to select the teamid and total number of goals per team for all teams in the year 2011 that were ranked 1,2, or 3 in the New England division.
Without a materialization strategy, the
IN
subquery would need to be executed for each row of theSCORING
table, of which there are over 45K rows, to find a match on theteams.teamid
column, using thescoring.teamid
column.Using a subquery materialization strategy, the execution will compute the subquery once and store the result set in an internal, in-memory table, hashed on the column being used in the predicate, the
teams.teamid
column. This computation happens when retrieving the first row of the outer table, thescoring
table. On retrieving each subsequent row of thescoring
table, the materialized result set is searched for a match onteamid
. If found, theIN
select predicate evaluates toTRUE
.USE HOCKEY EXPLAIN SELECT teamid, sum(goals) as total_goals FROM scoring WHERE year = 2011 AND teamid IN ( SELECT teamid FROM teams WHERE rank < 4 AND year = 2011 AND divisionid ='NE') GROUP BY teamid; Project TEAMID, TOTAL_GOALS StreamedGrouping keys { SCORING.TEAMID } aggregates { SUM(SCORING.GOALS) } (cost: 1227794.6, est. rows: 136) Sort SCORING.TEAMID ASC Nested Loop Join (cost: 1227794.6, est. rows: 214) Record fetch SCORING (cost: 1220617.4, est. rows: 214) Bitmap filter Bitmap scan SCORING..PRIMARY_KEY (PLAYERID[] YEAR[=2011] STINT[] TEAMID[] POSITION[]) (cost: 1188221.1, est. keys: 214) Hash lookup on (SCORING.TEAMID = TEAMID) (cost: 6962.0, est. rows: 1) Materialize stream Distinct (cost: 6961.0, est. rows: 1) Filter (TEAMS.YEAR = 2011) Filter (TEAMS.DIVISIONID = 'NE') Filter (TEAMS.RANK <= 3) Record fetch TEAMS (cost: 6961.0, est. rows: 30) Bitmap filter Bitmap scan TEAMS..PRIMARY_KEY (YEAR[=2011] TEAMID[]) (cost: 240.0, est. keys: 30)
In the previous example, the
Bitmap scan
operator on the outer table,SCORING
applies an optimization calledindex condition pushdown
. The operation scans the whole index, and applies the condition(year = 2011)
to the second column of each index key. This allows to filter a large portion of the keys, and reduce the cost of the subsequentRecord fetch
operation.How would this
EXPLAIN
command output differ if an index could be used on this outer table,SCORING
.In the following example, we want to select all of the teams that scored greater than 250 points in the year 2000. The primary key column,
teamid
, on the outer table,TEAMS
, is used to fetch all of the rows for the outer table that return true for the year=2000 predicate. TheHash lookup
operator computes theIN
select predicate using the materialized in-memory subquery result set, which only needs to be executed for the rows returned by the Index scan of the outer table.USE HOCKEY; EXPLAIN SELECT year, teamid, rank FROM teams WHERE year = 2000 AND teamid IN ( SELECT teamid FROM scoring WHERE year = 2000 GROUP BY teamid HAVING SUM(goals) > 250); Project YEAR, TEAMID, RANK Nested Loop Join (cost: 1227608.7, est. rows: 30) Record fetch TEAMS (cost: 6961.0, est. rows: 30) Bitmap filter Bitmap scan TEAMS..PRIMARY_KEY (YEAR[=2000] TEAMID[]) (cost: 240.0, est. keys: 30) Hash lookup on (TEAMS.TEAMID = TEAMID) (cost: 1220618.4, est. rows: 1) Materialize stream Distinct (cost: 1220617.4, est. rows: 136) Filter (SUM(SCORING.GOALS) >= 251) StreamedGrouping keys { SCORING.TEAMID } aggregates { SUM(SCORING.GOALS) } (cost: 1220617.4, est. rows: 136) Sort SCORING.TEAMID ASC Record fetch SCORING (cost: 1220617.4, est. rows: 214) Bitmap filter Bitmap scan SCORING..PRIMARY_KEY (PLAYERID[] YEAR[=2000] STINT[] TEAMID[] POSITION[]) (cost: 1188221.1, est. keys: 214)
- Example 11: EXPLAIN command output with Bitmap Index Access Ordering Property
-
This example assumes the HOCKEY schema has been restored. The index added in Example #5 has been dropped. The following example illustrates the ordering property of the bitmap index access method.
Suppose we’d like to find information about all players based on first name. To do this efficiently we create an index on column
players.firstname
. The optimizer will choose aBitmap scan
operation to quickly scan the index for all matching rows based on the predicate as defined, "FIRSTNAME = 'John'". The rows are returned by theRecord fetch
operation, sorted in record ID order. The rows are then ordered, based on theORDER BY
clause, by theSort
operator.create index players_firstname_idx on players(firstname); explain SELECT playerid, firstname,lastname FROM players WHERE firstname = 'John' ORDER BY playerid; Project PLAYERID, FIRSTNAME, LASTNAME Sort PLAYERS.PLAYERID ASC (cost: 43512.5, est. rows: 125) Record fetch PLAYERS (cost: 37759.8, est. rows: 125) Bitmap filter Bitmap scan PLAYERS_FIRSTNAME_IDX (FIRSTNAME[='John']) (cost: 711.3, est. keys: 125)
- Example 12: EXPLAIN command output with Streaming Index Access Ordering Property
-
This example assumes the HOCKEY schema has been restored. The index added in Example #5 has been dropped. The streaming index fetches rows in index key order. The optimizer can make use of this ordering property to optimize the processing of
GROUP BY
andORDER BY
clauses of the SQL statement. If this index key order of rows, output by the streaming index, matches the order specified by theORDER BY
clause of the SQL statement, then the optimizer will completely optimize out theSort
operator from the execution plan.As an example, the following query does not specify an
ORDER BY
clause. Therefore, the optimizer chooses a bitmap index access method for that query and does not sort the query result set output.SHOW TABLE scoring Found table SCORING in schema HOCKEY PLAYERID varchar(10) Nullable: No YEAR integer Nullable: No STINT integer Nullable: No TEAMID varchar(3) Nullable: No POSITION varchar(4) Nullable: No GAMESPLAYED integer Nullable: No Default: 0 GOALS integer Nullable: No Default: 0 ASSISTS integer Nullable: No Default: 0 PENALTYMINUTES integer Nullable: No Default: 0 Primary Index: SCORING..PRIMARY_KEY on fields: PLAYERID, YEAR, STINT, TEAMID, POSITION Foreign Key FK_HOCKEY_SCORING_YEAR_TEAMID_#_HOCKEY_TEAMS_YEAR_TEAMID: (YEAR,TEAMID) references TEAMS(YEAR,TEAMID) Foreign Key FK_HOCKEY_SCORING_PLAYERID_#_HOCKEY_PLAYERS_PLAYERID: (PLAYERID) references PLAYERS(PLAYERID) EXPLAIN SELECT * FROM scoring WHERE playerid > 'p'; Project PLAYERID, YEAR, STINT, TEAMID, POSITION, GAMESPLAYED, GOALS, ASSISTS, PENALTYMINUTES Record fetch SCORING (cost: 2502849.1, est. rows: 14122) Bitmap filter Bitmap scan SCORING..PRIMARY_KEY (PLAYERID[>'p':inf] YEAR[] STINT[] TEAMID[] POSITION[]) (cost: 365053.7, est. keys: 14122)
The following example adds an
ORDER BY
clause to the previous example. This particularORDER BY
clause specifies a column that is included in the index. However, this column is not the first column defined for the index. Therefore, the optimizer cannot take advantage of the index to order the records. For the rows to be output in the order specified by theORDER BY
clause, aSort
operator is included in the execution plan.EXPLAIN SELECT * FROM scoring WHERE playerid > 'p' ORDER BY teamid; Project PLAYERID, YEAR, STINT, TEAMID, POSITION, GAMESPLAYED, GOALS, ASSISTS, PENALTYMINUTES Sort SCORING.TEAMID ASC (cost: 3151785.9, est. rows: 14122) Record fetch SCORING (cost: 2502849.1, est. rows: 14122) Bitmap filter Bitmap scan SCORING..PRIMARY_KEY (PLAYERID[>'p':inf] YEAR[] STINT[] TEAMID[] POSITION[]) (cost: 365053.7, est. keys: 14122)
In the following example, we modify the
ORDER BY
clause to order the result set using one or more columns defined by the index. ThisORDER BY
column list has the columns appearing in the same order as defined by the index. In this case, the optimizer can choose aStreaming record fetch
access method and can eliminate theSort
operator that was required in the previous example.EXPLAIN SELECT * FROM scoring WHERE playerid > 'p' ORDER BY playerid, year; Project PLAYERID, YEAR, STINT, TEAMID, POSITION, GAMESPLAYED, GOALS, ASSISTS, PENALTYMINUTES Streaming record fetch SCORING (cost: 2502849.1, est. rows: 14122) Filter scan SCORING..PRIMARY_KEY (PLAYERID[>'p':inf] YEAR[] STINT[] TEAMID[] POSITION[]) ASC
- Example 13: EXPLAIN command output with Streaming Covering Index Access Ordering Property
-
This example assumes the HOCKEY schema has been restored. The index added in Example #5 has been dropped. A streaming covering index scan is similar to a streaming index scan. It will also output the records in the order of the index key, which allows the optimizer to eliminate the
Sort
operator.The optimizer will choose this type of index access method if the index contains all columns that are included in the SQL statement column list, where clause, group by, etc. When this is the case, the optimizer can make use of the streaming covering index access method to retrieve the column rows for the result set without accessing the table. This usually results in better performance because the index contains fewer columns compared to the table, and thus less data is fetched.
In this case, as in the previous example, the optimizer can also eliminate the
Sort
operator. The streaming covering index access method outputs the records in the order of the index key.In the following example, the optimizer will use the primary key index based on the predicate, as defined by the
WHERE
clause, to search for all rows where "
> 'f' ". With the SQLPLAYERS.PLAYERID
SELECT
statement column list selecting only those column(s) that are defined by the primary key index, the optimizer chooses this very efficient execution plan. This index access method also optimizes the output record ordering, therefore eliminating theSort
operator for theORDER BY
clause.SHOW TABLE scoring Found table SCORING in schema HOCKEY PLAYERID varchar(10) Nullable: No YEAR integer Nullable: No STINT integer Nullable: No TEAMID varchar(3) Nullable: No POSITION varchar(4) Nullable: No GAMESPLAYED integer Nullable: No Default: 0 GOALS integer Nullable: No Default: 0 ASSISTS integer Nullable: No Default: 0 PENALTYMINUTES integer Nullable: No Default: 0 Primary Index: SCORING..PRIMARY_KEY on fields: PLAYERID, YEAR, STINT, TEAMID, POSITION Foreign Key FK_HOCKEY_SCORING_YEAR_TEAMID_#_HOCKEY_TEAMS_YEAR_TEAMID: (YEAR,TEAMID) references TEAMS(YEAR,TEAMID) Foreign Key FK_HOCKEY_SCORING_PLAYERID_#_HOCKEY_PLAYERS_PLAYERID: (PLAYERID) references PLAYERS(PLAYERID) EXPLAIN SELECT playerid, year, stint, teamid, position FROM scoring WHERE playerid > 'm' ORDER BY playerid, year; Project PLAYERID, YEAR, STINT, TEAMID, POSITION <-- only columns included in the index are returned in the result set Streaming covering record fetch SCORING (cost: 541553.7, est. rows: 20949) Filter scan SCORING..PRIMARY_KEY (PLAYERID[>'m':inf] YEAR[] STINT[] TEAMID[] POSITION[]) ASC