Understanding EXPLAIN Command Output

The purpose of reading the output from the NuoDB EXPLAIN command (see EXPLAIN) is to see the execution plan that the optimizer will choose to quickly retrieve, join, group or summarize, and sort the rows that the SQL statement operates on or returns. The SQL statement can be a SELECT statement or any DML statement that requires the optimizer to first retrieve the rows to execute for the DML operation.

The NuoDB EXPLAIN command output displays the execution plan as a hierarchical list of operators and the stream of records that result from applying those operators. When you read the plan from top to bottom you can see the order in which NuoDB SQL forms the plan. When you read the plan from bottom to top you see the plan as the query engine will execute it. The exception to this is the JOIN order. This should be interpreted from top to bottom.

With the information provided in the execution plan, you might choose to update SQL statements or database structure to improve performance. For example, you might decide to add/remove indexes or override the indexes that the optimizer chooses by using hints (see Using Optimizer Hints).

The EXPLAIN command output provides the following information:

Execution Plan Top Level Operators

The execution plan is a hierarchical sequence of operators. The top level operators are:

Operator Description
Table Scan

Represents a full table scan. Either there is no index, or the indexes defined are not usable based on the predicates in the WHERE clause, or using the index would be too costly for the optimizer. The optimizer would determine a potential index to be too costly if, based on the predicates, the majority of rows in the table would be retrieved. In this case, a full table scan would be more efficient.

This operator has no input (no sub-tree is displayed) and the stream of records it outputs represents the entire table.

Index scan

Represents the usage of an index access method. There are various access methods that an Inversion operator can use to scan an index. Input to this operator can be a bitmap of record IDs or a list of key values. This operator outputs a stream of records. The order of the records is dependent on the Index Access Method. See Index Ordering Properties.

Boolean sieve

Represents any predicate in the WHERE clause. This operator can be thought of as a filter (like a sieve) that applies boolean conditions to output only those records that satisfy the condition[s]. It always takes two inputs, displayed as sub-trees. The first input is a series of boolean conditions and the second input is a record stream on which to apply those conditions. Output from the Boolean sieve operator is a record stream containing only records that satisfy the boolean condition.
Possible boolean conditions for the first input are:

  • And, Or, Eql, Not, Gtr, Lss, Neq, etc.
  • Functions, such as Upper, Lower, Sum, Min, Max, etc.
  • In List (DynamicInList, ConstantInList, ScanInList)
  • In Select
  • Like
  • Between

This is not a comprehensive list, but is meant to give you an idea of what types of inputs are possible.

Nested Loop Join

Represents the method that the optimizer will use to join two or more tables based on the predicate conditions of the JOIN clause.

This operator takes as input two or more boolean operators, each representing certain conjuncts in a JOIN clause and each specific to a certain table used in that JOIN clause. Output from the Join operator is a stream of records, from both tables, that match the JOIN criteria.

Description of Bitmap Index Operators

A bitmap index operator represents an index intersection and union (IIU) plan that takes one or more index scans as input and applies intersection or union operations for search conditions based on the conjuncts in the query predicate of the WHERE clause. The output of this plan is a bitmap of record IDs. There are various types of bitmap index operators in an execution plan and they are described in the following table.

Bitmap Index Operators

Description

Bitmap This bitmap represents a scan over an interval of an index key where the interval is deduced from the conditions in the query predicate. For each conjunct in the predicate that uses the index generating this bitmap, there are one or more sub-trees displayed - one node if the boundary is equality and two nodes (min,max) if the boundary is less than or greater than.
BitmapAnd This bitmap represents a scan on an index when AND is part of the query predicate and two or more separate indexes are used. It has two or more sub-trees displayed, each representing a Bitmap Index Operator for the indexes used for the operands of the AND.
BitmapOr This bitmap represents a scan on an index when OR is part of the query predicate and two or more separate indexes are used. It has two or more sub-trees displayed, each representing a Bitmap Index Operator for the indexes used for the operands of the OR.
InSelectBitmap

This bitmap represents a scan on an index of the outer table, using all the values of the subquery. It has two subtrees displayed:

  • the column list of the outer table over which the Index scan is performed.
  • the execution plan of the IN subquery.
LikeBitmap This bitmap represents an Index scan when LIKE is part of the query predicate. It has one subtree displayed which represents the constant string used in the LIKE clause, to which the field in the index is bound.
MinMaxBitmap This bitmap represents an Index scan when either MIN or MAX is part of the SELECT column_list. It has one subtree displayed, but this is ignored and is always null.

Index Ordering Properties

A bitmap index fetches records in record ID order. The records are then processed based on the GROUP BY and ORDER BY clauses.

A streaming index fetches records in index key order. The optimizer can make use of this ordering property to optimize the processing of GROUP BY and ORDER BY clauses of the SQL statement. If the order of records output by the streaming index matches the order specified by the ORDER BY clause of the SQL statement, then the optimizer will completely optimize out the sort from the execution plan.

A streaming covering index is similar to a streaming index in that it also fetches records in index key order. The optimizer will choose this index method if the index contains all of the columns that are to be retrieved. Therefore, the optimizer can make use of covering indexes to retrieve the result set information (without fetching table records) and optimize the processing of both the GROUP BY and the ORDER BY clauses.

Query Participants

Query participants as predicates in the WHERE clause are represented in the tree as follows:

Node

Participant

Parameter A parameter that is passed to the SQL statement at runtime
Constant A quoted string (even for numeric values)

Description of Optimizer Parameters

The EXPLAIN command, in addition to displaying the execution plan, can provide additional information that the optimizer uses to chose the most efficient execution plan for a SQL statement. See EXPLAIN for the command usage. This section describes this additional information provided by each of these EXPLAIN command options.

Cost Estimates

Running the EXPLAIN command with the option OPT_ESTIMATES set to ON provides optimizer cost estimates for each index that the optimizer choose to scan. These are the costs associated with that execution plan. The lower the cost, the more efficient the plan. These costs are based on statistics that are computed for each index and stored in NuoDB system tables (see Improving Query Performance). These estimates are computed internally by the optimizer and cannot be changed. See Example 6 below.

In order for the optimizer to chose the most efficient execution plan, the index statistics should be kept up-to-date to reflect changes to the underlying table rows (i.e. rows being inserted, updated or deleted). See Improving Query Performance for information about the NuoDB system tables that store these index statistics. To update the index statistics, run the ANALYZE command. This ANALYZE command should be run for every table in the SQL statement being passed to the EXPLAIN command. Alternatively, it can be run for each index on every table in the SQL statement (see ANALYZE command).

Some of the information provided by these cost estimates include:

Type of Estimate Description
cost This represents an abstract measure of how expensive an execution plan is. It is currently measured in I/O cost, in terms of bytes read.
interval sel This represents the percentage of the total number of keys that an Index scan will fetch. It is called interval sel because the lower and/or upper boundaries of an Index scan define an interval over the whole index.
predicate sel This represents the additional selectivity of any predicates over other columns of an index that could not be used for the Index scan itself. This is usually the same as interval sel as quite often there are no additional predicates.
keys This represents the number of rows that an Index scan will output.
rows This represents an estimated number of result (output) rows that the optimizer retrieve for that particular operation. For example, if it is a full table scan, this would be the total number of rows in the table. If it is an index, this would be the number of rows retrieved by the index search operation.
Note: If any of these estimates are shown as -1, this means that the optimizer was not able to compute a cost estimate for that particular operator.
Note: The selectivity ratios and the keys estimate are calculated and displayed if the optimizer chooses to use an index.

Analyze Cost Statistics

The ANALYZE option of the EXPLAIN command evaluates the SQL  statement, but instead of calculating cost estimates based on index statistics that NuoDB generates, this ANALYZE option forces the SQL statement to be executed. This allows the optimizer to collect actual statistics for each operator of the execution plan. These statistics are displayed in EXPLAIN format annotated with the counters' information.

Unlike other EXPLAIN command options, the ANALYZE option supports only SQL SELECT statements. The primary reason is that the ANALYZE option causes the SQL statement to be executed. Therefore, running the EXPLAIN command with the ANALYZE option on a DML statement may accidentally modify the database.

If a SQL SELECT statement has bind parameters NuoDB may dynamically re-optimize the execution plan at run time, when the parameters values are provided. This could cause the optimizer to chose a different execution plan from the one displayed by the EXPLAIN command output. The ANALYZE option is very useful in this case, as it shows the actual execution plan being evaluated.

When the ANALYZE option is set to ON, the EXPLAIN command output displays execution plan operators and their values as a comma-separated list enclosed in parentheses. These include:

micros This is the total time, in microseconds, that this node took during execution.
f/o This is the time, in microseconds, spent in "fetch" calls.
This is the time, in microseconds, spent in "open" calls.
The sum of these two values equals the micros value.
rows in This is the number of records/keys that are input to the operator.
rows out This is the number of records/keys produced as output by the operator.
rows in / rows out This is the actual selectivity of the operator, comparing the number of rows that are input to the operator versus the number of rows produced as output by the operator.

Optimizer Partition Pruning

NuoDB supports table partitioning (see Using Table Partitions and Storage Groups). The table partition criteria is stored in NuoDB system tables. The system table stores the column that the table is partitioned on. The system table also stores the partition criteria for each table partition (see PARTITIONIDS System Table Description and PARTITIONCRITERIA System Table Description). With a SQL statement predicate that includes this partitioning column, the optimizer can compare the predicate value to the partition criteria for each table partition. This gives the optimizer the ability to ignore, or "prune", one or more table partitions when searching for table rows that match the predicate values.

When the EXPLAIN command PARTITION_IDS option is set to ON, the EXPLAIN command output will display, for each table that is partitioned, the SYSTEM.PARTITIONIDS.PARTITIONID value for each partition that the optimizer searched. This option is ignored for tables that are not partitioned (see Example 14).

About Execution Plans

A few things to note when looking at execution plans:

The output from the EXPLAIN command gives a complete view of the plan that the optimizer produced. However, you should not expect the execution plan to stay static throughout the life of a database. Some reasons for this are:

Examples of EXPLAIN Command Output