Indexing

Indexing is the creation of data structures to speed up the data retrieval operations on a database table. These data structures are called indexes. Indexes help locate and access the rows of a table based on the values in one or more columns.

NuoDB uses two types of indexing:

B Tree Indexing

A B Tree index is a read-optimized index implemented using a B+ tree. The B Tree index sorts and stores the values in the indexed column(s) of each row in the indexed table in a single tree structure. This tree is comprised of one root index atom and zero or more leaf index atoms. For very small indexes, the entire index fits in the root. For large indexes each of one or more leaf atoms manages a range of values, and the root contains a lookup structure to find the leaf(s) containing a value or range of values. Leaf atoms are linked together for efficient range scans.

Use a B Tree index for smaller indexes that easily fit entirely in memory or for read-mostly indexes of any size. By default, when CREATE INDEX is used without the USING parameter, the index is created using B Tree.

Merge Tree Indexing

Merge Tree indexing is supported only for non-unique indexes.

A Merge Tree index is a write-optimized index implemented using multiple B+ tree components. Each component is a B Tree index, with components storing overlapping (rather than distinct) ranges. The first component contains only a root index atom. Other components may have a root index atom and leaf index atoms. The number of components is varied dynamically and automatically. When a value is inserted into the index, it is added to the first component. Values are merged in batches from smaller components into larger components, ultimately being merged into a top-level B Tree component.

By merging values in batches, modifications to the Merge Tree can be much more efficient than with a single B Tree. The tradeoff is that reads can be slower when there are multiple components in the Merge Tree. During a write-heavy workload, the number of components is increased to increase batch sizes and better amortize the cost of modifications to the index. During a read-mostly workload, the number of components is decreased to improve the performance of reads by merging values up to the top-level B Tree component.

A Merge Tree index will have much better and more predictable write throughput than a B Tree index, even for very large indexes. Use a Merge Tree index for large indexes with a heavy insert, update, or delete workload that modifies indexed columns in random (non-sorted) order.

Ideal Use Cases:

  • Logging Systems where log entries are continuously added and queried.

  • Time-Series Data where data is inserted and retrieved in real-time.

  • Real-Time Analytics that analyze data and requires frequent updates and quick access.

  • Event Sourcing where events are stored as they occur.

  • Telemetry where system or network performance data is collected and monitored continuously.

To create a Merge Tree index, use CREATE INDEX <index _name> ON <table_name> (FIELDI, FIELD2) USING MERGE TREE. For more information, see CREATE INDEX.

To rebuild an index using Merge Tree index, use ALTER TABLE <table_name> REBUILD INDEXES USING MERGE TREE. For more information, see ALTER TABLE.

To view information about Merge Tree indexes and their components, use the SYSTEM.MULTIINDEXES and SYSTEM.LOCALINDEXATOMS tables.

To get the following stats, use nuocmd get stats.

  • MemtablesMoved - The number of multi index memtables moved into sstables

  • SstablesMerged - The number of multi index stables merged into B+ tree

  • BtreeObjectsChanged - The number of changed multi index B+ tree objects

  • BtreeObjectsMerged - The number of merged multi index B+ tree objects

  • MemtableWaitTime - Time waiting for new memtable of multi index

To learn more about how to automate rebuilding indexes to use Merge Tree, see the NuoDB Support Knowledge Base article, Merge Tree Index.

Use SET AUTOCOMMIT OFF to rebuild an index safely, so that the index is replaced only on COMMIT.