Auto Generating Keys

Identifiers (keys) uniquely identify each persistent entity and the Jakarta Persistence API (JPA) allows them to be generated automatically when an entity is persisted. One function of the NuoDB Hibernate Dialect is to generate the appropriate NuoDB SQL to do this.

Auto-generated keys typically rely on using Sequences in a specific way, a feature Hibernate calls pooled sequences.

  • A client application uses the sequence to get a batch (pool) of values (keys) that it can call its own.

    • For example, the client might get 100 values to use.

  • Each time an entity is persisted the next value from the pool is used as the unique id.

  • When the pool is exhausted, another batch of keys is fetched from the sequence.

  • Concurrent clients get different, non-overlapping sets of key values from the shared sequence allowing them to persist entities in parallel knowing the keys they allocate are unique.

  • Using pooled sequences avoids each client having to get a new key value every time it persists an entity and avoids a popular sequence becoming a hot-spot in the database.

NuoDB uses a similar mechanism to support multiple TEs running in parallel on shared tables.

  • Each TE gets a batch (pool) of values (keys) to use from a shared sequence.

  • Each TE can allocate keys from its pool without accessing the sequence or communicating with any other TE.

  • This mechanism avoids multiple TEs contending for the shared sequence.

Unfortunately managing the pool in the TE rather than the client gives rise to different semantics and so NuoDB’s Hibernate Dialect does not support pooled sequences.

This section discusses NuoDB’s auto-generation SQL syntax, our recommendations for key generation and, for those interested, how each of Hibernate’s auto-generation strategies works with NuoDB and why we recommend the IDENTITY strategy.

NuoDB Auto-Generation SQL Syntax

NuoDB SQL supports the following syntax variations for automatic identifier generation:

  • To always generate the identifier automatically, use GENERATED ALWAYS AS IDENTITY:

    CREATE TABLE t1 (id BIGINT GENERATED ALWAYS AS IDENTITY ...
  • To generate identifies automatically, but only when the value is not provided in the INSERT statement:

    CREATE TABLE t1 (id BIGINT GENERATED BY DEFAULT AS IDENTITY ...

In both cases NuoDB automatically generates a sequence, called t1$IDENTITY_SEQUENCE, that is used to generate the keys. To define the sequence manually, see below.

NuoDB SQL supports auto-generation of keys of any primitive type but in practice id columns are usually of type INTEGER, BIGINT, DECIMAL, or STRING. In all cases the key is a 64-bit integer stored appropriately for the type.

Recommendations

JPA defines four auto-generation strategies but we only recommend one of them.

For further details and background, each is discussed below. In particular, unlike with most other RDBMS, we do not recommend using the SEQUENCE strategy directly (see Generation Type SEQUENCE).
  1. Use @GeneratedValue(strategy = GenerationType.IDENTITY)

    • Automatically implemented using a dedicated sequence for each table.

      • You can override this by defining the table manually if you prefer - see recommendation 2.

    • Key generation occurs in the TE and you get full advantage of NuoDB’s distributed sequence implementation (see discussion on SEQUENCE strategy).

    • The ID column is declared as GENERATED BY DEFAULT so you can always provide a key explicitly if you wish.

    • This option may preclude batched inserts (Hibernate 5.3 or earlier) - see below.

  2. The sequence has a default starting value of 1 and a quantum size of 100. To override, declare the table and its sequence manually. For example:

    CREATE SEQUENCE seq1 START WITH 1000, QUANTUM SIZE 200;
    CREATE TABLE tab1 ( id BIGINT GENERATED BY DEFAULT AS IDENTITY(seq1), ...);

    If you prefer, use id BIGINT GENERATED ALWAYS AS IDENTITY(seq1).

  3. Another reason to control the underlying sequence might be to ensure unique identifier values across multiple tables (for example when implementing a one-one relationship). Manually declare 2 or more tables sharing the same sequence:

    CREATE SEQUENCE seq1 ...;
    CREATE TABLE tab1 ( id BIGINT GENERATED ALWAYS AS IDENTITY(seq1), ...);
    CREATE TABLE tab2 ( id BIGINT GENERATED ALWAYS AS IDENTITY(seq1), ...);
    ...

Unless you have a particular reason for not doing so, the IDENTITY strategy (possibly using manual table definitions) is the best overall implementation.

Background

Here is how each of the four JPA strategies is implemented when using NuoDB, with points to consider for each one. The example assumes you have a Long id that is stored as a BIGINT.

Generation Type AUTO

Suppose you have the following identifier declaration in your Java entity class:

@Id
@GeneratedValue(strategy = GenerationType.AUTO)
private Long id;   // Long not long, so it can be null for non-persistent (transient)  entities

If you use Hibernate’s ability to run DDL and create tables for you, NuoDB’s Hibernate dialect creates tables with the ID column declared as ID BIGINT NOT NULL. In addition Hibernate creates a sequence called hibernate_sequence to generate the identifier values automatically upon insert.

Considerations
  • Hibernate 5.x uses the same hibernate_sequence to create keys for all entities using the AUTO strategy. See Generation Type IDENTITY for an alternative. From Hibernate 6.0 this limitation is removed and each table gets its own sequence.

  • Every new entity requires execution of two SQL statements - first to get the next value from the sequence, the second to insert the new row, using that id.

Generation Type IDENTITY - Recommended

This is our recommended strategy.

If you specify strategy=GenerationType.IDENTITY and request Hibernate to create tables for you, NuoDB’s Hibernate dialect creates tables with the ID column defined as ID BIGINT GENERATED BY DEFAULT AS IDENTITY NOT NULL.

The identifier is (typically) specified in Java as an Integer, a Long, a BigDecimal or a String.

  • The ID should always be an object, not a primitive type, so it can be null if the entity has no ID yet.

The identifier is a positive 64-bit integer, cast as necessary for the ID column type. Even when the identifier is a string it still holds an auto-generated numeric value.

Considerations
  • Persisting an entity requires just a single INSERT statement. The new key value is returned from the insert using Statement.getGeneratedKeys().

  • By default, each table gets its own sequence for key generation - in contrast to AUTO above.

  • Up to Hibernate 5.3, an INSERT is run as soon as EntityManager.persist() is invoked (to set the id in the new entity). This behaviour prevents the use of SQL batching.

    • From Hibernate 5.4, Hibernate is more optimized and, if the id is not used, delays invoking EntityManager.persist() to the end of the transaction, thus allowing batching to be supported.

    • Should this enhancement cause problems, you can disable it by setting hibernate.id.disable_delayed_identity_inserts=true, but this will prevent batching also.

Generation Type SEQUENCE - NOT Recommended

Sequences in NuoDB are optimized for use in a distributed system. Instead of an INCREMENT BY option, it supports a QUANTUM SIZE. This allows each TE to get batches of values to use, reducing contention for the shared sequence. See CREATE SEQUENCE for full details.

As a result, the sequence values returned by SELECT NEXT VALUE depend on which TE you are connected to and how many other clients that TE is also supplying values to.

Example:

  • Two TEs share a sequence with default quantum size of 100.

  • Both get 100 values to use.

    • TE1 gets 1-100 and TE2 gets 101-200.

  • TE1 is serving two clients fetching values from the sequence.

  • So the values returned to client-1 might be 1, 2, 5, 7 …​ because the other client is getting intermediate values.

  • If client-1 is multi-threaded and getting values from both TE1 and TE2, the values returned might be 1,101,2,5,102,7,103 …​

Hibernate uses sequences with increment greater than 1 in a similar way to reduce round trips to the database. If a sequence has increment 100, and SELECT NEXT VALUE returns 200, then Hibernate knows that any other client getting a value from that sequence will get a value > 299. Therefore it has a "pool" of contiguous values 200-299 that it can use to supply unique keys.

Hibernate calls these Pooled Sequences and they are (from V5) the default implementation for the SEQUENCE strategy. When a new entity is persisted, the next value from the pool is set as the id in the entity and in the new row inserted into the corresponding table.

As you can see from the example, this assumption of a contiguous batch of values does not hold with NuoDB so our dialect tells Hibernate that it does not support Pooled Sequences. As a result, Hibernate chooses an alternative implementation using an underlying table to simulate a pooled sequence (equivalent to strategy=GenerationType.TABLE) requiring two database accesses to persist one entity (just like the AUTO strategy) plus an extra write per pool (to increment the next-value held in the table for the next pool).

Considerations
  • The table used for key generation can become a contention hot spot with concurrent inserts by different clients. (NuoDB’s sequences are implemented as they are to avoid precisely this problem.)

  • Requires two (occasionally three) SQL statements to persist one object.

  • Other strategies should be used whenever possible - see Recommendations.

Generation Type TABLE

This is the explicit JPA strategy to use a table to simulate a sequence. It allows multiple entities to use the same table to implement their "sequence". Each pseudo-sequence is defined by a different row in the same table.

Considerations
  • The table used for key generation can become a contention hot spot with concurrent inserts by different clients.

    • It is recommended to use a different table for each entity (this can be requested via the @TableGenerator.table attribute).

  • Requires two database accesses to persist one entity (just like the AUTO strategy) plus an extra write per pool (to increment the next-value held in the table for the next pool)

  • Other strategies should be used whenever possible - see Recommendations.