SQL Table Functions

SQL Table Functions are built-in table functions that can be used in SQL statements where other table expressions, such as a table or a subquery, are expected. The result set of a table function is a temporary table that exists during SQL statement execution. This table can be queried further, joined with other tables, or used in any other way that a regular table can be used. Table functions resemble the user-defined table functions that are created using the CREATE TABLE statement.

DATAAFFINITYMAPPING

DATAAFFINITYMAPPING returns a table with a mapping of PARTITION to NODEID for any table. This provides information about the NODEIDs where each partition will be processed if the table is processed as a distributed query.

Syntax

DATAAFFINITYMAPPING('<tablename_in_current_schema>')
DATAAFFINITYMAPPING('<schema_name>','<table_name>')

tablename_in_current_schema is used for the specified table in the current schema. schema_name and table_name are used for the specified table in the specified schema.

Example

SELECT * FROM DATAAFFINITYMAPPING('TEST','T1')
ORDER BY PARTITION,NODEID;
PARTITION  NODEID
---------- -------
P1        3
P2        1
P3        2
P4        1

GETAGGREGATEDSCALARSTATS

GETAGGREGATEDSCALARSTATS returns the aggregated scalar stats per statistics (column, index, and column group) across one or more partitions. It takes two required parameters: a table name and a list of partition names, and an optional parameter, the schema name. If the schema name is not specified, it will default to the current schema. To see stats per partition, see GETSCALARSTATS.

Syntax

GETAGGREGATEDSCALARSTATS(['<schema_name>',] '<table_name>', '<partition_list>');
GETAGGREGATEDSCALARSTATS(['<schema_name>',] '<table_name>', null);

partition_list is a string that can contain one partition or a comma-separated list of partitions such as, p1 or p1, p2, p3. The null data type can be used to represent a list of all the partition names.

Examples

SELECT * FROM GETAGGREGATEDSCALARSTATS('t2', 'p1, p2, p3');
| FIELDS  NDV  NULLCOUNT
| ------- ---- ----------
|   C       3       0
|   C, D    6       0
|   D       4       4
SELECT * FROM GETAGGREGATEDSCALARSTATS('t2', null);
|  FIELDS  NDV  NULLCOUNT
| ------- ---- ----------
|   C       3       0
|   C, D    6       0
|   D       4       4

GETHISTOGRAMBUCKETS

GETHISTOGRAMBUCKETS returns a table containing the full contents of the histogram stats for a specified index or statistics object. It takes two required parameters: the table name and the index/statistics name, and one optional parameter, the schema name. If the schema name is not specified, it will default to the current schema. There will be one row in the table per bucket in the histogram.

Syntax

GETHISTOGRAMBUCKETS (['<schema_name>',]'<table_name>','<statistics_name>');

Example

SELECT * FROM GETHISTOGRAMBUCKETS('test', 't1', 's1');
  PARTITION  BUCKETINDEX            BOUNDARYLOW            BOUNDARYLOWCARD               BOUNDARY               BOUNDARYCARD  BUCKETNDV
 ---------- ------------ -------------------------------- ---------------- ----------------------------------- ------------- ----------
    <null>        0       [Null], [Null], [Null]                  49        [Null], [Null], [Null]                   49           1
    <null>        1       [1], [one], [110E-2]                     1        [2], [oneone], [220E-2]                   6           2
    <null>        2       [3], [oneoneone], [330E-2]              15        [4], [oneoneoneone], [440E-2]            18           2
    <null>        3       [5], [oneoneoneoneone], [550E-2]        10        [6], [oneoneoneoneoneone], [660E-2]       2           2

GETHISTOGRAMS

GETHISTOGRAMS returns a table containing data about the histogram stats for a specified table. It takes one required parameter, the table name, and one optional parameter, the schema name. If the schema name is not specified, it will default to the current schema. There will be one row, for each index and statistics object created on the table, per histogram.

Syntax

GETHISTOGRAMS (['<schema_name>',]'<table_name>');

Example

SELECT * FROM GETHISTOGRAMS('t1');
  PARTITION  NAME     FIELDS    KEYCOUNT              MINKEY                                         MAXKEY                              BUCKETCOUNT
 ---------- ----- ------------ --------- ------------------------------ --------------------------------------------------------------- ------------
    <null>    S1   I, S, N         101    [Null], [Null], [Null]         [6], [oneoneoneoneoneone], [660E-2]                                  6
    <null>    S2   D, T, B         101    [Null], [Null], [Null]         [1111-11-16], [2024-01-06 01:01:01], [010101010101010101010101]      6
    <null>    S4   DBL, T, N, I    101    [Null], [Null], [Null], [Null] [666.666], [2024-01-06 01:01:01], [660E-2], [6]                      6

GETSCALARSTATS

GETSCALARSTATS returns a table containing the current values of the scalar stats for a specified table. It takes one required parameter, the table name, and one optional parameter, the schema name. If the schema name is not specified, it will default to the current schema.

Syntax

GETSCALARSTATS(['<schema_name>',] '<table_name>');

Example

SELECT * FROM GETSCALARSTATS('t');
 PARTITION  FIELDS   NDV  NULLCOUNT
 ---------- ------- ----- ----------

     P2        E    12345    6789
     P1        E    12345    6789

GETSTATSMETADATA

GETSTATSMETADATA returns a table containing one row for scalar stats metadata, one row for histogram metadata, and one row for each index on the table. For partitioned tables, it returns these rows for each partition. It takes one optional parameter, the schema name, and one required parameter, the table name. GETSTATSMETADATA table function allows the inspection of the Stats v3 metadata.

Syntax

GETSTATSMETADATA(['<schema_name>',] '<table_name>')

Example

CREATE INDEX I1 ON t1 (f2);
SELECT * FROM getStatsMetadata('t1');
 PARTITION  METADATATYPE  CARDINALITY  MODCOUNT  MODCOUNTTHRESHOLD        TIMEOFLASTSCAN       CARDINALITYATLASTSCAN  SAMPLESIZE
 ---------- ------------- ------------ --------- ------------------ -------------------------- ---------------------- -----------

   <null>   SCALAR STATS     30000       40000         42500        2024-10-16 13:06:14.026422          30000           <null>
   <null>   HISTOGRAM        30000       40000         33000        2024-10-16 13:06:52.29099           30000            40000
   <null>   I1               30000       40000         33000        <null>                             <null>           <null>

VALUESBETWEEN

VALUESBETWEEN generates a table with a single column called VAL containing all integer values from lower_bound to upper_bound, including the lower_bound and the upper_bound. This function can be used for generating test data.

Syntax

VALUESBETWEEN(<lower_bound>,<upper_bound>)

Example

SELECT
    val AS id,
    MOD(val * 89, 50) AS intcol,
    DATE '2020-01-01' + MOD(val * 71, 365) AS datecol,
    CASE MOD(val, 3)
        WHEN 0 THEN 'abc'
        WHEN 1 THEN 'defghi'
        ELSE NULL
    END AS strcol
FROM
    valuesbetween(1, 10);
ID  INTCOL   DATECOL   STRCOL
--- ------- ---------- -------
1    39    2020-03-12 defghi
2    28    2020-05-22 <null>
3    17    2020-08-01 abc
4     6    2020-10-11 defghi
5    45    2020-12-21 <null>
6    34    2020-03-02 abc
7    23    2020-05-12 defghi
8    12    2020-07-22 <null>
9     1    2020-10-01 abc
10    40    2020-12-11 defghi