in black and white
Main menu
Share a book About us Home
Biology Business Chemistry Computers Culture Economics Fiction Games Guide History Management Mathematical Medicine Mental Fitnes Physics Psychology Scince Sport Technics

Teradata RDBMS Database Administration - NCR

NCR Teradata RDBMS Database Administration - NCR , 2004. - 616 p.
Download (direct link): teradatadatabaseadmin2004.pdf
Previous << 1 .. 28 29 30 31 32 33 < 34 > 35 36 37 38 39 40 .. 218 >> Next

Primary (PI)

Non-partitioned (NPPI)

Partitioned (PPI)

Identity Column used as a PI

Secondary (SI)

Join (JI)

Hash (HI)

Any combination of secondary, join, and hash indexes can be defined on one table. Keep in mind that:

Triggers are mutually exclusive with join indexes and hash indexes.

Some load utilities cannot insert rows into a table that has been defined with an identity column or triggers, or secondary, join, or hash indexes.

An identity column cannot be defined as part of a join index, hash index, partitioned primary index, or value-ordered index.

A value-ordered secondary index counts as two indexes.

This section introduces each type of index, provides summaries of the tools used for creating, changing, and dropping indexes, and gives procedures for and examples of prototyping indexes.

Performance advantages can far outweigh the maintenance and resource costs of indexes. For a discussion of the pros and cons of using indexes, see Teradata RDBMS Performance Optimization as well as "Selecting a Secondary Index" and "Join Index Benefits and Costs" in Teradata RDBMS Database Design.

Teradata RDBMS Database Administration

2 - 21 Chapter 2: Building the Teradata RDBMS

Secondary Indexes (SI)

Primary Index (PI)

A PI performs vital functions and provides many benefits, such as:

A PI is used to .



distribute rows among the AMPs

The column or columns defined for the PI affect how evenly rows are distributed across the disks.

Unique values provide a more even distribution than repeated values. Because balanced distribution enhances AMP parallelism, choosing a good PI is usually critical to performance.

How distribution is accomplished depends on whether the PI is partioned or not, as follows:

IF the PI is.




THEN row distribution is based on .

a hashing algorithm calculated on the PI value of each row.

The hashed result determines which hash bucket, and hence which AMP, a row is sent to.

first the row hash value to determine the AMP, then the partition number within the AMP, based on the value of the partitioning expression defined in the table definition.

Within each partition, rows are ordered by hash value and uniqueness.

Note: A PPI can be unique only if all partitioning columns are included in the set of primary index columns. Otherwise, define the index as a NUPPI and define the remaining partioning columns as a USI.

"Primary Indexes" in Teradata RDBMS Database Design

enable efficient aggregations

The GROUP BY clause specified on the PI columns can result in a more efficient aggregation.

"SQL and Performance" in Teradata RDBMS Performance Optimization

enable efficient joins

An equality join constraint on the PI may enable a direct join to the target table. This eliminates the need to redistribute, spool, and sort rows before the join operation.

2 - 22

Teradata RDBMS Database Administration Chapter 2: Building the Teradata RDBMS

Secondary Indexes (SI)

A PI is used to . Description References
provide access to rows more efficiently than block-at-atime full-table scan If you specify the values of all the PI columns in a query, only one AMP needs to be accessed to retrieve the row. With a PPI, even faster access is possible when you narrow the focus within the AMP to only the qualifying partitions, by specifying one of: - The values of all the partitioning columns - A range constraint using a partial range on the partitioning columns - A range constraint on range-partitioned columns Note: Other methods for fast retrieval include using USI values, a join index, a mix of index types, or a Cylinder Read full-table scan. "Partitioned and Nonpartioned Primary Indexes" under "CREATE TABLE (Index Defintion Clause)" in Teradata RDBMS SQL Reference, Volume 4 "Managing I/O with Cylinder Read" on page 15-2

The basic rules of defining a PI for a new table include the following:

Each data table must have one and only one PI.

The PI does not have a structure physically separate from the data rows of its table.

The PI may be defined on one or many columns.

The PI may be non-partitioned (NPPI) or partitioned (PPI).

An NPPI may be unique (UPI) or non-unique (NUPI).


- May always be non-unique (NPPI)

- May be unique (UPPI) only if you include all the partitioning columns in the set of PI columns

Because every table must have a PI, Teradata RDBMS assignes one by default if you do not define one. The column or columns used and whether it is unique or non-unique depend on the table definition, as follows:

__ I

IF you do not specify a PI column or column set . THEN the default for the table is .
but you do define one or both of: A primary key A unique constraint on one or more columns a UPI on the first column with such a constraint in the CREATE TABLE statement.
Previous << 1 .. 28 29 30 31 32 33 < 34 > 35 36 37 38 39 40 .. 218 >> Next