Books
in black and white
Main menu
Share a book About us Home
Books
Biology Business Chemistry Computers Culture Economics Fiction Games Guide History Management Mathematical Medicine Mental Fitnes Physics Psychology Scince Sport Technics
Ads

Introduction to the Teradata® RDBMS for UNIX® Version 2 Release 2.1 - NCR

NCR Introduction to the Teradata® RDBMS for UNIX® Version 2 Release 2.1 - NCR, 1998. - 315 p.
Download (direct link): inntroduktionteradata1998.pdf
Previous << 1 .. 24 25 26 27 28 29 < 30 > 31 32 33 34 35 36 .. 76 >> Next

Full table scan • Efficient because each row is touched only once • All AMPS, all rows • Spool file might be as large as the base table

4-12

Introduction to the Teradata RDBMS for UNIX
Data Definition

Primary Indexes

Introduction

Primary Indexes

In the Teradata RDBMS, at least one primary index is required for each table, while no secondary indexes are required.

Primary indexes:

• Affect the distribution of rows across AMPs.

• Do not have subtables

• Can be unique and nonunique

The primary index for a table should represent the data values most used by the SQL to access the data for the table.

In the following example, DeptNo would be a suitable primary key.

DeptNo DeptName EmpCount Loc MgrNo Budget
100 Administration 3 NYC 10004
300 Exec Office 2 NYC 10012
500 Engineering 5 ATL 10008
600 Manufacturing 3 CHI 10009

When a row of information is inserted into the table, the primary index value for that row is processed using a hashing algorithm which determines a specific location within the system for the row of information.

At a later time, in order to retrieve the information, the primary index value provided is used in the hashing algorithm to “compute” the location of the row.

Since the primary index value is supplied when the row is initially stored, it can be used to locate the row without the additional overhead involved in maintaining a separate index subtable.

Introduction to the Teradata RDBMS for UNIX

4-13
Data Definition

Primary Indexes

Unique Primary Index

Nonunique Primary Index

Guidelines for Selecting a Primary Index

A unique primary index (UPI) corresponds to one and only one data I

row in the table. Use a UPI when you want to ensure even |

distribution of data for a table across all AMPs. Even distribution optimizes both retrieval and disk storage.

The UPI is typically assigned to the column (or columns) that comprise the primary key in the logical design.

If you choose not to have a UPI for a table, you should assign a USI to the “primary key” column set. |

A non-unique primary index (NUPI) corresponds to one or more data rows in the table. Distribution of rows depends on the value of the primary index for a table, to obtain the most even distribution of rows possible across AMPs, you should select as “nearly unique” a nonunique primary index as possible.

All rows with the same primary index value are stored on the same AMP, so if you choose a frequently duplicated column as a nonunique primary index, row distribution could be very uneven. |

Keep these guidelines in mind when you select a primary index for your tables.

• The column (or column set) chosen should be the set selection most frequently used to select rows from the table and should be unique (UPI) or close to unique (NUPI)

• Selection should be based on an equality search

• The data values for the index should not be subject to change

• Distinct index values hash evenly across all AMPs, while

duplicate index values hash to the same AMP |

• A nonunique primary index should have no more than 100

duplicate values |

4-14

Introduction to the Teradata RDBMS for UNIX
Data Definition

Secondary Indexes

Introduction

Subtables

Journaling and Secondary Indexes

Secondary Indexes

Use a secondary index as an alternative access path to enhance retrieval performance.

Secondary indexes come in two forms: unique and nonunique.

Secondary indexes always have an associated subtable. Because of this, table updates, inserts, and deletes are slower than they would be without a secondary index because each updated row in the base table is accompanied by an updated row for the subtable.

For this reason, secondary indexes may not be appropriate in situations like a heavily used OLTP application.

On the other hand, multiple secondary indexes are useful in a decision support environment.

A subtable is a table of indexes comprised of two fields: a data value (which may be composite) and a pointer to one (in the case of a USI) or several (in the case of a NUSI) data rows.

The optimizer uses secondary index subtables for aggregate processing whenever it can because queries can frequently be answered based on the data in the subtable alone without having to refer to the base table.

The overall usefulness of a NUSI is heavily dependent on the |

If a secondary index is weakly selective, then the optimizer does not use it.

Changes to USI rows are kept in the Transient Journal.

Changes to NUSI rows are not journaled.

Introduction to the Teradata RDBMS for UNIX

4-15
Data Definition

Secondary Indexes

The following table provides examples of how a secondary index Advantages of Secondary would enhance performance on operations on a particular table.

Indexes

SQL Operation Notes USI on Name Column NUSI on Name Column
SELECT... WHERE Name = 'Leidner P'; None Involves two AMPs Step is first sent to index AMP Step is forwarded to primary AMP Involves all AMPs Reduces the number of rows examined on each AMP
Previous << 1 .. 24 25 26 27 28 29 < 30 > 31 32 33 34 35 36 .. 76 >> Next