in black and white
Main menu
Share a book About us Home
Biology Business Chemistry Computers Culture Economics Fiction Games Guide History Management Mathematical Medicine Mental Fitnes Physics Psychology Scince Sport Technics

Teradata RDBMS Database Administration - NCR

NCR Teradata RDBMS Database Administration - NCR , 2004. - 616 p.
Download (direct link): teradatadatabaseadmin2004.pdf
Previous << 1 .. 37 38 39 40 41 42 < 43 > 44 45 46 47 48 49 .. 218 >> Next

The COLLECT STATISTICS (Optimizer form) statement records statistical data about access to columns and the use of joins. The computed results are stored in the Data Dictionary for use during the optimizing phase of statement parsing. You can use the Teradata Manager STATS module to view Optimizer statistics. (For details on STATS, see Teradata Manager User Guide.)


You should collect statistics using the Optimizer form on newly created data tables while they are still empty. An empty collection defines the columns, indexes, and synoptic data structure for loaded collections. You can easily collect statistics again once the table is populated for prototyping, and again when it is in production (see "Re-Collecting Statistics" on page 2-47).

Use the Optimizer form of COLLECT STATISTICS to collect statistics on a:

Unique index, which can be:

Primary or secondary

Single or multiple column

Partitioned or non-partitioned

Non-unique index, which can be:

Primary or secondary

Single or multiple column

Partitioned or non-partitioned

With or without COMPRESS fields

Non-indexed column or set of columns, which can be:

Partitioned or non-partitioned

With or without COMPRESS fields

Join index

Hash index

Temporary table

If you specify the TEMPORARY keyword but a materialized table does not exist, the system first materializes an instance based on the column names you specify. This means that after a true instance is created, you can update (re-collect) statistics on the columns just by entering COLLECT STATISTICS with no options.

If you omit the TEMPORARY keyword but the table is a temporary table, statistics are collected for an empty base table rather than the materialized instance.

Sample (system-selected percentage) of the rows of a data table or index, to detect data skew and dynamically increase the sample size when found.

2 - 64 Teradata RDBMS Database Administration

Chapter 2: Building the Teradata RDBMS

Converting Non-Relational Data Structures

The SAMPLE option is not supported for global temporary tables, join indexes, or hash indexes.

The system does not store both sampled and defined statistics for the same index or column set. Once sampled statistics have been collected, implicit re-collection hits the same columns and indexes, and operates in the same mode. To change this, specify any keywords or options and name the columns and/or indexes.


Optimizer statistics are valuable for prototyping, especially when used in conjunction with the EXPLAIN modifier and the query text and processing performance captured via the DBQL feature. (For details, see Chapter 14: "Tracking Processing Behavior with the Database Query Log (DBQL)". Accurate statistics help the Optimizer determine the least costly access path and the most efficient join plans.

It is particularly important for the Optimizer to have accurate demographics on the PI to determine whether row distribution is skewed (distribution across the AMPs is lumpy).

Skew might occur when the following are true for a table:

The PI is not unique (a NUPI or NUPPI)

More than 10% of the PI values are the same

A UNIQUE column or column set, such as a USI or an identity column (see "Using an Identity Column" on page 2-66), are not defined or not used

If data is skewed or the access path the Optimizer chooses is not optimum, modify the PI and re-collect statistics until desired results are indicated. You can use the Query Capture Feature (QCF) and Teradata Visual Explain to analyze the differing plans (see Chapter 13: "Database Analysis").

Collecting Statistics on Indexes

To obtain accurate EXPLAIN information on whether and how the Optimizer will use indexes, collect statistics on the following columns (if you defined several indexes on a table, you may prefer the CREATE INDEX form):

IF you want EXPLAIN output on the . THEN collect statistics on the .
Primary index (unique or not, partitioned or not) base table columns
Secondary index, if any (unique or not)
Join queries
Join/hash indexes
NUPI of the JI or SI (the default is the first column in the index DDL; an index PI is always non-unique) NUSI on a JI (an HI cannot carry a SI) JI or HI columns

2 - 64

Teradata RDBMS Database Administration Chapter 2: Building the Teradata RDBMS

Collecting Optimizer Statistics

Collecting Index Statistics with CREATE INDEX

To collect statistics on multiple indexes of one table with a single statement, you can use the CREATE INDEX form of the COLLECT STATISTICS statement. To do this:

Step Action
1 Submit a SHOW INDEX statement.
2 Change the keywords CREATE INDEX to the keywords COLLECT

Re-Collecting Statistics

To update demographics automatically, enter the COLLECT STATISTICS statement with only the tablename (omit column or index specifications):

collect statistics ON table-name ;
Previous << 1 .. 37 38 39 40 41 42 < 43 > 44 45 46 47 48 49 .. 218 >> Next