Download (direct link):
Note: NCR recommends that you run tests on an adequate amount (approximately 10%) of real data before choosing a utility.
Loading Tables with Triggers
To load data into a trigger-enabled base table, you must disable all defined triggers before you can run the FastLoad or MultiLoad utility. To disable a table that has more than one trigger, you can use the statement:
ALTER TRIGGER tablename DISABLE ;
Supplying a table name instead of a trigger name disables all triggers for that table (see "ALTER TRIGGER" in Teradata RDBMS SQL Reference, Volume 4).
BTEQ and load utilities like TPump, which perform standard SQL inserts and updates, are supported for tables on which triggers are enabled.
Loading Tables with PPIs
Load utilities can be used for a table with a PPI, with the following advisories:
• Plan for the fact that
• MultiLoad does not support USIs
• FastLoad does not support SIs of any kind (and tables with PPIs often also have USIs). However, you can load the tables first and create the SIs later.
• The MultLoad IMPORT task:
• Does not support updates of
- The PI columns (whether PPI or NPPI)
- The partitioning columns
• For update/deletes, requires:
- All values of the PI columns
- All values of the partitioning columns
2 - 10 Teradata RDBMS Database Administration
Chapter 2: Building the Teradata RDBMS
Using an Identity Column
IDENTITY is an optional attribute used to generate a unique number for every
row inserted into the data table on which it is defined. (The identity column
does not have to be the first column in the table or defined as an index.)
You can use an identity column to:
• Avoid load preprocessing by eliminating duplicates when loading tables
• Achieve uniqueness of PK values without incurring the overhead of a UNIQUE constraint
When you use an identity column as the PI, it is useful for:
• Achieving row uniqueness without the need for a composite index.
• Ensuring row uniqueness during a merge of several tables that each have a UPI. The identity column PI is the UPI for the final result table.
The row insertion may be a singleton insert or a bulk insert, using:
• Single INSERT statements made through multiple concurrent sessions (BTEQ IMPORTs into the same table)
• Multiple INSERT statements made through multiple concurrent sessions (TPump inserts)
• INSERT-SELECT statements
• Atomic UPSERTs (with a non-PI identity column)
• Single-row MERGE-INTO
Controlling Identity Numbering Parameters
You control various aspects of identity numbering, including:
• The CREATE TABLE phrase AS IDENTITY and its options: GENERATED ALWAYS/BY DEFAULT
INCREMENT BY MINVALUE MAXVALUE CYCLE/NO CYCLE
• The dictionary table DBC.IdCol, which contains one row for each table with an identity column. The AvailValue field holds the starting value for the numbering pool. The beginning value is the value you defined as START WITH and is incremented by the INCREMENT BY value, or 1 if you did not specify an increment.
• The user-tunable field IdColBatchSize in the DBS Control GDO. This field determines the size of the number pool. You set or change the value with the DBS Control Utility (see "Reviewing or Changing Global Parameters" on page A-5).
• The incremental value for the identity field of each row, which you define with INCREMENT BY.
2 - 10
Teradata RDBMS Database AdministrationChapter 2: Building the Teradata RDBMS
Identity Column Numbering Considerations
The Identity Column numbering method takes advantage of the Teradata RDBMS parallel processing capabilities. For example, assume you create the following table:
CREATE TABLE TableZ (Y INT GENERATED ALWAYS AS IDENTITY (START WITH 1 INCREMENT BY 2 MINVALUE 1
MAXVALUE 10000 NO CYCLE); X INT)
UNIQUE PRIMARY INDEX(Y);
For the sake of this example, also assume you set IdColBatchSize to 10000.
To process a bulk load job for TableZ, AvailValue is initialized to the START WITH value. Each AMP then:
• Retrieves the AvailValue
• Reserves it in local cache
• Updates its value by an increment of IdColBatchSize (except for one AMP, which keeps the START WITH value of 1)
• Updates field Y in a row with the first number from cache
• Increments the next number in cache by the specified INCREMENT BY (in this case, 2) and updates field Y in another row
• Continues until it runs out of reserved numbers, then retrieves AvailValue, increments it by IdColBatchSize, and continues
The generated numbers for TableZ depend on the batch size, the number of rows loaded, and the number of AMPs. For example, if the batch size is 10000 and only 5 rows are loaded into a 2-AMP system, the numbering may be:
1, 3, 5, 10001, 10003 ...
When deciding how to apply an identity column, be aware that:
• Because table rows are distributed across all AMPs and all AMPs work in parallel, the numbers are not generated in strict consecutive order according to the sequence of row insertion. For example, a row with a generated number of 3 may be written to disk before the row with a generated number of 1.