in black and white
Main menu
Share a book About us Home
Biology Business Chemistry Computers Culture Economics Fiction Games Guide History Management Mathematical Medicine Mental Fitnes Physics Psychology Scince Sport Technics

Teradata RDBMS forUNIX SQL Reference - NCR

NCR Teradata RDBMS forUNIX SQL Reference - NCR, 1997. - 913 p.
Download (direct link): teradataforunix1997.pdf
Previous << 1 .. 39 40 41 42 43 44 < 45 > 46 47 48 49 50 51 .. 241 >> Next

If character strings of unequal length are compared on a European Teradata RDBMS, the shorter field is padded with blanks on the right prior to the comparison. If Japanese character strings of unequal length are compared, the shorter string is padded with one or more <single byte-space> characters.

For non CASESPECIFIC comparisons, any lowercase single byte Latin letters are converted to uppercase before comparison begins. The prepared strings are compared byte-to-byte and any trailing spaces are ignored.

Because of the long persistence of the CASESPECIFIC attribute for columns, and the shorter persistence for attributes carried in the source text of queries, the same statement can have different results for a user logged on in ANSI versus Teradata mode.

IF . . . THEN the comparison is . . .
either argument is case specific.
both arguments are NOT case blind for <simple Latin


Teradata RDBMS for UNIX SQL Reference
Data Definition

Case Sensitivity in Comparisons

Consider the following query:



IF column FIRSTNAME is . . . AND the session is in this mode. . . THEN the match succeeds for rows with FIRSTNAME containing . . .
CASESPECIFIC either ‘George’
NOT CASESPECIFIC Teradata ‘george’ ‘GEORGE’ ‘George’ and any other combination of cases spelling out the name George.

If you wans to assure case blind comparisons and have an ANSI compliant application, the recommended approach is to define FIRSTNAME with ANSI default (CASESPECIFIC).

Use the statement:


Teradata RDBMS for UNIX SQL Reference

Data Definition

Character Data for a Japanese Character Supported Site


Fixed Length KanjiEBCDIC:

Character Data for a Japanese Character Supported Site

On a Japanese character site, the Teradata RDBMS assumes that all character data is mixed single and multibyte characters. Mixed single and multibyte character data is associated with any column defined as CHAR, VARCHAR, or LONG VARCHAR.

Encoding of character data will be either KanjiEBCDIC, KanjiEUC, or KanjiShift-JIS, depending on the current character set for the session.

Depending on the character set of the session, single byte and multibyte characters are distinguished as described by the following table.

Character Set Definition
KanjiEBCDIC Characters are assumed to be single byte until a shift-out character is encountered. Subsequent characters are assumed to be multibyte characters until a Shift-In character is encountered. If the end of the string is reached without finding Shift-In, an error condition occurs.
KanjiEUC The first byte of a multibyte character always has the most significant bit on. Multibyte characters are two bytes except KanjiEUCcs3 characters, which require three bytes. The size of a single byte/multibyte character column must be expressed in bytes required to store the internal representation of the data.


Assume that a fixed-length column is to contain the following KanjiEBCDIC data:

< S T R I N G > 12

The column definition must be at least 16 bytes (CHAR(16)) to accommodate the internal representation of the data plus the Shift-Out and Shift-In characters, which is as follows:

0E 42E2 42E3 42D9 42C9 42D5 42D7 OF 31 32

Note that each of the three client representations of multibyte character data could require a different length for the same sequence of symbols.


Teradata RDBMS for UNIX SQL Reference
Data Definition

Character Data for a Japanese Character Supported Site

Fixed Length KanjiShift-JIS: Example

Character Expressions Assigned to Shorter or Longer CHAR Columns

SQL Terminal Symbols

Character Data Validation and Storage

The same string in KanjiShift-JIS, is as follows:

S T R I N G 12

and requires a length of only 14 bytes (CHAR(14). Again, here bold indicates a double byte character. The internal equivalent for the KanjiShift-JIS string is as follows:

8272 8273 8271 8268 826D 8266 31 32

If a character expression of some length is assigned to a CHAR column of a longer length, the field is padded with the <single byte space> character.

In Teradata mode, if a character expression is assigned to a CHAR column of a shorter length, the extra bytes are truncated. This may result in an improper string. You are not informed that this truncation has occurred.

In ANSI mode, an error occurs if a nonblank character is truncated.

Shorter strings are padded with single byte spaces, regardless of whether the mode is Teradata or ANSI. Only truncation differs between the two modes.

Each of the valid SQL terminal symbols is translated into the same internal code for KanjiEBCDIC, KanjiEUC, and KanjiShift-JIS. Thus, a string containing only SQL terminal symbols is retrievable from any of the three character sets.
Previous << 1 .. 39 40 41 42 43 44 < 45 > 46 47 48 49 50 51 .. 241 >> Next