Books
in black and white
Main menu
Share a book About us Home
Books
Biology Business Chemistry Computers Culture Economics Fiction Games Guide History Management Mathematical Medicine Mental Fitnes Physics Psychology Scince Sport Technics
Ads

Teradata RDBMS forUNIX SQL Reference - NCR

NCR Teradata RDBMS forUNIX SQL Reference - NCR, 1997. - 913 p.
Download (direct link): teradataforunix1997.pdf
Previous << 1 .. 223 224 225 226 227 228 < 229 > 230 231 232 233 234 235 .. 241 >> Next


Objects Not Accessible Across Character Sets

Constraints with Non-Kanji Character Sets

Accessing and Sharing Japanese Character Objects and Data

All objects and data created under one of the supported character sets can be accessed under the same character set via the standard Teradata RDBMS facilities (Teradata SQL, client utilities and interfaces, console utilities, and so forth). This means that a particular user will be able to access objects and retrieve data in the same form in which they were created or inserted.

All SQL terminal symbols (simple Latin letters, digits (0-9), the dollar sign ($), the number sign (#), and the underscore (_)) always have the same canonical representation and thus can be shared across all character sets.

The single byte character Katakana (Hankaku) characters from KanjiEBCDIC and KanjiShift-JIS are translated into canonical form; this allows sharing of single byte character data between KanjiEBCDIC and KanjiShift-JIS users.

Multibyte characters are stored in the dictionary tables without translation. This means that any name could be stored in one of three encodings: KanjiEBCDIC, KanjiEUC, or KanjiShift-JIS. Thus, an object name created under a particular character set will be accessible only by users in the same environment running under the same character set, unless the object name is referenced by its internal hexadecimal representation.

For example, if an IBM user running under a KanjiEBCDIC character set creates object names containing multibyte characters, another IBM user running under the standard EBCDIC character set will not be able to access those objects easily.

This restriction can be avoided if all names are created using only simple Latin letters and the standard SQL terminal symbols.

Note: If non-canonical characters are used in object names, make sure that there will be no naming conflicts when the RDBMS does support a canonical form. For example, do not use the same table name, in different character sets, in the same database.

On a Japanese character site, all character data is expected to be mixed single byte character/multibyte character.

Therefore, even when the current session is running standard EBCDIC or ASCII, certain character configurations are interpreted

G-54

Teradata RDBMS for UNIX SQL Reference
International and Japanese Character Support

Accessing and Sharing Japanese Character Objects and Data

either as starting a multibyte character string or as control characters.

If users on a Japanese character site wish to use the standard EBCDIC or ASCII character sets, the following restrictions apply to name and data characters:

These restrictions apply equally to ASCII and EBCDIC character sets.

DO not use characters with a client encoding of . . . Because . . .
0x0E they are interpreted as the Shift-Out or
0x0F Shift-In character, respectively, which delimit the start or end of an multibyte character string.
0x80 they are translated internally into characters
0xFF which are reserved for the ss2 and ss3 escape characters of KanjiEUC data.
anything that is their translation into anything recognizable is
internally translated at best problematic.
into encoding that
falls outside the Acceptable ranges for the character sets are as
range of the defined follows:
characters in the JIS-
x0201 standard. • ASCII users should use the 7-bit ANSI standard code. (external range 0x00-0x7F, except 0x0e-0x0F and 0x10-0x14). External code points will be interpreted internally as Katakana (as defined in JIS-x0201). • EBCDIC users should use printable single byte characters; otherwise, there is a risk of problems with uppercase conversion during comparison and sorting operations.

Teradata RDBMS for UNIX SQL Reference

G-55
International and Japanese Character Support

Japanese Character Sites

Japanese Character Sites

Japanese characters use the Kanji Hashing Algorithm.

Introduction

On a Japanese character site, the Teradata RDBMS provides the following character support:

• Predefined translation codes for Japanese character sets are provided in a script file named KANJINST (see also the following section, “Setting Up Japanese Character Support”, for specific script file names).

• During a session running under a Japanese character set, you can create object names and insert and retrieve data strings that contain Kanji ideographs.

• When the AltCurrency flag is 1 (ON) in the RDBMS Control Record, you can format numeric data with Japanese currency symbols such as the Yen sign.

Character sets in KANJINST support IBM channel-attached clients as well as UNIX and DOS/V network-attached clients. The names of the character set (or sets) associated with each client are as follows:

• IBM (channel-attached) clients:

• KANJIEBCDIC5026_0I

• KANJIEBCDIC5035_0I

• KATAKANAEBCDIC

• UNIX network-attached clients:

• KANJIEUC_0U.

• DOS/V network-attached clients:
Previous << 1 .. 223 224 225 226 227 228 < 229 > 230 231 232 233 234 235 .. 241 >> Next