|
Posted on 2005-11-23 17:18 非鱼 阅读(3606) 评论(0) 编辑 收藏 所属分类: Sybase
Sybase Adaptive Server
Enterprise在中文支持方面,提供的排序方式比较单一。如果使用CP850(支持GB2312标准),排序方式是比较多的。但GB2312的汉字
库太小,多数情况下,我们至少要使用CP936甚至GB18030字符集。在使用CP936、GB18030字符集时,你只有BINARY方式的排序可以
选择。 问题是,如果一个早期应该使用CP850+大小写不敏感方式的排序,在升级到新的字符集+BINARY排序时,程序中的大小写要改成敏感的。 类似的问题可以通过修改SYBASE的相应字符集下的排序文件解决。你甚至可以定义自己的排序方式。 首先我们来看看GB18030字符集的BINARY排序。BINARY排序对应的是$SYB_HOME/charsets/gb18030/binary.srt,用文本编辑器打开这个文件,可以看到其内容如下:
1; semi-colon is the comment character 2[sortorder] 3;=============================================================================== 4; 5; Sort Order Overview: 6; -------------------- 7; Based on the gb18030 simplified Chinese character set, this sort 8; order is a binary ordering. 9; 10;=============================================================================== 11 12class = 0x01 ; Class `1' sort order 13id = 0x32 ; id = 50 14name = bin_gb18030 15menuname = "Binary ordering, for gb18030." 16charset = gb18030 17 18description = "Binary sort order for simplified Chinese using gb18030." 19 20binary = "true" 21 Class、id、name、menuname、charset、description都不用修改,而binary=true则指明使用binary方式排序,其实这句等价于对01~FF每个字符都进行定义。 其
次,看一下其他字符集中的大小写不敏感的排序文件,如$SYB_HOME/charsets/iso_1/noaccents.srt,具体内容就不写
了。可以发现大小写不敏感只是设置a, A;b, B等字符相等。参考这个排序文件,我们就可以这样定义binary.srt: 1. 设置a=A...z=Z; 2. 定义其他字符。 这样修改之后,就支持GB18030下的大小写不敏感排序了。 修改排序文件后,需要使用加裁该字符集和排序: charset -Usa -Ppassword -Sserver sort_order_file charset_name 然后在isql中用sp_configure 'default sortorder id', 文件中的id(十进制),两次重启SYBASE后修改字符集排序就完成了。
这种方法适用于ASE11.9.2~15.0各个版本,经历了实践考验。
另外对于早期版本,没有CP936和GB18030,可以从15.0拷一个过去用。 下面附上我修改的GB18030, BINARY.SRT文件内容:
1; semi-colon is the comment character 2[sortorder] 3;=============================================================================== 4; 5; Sort Order Overview: 6; -------------------- 7; Based on the cp936 simplified Chinese character set, this sort 8; order is a binary ordering. 9; 10;=============================================================================== 11 12class = 0x01 ; Class `1' sort order 13id = 0x32 ; id = 50 14name = bin_gb18030 15menuname = "Binary ordering, for gb18030." 16charset = gb18030 17;preference = false ; Do not use preference 18 19description = "Binary sort order for simplified Chinese using gb18030." 20; Ligatures 21 22;lig = 0xC6=0xE6, after ae ; AE, ae ligature 23;lig = 0xDF, after ss ; small german letter sharp s 24 25; Control characters 26 27char = 0x01 ;(SOH) start of heading 28char = 0x02 ;(STX) start of text 29char = 0x03 ;(ETX) end of text 30char = 0x04 ;(EOT) end of transmission 31char = 0x05 ;(ENQ) enquiry 32char = 0x06 ;(ACK) acknowledge 33char = 0x07 ;(BEL) bell 34char = 0x08 ;(BS) backspace 35char = 0x09 ;(HT) horizontal tab 36char = 0x0A ;(LF) newline, or line feed 37char = 0x0B ;(VT) vertical tab 38char = 0x0C ;(FF) form feed 39char = 0x0D ;(CR) carriage return 40char = 0x0E ;(SO) shift out 41char = 0x0F ;(SI) shift in 42char = 0x10 ;(DLE) data link escape 43char = 0x11 ;(DC1) device control 1 44char = 0x12 ;(DC2) device control 2 45char = 0x13 ;(DC3) device control 3 46char = 0x14 ;(DC4) device control 4 47char = 0x15 ;(NAK) negative acknowledge 48char = 0x16 ;(SYN) synchronous idle 49char = 0x17 ;(ETB) end transmission blk 50char = 0x18 ;(CAN) cancel 51char = 0x19 ;(EM) end of medium 52char = 0x1A ;(SUB) substitute 53char = 0x1B ;(ESC) escape 54char = 0x1C ;(FS) file separator 55char = 0x1D ;(GS) group separator 56char = 0x1E ;(RS) record separator 57char = 0x1F ;(US) unit separator 58 59; All non-alphanumeric characters, including puntuation 60; These are sorted merely by their numerical ordering 61; based on the ISO 8859-1 standard for clarity and 62; consistency 63 64char = 0x20 ;( ) space 65char = 0x21 ;(!) exclamation mark 66char = 0x22 ;(") quotation mark 67char = 0x23 ;(#) number sign 68char = 0x24 ;($) dollar sign 69char = 0x25 ;(%) percent sign 70char = 0x26 ;(&) ampersand 71char = 0x27 ;(') apostrophe 72char = 0x28 ;(() left parenthesis 73char = 0x29 ;()) right parenthesis 74char = 0x2A ;(*) asterisk 75char = 0x2B ;(+) plus sign 76char = 0x2C ;(,) comma 77char = 0x2D ;(-) hyphen, minus sign 78char = 0x2E ;(.) full stop 79char = 0x2F ;(/) solidus 80char = 0x3A ;(:) colon 81char = 0x3B ;(;) semicolon 82char = 0x3C ;(<) less-than sign 83char = 0x3D ;(=) equals sign 84char = 0x3E ;(>) greater-than sign 85char = 0x3F ;(?) question mark 86char = 0x40 ;(@) commercial at 87char = 0x5B ;([) left square bracket 88char = 0x5C ;(\) reverse solidus 89char = 0x5D ;(]) right square bracket 90char = 0x5E ;(^) circumflex accent 91char = 0x5F ;(_) low line 92char = 0x60 ;(`) grave accent 93char = 0x7B ;({) left curly bracket 94char = 0x7C ;(|) vertical line 95char = 0x7D ;(}) right curly bracket 96char = 0x7E ;(~) tilde 97char = 0x7F ;delete, or rubout 98char = 0x80 ; undefined 99char = 0x81 ; undefined 100char = 0x82 ; undefined 101char = 0x83 ; undefined 102char = 0x84 ; undefined 103char = 0x85 ; undefined 104char = 0x86 ; undefined 105char = 0x87 ; undefined 106char = 0x88 ; undefined 107char = 0x89 ; undefined 108char = 0x8A ; undefined 109char = 0x8B ; undefined 110char = 0x8C ; undefined 111char = 0x8D ; undefined 112char = 0x8E ; undefined 113char = 0x8F ; undefined 114char = 0x90 ; undefined 115char = 0x91 ; undefined 116char = 0x92 ; undefined 117char = 0x93 ; undefined 118char = 0x94 ; undefined 119char = 0x95 ; undefined 120char = 0x96 ; undefined 121char = 0x97 ; undefined 122char = 0x98 ; undefined 123char = 0x99 ; undefined 124char = 0x9A ; undefined 125char = 0x9B ; undefined 126char = 0x9C ; undefined 127char = 0x9D ; undefined 128char = 0x9E ; undefined 129char = 0x9F ; undefined 130char = 0xA0 ;no-break space 131char = 0xA1 ;inverted exclamation mark 132char = 0xA2 ;cent sign 133char = 0xA3 ;pound sign 134char = 0xA4 ;currency sign 135char = 0xA5 ;yen sign 136char = 0xA6 ;broken bar 137char = 0xA7 ;paragraph sign, section sign 138char = 0xA8 ;diaeresis 139char = 0xA9 ;copyright sign 140char = 0xAA ;feminine ordinal indicator 141char = 0xAB ;left angle quotation mark 142char = 0xAC ;not sign 143char = 0xAD ;soft hyphen 144char = 0xAE ;registered trade mark sign 145char = 0xAF ;macron 146char = 0xB0 ;ring above or degree sign 147char = 0xB1 ;plus/minus (+/-) sign 148char = 0xB2 ;superscript 2 149char = 0xB3 ;superscript 3 150char = 0xB4 ;acute accent 151char = 0xB5 ;micro sign 152char = 0xB6 ;pilcrow or paragraph sign 153char = 0xB7 ;middle dot 154char = 0xB8 ;cedilla 155char = 0xB9 ;superscript 1 156char = 0xBA ;masculine ordinal indicator 157char = 0xBB ;right angle quotation mark 158char = 0xBC ;vulgar fraction one quarter 159char = 0xBD ;vulgar fraction one half 160char = 0xBE ;vulgar fraction three quarter 161char = 0xBF ;inverted question mark 162char = 0xC0 163char = 0xC1 164char = 0xC2 165char = 0xC3 166char = 0xC4 167char = 0xC5 168char = 0xC6 169char = 0xC7 170char = 0xC8 171char = 0xC9 172char = 0xCA 173char = 0xCB 174char = 0xCC 175char = 0xCD 176char = 0xCE 177char = 0xCF 178char = 0xD0 179char = 0xD1 180char = 0xD2 181char = 0xD3 182char = 0xD4 183char = 0xD5 184char = 0xD6 185char = 0xD7 ;multiplication sign 186char = 0xD8 187char = 0xD9 188char = 0xDA 189char = 0xDB 190char = 0xDC 191char = 0xDD 192char = 0xDE 193char = 0xDF 194char = 0xE0 195char = 0xE1 196char = 0xE2 197char = 0xE3 198char = 0xE4 199char = 0xE5 200char = 0xE6 201char = 0xE7 202char = 0xE8 203char = 0xE9 204char = 0xEA 205char = 0xEB 206char = 0xEC 207char = 0xED 208char = 0xEE 209char = 0xEF 210char = 0xF0 211char = 0xF1 212char = 0xF2 213char = 0xF3 214char = 0xF4 215char = 0xF5 216char = 0xF6 217char = 0xF7 218char = 0xF8 219char = 0xF9 220char = 0xFA 221char = 0xFB 222char = 0xFC 223char = 0xFD 224char = 0xFE 225char = 0xFF 226 227; Digits 228 229char = 0x30 ;(0) digit zero 230char = 0x31 ;(1) digit one 231char = 0x32 ;(2) digit two 232char = 0x33 ;(3) digit three 233char = 0x34 ;(4) digit four 234char = 0x35 ;(5) digit five 235char = 0x36 ;(6) digit six 236char = 0x37 ;(7) digit seven 237char = 0x38 ;(8) digit eight 238char = 0x39 ;(9) digit nine 239 240; Latin Alphabet 241 242char = 0x41=0x61 243 ;A, a, A-grave, a-grave, A-acute, a-acute, A-circumflex, a-circumflex, 244 ;A-tilde, a-tilde, ;A-diaeresis, a-diaeresis, A-ring, a-ring 245char = 0x42=0x62 ;letter B, b 246char = 0x43=0x63 247char = 0x44=0x64 ;letter D, d 248char = 0x45=0x65 249 ;E, e, E-grave, e-grave, E-acute, e-acute, E-circumflex, e-circumflex, 250 ;E-diaeresis, e-diaeresis 251char = 0x46=0x66 ;letter F, f 252char = 0x47=0x67 ;letter G, g 253char = 0x48=0x68 ;letter H, h 254char = 0x49=0x69 255 ;I, i, I-grave, i-grave, I-acute, i-acute, I-circumflex, i-circumflex, 256 ;I-diaeresis, i-diaeresis 257char = 0x4A=0x6A ;letter J, j 258char = 0x4B=0x6B ;letter K, k 259char = 0x4C=0x6C ;letter L, l 260char = 0x4D=0x6D ;letter M, m 261char = 0x4E=0x6E 262char = 0x4F=0x6F 263 ;O, o, O-grave, o-grave, O-acute, o-acute, O-circumflex, o-circumflex, 264 ;O-tilde, o-tilde, O-diaeresis, o-diaeresis, O-stroke, o-stroke 265char = 0x50=0x70 ;letter P, p 266char = 0x51=0x71 ;letter Q, q 267char = 0x52=0x72 ;letter R, r 268char = 0x53=0x73 ;letter S, s 269char = 0x54=0x74 ;letter T, t 270char = 0x55=0x75 271 ;U, u, U-grave, u-grave, U-acute, u-acute, U-circumflex, u-circumflex, 272 ;U-diaeresis, u-diaeresis 273char = 0x56=0x76 ;letter V, v 274char = 0x57=0x77 ;letter W, w 275char = 0x58=0x78 ;letter X, x 276char = 0x59=0x79 277char = 0x5A=0x7A ;letter Z, z 278 279; Alpha characters not used in English, French or German: 280 281;char = 0xD0=0xF0 ;icelandic capital letter eth, small letter eth 282;char = 0xDE=0xFE ;icelandic capital letter thorn, small letter thorn 283
|