<rt id="bn8ez"></rt>
<label id="bn8ez"></label>

  • <span id="bn8ez"></span>

    <label id="bn8ez"><meter id="bn8ez"></meter></label>

    javajohn

    金色年華

    漢字(中文)還是unicode

    漢字與 unicode 編碼相互轉(zhuǎn)化

    (2006年7月17日? 11:07:58 )

    一、???????????? 概述:

    ?????? 如果項(xiàng)目采用了 GBK 的編碼,那么漢字轉(zhuǎn)化就不是問題了。但是如果采用了 utf-8 的編碼,漢字的處理就相對(duì)比較麻煩一些。

    二、???????????? 功能實(shí)現(xiàn):

    ??????

    代碼如下:

    ?

    ?1 ???? // ?轉(zhuǎn)為unicode
    ?2 ???? public ? static ? void ?writeUnicode( final ?DataOutputStream?out,
    ?3 ???????????? final ?String?value)? {
    ?4 ???????? try ? {
    ?5 ???????????? final ?String?unicode? = ?gbEncoding(value);
    ?6 ???????????? final ? byte []?data? = ?unicode.getBytes();
    ?7 ???????????? final ? int ?dataLength? = ?data.length;
    ?8
    ?9 ????????????System.out.println( " Data?Length?is:? " ? + ?dataLength);
    10 ????????????System.out.println( " Data?is:? " ? + ?value);
    11 ????????????out.writeInt(dataLength);? // ?先寫出字符串的長(zhǎng)度
    12 ????????????out.write(data,? 0 ,?dataLength);? // ?然后寫出轉(zhuǎn)化后的字符串
    13 ????????}
    ? catch ?(IOException?e)? {
    14
    15 ????????}

    16 ????}

    17
    18 ???? public ? static ?String?gbEncoding( final ?String?gbString)? {
    19 ???????? char []?utfBytes? = ?gbString.toCharArray();
    20 ????????String?unicodeBytes? = ? "" ;
    21 ???????? for ?( int ?byteIndex? = ? 0 ;?byteIndex? < ?utfBytes.length;?byteIndex ++ )? {
    22 ????????????String?hexB? = ?Integer.toHexString(utfBytes[byteIndex]);
    23 ???????????? if ?(hexB.length()? <= ? 2 )? {
    24 ????????????????hexB? = ? " 00 " ? + ?hexB;
    25 ????????????}

    26 ????????????unicodeBytes? = ?unicodeBytes? + ? " \\u " ? + ?hexB;
    27 ????????}

    28 ???????? // ?System.out.println("unicodeBytes?is:?"?+?unicodeBytes);
    29 ???????? return ?unicodeBytes;
    30 ????}

    31
    32 ???? /**
    33 ?????*?This?method?will?decode?the?String?to?a?recognized?String?in?ui.
    34 ?????*?功能:將unicod碼轉(zhuǎn)為需要的格式(utf-8)
    35 ?????*? @author ?javajohn
    36 ?????*? @param ?dataStr
    37 ?????*? @return
    38 ????? */

    39 ???? public ? static ?StringBuffer?decodeUnicode( final ?String?dataStr)? {
    40 ???????? final ?StringBuffer?buffer? = ? new ?StringBuffer();
    41 ????????String?tempStr? = ? "" ;
    42 ????????String?operStr? = ?dataStr;
    43 ???????? if (operStr? != ? null ? && ?operStr.indexOf( " \\u " )? == ? - 1 )? return ?buffer.append(operStr); //
    44 ???????? if (operStr? != ? null ? && ? ! operStr.equals( "" )? && ? ! operStr.startsWith( " \\u " )) { //
    45 ????????????tempStr? = ?operStr.substring( 0 ,operStr.indexOf( " \\u " )); //?
    46????????????operStr?=?operStr.substring(operStr.indexOf("\\u"),operStr.length());//operStr字符一定是以u(píng)nicode編碼字符打頭的字符串
    47????????}

    48 ????????buffer.append(tempStr);
    49 ???????? while ?(operStr? != ? null ? && ? ! operStr.equals( "" )? && ?operStr.startsWith( " \\u " )) { // 循環(huán)處理,處理對(duì)象一定是以u(píng)nicode編碼字符打頭的字符串
    50 ????????????tempStr? = ?operStr.substring( 0 , 6 );
    51 ????????????operStr? = ?operStr.substring( 6 ,operStr.length());
    52 ????????????String?charStr? = ? "" ;
    53 ????????????charStr? = ?tempStr.substring( 2 ,?tempStr.length());
    54 ???????????? char ?letter? = ?( char )?Integer.parseInt(charStr,? 16 );? // ?16進(jìn)制parse整形字符串。
    55 ????????????buffer.append( new ?Character(letter).toString());
    56 ???????????? if (operStr.indexOf( " \\u " )? == ? - 1 ) { //?
    57????????????????buffer.append(operStr);
    58????????????}
    else { // 處理operStr使其打頭字符為unicode字符
    59 ????????????????tempStr? = ?operStr.substring( 0 ,operStr.indexOf( " \\u " ));
    60 ????????????????operStr? = ?operStr.substring(operStr.indexOf( " \\u " ),operStr.length());
    61 ????????????????buffer.append(tempStr);
    62 ????????????}

    63 ????????}

    64 ???????? return ?buffer;
    65 ????}

    一、???????????? 結(jié)尾:

    posted on 2006-07-17 11:07 javajohn 閱讀(5532) 評(píng)論(1)  編輯  收藏 所屬分類: 我的記憶

    Feedback

    # re: 漢字(中文)還是unicode 2006-07-18 17:11 小豬

    關(guān)于代碼單元和代碼點(diǎn)的理解:
    1、一個(gè)代碼點(diǎn)可能包含一個(gè)或兩個(gè)代碼單元。
    2、在我的測(cè)試程序中,“我 ”也只占用一個(gè)代碼單元。即代碼點(diǎn)數(shù)等于代碼單元數(shù)。
    下面是在unicode的官方網(wǎng)站上找到的關(guān)于unicode的中文,韓文,日文的一些說明:
    Q: I have heard that UTF-8 does not support some Japanese characters. Is this correct?

    A: There is a lot of misinformation floating around about the support of Chinese, Japanese and Korean (CJK) characters. The Unicode Standard supports all of the CJK characters from JIS X 0208, JIS X 0212, JIS X 0221, or JIS X 0213, for example, and many more. This is true no matter which encoding form of Unicode is used: UTF-8, UTF-16, or UTF-32.

    Unicode supports over 70,000 CJK characters right now, and work is underway to encode further additions. The International Standard ISO/IEC 10646 and the Unicode Standard are completely synchronized in repertoire and content. And that means that Unicode has the same repertoire as GB 18030, since that also is synchronized with ISO 10646 — although with a different ordering and byte format.
    無論是那個(gè)編碼方式(UTF-8, UTF-16, or UTF-32)都可以對(duì)中文全面支持?


    我的測(cè)試程序如下:
    public class test0 {
    public static void main(String[] args)
    {String a="我 ";
    int cuCount=a.length();
    System.out.println("the number of code units required for string \"test\" in the UTF-16 encoding is "+cuCount);
    int cpCount=a.codePointCount(0, a.length());
    System.out.println("the number of code points is "+cpCount);
    System.out.println("the end of string \"我 \" is "+a.charAt(a.length()-1));

    }

    }

    輸出結(jié)果為:
    the number of code units required for string "test" in the UTF-16 encoding is 2
    the number of code points is 2
    the end of string "我 " is [空格]

    在eclipse里面找到了set encoding選項(xiàng),在里面可以設(shè)置編碼方式。  回復(fù)  更多評(píng)論   


    My Links

    Blog Stats

    常用鏈接

    留言簿(7)

    隨筆分類(36)

    隨筆檔案(39)

    classmate

    good blog

    企業(yè)管理網(wǎng)站

    好友

    站點(diǎn)收藏

    搜索

    最新評(píng)論

    閱讀排行榜

    評(píng)論排行榜

    主站蜘蛛池模板: 亚洲国产精品婷婷久久| 婷婷精品国产亚洲AV麻豆不片 | 欧美亚洲国产SUV| 免费无码成人AV片在线在线播放| 亚洲av乱码一区二区三区香蕉| 精品久久久久久久久免费影院| 最新亚洲卡一卡二卡三新区| 成人免费网站在线观看| 日韩欧美亚洲国产精品字幕久久久| 免费无码黄动漫在线观看| 人妻仑刮八A级毛片免费看| 国产亚洲精品资在线| 久久免费动漫品精老司机| 亚洲黄色免费网址| 免费毛片在线看片免费丝瓜视频| 亚洲日韩国产欧美一区二区三区| 免费日本黄色网址| 色播在线永久免费视频网站| 亚洲网站在线免费观看| 午夜成年女人毛片免费观看| 老司机精品视频免费| 亚洲国产精品无码久久久不卡| 曰批全过程免费视频播放网站| 99久久国产亚洲综合精品| 免费在线观看理论片| 国产午夜免费高清久久影院| 亚洲另类自拍丝袜第1页| 四虎永久免费地址在线网站| a级毛片高清免费视频就| 亚洲免费人成视频观看| 免费v片在线观看| 免费A级毛片无码A∨中文字幕下载| 亚洲小说图区综合在线| 久久久久亚洲av毛片大| 亚洲一区在线免费观看| 无套内射无矿码免费看黄 | 麻豆狠色伊人亚洲综合网站| 亚洲熟女乱综合一区二区| 国产精品视频免费观看| 一级做a爰片久久毛片免费陪| 亚洲色欲色欲综合网站|