<rt id="bn8ez"></rt>
<label id="bn8ez"></label>

  • <span id="bn8ez"></span>

    <label id="bn8ez"><meter id="bn8ez"></meter></label>

    javajohn

    金色年華

    漢字(中文)還是unicode

    漢字與 unicode 編碼相互轉(zhuǎn)化

    (2006年7月17日? 11:07:58 )

    一、???????????? 概述:

    ?????? 如果項目采用了 GBK 的編碼,那么漢字轉(zhuǎn)化就不是問題了。但是如果采用了 utf-8 的編碼,漢字的處理就相對比較麻煩一些。

    二、???????????? 功能實現(xiàn):

    ??????

    代碼如下:

    ?

    ?1 ???? // ?轉(zhuǎn)為unicode
    ?2 ???? public ? static ? void ?writeUnicode( final ?DataOutputStream?out,
    ?3 ???????????? final ?String?value)? {
    ?4 ???????? try ? {
    ?5 ???????????? final ?String?unicode? = ?gbEncoding(value);
    ?6 ???????????? final ? byte []?data? = ?unicode.getBytes();
    ?7 ???????????? final ? int ?dataLength? = ?data.length;
    ?8
    ?9 ????????????System.out.println( " Data?Length?is:? " ? + ?dataLength);
    10 ????????????System.out.println( " Data?is:? " ? + ?value);
    11 ????????????out.writeInt(dataLength);? // ?先寫出字符串的長度
    12 ????????????out.write(data,? 0 ,?dataLength);? // ?然后寫出轉(zhuǎn)化后的字符串
    13 ????????}
    ? catch ?(IOException?e)? {
    14
    15 ????????}

    16 ????}

    17
    18 ???? public ? static ?String?gbEncoding( final ?String?gbString)? {
    19 ???????? char []?utfBytes? = ?gbString.toCharArray();
    20 ????????String?unicodeBytes? = ? "" ;
    21 ???????? for ?( int ?byteIndex? = ? 0 ;?byteIndex? < ?utfBytes.length;?byteIndex ++ )? {
    22 ????????????String?hexB? = ?Integer.toHexString(utfBytes[byteIndex]);
    23 ???????????? if ?(hexB.length()? <= ? 2 )? {
    24 ????????????????hexB? = ? " 00 " ? + ?hexB;
    25 ????????????}

    26 ????????????unicodeBytes? = ?unicodeBytes? + ? " \\u " ? + ?hexB;
    27 ????????}

    28 ???????? // ?System.out.println("unicodeBytes?is:?"?+?unicodeBytes);
    29 ???????? return ?unicodeBytes;
    30 ????}

    31
    32 ???? /**
    33 ?????*?This?method?will?decode?the?String?to?a?recognized?String?in?ui.
    34 ?????*?功能:將unicod碼轉(zhuǎn)為需要的格式(utf-8)
    35 ?????*? @author ?javajohn
    36 ?????*? @param ?dataStr
    37 ?????*? @return
    38 ????? */

    39 ???? public ? static ?StringBuffer?decodeUnicode( final ?String?dataStr)? {
    40 ???????? final ?StringBuffer?buffer? = ? new ?StringBuffer();
    41 ????????String?tempStr? = ? "" ;
    42 ????????String?operStr? = ?dataStr;
    43 ???????? if (operStr? != ? null ? && ?operStr.indexOf( " \\u " )? == ? - 1 )? return ?buffer.append(operStr); //
    44 ???????? if (operStr? != ? null ? && ? ! operStr.equals( "" )? && ? ! operStr.startsWith( " \\u " )) { //
    45 ????????????tempStr? = ?operStr.substring( 0 ,operStr.indexOf( " \\u " )); //?
    46????????????operStr?=?operStr.substring(operStr.indexOf("\\u"),operStr.length());//operStr字符一定是以unicode編碼字符打頭的字符串
    47????????}

    48 ????????buffer.append(tempStr);
    49 ???????? while ?(operStr? != ? null ? && ? ! operStr.equals( "" )? && ?operStr.startsWith( " \\u " )) { // 循環(huán)處理,處理對象一定是以unicode編碼字符打頭的字符串
    50 ????????????tempStr? = ?operStr.substring( 0 , 6 );
    51 ????????????operStr? = ?operStr.substring( 6 ,operStr.length());
    52 ????????????String?charStr? = ? "" ;
    53 ????????????charStr? = ?tempStr.substring( 2 ,?tempStr.length());
    54 ???????????? char ?letter? = ?( char )?Integer.parseInt(charStr,? 16 );? // ?16進制parse整形字符串。
    55 ????????????buffer.append( new ?Character(letter).toString());
    56 ???????????? if (operStr.indexOf( " \\u " )? == ? - 1 ) { //?
    57????????????????buffer.append(operStr);
    58????????????}
    else { // 處理operStr使其打頭字符為unicode字符
    59 ????????????????tempStr? = ?operStr.substring( 0 ,operStr.indexOf( " \\u " ));
    60 ????????????????operStr? = ?operStr.substring(operStr.indexOf( " \\u " ),operStr.length());
    61 ????????????????buffer.append(tempStr);
    62 ????????????}

    63 ????????}

    64 ???????? return ?buffer;
    65 ????}

    一、???????????? 結(jié)尾:

    posted on 2006-07-17 11:07 javajohn 閱讀(5543) 評論(1)  編輯  收藏 所屬分類: 我的記憶

    Feedback

    # re: 漢字(中文)還是unicode 2006-07-18 17:11 小豬

    關(guān)于代碼單元和代碼點的理解:
    1、一個代碼點可能包含一個或兩個代碼單元。
    2、在我的測試程序中,“我 ”也只占用一個代碼單元。即代碼點數(shù)等于代碼單元數(shù)。
    下面是在unicode的官方網(wǎng)站上找到的關(guān)于unicode的中文,韓文,日文的一些說明:
    Q: I have heard that UTF-8 does not support some Japanese characters. Is this correct?

    A: There is a lot of misinformation floating around about the support of Chinese, Japanese and Korean (CJK) characters. The Unicode Standard supports all of the CJK characters from JIS X 0208, JIS X 0212, JIS X 0221, or JIS X 0213, for example, and many more. This is true no matter which encoding form of Unicode is used: UTF-8, UTF-16, or UTF-32.

    Unicode supports over 70,000 CJK characters right now, and work is underway to encode further additions. The International Standard ISO/IEC 10646 and the Unicode Standard are completely synchronized in repertoire and content. And that means that Unicode has the same repertoire as GB 18030, since that also is synchronized with ISO 10646 — although with a different ordering and byte format.
    無論是那個編碼方式(UTF-8, UTF-16, or UTF-32)都可以對中文全面支持?


    我的測試程序如下:
    public class test0 {
    public static void main(String[] args)
    {String a="我 ";
    int cuCount=a.length();
    System.out.println("the number of code units required for string \"test\" in the UTF-16 encoding is "+cuCount);
    int cpCount=a.codePointCount(0, a.length());
    System.out.println("the number of code points is "+cpCount);
    System.out.println("the end of string \"我 \" is "+a.charAt(a.length()-1));

    }

    }

    輸出結(jié)果為:
    the number of code units required for string "test" in the UTF-16 encoding is 2
    the number of code points is 2
    the end of string "我 " is [空格]

    在eclipse里面找到了set encoding選項,在里面可以設置編碼方式。  回復  更多評論   


    My Links

    Blog Stats

    常用鏈接

    留言簿(7)

    隨筆分類(36)

    隨筆檔案(39)

    classmate

    good blog

    企業(yè)管理網(wǎng)站

    好友

    站點收藏

    搜索

    最新評論

    閱讀排行榜

    評論排行榜

    主站蜘蛛池模板: 免费真实播放国产乱子伦| 曰批全过程免费视频在线观看无码 | 国产精品亚洲片在线花蝴蝶| 免费国产黄网站在线观看可以下载| 亚洲精品成a人在线观看| 亚洲av无码不卡久久| 久久午夜羞羞影院免费观看| 久久久久亚洲精品美女| 国产高潮久久免费观看| 免费一级毛片女人图片| 羞羞漫画在线成人漫画阅读免费 | 一日本道a高清免费播放| 亚洲精品色婷婷在线影院| 日本特黄特色AAA大片免费| 亚洲精品线路一在线观看| 一边摸一边桶一边脱免费视频| 亚洲日韩在线观看| 伊人久久大香线蕉免费视频| 亚洲第一极品精品无码久久| 最近2019免费中文字幕视频三| 亚洲欧洲日产国码www| 在线观看视频免费国语| 国产亚洲人成在线播放| 国产成人精品日本亚洲专区| 成人免费区一区二区三区| 亚洲AV成人片色在线观看| 4虎永免费最新永久免费地址| 亚洲人成无码网站在线观看| 全部免费a级毛片| 青柠影视在线观看免费| 亚洲喷奶水中文字幕电影| 日韩特黄特色大片免费视频| 国产精品高清免费网站 | 亚洲三级在线播放| 俄罗斯极品美女毛片免费播放| 国产99久久久国产精免费| 久久精品亚洲中文字幕无码麻豆| 无码日韩人妻av一区免费| 一区二区免费电影| 亚洲精品高清国产麻豆专区| 国产无遮挡色视频免费视频|