亚洲精品永久www忘忧草,亚洲&

【轉(zhuǎn)】解析xml文件獲取encoding的值

轉(zhuǎn)自http://faq.csdn.net/read/207889.html

為 DOM Level 3 而完成的一項(xiàng)重要任務(wù)是：通過加入新的可以查詢?nèi)鄙俚?nbsp;XMLInfoset 信息的方法，使 DOM 數(shù)據(jù)模型與 XML Information Set（Infoset）相匹配。例如，現(xiàn)在可以通過 Document 接口（它被映射到 Infoset 文檔信息項(xiàng)）查詢和修改儲存在一個 XML 聲明中的信息，例如 version、standalone 和 encoding。類似地，基本 URI 和聲明基本 URI 屬性是根據(jù) XML Base 處理的，它們被放在 Node 接口中。您還可以獲取 XML Infoset 元素內(nèi)容的 whitespace 屬性。這個屬性表明一個 Text 節(jié)點(diǎn)是否只包含可以被忽略的空白?？梢酝ㄟ^ Text 接口（它映射到 XML Inforset 字符信息項(xiàng)）獲得這個屬性。清單1展示了在 Java 語言綁定中這個接口中的實(shí)際方法簽名。

清單1. 在 Java 語言綁定的方法簽名

// XML Declaration information on
// the org.w3c.dom.Document interface
public String getXmlEncoding();
public void setXmlEncoding(String xmlEncoding);
public boolean getXmlStandalone();
public void setXmlStandalone(boolean xmlStandalone)
                                  throws DOMException;
public String getXmlVersion();
public void setXmlVersion(String xmlVersion)
                                  throws DOMException;

// element content whitespace property on the Text
// interface
public boolean isWhitespaceInElementContent();

通過 Attr 接口的 schemaTypeInfo 屬性，您還可以獲取一個屬性信息項(xiàng)的屬性類型特性的值 ——即一個屬性的類型。后面有一節(jié)對此給予了更詳細(xì)的介紹。

此外，這里提供了一個新的特性，用于以最接近 XML Infoset 的形式返回 Document，在此之前，由于不同的編輯操作（例如插入或者刪除節(jié)點(diǎn)）的作用，文檔通常會更加偏離 XML Infoset。這是在進(jìn)行文檔標(biāo)準(zhǔn)化（document normalization）操作時可能造成的部分結(jié)果，我們將在下面的文檔標(biāo)準(zhǔn)化一節(jié)中對此加以描述。

最后，新的 Appendix C 提供了 XML Infoset 模型與 DOM 之間的映射，在這種映射中，每一個 XML Infoset 信息項(xiàng)都映射到其相應(yīng)的 Node，反之也一樣，一個信息項(xiàng)的每一個屬性都映射到其相應(yīng) Node 的屬性。這個附錄應(yīng)該可以使您對 DOM 數(shù)據(jù)模型有一個很好的全面了解，并且展示了如何訪問所要查找的信息。

---------------------------------------------------------------

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = factory.newDocumentBuilder();
        InputStream in = Test.class.getResourceAsStream(fileName);
        DocumentImpl doc = (DocumentImpl) builder.parse(in);

---------------------------------------------------------------

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;

import org.apache.xerces.dom.DocumentImpl;

            DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
            DocumentBuilder builder = factory.newDocumentBuilder();
            InputStream in = new FileInputStream(args[0]);
            DocumentImpl doc = (DocumentImpl)builder.parse(in);
            System.out.println(doc.getXmlEncoding());

============================================================================
import org.dom4j.Document;
import org.dom4j.DocumentHelper;

            String xml = "<?xml version='1.0' encoding='iso-8859-1'?><Message>Hi there</Message>";
            Document doc = DocumentHelper.parseText(xml);
            System.out.println("The encoding is " + doc.getXMLEncoding());
            System.out.println("As XML: " + doc.asXML());

The result is:

The encoding is iso-8859-1
As XML: <?xml version="1.0" encoding="iso-8859-1"?>
<Message>Hi there</Message>

=================================

            String xml = "<?xml version='1.0' encoding='UTF-8'?><Message>Hi there</Message>";
            Document doc = DocumentHelper.parseText(xml);
            System.out.println("The encoding is " + doc.getXMLEncoding());
            System.out.println("As XML: " + doc.asXML());

The result is:

The encoding is UTF-8

As XML: <?xml version="1.0" encoding="UTF-8"?>
<Message>Hi there</Message>

====================================
            String xml = "<?xml version='1.0' encoding='GBK'?><Message>Hi there</Message>";
            Document doc = DocumentHelper.parseText(xml);
            System.out.println("The encoding is " + doc.getXMLEncoding());
            System.out.println("As XML: " + doc.asXML());

The result is:

The encoding is GBK

As XML: <?xml version="1.0" encoding="GBK"?>
<Message>Hi there</Message>

posted on 2008-11-13 10:33 fatbear 閱讀(1793) 評論(0) 編輯收藏所屬分類: XML/XSLT

新用戶注冊刷新評論列表


只有注冊用戶登錄后才能發(fā)表評論。




網(wǎng)站導(dǎo)航: 博客園 IT新聞 Chat2DB C++博客博問管理

胖熊熊

導(dǎo)航

統(tǒng)計

常用鏈接

留言簿(3)

我參與的團(tuán)隊(duì)

隨筆分類(25)

隨筆檔案(27)

關(guān)注的blog

搜索

最新隨筆

最新評論

閱讀排行榜

評論排行榜

【轉(zhuǎn)】解析xml文件獲取encoding的值