Sax
是事件驅動的
xml
簡單接口。
要解析一份
xml
文檔,而且在解析的過程中當某些事件發生時執行你希望此時執行的代碼,就先準備以下三件事情。
l????????
獲取一個
xml
解析器:到
xml.apache.org
免費獲取。
l????????
獲取
sax
類:上述的
xerces
解析器已經包括了,記得在
classpath
里包括他們。
l????????
獲取一個
xml
文檔:相信這個你自己可以搞定
?
?
接下來的事情,我認為以下的代碼基本上說明了流程,其中要稍微解釋的是
ContentHandler
接口。
SAX 2.0
定義了四個核心的處理接口,分別是
ContentHandler
,
ErrorHandler
,
DTDHandler
,
EntityResolver
。其中較常用到的是前面兩個,最后一個是處理
xml
里的實體的,而在
schema
較流行的今天,
DTDHandler
也許不太需要注意。這些處理器可以被
set
到
parser
上,當
parser
在解析
xml
文檔的過程中,發生特定的事件時,處理器中對應的方法便會被調用,當然了,方法的內容由你來寫。這個基本上就是
sax
的概要情況了。
在讀完代碼和自己運行過下面的代碼后,我相信你對
sax
的工作方式已經了解了,剩下的事情就是自己去熟悉另外幾個處理器的方法。希望這篇東西能讓你快速了解
sax
,而還有不少的細節,還需要自己去慢慢探討了。
?
?
public
?
class
?Practice1?{
?
???????
//
要實例化的reader的類名
???????
private
?String?vendroParserClass?
=
?
"
org.apache.xerces.parsers.SAXParser
"
;
?
???????
//
被讀取的xml文件路徑,以bin為根目錄。
???????
private
?String?uri?
=
?
"
xmlDocuments/xmlPractise.xml
"
;
?
???????
//
為0時debugPrint方法不會打印數據。
???????
private
?
final
?
int
?debug?
=
?
1
;
?
???????
public
?
void
?test()?
throws
?IOException,?SAXException?{
??????????????XMLReader?reader?
=
?
null
;
??????????????
try
?{
?????????????????????
//
工廠方法以類名獲得reader實例
?????????????????????reader?
=
?XMLReaderFactory.createXMLReader(vendroParserClass);
??????????????}?
catch
?(Exception?e)?{
?????????????????????e.printStackTrace();
??????????????}
??????????????
//
以uri獲取xml文檔實例,reader.parse()方法可以接受一個
??????????????//
簡單的uri作為參數,但是InputSource會更好。
??????????????InputSource?inputSource?
=
?
new
?InputSource(uri);
??????????????
//
設置contentHandler
??????????????reader.setContentHandler(
new
?MyContentHandler());
??????????????
//
執行讀取
??????????????reader.parse(inputSource);
??????????????System.out.println(
"
test?completed.
"
);
???????}
?
???????
public
?
void
?debugPrint(String?msg)?{
??????????????
if
?(debug?
>
?
0
)?{
?????????????????????System.out.print(msg);
??????????????}
???????}
???????
//
內容處理器
???????
class
?MyContentHandler?
implements
?ContentHandler?{
??????????????
//
locator是定位器,指示當前解析文檔進行到哪個位置了。
???????????? //
它只在解析生命周期內有效,解析完畢就別再碰它咯~
??????????????
private
?Locator?locator;
?
??????????????
//
這個方法在整個解析過程的一開始被調用
public
?
void
?setDocumentLocator(Locator?locator)?{
?????????????????????debugPrint(
"
setDocumentLocator?get?called.\n
"
);
?????????????????????
this
.locator?
=
?locator;
??????????????}
??????????????
???????????????//
xml文檔開始時被調用
??????????????
public
?
void
?startDocument()?
throws
?SAXException?{
?????????????????????debugPrint(
"
startDocument()?get?called.\n
"
);
?
??????????????}
??????????????
//
xml文檔結束時被調用
??????????????
public
?
void
?endDocument()?
throws
?SAXException?{
?????????????????????debugPrint(
"
endDocument()?get?called.\n
"
);
?
??????????????}
???????????????//
xml文檔的某個名稱空間開始時被調用
??????????????
public
?
void
?startPrefixMapping(String?prefix,?String?uri)
????????????????????????????
throws
?SAXException?{
?????????????????????debugPrint(
"
start?of?:?uri:?
"
?
+
?uri?
+
?
"
,?prefix:?
"
?
+
?prefix?
+
?
"
.\n
"
);
?
??????????????}
??????????????
//
xml文檔的某個名稱空間結束時被調用
??????????????
public
?
void
?endPrefixMapping(String?prefix)?
throws
?SAXException?{
?????????????????????debugPrint(
"
end?of:?uri:?
"
?
+
?uri?
+
?
"
,?prefix:?
"
?
+
?prefix?
+
?
"
.\n
"
);
?
??????????????}
??????????????
//
xml文檔的某個元素開始時被調用
??????????????
//
uri是名稱空間,localName是不帶前綴的元素名,qName是前綴+元素名
??????????????
//
atts就是屬性列表了。
??????????????
public
?
void
?startElement(String?uri,?String?localName,?String?qName,
????????????????????????????Attributes?atts)?
throws
?SAXException?{
?????????????????????debugPrint(
"
<
"
?
+
?localName?
+
?
"
>
"
);
?
??????????????}
??????????????
//
xml文檔的某個元素結束時被調用
??????????????
public
?
void
?endElement(String?uri,?String?localName,?String?qName)
????????????????????????????
throws
?SAXException?{
?????????????????????debugPrint(
"
</
"
?
+
?localName?
+
?
"
>
"
);
?
??????????????}
??????????????
//
xml文檔的某個元素的文本內容出現時被調用
????????????//
start和length是字符串截取的開始位置和長度,如下所示是比較好的
???????????????//
處理方法,先轉換成String再處理
??????????????
public
?
void
?characters(
char
[]?ch,?
int
?start,?
int
?length)
????????????????????????????
throws
?SAXException?{
?????????????????????String?s?
=
?
new
?String(ch,?start,?length);
?????????????????????debugPrint(s);
?????????????????????
??????????????}
????????????//
遇到可以忽略的空白時被調用,關于xml里的空白,可以長篇大論,
????????????//
這里就不廢話了。
??????????????
public
?
void
?ignorableWhitespace(
char
[]?ch,?
int
?start,?
int
?length)
????????????????????????????
throws
?SAXException?{
?????????????????????
//
?TODO?Auto-generated?method?stub
?
??????????????}
??????????????
//
遇到xml的處理指令時被調用
??????????????
public
?
void
?processingInstruction(String?target,?String?data)
????????????????????????????
throws
?SAXException?{
?????????????????????debugPrint(
"
PI?target:?'
"
?
+
?target?
+
?
"
',?data:?'
"
?
+
?data?
+
?
"
'.
"
);
?
??????????????}
??????????????
//
當xml里的實體被非驗證解析器忽略時被調用
??????????????
public
?
void
?skippedEntity(String?name)?
throws
?SAXException?{
?????????????????????
//
?TODO?Auto-generated?method?stub
?
??????????????}
?
???????}
?
???????
public
?
static
?
void
?main(String[]?args)?
throws
?IOException,?SAXException?{
??????????????
new
?Practice1().test();
???????}
}
題外話:以自己的學習的經歷,感覺接觸一門新技術的時候,詳盡的經典著作未必最合適,如果要追求平緩的學習曲線,最好的方法是聽學過的人談談整體的概念,看看運作實例,此時心中有了感性認識,再投入真正的學習中去,效果相當好。這也是寫這篇東西的初衷了。文章很簡陋,望勿見笑。
SAX is a event-driven simple api for xml.
There are three things to get before using SAX to parse a xml document, and make some code to execute when certain events comes up.
l??????? Get an xml parser: download it from xml.apache.org for free.
l??????? Get a SAX class: it should be included in the parser we mentioned above.
l??????? Get an xml document: get the white mouse yourself.
?
The following is quite self-explanative, the codes describe the basic flow. Only the ContentHandler interface needs a little bit words. SAX 2.0 defined 4 core handler interfaces: ContentHandler, ErrorHandler, DTDHandler, EntityResolver. The leading 2 are often used, the last one is for the entity in xml, while schema is more prefer now, DTDHandler needs not much attention. These handlers can be set to a parser, when the parser is parsing the xml file, some certain events take places, the corresponding methods will be called back. The content in these methods, of course, will be finished by you, that’s what call-back is. Then it is the brief of SAX.
After reading and running the following codes, I believed that you are clear about how SAX works, what’s left is that you shall get familiar with other handlers. I hope this stuff can help you understand SAX in a short time, the details are left to yourself.
??
public?class?Practice1?{
?
???????//name?of?the?reader?class?that?will?be?initialled.
???????private?String?vendroParserClass?=?"org.apache.xerces.parsers.SAXParser";
?
???????//file?path?of?the?xml?file,using?bin?as?file?root?。
???????private?String?uri?=?"xmlDocuments/xmlPractise.xml";
?
???????//when?this?equasl?0,?debugPrint()?method?won’t?print?message?out。
???????private?final?int?debug?=?1;
?
???????public?void?test()?throws?IOException,?SAXException?{
??????????????XMLReader?reader?=?null;
??????????????try?{
?????????????????????//using?factory?pattern?
?????????????????????reader?=?XMLReaderFactory.createXMLReader(vendroParserClass);
??????????????}?catch?(Exception?e)?{
?????????????????????e.printStackTrace();
??????????????}
??????????????//get?a?xml?file?instance?using?uri,reader.parse()method?can?accept?a?simple?
//?uri?as?a?parameter,?but?InputSource?is?better。
??????????????InputSource?inputSource?=?new?InputSource(uri);
??????????????//set?the?contentHandler
??????????????reader.setContentHandler(new?MyContentHandler());
??????????????//executing?the?parsing
??????????????reader.parse(inputSource);
??????????????System.out.println("test?completed.");
???????}
?
???????public?void?debugPrint(String?msg)?{
??????????????if?(debug?>?0)?{
?????????????????????System.out.print(msg);
??????????????}
???????}
???????//content?handler
???????class?MyContentHandler?implements?ContentHandler?{
??????????????//locator?indicate?the?current?position?when?parsing?the?file
//it?is?only?valid?during?parsing,?so?don’t?touch?it?after?the?parsing?is?finished
??????????????private?Locator?locator;
?
??????????????//this?method?will?be?the?first?method?called?as?the?parsing?begins.
public?void?setDocumentLocator(Locator?locator)?{
?????????????????????debugPrint("setDocumentLocator?get?called.\n");
?????????????????????this.locator?=?locator;
??????????????}
??????????????
//this?method?will?be?called?at?the?start?of?a?xml?file?
??????????????public?void?startDocument()?throws?SAXException?{
?????????????????????debugPrint("startDocument()?get?called.\n");
?
??????????????}
??????????????//?this?method?will?be?called?at?the?end?of?a?xml?file
??????????????public?void?endDocument()?throws?SAXException?{
?????????????????????debugPrint("endDocument()?get?called.\n");
?
??????????????}
//?this?method?will?be?called?at?the?start?of?a?namespace
??????????????public?void?startPrefixMapping(String?prefix,?String?uri)
????????????????????????????throws?SAXException?{
?????????????????????debugPrint("start?of?:?uri:?"?+?uri?+?",?prefix:?"?+?prefix?+?".\n");
?
??????????????}
??????????????//?this?method?will?be?called?at?the?end?of?a?namespace
??????????????public?void?endPrefixMapping(String?prefix)?throws?SAXException?{
?????????????????????debugPrint("end?of:?uri:?"?+?uri?+?",?prefix:?"?+?prefix?+?".\n");
?
??????????????}
??????????????//?this?method?will?be?called?at?the?start?of?an?element
??????????????public?void?startElement(String?uri,?String?localName,?String?qName,
????????????????????????????Attributes?atts)?throws?SAXException?{
?????????????????????debugPrint("<"?+?localName?+?">");
?
??????????????}
??????????????//?this?method?will?be?called?at?the?end?of?an?element
??????????????public?void?endElement(String?uri,?String?localName,?String?qName)
????????????????????????????throws?SAXException?{
?????????????????????debugPrint("</"?+?localName?+?">");
?
??????????????}
??????????????//x?this?method?will?be?called?when?the?text?content?appears
//start?and?length?is?the?start?index?and?length?of?the?char?array.
//parsed?to?a?String?before?handling?the?content?will?be?a?good?choice
??????????????public?void?characters(char[]?ch,?int?start,?int?length)
????????????????????????????throws?SAXException?{
?????????????????????String?s?=?new?String(ch,?start,?length);
?????????????????????debugPrint(s);
?????????????????????
??????????????}
//?this?method?will?be?called?when?some?ignorable?white?space?appears,
//?there?are?much?to?talk?about?white?space?in?xml,?we?aren’t?talking?them?here.
??????????????public?void?ignorableWhitespace(char[]?ch,?int?start,?int?length)
????????????????????????????throws?SAXException?{
?????????????????????//?TODO?Auto-generated?method?stub
?
??????????????}
??????????????//?this?method?will?be?called?when?processing?instructions?appears
??????????????public?void?processingInstruction(String?target,?String?data)
????????????????????????????throws?SAXException?{
?????????????????????debugPrint("PI?target:?'"?+?target?+?"',?data:?'"?+?data?+?"'.");
?
??????????????}
//?this?method?will?be?called?when?the?entity?in?the?xml?file?is?skipped?by?the?//parser.
??????????????public?void?skippedEntity(String?name)?throws?SAXException?{
?????????????????????//?TODO?Auto-generated?method?stub
?
??????????????}
?
???????}
?
???????public?static?void?main(String[]?args)?throws?IOException,?SAXException?{
??????????????new?Practice1().test();
???????}
}
posted on 2006-09-07 16:57
Ye Yiliang 閱讀(1914)
評論(7) 編輯 收藏 所屬分類:
Java