Sax
是事件驅(qū)動(dòng)的
xml
簡單接口。
要解析一份
xml
文檔,而且在解析的過程中當(dāng)某些事件發(fā)生時(shí)執(zhí)行你希望此時(shí)執(zhí)行的代碼,就先準(zhǔn)備以下三件事情。
l????????
獲取一個(gè)
xml
解析器:到
xml.apache.org
免費(fèi)獲取。
l????????
獲取
sax
類:上述的
xerces
解析器已經(jīng)包括了,記得在
classpath
里包括他們。
l????????
獲取一個(gè)
xml
文檔:相信這個(gè)你自己可以搞定
?
?
接下來的事情,我認(rèn)為以下的代碼基本上說明了流程,其中要稍微解釋的是
ContentHandler
接口。
SAX 2.0
定義了四個(gè)核心的處理接口,分別是
ContentHandler
,
ErrorHandler
,
DTDHandler
,
EntityResolver
。其中較常用到的是前面兩個(gè),最后一個(gè)是處理
xml
里的實(shí)體的,而在
schema
較流行的今天,
DTDHandler
也許不太需要注意。這些處理器可以被
set
到
parser
上,當(dāng)
parser
在解析
xml
文檔的過程中,發(fā)生特定的事件時(shí),處理器中對應(yīng)的方法便會(huì)被調(diào)用,當(dāng)然了,方法的內(nèi)容由你來寫。這個(gè)基本上就是
sax
的概要情況了。
在讀完代碼和自己運(yùn)行過下面的代碼后,我相信你對
sax
的工作方式已經(jīng)了解了,剩下的事情就是自己去熟悉另外幾個(gè)處理器的方法。希望這篇東西能讓你快速了解
sax
,而還有不少的細(xì)節(jié),還需要自己去慢慢探討了。
?
?
public
?
class
?Practice1?{
?
???????
//
要實(shí)例化的reader的類名
???????
private
?String?vendroParserClass?
=
?
"
org.apache.xerces.parsers.SAXParser
"
;
?
???????
//
被讀取的xml文件路徑,以bin為根目錄。
???????
private
?String?uri?
=
?
"
xmlDocuments/xmlPractise.xml
"
;
?
???????
//
為0時(shí)debugPrint方法不會(huì)打印數(shù)據(jù)。
???????
private
?
final
?
int
?debug?
=
?
1
;
?
???????
public
?
void
?test()?
throws
?IOException,?SAXException?{
??????????????XMLReader?reader?
=
?
null
;
??????????????
try
?{
?????????????????????
//
工廠方法以類名獲得reader實(shí)例
?????????????????????reader?
=
?XMLReaderFactory.createXMLReader(vendroParserClass);
??????????????}?
catch
?(Exception?e)?{
?????????????????????e.printStackTrace();
??????????????}
??????????????
//
以u(píng)ri獲取xml文檔實(shí)例,reader.parse()方法可以接受一個(gè)
??????????????//
簡單的uri作為參數(shù),但是InputSource會(huì)更好。
??????????????InputSource?inputSource?
=
?
new
?InputSource(uri);
??????????????
//
設(shè)置contentHandler
??????????????reader.setContentHandler(
new
?MyContentHandler());
??????????????
//
執(zhí)行讀取
??????????????reader.parse(inputSource);
??????????????System.out.println(
"
test?completed.
"
);
???????}
?
???????
public
?
void
?debugPrint(String?msg)?{
??????????????
if
?(debug?
>
?
0
)?{
?????????????????????System.out.print(msg);
??????????????}
???????}
???????
//
內(nèi)容處理器
???????
class
?MyContentHandler?
implements
?ContentHandler?{
??????????????
//
locator是定位器,指示當(dāng)前解析文檔進(jìn)行到哪個(gè)位置了。
???????????? //
它只在解析生命周期內(nèi)有效,解析完畢就別再碰它咯~
??????????????
private
?Locator?locator;
?
??????????????
//
這個(gè)方法在整個(gè)解析過程的一開始被調(diào)用
public
?
void
?setDocumentLocator(Locator?locator)?{
?????????????????????debugPrint(
"
setDocumentLocator?get?called.\n
"
);
?????????????????????
this
.locator?
=
?locator;
??????????????}
??????????????
???????????????//
xml文檔開始時(shí)被調(diào)用
??????????????
public
?
void
?startDocument()?
throws
?SAXException?{
?????????????????????debugPrint(
"
startDocument()?get?called.\n
"
);
?
??????????????}
??????????????
//
xml文檔結(jié)束時(shí)被調(diào)用
??????????????
public
?
void
?endDocument()?
throws
?SAXException?{
?????????????????????debugPrint(
"
endDocument()?get?called.\n
"
);
?
??????????????}
???????????????//
xml文檔的某個(gè)名稱空間開始時(shí)被調(diào)用
??????????????
public
?
void
?startPrefixMapping(String?prefix,?String?uri)
????????????????????????????
throws
?SAXException?{
?????????????????????debugPrint(
"
start?of?:?uri:?
"
?
+
?uri?
+
?
"
,?prefix:?
"
?
+
?prefix?
+
?
"
.\n
"
);
?
??????????????}
??????????????
//
xml文檔的某個(gè)名稱空間結(jié)束時(shí)被調(diào)用
??????????????
public
?
void
?endPrefixMapping(String?prefix)?
throws
?SAXException?{
?????????????????????debugPrint(
"
end?of:?uri:?
"
?
+
?uri?
+
?
"
,?prefix:?
"
?
+
?prefix?
+
?
"
.\n
"
);
?
??????????????}
??????????????
//
xml文檔的某個(gè)元素開始時(shí)被調(diào)用
??????????????
//
uri是名稱空間,localName是不帶前綴的元素名,qName是前綴+元素名
??????????????
//
atts就是屬性列表了。
??????????????
public
?
void
?startElement(String?uri,?String?localName,?String?qName,
????????????????????????????Attributes?atts)?
throws
?SAXException?{
?????????????????????debugPrint(
"
<
"
?
+
?localName?
+
?
"
>
"
);
?
??????????????}
??????????????
//
xml文檔的某個(gè)元素結(jié)束時(shí)被調(diào)用
??????????????
public
?
void
?endElement(String?uri,?String?localName,?String?qName)
????????????????????????????
throws
?SAXException?{
?????????????????????debugPrint(
"
</
"
?
+
?localName?
+
?
"
>
"
);
?
??????????????}
??????????????
//
xml文檔的某個(gè)元素的文本內(nèi)容出現(xiàn)時(shí)被調(diào)用
????????????//
start和length是字符串截取的開始位置和長度,如下所示是比較好的
???????????????//
處理方法,先轉(zhuǎn)換成String再處理
??????????????
public
?
void
?characters(
char
[]?ch,?
int
?start,?
int
?length)
????????????????????????????
throws
?SAXException?{
?????????????????????String?s?
=
?
new
?String(ch,?start,?length);
?????????????????????debugPrint(s);
?????????????????????
??????????????}
????????????//
遇到可以忽略的空白時(shí)被調(diào)用,關(guān)于xml里的空白,可以長篇大論,
????????????//
這里就不廢話了。
??????????????
public
?
void
?ignorableWhitespace(
char
[]?ch,?
int
?start,?
int
?length)
????????????????????????????
throws
?SAXException?{
?????????????????????
//
?TODO?Auto-generated?method?stub
?
??????????????}
??????????????
//
遇到xml的處理指令時(shí)被調(diào)用
??????????????
public
?
void
?processingInstruction(String?target,?String?data)
????????????????????????????
throws
?SAXException?{
?????????????????????debugPrint(
"
PI?target:?'
"
?
+
?target?
+
?
"
',?data:?'
"
?
+
?data?
+
?
"
'.
"
);
?
??????????????}
??????????????
//
當(dāng)xml里的實(shí)體被非驗(yàn)證解析器忽略時(shí)被調(diào)用
??????????????
public
?
void
?skippedEntity(String?name)?
throws
?SAXException?{
?????????????????????
//
?TODO?Auto-generated?method?stub
?
??????????????}
?
???????}
?
???????
public
?
static
?
void
?main(String[]?args)?
throws
?IOException,?SAXException?{
??????????????
new
?Practice1().test();
???????}
}
題外話:以自己的學(xué)習(xí)的經(jīng)歷,感覺接觸一門新技術(shù)的時(shí)候,詳盡的經(jīng)典著作未必最合適,如果要追求平緩的學(xué)習(xí)曲線,最好的方法是聽學(xué)過的人談?wù)務(wù)w的概念,看看運(yùn)作實(shí)例,此時(shí)心中有了感性認(rèn)識(shí),再投入真正的學(xué)習(xí)中去,效果相當(dāng)好。這也是寫這篇東西的初衷了。文章很簡陋,望勿見笑。
SAX is a event-driven simple api for xml.
There are three things to get before using SAX to parse a xml document, and make some code to execute when certain events comes up.
l??????? Get an xml parser: download it from xml.apache.org for free.
l??????? Get a SAX class: it should be included in the parser we mentioned above.
l??????? Get an xml document: get the white mouse yourself.
?
The following is quite self-explanative, the codes describe the basic flow. Only the ContentHandler interface needs a little bit words. SAX 2.0 defined 4 core handler interfaces: ContentHandler, ErrorHandler, DTDHandler, EntityResolver. The leading 2 are often used, the last one is for the entity in xml, while schema is more prefer now, DTDHandler needs not much attention. These handlers can be set to a parser, when the parser is parsing the xml file, some certain events take places, the corresponding methods will be called back. The content in these methods, of course, will be finished by you, that’s what call-back is. Then it is the brief of SAX.
After reading and running the following codes, I believed that you are clear about how SAX works, what’s left is that you shall get familiar with other handlers. I hope this stuff can help you understand SAX in a short time, the details are left to yourself.
??
public?class?Practice1?{
?
???????//name?of?the?reader?class?that?will?be?initialled.
???????private?String?vendroParserClass?=?"org.apache.xerces.parsers.SAXParser";
?
???????//file?path?of?the?xml?file,using?bin?as?file?root?。
???????private?String?uri?=?"xmlDocuments/xmlPractise.xml";
?
???????//when?this?equasl?0,?debugPrint()?method?won’t?print?message?out。
???????private?final?int?debug?=?1;
?
???????public?void?test()?throws?IOException,?SAXException?{
??????????????XMLReader?reader?=?null;
??????????????try?{
?????????????????????//using?factory?pattern?
?????????????????????reader?=?XMLReaderFactory.createXMLReader(vendroParserClass);
??????????????}?catch?(Exception?e)?{
?????????????????????e.printStackTrace();
??????????????}
??????????????//get?a?xml?file?instance?using?uri,reader.parse()method?can?accept?a?simple?
//?uri?as?a?parameter,?but?InputSource?is?better。
??????????????InputSource?inputSource?=?new?InputSource(uri);
??????????????//set?the?contentHandler
??????????????reader.setContentHandler(new?MyContentHandler());
??????????????//executing?the?parsing
??????????????reader.parse(inputSource);
??????????????System.out.println("test?completed.");
???????}
?
???????public?void?debugPrint(String?msg)?{
??????????????if?(debug?>?0)?{
?????????????????????System.out.print(msg);
??????????????}
???????}
???????//content?handler
???????class?MyContentHandler?implements?ContentHandler?{
??????????????//locator?indicate?the?current?position?when?parsing?the?file
//it?is?only?valid?during?parsing,?so?don’t?touch?it?after?the?parsing?is?finished
??????????????private?Locator?locator;
?
??????????????//this?method?will?be?the?first?method?called?as?the?parsing?begins.
public?void?setDocumentLocator(Locator?locator)?{
?????????????????????debugPrint("setDocumentLocator?get?called.\n");
?????????????????????this.locator?=?locator;
??????????????}
??????????????
//this?method?will?be?called?at?the?start?of?a?xml?file?
??????????????public?void?startDocument()?throws?SAXException?{
?????????????????????debugPrint("startDocument()?get?called.\n");
?
??????????????}
??????????????//?this?method?will?be?called?at?the?end?of?a?xml?file
??????????????public?void?endDocument()?throws?SAXException?{
?????????????????????debugPrint("endDocument()?get?called.\n");
?
??????????????}
//?this?method?will?be?called?at?the?start?of?a?namespace
??????????????public?void?startPrefixMapping(String?prefix,?String?uri)
????????????????????????????throws?SAXException?{
?????????????????????debugPrint("start?of?:?uri:?"?+?uri?+?",?prefix:?"?+?prefix?+?".\n");
?
??????????????}
??????????????//?this?method?will?be?called?at?the?end?of?a?namespace
??????????????public?void?endPrefixMapping(String?prefix)?throws?SAXException?{
?????????????????????debugPrint("end?of:?uri:?"?+?uri?+?",?prefix:?"?+?prefix?+?".\n");
?
??????????????}
??????????????//?this?method?will?be?called?at?the?start?of?an?element
??????????????public?void?startElement(String?uri,?String?localName,?String?qName,
????????????????????????????Attributes?atts)?throws?SAXException?{
?????????????????????debugPrint("<"?+?localName?+?">");
?
??????????????}
??????????????//?this?method?will?be?called?at?the?end?of?an?element
??????????????public?void?endElement(String?uri,?String?localName,?String?qName)
????????????????????????????throws?SAXException?{
?????????????????????debugPrint("</"?+?localName?+?">");
?
??????????????}
??????????????//x?this?method?will?be?called?when?the?text?content?appears
//start?and?length?is?the?start?index?and?length?of?the?char?array.
//parsed?to?a?String?before?handling?the?content?will?be?a?good?choice
??????????????public?void?characters(char[]?ch,?int?start,?int?length)
????????????????????????????throws?SAXException?{
?????????????????????String?s?=?new?String(ch,?start,?length);
?????????????????????debugPrint(s);
?????????????????????
??????????????}
//?this?method?will?be?called?when?some?ignorable?white?space?appears,
//?there?are?much?to?talk?about?white?space?in?xml,?we?aren’t?talking?them?here.
??????????????public?void?ignorableWhitespace(char[]?ch,?int?start,?int?length)
????????????????????????????throws?SAXException?{
?????????????????????//?TODO?Auto-generated?method?stub
?
??????????????}
??????????????//?this?method?will?be?called?when?processing?instructions?appears
??????????????public?void?processingInstruction(String?target,?String?data)
????????????????????????????throws?SAXException?{
?????????????????????debugPrint("PI?target:?'"?+?target?+?"',?data:?'"?+?data?+?"'.");
?
??????????????}
//?this?method?will?be?called?when?the?entity?in?the?xml?file?is?skipped?by?the?//parser.
??????????????public?void?skippedEntity(String?name)?throws?SAXException?{
?????????????????????//?TODO?Auto-generated?method?stub
?
??????????????}
?
???????}
?
???????public?static?void?main(String[]?args)?throws?IOException,?SAXException?{
??????????????new?Practice1().test();
???????}
}
posted on 2006-09-07 16:57
Ye Yiliang 閱讀(1914)
評(píng)論(7) 編輯 收藏 所屬分類:
Java