最近的項(xiàng)目中,使用到了HtmlParser(1.5版本).在使用過程中(如訪問url為:
http://athena2002.vip.china.alibaba.com/
),遇到了異常:
Exception in thread "main" java.lang.IllegalArgumentException: invalid cookie name: Discard
at org.htmlparser.http.Cookie.<init>(Cookie.java:136)
at org.htmlparser.http.ConnectionManager.parseCookies(ConnectionManager.java:1126)
at org.htmlparser.http.ConnectionManager.openConnection(ConnectionManager.java:621)
at org.htmlparser.http.ConnectionManager.openConnection(ConnectionManager.java:792)
at org.htmlparser.Parser.<init>(Parser.java:251)
at org.htmlparser.Parser.<init>(Parser.java:261)
檢查代碼,發(fā)現(xiàn):
org.htmlparser.http.Cookie
1 public Cookie (String name, String value)
2 {
3 if (!isToken (name) || name.equalsIgnoreCase ("Comment") // rfc2019
4 || name.equalsIgnoreCase ("Discard") // 2019++
5 || name.equalsIgnoreCase ("Domain")
6 || name.equalsIgnoreCase ("Expires") // (old cookies)
7 || name.equalsIgnoreCase ("Max-Age") // rfc2019
8 || name.equalsIgnoreCase ("Path")
9 || name.equalsIgnoreCase ("Secure")
10 || name.equalsIgnoreCase ("Version"))
11 throw new IllegalArgumentException ("invalid cookie name: " + name);
12 mName = name;
13 mValue = value;
14 mComment = null;
15 mDomain = null;
16 mExpiry = null; // not persisted
17 mPath = "/";
18 mSecure = false;
19 mVersion = 0;
20 }
一旦發(fā)現(xiàn)name值為“Discard”,則拋異常。
而在org.htmlparser.http.ConnectionManager.parseCookies (URLConnection connection) 解析cookie的代碼中,見代碼片段
if (key.equals ("domain"))
cookie.setDomain (value);
else
if (key.equals ("path"))
cookie.setPath (value);
else
if (key.equals ("secure"))
cookie.setSecure (true);
else
if (key.equals ("comment"))
cookie.setComment (value);
else
if (key.equals ("version"))
cookie.setVersion (Integer.parseInt (value));
else
if (key.equals ("max-age"))
{
Date date = new Date ();
long then = date.getTime () + Integer.parseInt (value) * 1000;
date.setTime (then);
cookie.setExpiryDate (date);
}
else
{ // error,? unknown attribute,
// maybe just another cookie not separated by a comma
cookie = new Cookie (name, value); //出問題的地方
cookies.addElement (cookie);
}
沒有對(duì)Discard做特殊處理。
無奈之下,覆寫了此方法,加上對(duì)Discard的處理--直接continue :)
今天在寫blog的時(shí)候,拿了1.6的代碼測(cè)試,發(fā)現(xiàn)沒有問題,分析代碼后發(fā)現(xiàn)
1. ConnectionManager parserCookie之前,加了條件判斷
if (getCookieProcessingEnabled ())
parseCookies (ret);
默認(rèn)情況下,條件為false
2. parserCookie的時(shí)候,catch了異常
1 // error,? unknown attribute,
2 // maybe just another cookie
3 // not separated by a comma
4 try
5 {
6 cookie = new Cookie (name,
7 value);
8 cookies.addElement (cookie);
9 }
10 catch (IllegalArgumentException iae)
11 {
12 // should print a warning
13 // for now just bail
14 break;
15 }
雖然解決了問題,但是明顯還沒有意識(shí)到Discard的問題。
從我的理解看,最合理的解決方案是:
1. org.htmlparser.http.Cookie中添加 boolean discard方法
2. org.htmlparser.http.ConnectionManager parserCookies()方法,對(duì)Discard做處理,如有值,則設(shè)置cookie.discard=true
關(guān)于discard的解釋,見
http://www.faqs.org/rfcs/rfc2965.html:
Discard
OPTIONAL. The Discard attribute instructs the user agent to
discard the cookie unconditionally when the user agent terminates