无码天堂va亚洲va在线va,亚洲av一综合av一区,亚洲av成本人无码网站

分享代碼系列——parseInt(包含java和c語言的atoi方法）

jdk中的Integer類是int對象的包裝類，正常的Integer占用內存開銷要比int大，比例大概是1:4 。今天分享的代碼是Integer類中的靜態方法parseInt(String, int)。這個方法眾所周知，甚至在我們一開始學習編程時就嘗試的寫過這樣的代碼，一個正常的思路：遍歷輸入的字符數組（java的字符串就是一個字符數組），然后parse每個char，依據參數給定的進制，判斷每個char是否滿足，滿足則繼續，否則拋出異常或中斷，直到處理完畢所有字符，返回結果。

那么我們看看jdk給出的實現：

    public static int parseInt(String s, int radix)
        throws NumberFormatException
    {
        if (s == null) {
            throw new NumberFormatException("null");
        }

    if (radix < Character.MIN_RADIX) {
        throw new NumberFormatException("radix " + radix +
                        " less than Character.MIN_RADIX");
    }

    if (radix > Character.MAX_RADIX) {
        throw new NumberFormatException("radix " + radix +
                        " greater than Character.MAX_RADIX");
    }

    int result = 0;
    boolean negative = false;
    int i = 0, max = s.length();
    int limit;
    int multmin;
    int digit;

    if (max > 0) {
        if (s.charAt(0) == '-') {
        negative = true;
        limit = Integer.MIN_VALUE;
        i++;
        } else {
        limit = -Integer.MAX_VALUE;
        }
        multmin = limit / radix;
        if (i < max) {
        digit = Character.digit(s.charAt(i++),radix);
        if (digit < 0) {
            throw NumberFormatException.forInputString(s);
        } else {
            result = -digit;
        }
        }
        while (i < max) {
        // Accumulating negatively avoids surprises near MAX_VALUE
        digit = Character.digit(s.charAt(i++),radix);
        if (digit < 0) {
            throw NumberFormatException.forInputString(s);
        }
        if (result < multmin) {
            throw NumberFormatException.forInputString(s);
        }
        result *= radix;
        if (result < limit + digit) {
            throw NumberFormatException.forInputString(s);
        }
        result -= digit;
        }
    } else {
        throw NumberFormatException.forInputString(s);
    }
    if (negative) {
        if (i > 1) {
        return result;
        } else {    /* Only got "-" */
        throw NumberFormatException.forInputString(s);
        }
    } else {
        return -result;
    }
    }

過程就是按照思路來的，但是更全面一些，首先做一些參數檢查，然后定義了局部變量用于計算：result是對應的int結果，negative對應是否是負數的判斷，i是遍歷用的索引指針，max代表字符串的長度，limit是合法數字的上限（下限），digit是當前掃描到的字符對應的數字，multmin是在做乘法計算時能走到的合法下限。

嚴謹是這段程序最大的特點，因為有符號int的上下限是-2147483648~2147483647，可見負數表達的范圍比正數多一個，這樣就好理解為什么在開頭要把limit全部表達為負數（下限），這樣的操作減少了后續的判斷，可以一步到位，相當于二者選擇取其大一樣，大的包含了小的。同理，那么multmin也就是負數了，而且可以認為是只和進制參數radix有關系。接著每個char的掃描計算digit利用到了Character.digit(char,int) 方法，這個方法就是在調用CharacterDataLatin1.digit(codePoint, radix) 方法，而這個新的方法其實只是去靜態數組中取個映射而已。最后當順利的執行完while循環后，result結果也就計算好了。

作為程序設計人員，我最初接觸的語言是C++，當初用到的庫函數是atoi，那么我們看看atoi的庫標準實現：

int
atoi(str)
	const char *str;
{
	_DIAGASSERT(str != NULL);

	return((int)strtol(str, (char **)NULL, 10));
}

其中調用了strtol方法，參數傳遞的radix是10，也就是說我們常用的atoi是默認轉化字符串到10進制的。其中開始時還進行了一個trim的操作，而且支持16進制的0x開頭，可謂完全的盡善盡美啊。

strtol方法：

#define	_FUNCNAME	strtol
#define	__INT		long
#define	__INT_MIN	LONG_MIN
#define	__INT_MAX	LONG_MAX

__INT
_FUNCNAME(const char *nptr, char **endptr, int base)
{
	const char *s;
	__INT acc, cutoff;
	char c;
	int i, neg, any, cutlim;

	_DIAGASSERT(nptr != NULL);
	/* endptr may be NULL */

	/* check base value */
	if (base && (base < 2 || base > 36)) {
		errno = EINVAL;
		return(0);
	}

	/*
	 * Skip white space and pick up leading +/- sign if any.
	 * If base is 0, allow 0x for hex and 0 for octal, else
	 * assume decimal; if base is already 16, allow 0x.
	 */
	s = nptr;
	do {
		c = *s++;
	} while (isspace(c));
	if (c == '-') {
		neg = 1;
		c = *s++;
	} else {
		neg = 0;
		if (c == '+')
			c = *s++;
	}
	if ((base == 0 || base == 16) &&
	    c == '0' && (*s == 'x' || *s == 'X')) {
		c = s[1];
		s += 2;
		base = 16;
	}
	if (base == 0)
		base = c == '0' ? 8 : 10;

	/*
	 * Compute the cutoff value between legal numbers and illegal
	 * numbers.  That is the largest legal value, divided by the
	 * base.  An input number that is greater than this value, if
	 * followed by a legal input character, is too big.  One that
	 * is equal to this value may be valid or not; the limit
	 * between valid and invalid numbers is then based on the last
	 * digit.  For instance, if the range for longs is
	 * [-2147483648..2147483647] and the input base is 10,
	 * cutoff will be set to 214748364 and cutlim to either
	 * 7 (neg==0) or 8 (neg==1), meaning that if we have accumulated
	 * a value > 214748364, or equal but the next digit is > 7 (or 8),
	 * the number is too big, and we will return a range error.
	 *
	 * Set any if any `digits' consumed; make it negative to indicate
	 * overflow.
	 */
	cutoff = neg ? __INT_MIN : __INT_MAX;
	cutlim = (int)(cutoff % base);
	cutoff /= base;
	if (neg) {
		if (cutlim > 0) {
			cutlim -= base;
			cutoff += 1;
		}
		cutlim = -cutlim;
	}
	for (acc = 0, any = 0;; c = *s++) {
		if (!isascii(c))
			break;
		if (isdigit(c))
			i = c - '0';
		else if (isalpha(c))
			i = c - (isupper(c) ? 'A' - 10 : 'a' - 10);
		else
			break;
		if (i >= base)
			break;
		if (any < 0)
			continue;
		if (neg) {
			if (acc < cutoff || (acc == cutoff && i > cutlim)) {
				any = -1;
				acc = __INT_MIN;
				errno = ERANGE;
			} else {
				any = 1;
				acc *= base;
				acc -= i;
			}
		} else {
			if (acc > cutoff || (acc == cutoff && i > cutlim)) {
				any = -1;
				acc = __INT_MAX;
				errno = ERANGE;
			} else {
				any = 1;
				acc *= base;
				acc += i;
			}
		}
	}
	if (endptr != 0)
		/* LINTED interface specification */
		*endptr = __DECONST(char *, (any ? s - 1 : nptr));
	return(acc);
}

當然，類似的代碼還有很多，這里只列出了兩大語言的庫實現，總體思路是一致的，當我們設計api時，這種編程思路和風格以及功能的考慮是我們需要學習的。

下面這兩篇stackoverflow的問答給出了一些比較全面的c風格代碼，可以參考，這里不貼全文只給link：

http://stackoverflow.com/questions/194465/how-to-parse-a-string-to-an-int-in-c

http://stackoverflow.com/questions/4442658/c-parse-int-from-string

參考文獻：

jdk文檔及源碼

c庫函數源碼及文檔

posted on 2012-04-06 12:20 changedi 閱讀(2666) 評論(2) 編輯收藏所屬分類: 好代碼分享


只有注冊用戶登錄后才能發表評論。




網站導航: 博客園 IT新聞 Chat2DB C++博客博問管理
相關文章: 分享代碼系列——HashedArrayTree 分享一個LRUMap的實現——來自apache common-collections框架分享代碼系列——vlist 分享代碼系列——parseInt(包含java和c語言的atoi方法）分享一個ThreadMonitor

# re: 分享代碼系列——parseInt(包含java和c語言的atoi方法） 2014-04-29 22:46 zuidaima

Change Dir

導航

公告

隨筆分類(125)

隨筆檔案(123)

統計

留言簿(18)

積分與排名

“牛”們的博客

各個公司技術

我的鏈接

淘寶技術

閱讀排行榜

評論排行榜

分享代碼系列——parseInt(包含java和c語言的atoi方法）

評論

Change Dir

導航

公告

隨筆分類(125)

隨筆檔案(123)

統計

留言簿(18)

積分與排名

“牛”們的博客

各個公司技術

我的鏈接

淘寶技術

閱讀排行榜

評論排行榜

分享代碼系列——parseInt(包含java和c語言的atoi方法）

評論

# re: 分享代碼系列&mdash;&mdash;parseInt(包含java和c語言的atoi方法） 2014-04-29 22:46 zuidaima

# re: 分享代碼系列——parseInt(包含java和c語言的atoi方法） 2014-04-29 22:46 zuidaima