ROC曲線(Receiver Operating Characteeristic Curve)是顯示Classification模型真正率和假正率之間折中的一種圖形化方法。
解讀ROC圖的一些概念定義::
真正(True Positive , TP)被模型預(yù)測為正的正樣本
假負(fù)(False Negative , FN)被模型預(yù)測為負(fù)的正樣本
假正(False Positive , FP)被模型預(yù)測為正的負(fù)樣本
真負(fù)(True Negative , TN)被模型預(yù)測為負(fù)的負(fù)樣本
真正率(True Positive Rate , TPR)或靈敏度(sensitivity)?
???TPR = TP /(TP + FN)?
???正樣本預(yù)測結(jié)果數(shù) / 正樣本實(shí)際數(shù)
假負(fù)率(False Negative Rate , FNR)?
???FNR = FN /(TP + FN)?
???被預(yù)測為負(fù)的正樣本結(jié)果數(shù) / 正樣本實(shí)際數(shù)
假正率(False Positive Rate , FPR)?
???FPR = FP /(FP + TN)?
???被預(yù)測為正的負(fù)樣本結(jié)果數(shù) /負(fù)樣本實(shí)際數(shù)
真負(fù)率(True Negative Rate , TNR)或特指度(specificity)?
???TNR = TN /(TN + FP)?
???負(fù)樣本預(yù)測結(jié)果數(shù) / 負(fù)樣本實(shí)際數(shù)
目標(biāo)屬性的被選中的那個(gè)期望值稱作是“正”(positive)
ROC曲線上幾個(gè)關(guān)鍵點(diǎn)的解釋:
( TPR=0,FPR=0 ) 把每個(gè)實(shí)例都預(yù)測為負(fù)類的模型
( TPR=1,FPR=1 ) 把每個(gè)實(shí)例都預(yù)測為正類的模型
( TPR=1,FPR=0 ) 理想模型
此處圖像以后再補(bǔ)
一個(gè)好的分類模型應(yīng)該盡可能靠近圖形的左上角,而一個(gè)隨機(jī)猜測模型應(yīng)位于連接點(diǎn)(TPR=0,FPR=0)和(TPR=1,FPR=1)的主對角線上。
ROC曲線下方的面積(AUC)提供了評價(jià)模型平均性能的另一種方法。如果模型是完美的,那么它的AUG = 1,如果模型是個(gè)簡單的隨機(jī)猜測模型,那么它的AUG = 0.5,如果一個(gè)模型好于另一個(gè),則它的曲線下方面積相對較大。

Oracle 論壇上對ROC 的解釋
This?explaination?comes?from?one?of?our?algorithm?engineers:?
"The?ROC?analysis?applies?to?binary?classification?problems.?One?of?the?classes?is?selected?as?a?"positive"?one.?The?ROC?chart?plots?the?true?positive?rate?as?a?function?of?the?false?positive?rate.?It?is?parametrized?by?the?probability?threshold?values.?The?true?positive?rate?represents?the?fraction?of?positive?cases?that?were?correctly?classified?by?the?model.?The?false?positive?rate?represents?the?fraction?of?negative?cases?that?were?incorrectly?classified?as?positive.?Each?point?on?the?ROC?plot?represents?a?true_positive_rate/false_positive_rate?pair?corresponding?to?a?particular?probability?threshold.?Each?point?has?a?corresponding?confusion?matrix.?The?user?can?analyze?the?confusion?matrices?produced?at?different?threshold?levels?and?select?a?probability?threshold?to?be?used?for?scoring.?The?probability?threshold?choice?is?usually?based?on?application?requirements?(i.e.,?acceptable?level?of?false?positives).
The?ROC?does?not?represent?a?model.?Instead?it?quantifies?its?discriminatory?ability?and?assists?the?user?in?selecting?an?appropriate?operating?point?for?scoring."
I?would?add?to?this?that?you?can?select?a?threshold?point?the?build?activity?to?bias?the?apply?process.?Currently?we?generate?a?cost?matrix?based?on?the?selected?threshold?point?rather?than?use?the?threshold?point?directly. http://forums.oracle.com/forums/thread.jspa?threadID=415870&tstart=15