開(kāi)窗函數(shù)
Oracle從8.1.6開(kāi)始提供分析函數(shù),分析函數(shù)用于計(jì)算基于組的某種聚合值,它和聚合函數(shù)的不同之處是:對(duì)于每個(gè)組返回多行,而聚合函數(shù)對(duì)于每個(gè)組只返回一行。
開(kāi)窗函數(shù)指定了分析函數(shù)工作的數(shù)據(jù)窗口大小,這個(gè)數(shù)據(jù)窗口大小可能會(huì)隨著行的變化而變化,舉例如下:
1:over后的寫(xiě)法:
over(order by salary) 按照salary排序進(jìn)行累計(jì),order by是個(gè)默認(rèn)的開(kāi)窗函數(shù)
over(partition by deptno)按照部門(mén)分區(qū)
over(partition by deptno order by salary)
2:開(kāi)窗的窗口范圍:
over(order by salary range between 5 preceding and 5 following):窗口范圍為當(dāng)前行數(shù)據(jù)幅度減5加5后的范圍內(nèi)的。
舉例:
--sum(s)over(order by s range between 2 preceding and 2 following) 表示加2或2的范圍內(nèi)的求和
select name,class,s, sum(s)over(order by s range between 2 preceding and 2 following) mm from t2
adf 3 45 45 --45加2減2即43到47,但是s在這個(gè)范圍內(nèi)只有45
asdf 3 55 55
cfe 2 74 74
3dd 3 78 158 --78在76到80范圍內(nèi)有78,80,求和得158
fda 1 80 158
gds 2 92 92
ffd 1 95 190
dss 1 95 190
ddd 3 99 198
gf 3 99 198
over(order by salary rows between 5 preceding and 5 following):窗口范圍為當(dāng)前行前后各移動(dòng)5行。
舉例:
--sum(s)over(order by s rows between 2 preceding and 2 following)表示在上下兩行之間的范圍內(nèi)
select name,class,s, sum(s)over(order by s rows between 2 preceding and 2 following) mm from t2
adf 3 45 174 (45+55+74=174)
asdf 3 55 252 (45+55+74+78=252)
cfe 2 74 332 (74+55+45+78+80=332)
3dd 3 78 379 (78+74+55+80+92=379)
fda 1 80 419
gds 2 92 440
ffd 1 95 461
dss 1 95 480
ddd 3 99 388
gf 3 99 293
over(order by salary
range between unbounded preceding and unbounded following)或者
over(order by salary rows between unbounded preceding and unbounded following):窗口不做限制
3、與over函數(shù)結(jié)合的幾個(gè)函數(shù)介紹
row_number()over()、rank()over()和dense_rank()over()函數(shù)的使用
下面以班級(jí)成績(jī)表t2來(lái)說(shuō)明其應(yīng)用
t2表信息如下:
cfe 2 74
dss 1 95
ffd 1 95
fda 1 80
gds 2 92
gf 3 99
ddd 3 99
adf 3 45
asdf 3 55
3dd 3 78
select * from
(
select name,class,s,rank()over(partition by class order by s desc) mm from t2
)
where mm=1;
得到的結(jié)果是:
dss 1 95 1
ffd 1 95 1
gds 2 92 1
gf 3 99 1
ddd 3 99 1
注意:
1.在求第一名成績(jī)的時(shí)候,不能用row_number(),因?yàn)槿绻嘤袃蓚€(gè)并列第一,row_number()只返回一個(gè)結(jié)果;
select * from
(
select name,class,s,row_number()over(partition by class order by s desc) mm from t2
)
where mm=1;
1 95 1 --95有兩名但是只顯示一個(gè)
2 92 1
3 99 1 --99有兩名但也只顯示一個(gè)
2.rank()和dense_rank()可以將所有的都查找出來(lái):
如上可以看到采用rank可以將并列第一名的都查找出來(lái);
rank()和dense_rank()區(qū)別:
--rank()是跳躍排序,有兩個(gè)第二名時(shí)接下來(lái)就是第四名;
select name,class,s,rank()over(partition by class order by s desc) mm from t2
dss 1 95 1
ffd 1 95 1
fda 1 80 3 --直接就跳到了第三
gds 2 92 1
cfe 2 74 2
gf 3 99 1
ddd 3 99 1
3dd 3 78 3
asdf 3 55 4
adf 3 45 5
--dense_rank()l是連續(xù)排序,有兩個(gè)第二名時(shí)仍然跟著第三名
select name,class,s,dense_rank()over(partition by class order by s desc) mm from t2
dss 1 95 1
ffd 1 95 1
fda 1 80 2 --連續(xù)排序(仍為2)
gds 2 92 1
cfe 2 74 2
gf 3 99 1
ddd 3 99 1
3dd 3 78 2
asdf 3 55 3
adf 3 45 4
--sum()over()的使用select name,class,s, sum(s)over(partition by class order by s desc) mm from t2 --根據(jù)班級(jí)進(jìn)行分?jǐn)?shù)求和
dss 1 95 190 --由于兩個(gè)95都是第一名,所以累加時(shí)是兩個(gè)第一名的相加
ffd 1 95 190
fda 1 80 270 --第一名加上第二名的
gds 2 92 92
cfe 2 74 166
gf 3 99 198
ddd 3 99 198
3dd 3 78 276
asdf 3 55 331
adf 3 45 376
first_value() over()和last_value() over()的使用
--找出這三條電路每條電路的第一條記錄類型和最后一條記錄類型
SELECT opr_id,res_type,
first_value(res_type) over(PARTITION BY opr_id ORDER BY res_type) low,
last_value(res_type) over(PARTITION BY opr_id ORDER BY res_type rows BETWEEN unbounded preceding AND unbounded following) high
FROM rm_circuit_route
WHERE opr_id IN ('000100190000000000021311','000100190000000000021355','000100190000000000021339')
ORDER BY opr_id;

注:rows BETWEEN unbounded preceding AND unbounded following 的使用
--取last_value時(shí)不使用rows BETWEEN unbounded preceding AND unbounded following的結(jié)果
SELECT opr_id,res_type,
first_value(res_type) over(PARTITION BY opr_id ORDER BY res_type) low,
last_value(res_type) over(PARTITION BY opr_id ORDER BY res_type) high
FROM rm_circuit_route
WHERE opr_id IN ('000100190000000000021311','000100190000000000021355','000100190000000000021339')
ORDER BY opr_id;
如下圖可以看到,如果不使用
rows BETWEEN unbounded preceding AND unbounded following,取出的last_value由于與res_type進(jìn)行進(jìn)行排列,因此取出的電路的最后一行記錄的類型就不是按照電路的范圍提取了,而是以res_type為范圍進(jìn)行提取了。
在first_value和last_value中ignore nulls的使用
數(shù)據(jù)如下:
取出該電路的第一條記錄,加上ignore nulls后,如果第一條是判斷的那個(gè)字段是空的,則默認(rèn)取下一條,結(jié)果如下所示:

--lag() over()函數(shù)用法(取出前n行數(shù)據(jù))
lag(expresstion,<offset>,<default>)
with a as
(select 1 id,'a' name from dual
union
select 2 id,'b' name from dual
union
select 3 id,'c' name from dual
union
select 4 id,'d' name from dual
union
select 5 id,'e' name from dual
)
select id,name,lag(id,1,'')over(order by name) from a;
--lead() over()函數(shù)用法(取出后N行數(shù)據(jù))
lead(expresstion,<offset>,<default>)
with a as
(select 1 id,'a' name from dual
union
select 2 id,'b' name from dual
union
select 3 id,'c' name from dual
union
select 4 id,'d' name from dual
union
select 5 id,'e' name from dual
)
select id,name,lead(id,1,'')over(order by name) from a;
--ratio_to_report(a)函數(shù)用法 Ratio_to_report() 括號(hào)中就是分子,over() 括號(hào)中就是分母
with a as (select 1 a from dual
union all
select 1 a from dual
union all
select 1 a from dual
union all
select 2 a from dual
union all
select 3 a from dual
union all
select 4 a from dual
union all
select 4 a from dual
union all
select 5 a from dual
)
select a, ratio_to_report(a)over(partition by a) b from a
order by a;
with a as (select 1 a from dual
union all
select 1 a from dual
union all
select 1 a from dual
union all
select 2 a from dual
union all
select 3 a from dual
union all
select 4 a from dual
union all
select 4 a from dual
union all
select 5 a from dual
)
select a, ratio_to_report(a)over() b from a --分母缺省就是整個(gè)占比
order by a;
with a as (select 1 a from dual
union all
select 1 a from dual
union all
select 1 a from dual
union all
select 2 a from dual
union all
select 3 a from dual
union all
select 4 a from dual
union all
select 4 a from dual
union all
select 5 a from dual
)
select a, ratio_to_report(a)over() b from a
group by a order by a;--分組后的占比
percent_rank用法
計(jì)算方法:所在組排名序號(hào)-1除以該組所有的行數(shù)-1,如下所示自己計(jì)算的pr1與通過(guò)percent_rank函數(shù)得到的值是一樣的:
SELECT a.deptno,
a.ename,
a.sal,
a.r,
b.n,
(a.r-1)/(n-1) pr1,
percent_rank() over(PARTITION BY a.deptno ORDER BY a.sal) pr2
FROM (SELECT deptno,
ename,
sal,
rank() over(PARTITION BY deptno ORDER BY sal) r --計(jì)算出在組中的排名序號(hào)
FROM emp
ORDER BY deptno, sal) a,
(SELECT deptno, COUNT(1) n FROM emp GROUP BY deptno) b --按部門(mén)計(jì)算每個(gè)部門(mén)的所有成員數(shù)
WHERE a.deptno = b.deptno;

cume_dist函數(shù)
計(jì)算方法:所在組排名序號(hào)除以該組所有的行數(shù),但是如果存在并列情況,則需加上并列的個(gè)數(shù)-1,
如下所示自己計(jì)算的pr1與通過(guò)percent_rank函數(shù)得到的值是一樣的:
SELECT a.deptno,
a.ename,
a.sal,
a.r,
b.n,
c.rn,
(a.r + c.rn - 1) / n pr1,
cume_dist() over(PARTITION BY a.deptno ORDER BY a.sal) pr2
FROM (SELECT deptno,
ename,
sal,
rank() over(PARTITION BY deptno ORDER BY sal) r
FROM emp
ORDER BY deptno, sal) a,
(SELECT deptno, COUNT(1) n FROM emp GROUP BY deptno) b,
(SELECT deptno, r, COUNT(1) rn,sal
FROM (SELECT deptno,sal,
rank() over(PARTITION BY deptno ORDER BY sal) r
FROM emp)
GROUP BY deptno, r,sal
ORDER BY deptno) c --c表就是為了得到每個(gè)部門(mén)員工工資的一樣的個(gè)數(shù)
WHERE a.deptno = b.deptno
AND a.deptno = c.deptno(+)
AND a.sal = c.sal;
percentile_cont函數(shù)
含義:輸入一個(gè)百分比(該百分比就是按照percent_rank函數(shù)計(jì)算的值),返回該百分比位置的平均值
如下,輸入百分比為0.7,因?yàn)?.7介于0.6和0.8之間,因此返回的結(jié)果就是0.6對(duì)應(yīng)的sal的1500加上0.8對(duì)應(yīng)的sal的1600平均
SELECT ename,
sal,
deptno,
percentile_cont(0.7) within GROUP(ORDER BY sal) over(PARTITION BY deptno) "Percentile_Cont",
percent_rank() over(PARTITION BY deptno ORDER BY sal) "Percent_Rank"
FROM emp
WHERE deptno IN (30, 60);

若輸入的百分比為0.6,則直接0.6對(duì)應(yīng)的sal值,即1500
SELECT ename,
sal,
deptno,
percentile_cont(0.6) within GROUP(ORDER BY sal) over(PARTITION BY deptno) "Percentile_Cont",
percent_rank() over(PARTITION BY deptno ORDER BY sal) "Percent_Rank"
FROM emp
WHERE deptno IN (30, 60);
PERCENTILE_DISC函數(shù)
功能描述:返回一個(gè)與輸入的分布百分比值相對(duì)應(yīng)的數(shù)據(jù)值,分布百分比的計(jì)算方法見(jiàn)函數(shù)CUME_DIST,如果沒(méi)有正好對(duì)應(yīng)的數(shù)據(jù)值,就取大于該分布值的下一個(gè)值。
注意:本函數(shù)與PERCENTILE_CONT的區(qū)別在找不到對(duì)應(yīng)的分布值時(shí)返回的替代值的計(jì)算方法不同
SAMPLE:下例中0.7的分布值在部門(mén)30中沒(méi)有對(duì)應(yīng)的Cume_Dist值,所以就取下一個(gè)分布值0.83333333所對(duì)應(yīng)的SALARY來(lái)替代
SELECT ename,
sal,
deptno,
percentile_disc(0.7) within GROUP(ORDER BY sal) over(PARTITION BY deptno) "Percentile_Disc",
cume_dist() over(PARTITION BY deptno ORDER BY sal) "Cume_Dist"
FROM emp
WHERE deptno IN (30, 60);