<rt id="bn8ez"></rt>
<label id="bn8ez"></label>

  • <span id="bn8ez"></span>

    <label id="bn8ez"><meter id="bn8ez"></meter></label>

    Decode360's Blog

    業精于勤而荒于嬉 QQ:150355677 MSN:decode360@hotmail.com

      BlogJava :: 首頁 :: 新隨筆 :: 聯系 ::  :: 管理 ::
      397 隨筆 :: 33 文章 :: 29 評論 :: 0 Trackbacks
    取重復記錄中的任一條的分析
    ?
    ??? 最近一直考慮一個問題,找出表中同一字段重復記錄中的任意一條,照理來說這樣的一個需求,在Oracle內部進行實現是很方便的,而且不需要對表進行2遍的掃描。但是事實上我想了很久也向不出來有什么函數可以直接實現這一功能,基本上所有可以這樣做的方法都需要進行嵌套才能完成。不知道為什么Oracle沒有提供這個功能,也許是有什么邏輯矛盾我沒有想到。現在總結一下,模擬的環境如下:
    ?
    ??? create table t1_t(a int,b varchar2(50));
    ??? insert into t1_t(a,b) (select mod(rownum,50) rn,object_name from user_objects);
    ??? insert into t1_t(a,b) (select * from t1_t);
    ??? insert into t1_t(a,b) (select * from t1_t);
    ??? insert into t1_t(a,b) (select * from t1_t);
    ??? insert into t1_t(a,b) (select * from t1_t);
    ??? commit;
    ?
    ??? 至此t1_t表大致上有300萬條記錄左右,其中a字段的取值都是0-49之間的integer,現在的目標就是取到50條記錄,a分別等于0~49,具體哪一條都可以,但需要同時取出b字段。
    ?
    ?
    一、方法大致有兩種:
    ?
    1、使用row_number() over函數找到首條記錄選出
    ?
    ??? select * from
    ??? (select a,b,row_number() over(partition by a order by 1) k from t1_t)
    ??? where k=1;
    ?
    ??? 實際執行時間:在軟解析的情況下需要3s左右。執行計劃如下:
    ?
    Execution Plan
    ----------------------------------------------------------
    ?? 0????? SELECT STATEMENT Optimizer=ALL_ROWS (Cost=32068 Card=3632040 Bytes=192498120)
    ?? 1??? 0?? VIEW (Cost=32068 Card=3632040 Bytes=192498120)
    ?? 2??? 1???? WINDOW (SORT PUSHED RANK) (Cost=32068 Card=3632040 Bytes=101697120)
    ?? 3??? 2?????? TABLE ACCESS (FULL) OF 'T1_T' (TABLE) (Cost=3255 Card=3632040 Bytes=101697120)
    ?
    Statistics
    ----------------------------------------------------------
    ??????? 165? recursive calls
    ???????? 22? db block gets
    ????? 17303? consistent gets
    ?????? 1937? physical reads
    ????? 75904? redo size
    ?????? 2682? bytes sent via SQL*Net to client
    ??????? 537? bytes received via SQL*Net from client
    ????????? 5? SQL*Net roundtrips to/from client
    ????????? 4? sorts (memory)
    ????????? 1? sorts (disk)
    ???????? 50? rows processed
    ?
    2、兩層關聯,用rowid別名進行匹配:
    ?
    ??? select t1.* from t1_t t1,(select min(rowid) rid from t1_t group by a) t2
    ??? where t1.rowid = t2.rid;
    ?
    ??? 實際執行時間:軟解析情況下1.5s左右。執行計劃如下,比上面的方法要好很多,主要是由于不需要進行排序的操作:
    ?
    Execution Plan
    ----------------------------------------------------------
    ?? 0????? SELECT STATEMENT Optimizer=ALL_ROWS (Cost=3919 Card=50 Bytes=2000)
    ?? 1??? 0?? NESTED LOOPS (Cost=3919 Card=50 Bytes=2000)
    ?? 2??? 1???? VIEW (Cost=3869 Card=50 Bytes=600)
    ?? 3??? 2?????? HASH (GROUP BY) (Cost=3869 Card=50 Bytes=750)
    ?? 4??? 3???????? TABLE ACCESS (FULL) OF 'T1_T' (TABLE) (Cost=3243 Card=3632040 Bytes=54480600)
    ?? 5??? 1???? TABLE ACCESS (BY USER ROWID) OF 'T1_T' (TABLE) (Cost=1 Card=1 Bytes=28)
    ?
    Statistics
    ----------------------------------------------------------
    ????????? 1? recursive calls
    ????????? 0? db block gets
    ????? 16278? consistent gets
    ????????? 0? physical reads
    ????????? 0? redo size
    ?????? 2638? bytes sent via SQL*Net to client
    ??????? 537? bytes received via SQL*Net from client
    ????????? 5? SQL*Net roundtrips to/from client
    ????????? 0? sorts (memory)
    ????????? 0? sorts (disk)
    ???????? 50? rows processed
    ?
    ?
    二、考慮使用索引
    ?
    ?
    ??? 首先增加a、b字段的索引:
    ?

    ??? create index t1_t_a on t1_t(a);

    ??? create index t1_t_b on t1_t(b);

    ???
    ??? 要使用索引,需要將字段設置為not null,或在SQL中使用not null選項才可以:
    ?

    ??? alter table t1_t modify a not null ;

    ??? alter table t1_t modify b not null ;

    ?
    ?
    1、對于第1種方法,執行計劃完全沒變,執行效率略有提高
    ?
    Execution Plan
    ----------------------------------------------------------
    ?? 0????? SELECT STATEMENT Optimizer=ALL_ROWS (Cost=32068 Card=3632040 Bytes=192498120)
    ?? 1??? 0?? VIEW (Cost=32068 Card=3632040 Bytes=192498120)
    ?? 2??? 1???? WINDOW (SORT PUSHED RANK) (Cost=32068 Card=3632040 Bytes=101697120)
    ?? 3??? 2?????? TABLE ACCESS (FULL) OF 'T1_T' (TABLE) (Cost=3255 Card=3632040 Bytes=101697120)
    ?
    Statistics
    ----------------------------------------------------------
    ????????? 2? recursive calls
    ???????? 22? db block gets
    ????? 16228? consistent gets
    ???????? 21? physical reads
    ????????? 0? redo size
    ?????? 2682? bytes sent via SQL*Net to client
    ??????? 537? bytes received via SQL*Net from client
    ????????? 5? SQL*Net roundtrips to/from client
    ????????? 0? sorts (memory)
    ????????? 1? sorts (disk)
    ???????? 50? rows processed
    ?
    2、對于第2中方法,用INDEX FAST FULL SCAN 代替了 TABLE ACCESS FULL SCAN,執行效率也略有提高
    ?
    Execution Plan
    ----------------------------------------------------------
    ?? 0????? SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2114 Card=50 Bytes=2000)
    ?? 1??? 0?? NESTED LOOPS (Cost=2114 Card=50 Bytes=2000)
    ?? 2??? 1???? VIEW (Cost=2064 Card=50 Bytes=600)
    ?? 3??? 2?????? HASH (GROUP BY) (Cost=2064 Card=50 Bytes=750)
    ?? 4??? 3???????? INDEX (FAST FULL SCAN) OF 'T1_T_A' (INDEX) (Cost=1439 Card=3632040 Bytes=54480600)
    ?? 5??? 1???? TABLE ACCESS (BY USER ROWID) OF 'T1_T' (TABLE) (Cost=1 Card=1 Bytes=28)

    Statistics
    ----------------------------------------------------------
    ??????? 210? recursive calls
    ????????? 0? db block gets
    ?????? 7145? consistent gets
    ????????? 0? physical reads
    ????????? 0? redo size
    ?????? 2638? bytes sent via SQL*Net to client
    ??????? 537? bytes received via SQL*Net from client
    ????????? 5? SQL*Net roundtrips to/from client
    ????????? 5? sorts (memory)
    ????????? 0? sorts (disk)
    ???????? 50? rows processed
    ?
    ?
    三、強制使用索引
    ?
    1、對于方法1,使用索引t1_t_a
    ?

    ??? select * from

    ??? ( select /*+INDEX(t1_t t1_t_a)*/ a,b,row_number() over( partition by a order by 1 ) k from t1_t)

    ??? where k= 1 ;


    Execution Plan
    ----------------------------------------------------------
    ?? 0????? SELECT STATEMENT Optimizer=ALL_ROWS (Cost=844081 Card=3632040 Bytes=192498120)
    ?? 1??? 0?? VIEW (Cost=844081 Card=3632040 Bytes=192498120)
    ?? 2??? 1???? WINDOW (NOSORT) (Cost=844081 Card=3632040 Bytes=101697120)
    ?? 3??? 2?????? TABLE ACCESS (BY INDEX ROWID) OF 'T1_T' (TABLE) (Cost=815268 Card=3632040Bytes=101697120)
    ?
    ?? 4??? 3???????? INDEX (FULL SCAN) OF 'T1_T_A' (INDEX) (Cost=7154 Card=3632040)

    Statistics
    ----------------------------------------------------------
    ????????? 0? recursive calls
    ????????? 0? db block gets
    ???? 814027? consistent gets
    ????????? 0? physical reads
    ????????? 0? redo size
    ?????? 2817? bytes sent via SQL*Net to client
    ??????? 537? bytes received via SQL*Net from client
    ????????? 5? SQL*Net roundtrips to/from client
    ????????? 0? sorts (memory)
    ????????? 0? sorts (disk)
    ???????? 50? rows processed
    ?
    ??? 相比沒有使用索引時的效率還要低的多,因為是全表掃描,所以使用全索引掃描造成了額外的開銷。
    ?
    2、對于方法2:
    ?
    ??? 將INDEX FAST FULL SCAN 改成INDEX FULL SCAN 明顯沒有什么意義。
    ?
    ?
    四、a,b的聯合索引:
    ?
    ??? 首先創建a,b的聯合索引:
    ??? createindex t1_t_ab on t1_t(a,b);
    ?
    1、對方法1,使用索引t1_t_ab
    ?
    ?

    ??? select * from

    ??? ( select /*+INDEX(t1_t t1_t_ab)*/ a,b,row_number() over( partition by a order by 1 ) k from t1_t)

    ??? where k= 1 ;

    ?
    Execution Plan
    ----------------------------------------------------------
    ?? 0????? SELECT STATEMENT Optimizer=ALL_ROWS (Cost=48327 Card=3632040 Bytes=192498120)
    ?? 1??? 0?? VIEW (Cost=48327 Card=3632040 Bytes=192498120)
    ?? 2??? 1???? WINDOW (NOSORT) (Cost=48327 Card=3632040 Bytes=101697120)
    ?? 3??? 2?????? INDEX (FULL SCAN) OF 'T1_T_AB' (INDEX) (Cost=19514 Card=3632040 Bytes=1016
    ????????? 97120)
    ?
    Statistics
    ----------------------------------------------------------
    ????????? 1? recursive calls
    ????????? 0? db block gets
    ????? 19368? consistent gets
    ????????? 0? physical reads
    ????????? 0? redo size
    ?????? 2783? bytes sent via SQL*Net to client
    ??????? 537? bytes received via SQL*Net from client
    ????????? 5? SQL*Net roundtrips to/from client
    ????????? 0? sorts (memory)
    ????????? 0? sorts (disk)
    ???????? 50? rows processed
    ?
    ??? 相比沒有使用索引的時候,性能有所降低,但是比使用t1_t_a索引要好的多了,但是如果對于a字段進行限制時,row_number() over函數會自動找到t1_t_ab索引,效果會很好
    ?

    ??? select * from

    ??? ( select a,b,row_number() over( partition by a order by 1 ) k from t1_twhere a='1' )

    ??? where k= 1 ;

    ?
    Execution Plan
    ----------------------------------------------------------
    ?? 0????? SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1119 Card=84004 Bytes=4452212)
    ?? 1??? 0?? VIEW (Cost=1119 Card=84004 Bytes=4452212)
    ?? 2??? 1???? WINDOW (NOSORT) (Cost=1119 Card=84004 Bytes=2352112)
    ?? 3??? 2?????? INDEX (RANGE SCAN) OF 'T1_T_AB' (INDEX) (Cost=453 Card=84004 Bytes=2352112)
    ?
    Statistics
    ----------------------------------------------------------
    ????????? 1? recursive calls
    ????????? 0? db block gets
    ??????? 410? consistent gets
    ????????? 0? physical reads
    ????????? 0? redo size
    ??????? 537? bytes sent via SQL*Net to client
    ??????? 504? bytes received via SQL*Net from client
    ????????? 2? SQL*Net roundtrips to/from client
    ????????? 0? sorts (memory)
    ????????? 0? sorts (disk)
    ????????? 1? rows processed
    ?
    ?
    五、總結
    ?
    ??? 基本上如果是要在全表中找到所有a的單條記錄,那么加不加索引的區別都不大,因為全表掃描是沒有必要走索引的,但是用方法2會比方法1要快速很多,資源占用也較少,原因是不需要排序的環節。
    ?
    ??? 如果只是想要取出單個a值的任意記錄,則可以選用方法1,并建立a,b字段的聯合索引,會大大提高效率。
    ?
    ?
    ?
    posted on 2009-03-04 21:57 decode360 閱讀(251) 評論(0)  編輯  收藏 所屬分類: 05.SQL
    主站蜘蛛池模板: 亚洲人成网站日本片| 国产大片91精品免费观看男同| 男人和女人高潮免费网站| 成人婷婷网色偷偷亚洲男人的天堂| 黄色三级三级免费看| 免费成人在线电影| 免费能直接在线观看黄的视频 | 免费无码黄网站在线看| 无码av免费网站| 日本不卡视频免费| 亚洲女久久久噜噜噜熟女| 亚洲免费人成在线视频观看| 国产在线精品观看免费观看| 色老头综合免费视频| 免费看黄的成人APP| 亚洲精品无码久久一线| 亚洲国色天香视频| 国内精品免费视频精选在线观看| 国产A在亚洲线播放| 日本精品久久久久久久久免费| 亚洲AV无码一区二三区| 国产日本亚洲一区二区三区| 全部一级一级毛片免费看| 亚洲乳大丰满中文字幕| 99热这里只有精品6免费| 亚洲成A人片在线观看中文| 又黄又大的激情视频在线观看免费视频社区在线 | 亚洲人成网站在线播放影院在线| 亚洲高清视频在线| 日本不卡免费新一区二区三区 | 女人18毛片a级毛片免费| 亚洲精品少妇30p| 91精品国产免费| 亚洲精品无码成人AAA片| 久久青草免费91线频观看站街| 亚洲伊人久久大香线焦| 国产一区二区三区免费在线观看 | 国产精品免费久久久久久久久| 天天看免费高清影视| 免费一级毛片在线播放视频免费观看永久| 亚洲日本韩国在线|