<rt id="bn8ez"></rt>
<label id="bn8ez"></label>

  • <span id="bn8ez"></span>

    <label id="bn8ez"><meter id="bn8ez"></meter></label>

    Decode360's Blog

    業精于勤而荒于嬉 QQ:150355677 MSN:decode360@hotmail.com

      BlogJava :: 首頁 :: 新隨筆 :: 聯系 ::  :: 管理 ::
      302 隨筆 :: 26 文章 :: 82 評論 :: 0 Trackbacks
    取重復記錄中的任一條的分析
    ?
    ?
    ??? 最近一直考慮一個問題,找出表中同一字段重復記錄中的任意一條,模擬的環境如下:
    ?
    ??? create table t1_t(a int,b varchar2(50));
    ??? insert into t1_t(a,b) (select mod(rownum,50) rn,object_name from user_objects);
    ??? insert into t1_t(a,b) (select * from t1_t);
    ??? insert into t1_t(a,b) (select * from t1_t);
    ??? insert into t1_t(a,b) (select * from t1_t);
    ??? insert into t1_t(a,b) (select * from t1_t);
    ??? commit;
    ?
    ??? 至此t1_t表大致上有300萬條記錄左右,其中a字段的取值都是0-49之間的integer,現在的目標就是取到50條記錄,a分別等于0~49,具體哪一條都可以,但需要同時取出b字段。
    ?
    ?
    一、方法大致有兩種:
    ?
    1、使用row_number() over函數找到首條記錄選出
    ?
    ??? select * from?
    ??? (select a,b,row_number() over(partition by a order by 1) k from t1_t)
    ??? where k=1;
    ?
    ??? 實際執行時間:在軟解析的情況下需要3s左右。執行計劃如下:
    ?
    Execution Plan
    ----------------------------------------------------------
    ?? 0????? SELECT STATEMENT Optimizer=ALL_ROWS (Cost=32068 Card=3632040 Bytes=192498120)
    ?? 1??? 0?? VIEW (Cost=32068 Card=3632040 Bytes=192498120)
    ?? 2??? 1???? WINDOW (SORT PUSHED RANK) (Cost=32068 Card=3632040 Bytes=101697120)
    ?? 3??? 2?????? TABLE ACCESS (FULL) OF 'T1_T' (TABLE) (Cost=3255 Card=3632040 Bytes=101697120)
    ?
    Statistics
    ----------------------------------------------------------
    ??????? 165? recursive calls
    ???????? 22? db block gets
    ????? 17303? consistent gets
    ?????? 1937? physical reads
    ????? 75904? redo size
    ?????? 2682? bytes sent via SQL*Net to client
    ??????? 537? bytes received via SQL*Net from client
    ????????? 5? SQL*Net roundtrips to/from client
    ????????? 4? sorts (memory)
    ????????? 1? sorts (disk)
    ???????? 50? rows processed
    ?
    2、兩層關聯,用rowid別名進行匹配:
    ?
    ??? select t1.* from t1_t t1,(select min(rowid) rid from t1_t group by a) t2
    ??? where t1.rowid = t2.rid;
    ?
    ??? 實際執行時間:軟解析情況下1.5s左右。執行計劃如下,比上面的方法要好很多,主要是由于不需要進行排序的操作:
    ?
    Execution Plan
    ----------------------------------------------------------
    ?? 0????? SELECT STATEMENT Optimizer=ALL_ROWS (Cost=3919 Card=50 Bytes=2000)
    ?? 1??? 0?? NESTED LOOPS (Cost=3919 Card=50 Bytes=2000)
    ?? 2??? 1???? VIEW (Cost=3869 Card=50 Bytes=600)
    ?? 3??? 2?????? HASH (GROUP BY) (Cost=3869 Card=50 Bytes=750)
    ?? 4??? 3???????? TABLE ACCESS (FULL) OF 'T1_T' (TABLE) (Cost=3243 Card=3632040 Bytes=54480600)
    ?? 5??? 1???? TABLE ACCESS (BY USER ROWID) OF 'T1_T' (TABLE) (Cost=1 Card=1 Bytes=28)
    ?
    Statistics
    ----------------------------------------------------------
    ????????? 1? recursive calls
    ????????? 0? db block gets
    ????? 16278? consistent gets
    ????????? 0? physical reads
    ????????? 0? redo size
    ?????? 2638? bytes sent via SQL*Net to client
    ??????? 537? bytes received via SQL*Net from client
    ????????? 5? SQL*Net roundtrips to/from client
    ????????? 0? sorts (memory)
    ????????? 0? sorts (disk)
    ???????? 50? rows processed
    ?
    ?
    二、考慮使用索引
    ?
    ?
    ??? 首先增加a、b字段的索引:
    ?

    ??? create index t1_t_a on t1_t(a);

    ??? create index t1_t_b on t1_t(b);

    ???
    ??? 要使用索引,需要將字段設置為not null,或在SQL中使用not null選項才可以:
    ?

    ??? alter table t1_t modify a not null ;

    ??? alter table t1_t modify b not null ;

    ?
    ?
    1、對于第1種方法,執行計劃完全沒變,執行效率略有提高
    ?
    Execution Plan
    ----------------------------------------------------------
    ?? 0????? SELECT STATEMENT Optimizer=ALL_ROWS (Cost=32068 Card=3632040 Bytes=192498120)
    ?? 1??? 0?? VIEW (Cost=32068 Card=3632040 Bytes=192498120)
    ?? 2??? 1???? WINDOW (SORT PUSHED RANK) (Cost=32068 Card=3632040 Bytes=101697120)
    ?? 3??? 2?????? TABLE ACCESS (FULL) OF 'T1_T' (TABLE) (Cost=3255 Card=3632040 Bytes=101697120)
    ?
    Statistics
    ----------------------------------------------------------
    ????????? 2? recursive calls
    ???????? 22? db block gets
    ????? 16228? consistent gets
    ???????? 21? physical reads
    ????????? 0? redo size
    ?????? 2682? bytes sent via SQL*Net to client
    ??????? 537? bytes received via SQL*Net from client
    ????????? 5? SQL*Net roundtrips to/from client
    ????????? 0? sorts (memory)
    ????????? 1? sorts (disk)
    ???????? 50? rows processed
    ?
    2、對于第2中方法,用INDEX FAST FULL SCAN 代替了 TABLE ACCESS FULL SCAN,執行效率也略有提高
    ?
    Execution Plan
    ----------------------------------------------------------
    ?? 0????? SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2114 Card=50 Bytes=2000)
    ?? 1??? 0?? NESTED LOOPS (Cost=2114 Card=50 Bytes=2000)
    ?? 2??? 1???? VIEW (Cost=2064 Card=50 Bytes=600)
    ?? 3??? 2?????? HASH (GROUP BY) (Cost=2064 Card=50 Bytes=750)
    ?? 4??? 3???????? INDEX (FAST FULL SCAN) OF 'T1_T_A' (INDEX) (Cost=1439 Card=3632040 Bytes=54480600)
    ?? 5??? 1???? TABLE ACCESS (BY USER ROWID) OF 'T1_T' (TABLE) (Cost=1 Card=1 Bytes=28)

    Statistics
    ----------------------------------------------------------
    ??????? 210? recursive calls
    ????????? 0? db block gets
    ?????? 7145? consistent gets
    ????????? 0? physical reads
    ????????? 0? redo size
    ?????? 2638? bytes sent via SQL*Net to client
    ??????? 537? bytes received via SQL*Net from client
    ????????? 5? SQL*Net roundtrips to/from client
    ????????? 5? sorts (memory)
    ????????? 0? sorts (disk)
    ???????? 50? rows processed
    ?
    ?
    三、強制使用索引
    ?
    1、對于方法1,使用索引t1_t_a
    ?

    ??? select * from

    ??? ( select /*+INDEX(t1_t t1_t_a)*/ a,b,row_number() over( partition by a order by 1 ) k from t1_t)

    ??? where k= 1 ;


    Execution Plan
    ----------------------------------------------------------
    ?? 0????? SELECT STATEMENT Optimizer=ALL_ROWS (Cost=844081 Card=3632040 Bytes=192498120)
    ?? 1??? 0?? VIEW (Cost=844081 Card=3632040 Bytes=192498120)
    ?? 2??? 1???? WINDOW (NOSORT) (Cost=844081 Card=3632040 Bytes=101697120)
    ?? 3??? 2?????? TABLE ACCESS (BY INDEX ROWID) OF 'T1_T' (TABLE) (Cost=815268 Card=3632040Bytes=101697120)
    ?
    ?? 4??? 3???????? INDEX (FULL SCAN) OF 'T1_T_A' (INDEX) (Cost=7154 Card=3632040)

    Statistics
    ----------------------------------------------------------
    ????????? 0? recursive calls
    ????????? 0? db block gets
    ???? 814027? consistent gets
    ????????? 0? physical reads
    ????????? 0? redo size
    ?????? 2817? bytes sent via SQL*Net to client
    ??????? 537? bytes received via SQL*Net from client
    ????????? 5? SQL*Net roundtrips to/from client
    ????????? 0? sorts (memory)
    ????????? 0? sorts (disk)
    ???????? 50? rows processed
    ?
    ??? 相比沒有使用索引時的效率還要低的多,因為是全表掃描,所以使用全索引掃描造成了額外的開銷。
    ?
    2、對于方法2:
    ?
    ??? 將INDEX FAST FULL SCAN 改成INDEX FULL SCAN 明顯沒有什么意義。
    ?
    ?
    四、a,b的聯合索引:
    ?
    ??? 首先創建a,b的聯合索引:
    ??? createindex t1_t_ab on t1_t(a,b);
    ?
    1、對方法1,使用索引t1_t_ab
    ?
    ?

    ??? select * from

    ??? ( select /*+INDEX(t1_t t1_t_ab)*/ a,b,row_number() over( partition by a order by 1 ) k from t1_t)

    ??? where k= 1 ;

    ?
    Execution Plan
    ----------------------------------------------------------
    ?? 0????? SELECT STATEMENT Optimizer=ALL_ROWS (Cost=48327 Card=3632040 Bytes=192498120)
    ?? 1??? 0?? VIEW (Cost=48327 Card=3632040 Bytes=192498120)
    ?? 2??? 1???? WINDOW (NOSORT) (Cost=48327 Card=3632040 Bytes=101697120)
    ?? 3??? 2?????? INDEX (FULL SCAN) OF 'T1_T_AB' (INDEX) (Cost=19514 Card=3632040 Bytes=1016
    ????????? 97120)
    ?
    Statistics
    ----------------------------------------------------------
    ????????? 1? recursive calls
    ????????? 0? db block gets
    ????? 19368? consistent gets
    ????????? 0? physical reads
    ????????? 0? redo size
    ?????? 2783? bytes sent via SQL*Net to client
    ??????? 537? bytes received via SQL*Net from client
    ????????? 5? SQL*Net roundtrips to/from client
    ????????? 0? sorts (memory)
    ????????? 0? sorts (disk)
    ???????? 50? rows processed
    ?
    ??? 相比沒有使用索引的時候,性能有所降低,但是比使用t1_t_a索引要好的多了,但是如果對于a字段進行限制時,row_number() over函數會自動找到t1_t_ab索引,效果會很好
    ?

    ??? select * from

    ??? ( select a,b,row_number() over( partition by a order by 1 ) k from t1_twhere a='1' )

    ??? where k= 1 ;

    ?
    Execution Plan
    ----------------------------------------------------------
    ?? 0????? SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1119 Card=84004 Bytes=4452212)
    ?? 1??? 0?? VIEW (Cost=1119 Card=84004 Bytes=4452212)
    ?? 2??? 1???? WINDOW (NOSORT) (Cost=1119 Card=84004 Bytes=2352112)
    ?? 3??? 2?????? INDEX (RANGE SCAN) OF 'T1_T_AB' (INDEX) (Cost=453 Card=84004 Bytes=2352112)
    ?
    Statistics
    ----------------------------------------------------------
    ????????? 1? recursive calls
    ????????? 0? db block gets
    ??????? 410? consistent gets
    ????????? 0? physical reads
    ????????? 0? redo size
    ??????? 537? bytes sent via SQL*Net to client
    ??????? 504? bytes received via SQL*Net from client
    ????????? 2? SQL*Net roundtrips to/from client
    ????????? 0? sorts (memory)
    ????????? 0? sorts (disk)
    ????????? 1? rows processed
    ?
    ?
    五、總結
    ?
    ??? 基本上如果是要在全表中找到所有a的單條記錄,那么加不加索引的區別都不大,因為全表掃描是沒有必要走索引的,但是用方法2會比方法1要快速很多,資源占用也較少,原因是不需要排序的環節。
    ?
    ??? 如果只是想要取出單個a值的任意記錄,則可以選用方法1,并建立a,b字段的聯合索引,會大大提高效率。
    ?
    ?




    -The End-

    posted on 2009-03-04 21:57 decode360-3 閱讀(389) 評論(0)  編輯  收藏 所屬分類: SQL Dev
    主站蜘蛛池模板: 亚洲视频在线免费观看| 亚洲精品国产高清在线观看| 亚洲日本一区二区三区在线不卡| 免费无遮挡无码永久在线观看视频| 污视频在线免费观看| 亚洲人成中文字幕在线观看| 亚洲精品国产免费| 亚洲精品色在线网站| 一级做a爰性色毛片免费| 99爱在线精品视频免费观看9| 国产免费变态视频网址网站| 美女羞羞免费视频网站| 暖暖免费在线中文日本| 国产禁女女网站免费看| 亚洲成AV人片一区二区密柚| 亚洲五月丁香综合视频| 中文字幕不卡免费视频| 在线免费观看中文字幕| 国产亚洲AV无码AV男人的天堂| 亚洲Av永久无码精品黑人| 久久免费精品视频| 亚洲国产精品无码久久青草| 亚洲特级aaaaaa毛片| 久久精品网站免费观看| 亚洲国产精品自在线一区二区 | 亚洲AV无码一区二区二三区软件| 中文字幕乱码亚洲无线三区 | 国产精品无码亚洲一区二区三区| 国产成人免费片在线观看| 一级一看免费完整版毛片| 久久亚洲成a人片| 一级有奶水毛片免费看| 国产jizzjizz免费视频| 三年片免费高清版 | 亚洲欧美成人av在线观看| 69av免费视频| 亚洲AV综合色区无码一区爱AV| 1000部禁片黄的免费看| 亚洲国产精品久久久久久| 成人免费午夜在线观看| 国产精品一区二区三区免费|