客戶數據庫出現ORA-4031錯誤,隨后出現了大量的ORA-1003和ORA-604錯誤。
數據庫版本為10.2.0.3 RAC for HP-UX,詳細的報錯信息為:
Mon Jul 16 15:30:30 2012
Errors in file /u01/app/oracle/admin/ORCL/udump/orcl2_ora_2389.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","select name,online$,contents...","sql area","tmp")
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","select name,online$,contents...","sql area","tmp")
Mon Jul 16 15:30:32 2012
Errors in file /u01/app/oracle/admin/ORCL/udump/orcl2_ora_2878.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 144 bytes of shared memory ("shared pool","select name,online$,contents...","Typecheck","coldef: qcopCreateCol")
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 144 bytes of shared memory ("shared pool","select name,online$,contents...","Typecheck","coldef: qcopCreateCol")
Mon Jul 16 15:30:32 2012
Errors in file /u01/app/oracle/admin/ORCL/udump/orcl2_ora_10030.trc:
ORA-00603: ORACLE server session terminated by fatal error tb
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 144 bytes of shared memory ("shared pool","select name,online$,contents...","Typecheck","coldef: qcopCreateCol")
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 144 bytes of shared memory ("shared pool","select name,online$,contents...","Typecheck","coldef: qcopCreateCol")
Mon Jul 16 15:30:39 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl2_cjq0_29269.trc:
ORA-00604: error occurred at recursive SQL level 2
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","select o.owner#,o.name,o.nam...","sql area","kobjn : kkdcchs")
Mon Jul 16 15:30:40 2012
Trace dumping is performing id=[cdmp_20120716153040]
Mon Jul 16 15:32:19 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl2_cjq0_29269.trc:
ORA-00604: error occurred at recursive SQL level 2
ORA-01003: no statement parsed
Mon Jul 16 15:33:59 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl2_cjq0_29269.trc:
ORA-00604: error occurred at recursive SQL level 2
ORA-01003: no statement parsed
Mon Jul 16 15:35:39 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl2_cjq0_29269.trc:
ORA-00604: error occurred at recursive SQL level 2
ORA-01003: no statement parsed
Mon Jul 16 15:37:19 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl2_cjq0_29269.trc:
ORA-00604: error occurred at recursive SQL level 2
ORA-01003: no statement parsed
Mon Jul 16 15:39:00 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl2_cjq0_29269.trc:
ORA-00604: error occurred at recursive SQL level 2
ORA-01003: no statement parsed
Mon Jul 16 15:40:40 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl2_cjq0_29269.trc:
ORA-00604: error occurred at recursive SQL level ORA-00604: error occurred at recursive SQL level 2
ORA-01003: no statement parsed
ORA-00604: error occurred at recursive SQL level ORA-00604: error occurred at recursive SQL level 2
ORA-01003: no statement parsed
ORA-00604: error occurred at recursive SQ
ORA-00604: error occurred at recursive SQL level ORA-00604: error occurred at recursive SQL level 2
ORA-01003: no statement parsed
ORA-00604: error occurred at recursive SQL level ORA-00604: error occurred at recursive SQL level 2
ORA-01003: no statement parsed
ORA-00604: error occurred at recursive SQL level ORA-00604: error occurred at recursive SQL level 2
ORA-01003: no statement parsed
ORA-00604: error occurred at recursive SQL level ORA-00604: error occurred at recursive SQL level 2
ORA-01003: no statement parsed
ORA-00604: error occurred at recursive SQL level ORA-00604: error occurred at recursive SQL level 2
ORA-01003: no statement parsed
ORA-00604: error occurred at recursive SQL level ORA-00604: err
Mon Jul 16 15:41:59 2012
Thread 2 advanced to log sequence 61522
Current log# 7 seq# 61522 mem# 0: +ORCL_CTL/orcl/onlinelog/group_7.263.611598065
Mon Jul 16 15:42:20 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl2_cjq0_29269.trc:
ORA-00604: error occurred at recursive SQL level 2
ORA-01003: no statement parsed
Mon Jul 16 15:44:00 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl2_cjq0_29269.trc:
ORA-00604: error occurred at recursive SQL level 2
ORA-01003: no statement parsed
Mon Jul 16 15:45:40 2012
Errors in file /u01/app/oracle/admin/ORCL/bdump/orcl2_cjq0_29269.trc:
ORA-00604: error occurred at recursive SQL level ORA-00604: error occurred at recursive SQL level 2
ORA-01003: no statement parsed
ORA-00604: error occurred at recursive SQL level ORA-00604: error occurred at recursive SQL level 2
ORA-01003: no statement parsed
ORA-00604: error occurred at recursive SQ
ORA-00604: error occurred at recursive SQL level ORA-00604: error occurred at recursive SQL level 2
ORA-01003: no statement parsed
ORA-00604: error occurred at recursive SQL level ORA-00604: error occurred at recursive SQL level 2
ORA-01003: no statement parsed
ORA-00604: error occurred at recursive SQL level ORA-00604: error occurred at recursive SQL level 2
ORA-01003: no statement parsed
ORA-00604: error occurred at recursive SQL level ORA-00604: error occurred at recursive SQL level 2
ORA-01003: no statement parsed
ORA-00604: error occurred at recursive SQL level ORA-00604: error occurred at recursive SQL level 2
ORA-01003: no statement parsed
ORA-00604: error occurred at recursive SQL level ORA-00604: err
可以看得,開始還是比較“正常”的ORA-4031錯誤,但是隨著CJQ進程出現了ORA-4031的錯誤,數據庫開始每1分40秒報一次ORA-1003和ORA-604的錯誤。這里CJQ進程似乎陷入了一個死循環中,持續不停的報錯。查詢MOS發現,在9.2中Oracle存在類似的BUG:Bug 3289063 ORA-1003 every 5 seconds after CJQ hits ORA-4031。這個bug已經在9.2.0.5和10.1.0.2中被FIXED,但是這個bug的現象和當前的錯誤完全一致,從這點上看,這個問題顯然在10.2.0.3中仍然存在。
Oracle給出的解決方案也很簡單,重啟CJQ進程既可。除了在操作系統去kill -9殺掉cjq0對應的進程之外,直接通過alter system set job_queue_processes=0,然后在設置回原值,也有可能解決該問題。此外,盡量避免ORA-4031錯誤的產生是最根本的問題解決之道。