Automatic UNDO Retention 特性导致Oracle10g undo tablespace表空间使用率异常

教程发布:风哥 教程分类:ITPUX技术网 更新日期:2022-02-12 浏览学习:1814

作者:

1 问题描述
近几天发现Oracle10.2.0.4数据库undo tablespace表空间使用率很高(最高时达到100%),报警系统频繁报障undo tablespace表空间使用率过高。
TABLESPACE_NAME TOTAL USED FREE PUSED PFREE
-------------------- ---------- ---------- ---------- ---------- ----------
UNDOTBS1 6141 % %
[@more@]2 问题分析
2.1 业务系统影响
undo tablespace表空间使用率达到100%时,没有业务系统用户反映系统出现表空间不能扩展的报错(ORA-30036),但曾经出现过快照过旧报错(ORA-01555)。
2.2 数据库日志
查看alert_instdb.log报警文件,发现从Tue May 18 10:51:27 2010到Sat May 22 14:55:59 2010,共出现了15次与undo tablespace有关的ORA-01555报错,这些报错都是由同一条SQL语句引起的(SQL ID: 1h54gg6shbrkh),语句执行时间都很长(在20万秒以上),最长执行时间达到355659 sec。
Tue May 18 10:51:27 2010
ORA-01555 caused by SQL statement below (SQL ID: 1h54gg6shbrkh, Query Duration=349634 sec, SCN: 0x0a1b.a93246fa):
Tue May 18 10:51:27 2010
select decode(temp_kp.mc,null,temp_hdhs.mc,temp_kp.mc) as mc,
decode(temp_kp.jzyje,null, 0,temp_kp.jzyje) as jzyje,
decode(temp_kp.jzyfs,null, 0,temp_kp.jzyfs) as jzyfs,
decode(temp_kp.bdcje,null, 0,temp_kp.bdcje) as bdcje,
decode(temp_kp.bdcfs,null, 0,temp_kp.bdcfs) as bdcfs,
decode(temp_kp.ptje,null, 0,temp_kp.ptje) as ptje,
decode(temp_kp.ptfs,null, 0,temp_kp.ptfs) as ptfs,
decode(temp_kp.dkje,null, 0,temp_kp.dkje) as dkje,
decode(temp_kp.dkfs,null, 0,temp_kp.dkfs) as dkfs,
decode(temp_kp.bgdlyje,null, 0,temp_kp.bgdlyje) as bgdlyje,
decode(temp_kp.bgdlyfs,null, 0,temp_kp.bgdlyfs) as bgdlyfs,
decode(temp_kp.gjhyyje,null, 0,temp_kp.gjhyyje) as gjhyyje,
decode(temp_kp.gjhyyfs,null, 0,temp_kp.gjhyyfs) as gjhyyfs,
decode(temp_kp.gjhyycpdlje,null, 0,temp_kp.gjhyycpdlje) as gjhyycpdlje,
decode(temp_kp.gjhyycpdlfs,null, 0,temp_kp.gjhyycpdlfs) as gjhyycpdlfs,
decode(temp_kp.gjhwysje,null, 0,temp_kp.gjhwysje) as
2.3 Undo参数
SQL> show parameter undo
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
undo_management string AUTO
undo_retention integer 900
undo_tablespace string UNDOTBS1
2.4 Undo tablespace空间使用情况
select file_name,autoextensible,bytes/1048576 MB,increment_by,maxbytes/1048576 Max_MB
from dba_data_files
where tablespace_name = 'UNDOTBS1
FILE_NAME AUTOEXTENSIBLE MB INCREMENT_BY MAX_MB
/dev/rinstdbUNDO_lv NO 1023 0 0
/dev/rinstdbUNDO2_lv NO 3071 0 0
/dev/rinstdbUNDO3_lv NO 2047 0 0
select tablespace_name,status, sum(bytes)/1024/1024 MB, count(*)
from dba_undo_extents
group by tablespace_name,status
TABLESPACE_NAME STATUS MB COUNT(*)
TS_UNDO EXPIRED 2.375 38
UNDOTBS1 UNEXPIRED 4344.75 2687
UNDOTBS1 EXPIRED 1796.0625 3146
UNDOTBS1表空间总空间是6141MB,已分配空间达到6140.8125MB,其中UNEXPIRED EXTENT空间占4344.75MB。
2.5 V$UNDOSTAT和DBA_HIST_UNDOSTAT
查询V$UNDOSTAT视图和DBA_HIST_UNDOSTAT数据字典可以看到:
1. 从2010-05-19 PM 10:53:38到2010-5-22 04:53:38 PM,MAXQUERYLEN值都保持在20万秒以上,MAXQUERYLEN最大值是285073s (2010-5-22 04:53:38 PM),
2. TUNED_UNDORETENTION保持在20万秒以上,TUNED_UNDORETENTION最大值是345742s (2010-5-26 09:13:38 AM),TUNED_UNDORETENTION当前值是255484(2010-5-27 02:43:38 PM)
3. SSOLDERRCNT累计值大于0
4. NOSPACEERRCNT一直都是0
从以上数据可以了解到,从2010-05-19 PM 10:53:38到2010-5-22 04:53:38 PM期间出现了多次异常超长时间查询,这些长时间查询导致TUNED_UNDORETENTION参数值变大,这些查询曾经导致ORA-01555报错,但没有导致DML语句出错。
注:从oracle10g开始,oracle提供了Automatic UNDO Retention特性,即oracle会根据实际情况自动调整undo_retention值,以尽可能避免ORA-01555的报错。v$undostat视图中几个主要列的解释:
MAXQUERYLEN Identifies the length of the longest query (in seconds) executed in the instance during the period. You can use this statistic to estimate the proper setting of the UNDO_RETENTION initialization parameter. The length of a query is measured from the cursor open time to the last fetch/execute time of the cursor. Only the length of those cursors that have been fetched/executed during the period are reflected in the view.
SSOLDERRCNT Identifies the number of times the error ORA-01555 occurred.
NOSPACEERRCNT Identifies the number of times space was requested in the undo tablespace and there was no free space available. That is, all of the space in the undo tablespace was in use by active transactions. The corrective action is to add more space to the undo tablespace.
TUNED_UNDORETENTION System tuned value indicating the period for which undo is being retained
3 总结及建议
3.1 Undo tablespace使用率100%原因
从oracle10g开始,oracle提供了Automatic UNDO Retention特性,即oracle会根据实际情况自动调整undo_retention值,从2010.5.18至2010.5.22期间产生了多次超长时间查询,这段期间MAXQUERYLEN值保持在20万秒以上,为了尽可能避免ORA-01555的报错,尽可能保留更多的UNEXPIRED EXTENT,ORACLE会首先选择在undo tablespace中扩充rollback segment,所以undo tablespace使用率会不断上升,最终导致了undo tablespace空间使用率达到100%。
3.2 Undo tablespace空间监控方法
undo tablespace空间使用率达到100%,只是意味着所有的空间都已经分配给rollback segment,但并不一定会影响到业务的select和insert、delete、update操作,因为rollback segment中的空间是可以重用的。在oracle10g,不能通过监控dba_free_space数据库字典的方法去获得undo tablespace的使用情况,可以考虑通过以下方法进行监控:
1. 监控v$undostat视图,主要包括MAXQUERYLEN、SSOLDERRCNT、NOSPACEERRCNT列值,如:MAXQUERYLEN大于36000秒则报警、SSOLDERRCNT大于0次(或一定次数)则报警、NOSPACEERRCNT大于0次则严重警告等;
2. 监控平均每一秒钟所需的最少回滚表空间容量,如果超过一定范围则报警
SELECT (UR * (UPS * DBS)) + (DBS * 24) AS "Bytes"
FROM (SELECT value AS UR FROM v$parameter WHERE name = 'undo_retention'),
(SELECT (SUM(undoblks)/SUM(((end_time - begin_time)*86400))) AS UPS FROM v$undostat),
(select block_size as DBS from dba_tablespaces where tablespace_name=
(select value from v$parameter where name = 'undo_tablespace'));

本文标签:
网站声明:本文由风哥整理发布,转载请保留此段声明,本站所有内容将不对其使用后果做任何承诺,请读者谨慎使用!
【上一篇】
【下一篇】