« Exadata现场演示与锁机制-ACOUG 7月活动小记 | Blog首页 | 《循序渐进Oracle》第二版一书相关代码 »
Oracle数据恢复:SYSTEM回滚段损坏案例一则
作者:eygle | 【转载请注出处】|【云和恩墨 领先的zData数据库一体机 | zCloud PaaS云管平台 | SQM SQL审核平台 | ZDBM 数据库备份一体机】
链接:https://www.eygle.com/archives/2011/08/system_rollback_corruption.html
链接:https://www.eygle.com/archives/2011/08/system_rollback_corruption.html
前一段时间,接收到一次用户报告,用户因为断电导致了数据库故障.启动时遇到了01555的错误.
通常ORA-01555错误并不可怕,但是如果出现在SYSTEM回滚段上,则问题就严重了,因为SYSTEM回滚段无法Offline,也无法重建.以下是错误的主要信息:
Thu Jul 07 15:18:20 CST 2011ARC0: LGWR is actively archiving destination LOG_ARCHIVE_DEST_2Thu Jul 07 15:18:20 CST 2011ORA-01555 caused by SQL statement below (SQL ID: 7bd391hat42zk, Query Duration=0 sec, SCN: 0x000a.79ed044d):Thu Jul 07 15:18:20 CST 2011select /*+ rule */ name,file#,block#,status$,user#,undosqn,xactsqn,scnbas,scnwrp,DECODE(inst#,0,NULL,inst#),ts#,spare1 from undo$ where us#=:1Thu Jul 07 15:18:20 CST 2011Errors in file /home/oracle/oracle/admin/EDB01/udump/edb01_ora_1208.trc:ORA-00604: error occurred at recursive SQL level 1ORA-01555: snapshot too old: rollback segment number 0 with name "SYSTEM" too smallError 604 happened during db open, shutting down databaseUSER: terminating instance due to error 604Instance terminated by USER, pid = 1208ORA-1092 signalled during: alter database open...
注意,以下一段SQL非常著名:
select /*+ rule */ name,file#,block#,status$,user#,undosqn,xactsqn,scnbas,scnwrp,DECODE(inst#,0,NULL,inst#),ts#,spare1 from undo$ where us#=:1
这是启动过程中对于undo$的递归读取,获得其中的回滚段信息.如果某个回滚段上存在活动事务,则该事务必须被读取回滚,以便保证事务的一致性.
以下是Oracle 9i的SYSTEM回滚段空间分配,通常这些数据块损坏会非常复杂:
SQL> select segment_name,block_id,blocks from dba_extents where segment_name='SYSTEM';SEGMENT_NAME BLOCK_ID BLOCKS------------------------------ ---------- ----------SYSTEM 9 8SYSTEM 17 8SYSTEM 385 8SYSTEM 393 8SYSTEM 401 8SYSTEM 409 8
对于SYSTEM回滚段,其为Oracle数据库第一个创建的回滚段,主要用于数据库的内部事务或SYS的事务信息记录。如果数据库创建了其他用户的回滚段,则SYSTEM回滚段将近用于UNDO$的信息记录,这也是为什么在出现问题时,我们看到的是在undo$读取时抛出的异常。
在sql.bsq文件中,记录了数据库创建第一个步骤中的SYSTEM回滚段信息:
create tablespace SYSTEM datafile "D_DBFN""D_DSTG" online/create rollback segment SYSTEM tablespace SYSTEMstorage (initial 50K next 50K)/
系统回滚段的作用如下:
When a database is first created using the CREATE DATABASE command, only a single rollback segment is created.This is the system rollback segment and it is created in the system tablespace.The system rollback segment has one basic difference from any other rollback segment, including any other rollback segments that are created in the system tablespace.This difference is that the system rollback segment can only be used for transactions that occur on objects inside the system tablespace.This is done because the main purpose of the system rollback segment is to handle rollback for DDL transactions - that is transactions against the data dictionary tables themselves. Making the system rollback usable only for the system tablespace was simply an easy way to enforce that.It is possible for the system rollback segment to be used for non-data dictionary tables, but only if those tables are created inside the system tablespace (which is very bad development practice).
Steve对此的重要注解:
When other rollback segments are available, the SYSTEM rollback segment is only used for the changes to UNDO$ associated with bringing other rollback segments online, or taking them offline.
至于SYSTEM回滚段损坏,你最好有备份,否则就只能通过BBED去修改相关的数据块,在《Oracle DBA手记 2》上曾经有此案例。
历史上的今天...
>> 2007-08-05文章:
>> 2005-08-05文章:
By eygle on 2011-08-05 14:14 | Comments (1) | Backup&Recovery | 2847 |
大师,我觉的解决该问题的核心思想应该是增大这个数据库的scn或者减小system回滚段对应这条递归SQL的Itl槽的commit scn来解决该问题。前者比后者处理起来简单一些。当然我还没有进行测试,如果思路上有问题的话,请大师指正。