« Oracle数据库恢复 : 存储故障导致的数据损坏 | Blog首页 | 书评:《Oracle Database 11g数据库管理艺术》 »
Oracle数据库恢复: 存储及系统故障导致文件丢失
作者:eygle | 【转载请注出处】|【云和恩墨 领先的zData数据库一体机 | zCloud PaaS云管平台 | SQM SQL审核平台 | ZDBM 数据库备份一体机】
链接:https://www.eygle.com/archives/2010/12/storage_fault_recovery.html
上周帮助一个用户恢复了一个数据库,情况并不复杂,但是过程值得记录一下。链接:https://www.eygle.com/archives/2010/12/storage_fault_recovery.html
首先是存储级别的故障,导致了数据文件损坏,甚至丢失。数据库首先在告警日志中抛出如下异常:
Wed Nov 1 17:12:45 2010注意,ORA-600 2662错误本身没什么特殊,也并不算棘手,但是注意其错误发生在写出Stats数据和Extent Coalescing之时,尤其是后者,很多人一度认为10g中已经不存在该行为了。
Errors in file /ADMIN/bdump/erp_smon_14821.trc:
ORA-00600: internal error code, arguments: [2662], [1388], [4005408990], [1388], [4005425099], [1484804288], [], []
Wed Nov 1 17:12:47 2010
Non-fatal internal error happenned while SMON was doing flushing of monitored table stats.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Wed Nov 1 17:12:47 2010
Errors in file /ADMIN/bdump/erp_smon_14821.trc:
ORA-00600: internal error code, arguments: [2662], [1388], [4005408993], [1388], [4005425099], [1484804288], [], []
Non-fatal internal error happenned while SMON was doing extent coalescing.
SMON encountered 2 out of maximum 100 non-fatal internal errors.
ORA-00474: SMON process terminated with error
这些错误导致数据异常Crash崩溃,而重启之后悲惨的事情出现了:
Errors in file /ADMIN/bdump/erp_q000_6024.trc:SYSAUX表空间损坏了,而且用dbv检查发现,这个文件只剩下5个Block,还有4个损坏:
ORA-01578: ORACLE data block corrupted (file # 11, block # 62612)
ORA-01110: data file 11: '/data/sysaux01.dbf'
Page 2 is marked corrupt最后检查发现,几乎所有文件都丢失了。UNDO文件也已经被清空:
Corrupt block relative dba: 0x5ec00002 (file 379, block 2)
Bad header found during dbv:
Data in bad block:
type: 0 format: 2 rdba: 0x4e802e1b
last change scn: 0x0000.00000000 seq: 0x1 flg: 0x05
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x00000001
check value in block header: 0xc79b
computed block checksum: 0x0
Page 3 is marked corrupt
Corrupt block relative dba: 0x5ec00003 (file 379, block 3)
Bad header found during dbv:
Data in bad block:
type: 0 format: 2 rdba: 0x4e802e1c
last change scn: 0x0000.00000000 seq: 0x1 flg: 0x05
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x00000001
check value in block header: 0xc79c
computed block checksum: 0x0
Page 4 is marked corrupt
Corrupt block relative dba: 0x5ec00004 (file 379, block 4)
Bad header found during dbv:
Data in bad block:
type: 0 format: 2 rdba: 0x4e802e1d
last change scn: 0x0000.00000000 seq: 0x1 flg: 0x05
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x00000001
check value in block header: 0xc79d
computed block checksum: 0x0
Page 5 is marked corrupt
Corrupt block relative dba: 0x5ec00005 (file 379, block 5)
Bad header found during dbv:
Data in bad block:
type: 0 format: 2 rdba: 0x4e802e1e
last change scn: 0x0000.00000000 seq: 0x1 flg: 0x05
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x00000001
check value in block header: 0xc79e
computed block checksum: 0x0
DBVERIFY - Verification complete
Total Pages Examined : 5
Total Pages Processed (Data) : 0
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing (Index): 0
Total Pages Processed (Other): 1
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 0
Total Pages Marked Corrupt : 4
Total Pages Influx : 0
Highest block SCN : 0 (0.0)
DBVERIFY - Verification starting : FILE = undo01.dbf这意味着,在这样一次异常之后,所有数据文件都从存储上丢失了,多么疯狂!
Page 2 is marked corrupt
Corrupt block relative dba: 0x02c00002 (file 11, block 2)
Bad header found during dbv:
Data in bad block:
type: 2 format: 2 rdba: 0x5ec2b80c
last change scn: 0x056c.dc9017e1 seq: 0x35 flg: 0x04
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x17e10235
check value in block header: 0xf125
computed block checksum: 0x0
Page 3 is marked corrupt
Corrupt block relative dba: 0x02c00003 (file 11, block 3)
Bad header found during dbv:
Data in bad block:
type: 2 format: 2 rdba: 0x5ec2b80d
last change scn: 0x056c.dc9017e1 seq: 0x35 flg: 0x04
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x17e10235
check value in block header: 0xf50b
computed block checksum: 0x0
Page 4 is marked corrupt
Corrupt block relative dba: 0x02c00004 (file 11, block 4)
Bad header found during dbv:
Data in bad block:
type: 2 format: 2 rdba: 0x5ec2b80e
last change scn: 0x056c.dc9017e1 seq: 0x36 flg: 0x04
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x17e10236
check value in block header: 0xe0ef
computed block checksum: 0x0
Page 5 is marked corrupt
Corrupt block relative dba: 0x02c00005 (file 11, block 5)
Bad header found during dbv:
Data in bad block:
type: 2 format: 2 rdba: 0x5ec2b80f
last change scn: 0x056c.dc9017e1 seq: 0x35 flg: 0x04
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x17e10235
check value in block header: 0x1844
computed block checksum: 0x0
DBVERIFY - Verification complete
Total Pages Examined : 5
Total Pages Processed (Data) : 0
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing (Index): 0
Total Pages Processed (Other): 1
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 0
Total Pages Marked Corrupt : 4
Total Pages Influx : 0
Highest block SCN : 0 (0.0)
接下来当我们从磁带上进行恢复时,在经历了数小时的等待之后,磁带报错,文件不能读取。
我一直不太相信磁带,这一次,磁带再次带来了大麻烦。
对于数据库不太大的用户,我强烈建议用户在主机上多配备几块硬盘,将备份存放到本地,一是获得性能,二可以加快恢复,保证恢复时间。
最后客户在一块移动硬盘上找到了一份临时分离出去的备份文件,最终靠这个偶然留存的备份挽救了数据库。
数据备份,再多一份也不为过!
历史上的今天...
>> 2019-12-04文章:
>> 2006-12-04文章:
>> 2005-12-04文章:
>> 2004-12-04文章:
By eygle on 2010-12-04 10:07 | Comments (3) | Backup&Recovery | Case | 2669 |
汗~~
现实残酷,然而人生美好!
吓到我了呢!这个真的是损失太大了!