一次简单的Oracle恢复Case实战记录

 更新时间:2018年10月30日 15:57:02   作者:Categories  
这篇文章主要给大家介绍了一次简单的Oracle恢复Case实战过程,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友们下面随着小编来一起学习学习吧

发现问题

某网友的数据库由于坏盘了,并且存储掉电,导致数据库无法open了。单看其数据库alert log的错误来看,是非常之简单的,如下:

Fri Oct 26 10:33:53 2018
Recovery of Online Redo Log: Thread 1 Group 3 Seq 39 Reading mem 0
Mem# 0: /fs/fs/oradata/orcl/redo03.log
Block recovery stopped at EOT rba 39.77.16
Block recovery completed at rba 39.77.16, scn 0.1002048587
ORACLE Instance orcl (pid = 8) - Error 600 encountered while recovering transaction (9, 30) on object 9149.
Fri Oct 26 10:33:53 2018
Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_smon_192644.trc:
ORA-00600: internal error code, arguments: [6856], [0], [43], [], [], [], [], []
Fri Oct 26 10:33:56 2018
Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_smon_192644.trc:
ORA-00600: internal error code, arguments: [4194], [33], [36], [], [], [], [], []
Doing block recovery for file 2 block 713
Block recovery from logseq 39, block 82 to scn 1002048595

对于这种错误,很明显,屏蔽回滚段即可,屏蔽之后可顺利打开数据库,不过后面很快又会crash掉,因此重建undo也就绕过这个问题了。

打开数据库之后,再去观察数据库,会发现alert log有不少的错误,如下所示:

Fri Oct 26 11:01:46 2018
Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_mmon_385148.trc:
ORA-00600: internal error code, arguments: [17147], [0x110549070], [], [], [], [], [], []
Fri Oct 26 11:01:46 2018
Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_m001_373218.trc:
ORA-00600: internal error code, arguments: [kdddgb5], [196650], [0], [], [], [], [], []
ORA-600 encountered when generating server alert SMG-4120
Fri Oct 26 11:01:47 2018
Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_mmon_385148.trc:
ORA-00600: internal error code, arguments: [KGHALO4], [0x11047F6F0], [], [], [], [], [], []
ORA-600 encountered when generating server alert SMG-4121
Fri Oct 26 11:01:48 2018
Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_mmon_385148.trc:
ORA-00600: internal error code, arguments: [KGHALO4], [0x11047F6F0], [], [], [], [], [], []
ORA-600 encountered when generating server alert SMG-4121
Fri Oct 26 11:01:50 2018
Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_m001_373218.trc:
ORA-00600: internal error code, arguments: [kdddgb5], [196650], [0], [], [], [], [], []
Fri Oct 26 11:02:22 2018
Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_mmon_385148.trc:
ORA-00600: internal error code, arguments: [17114], [0x110549070], [], [], [], [], [], []
Fri Oct 26 11:02:23 2018
Errors in file /fs/fs/oradata/admin/orcl/bdump/orcl_mmon_385148.trc:
ORA-00600: internal error code, arguments: [kebm_mmon_main_1], [39], [], [], [], [], [], []
ORA-00039: error during periodic action
ORA-00600: internal error code, arguments: [17114], [0x110549070], [], [], [], [], [], []
Fri Oct 26 11:03:30 2018
Restarting dead background process MMON

除此之外,由于之外alert log有坏块报错,因此对system进行了dbv检查,发现确实存在少量坏块,如下:

DBVERIFY: Release 10.2.0.4.0 - Production on Fri Oct 26 10:37:20 2018
 
Copyright (c) 1982, 2007, Oracle. All rights reserved.
 
DBVERIFY - Verification starting : FILE = system01.dbf
 
DBV-00200: Block, DBA 4255202, already marked corrupt
Block Checking: DBA = 4258751, Block Type = KTB-managed data block
data header at 0x11022a05c
kdbchk: fsbo(596) wrong, (hsz 4178)
Page 64447 failed with check code 6129
Block Checking: DBA = 4259386, Block Type = KTB-managed data block
**** kdxcofbo = 208 != 24
---- end index block validation
Page 65082 failed with check code 6401
Block Checking: DBA = 4269609, Block Type = Unlimited data segment header
Incorrect extent count in the extent map: 16777317
Block Checking: DBA = 4269612, Block Type = KTB-managed data block
**** kdxcofbo = 224 != 216
---- end index block validation
Page 75308 failed with check code 6401
Block Checking: DBA = 4269615, Block Type = KTB-managed data block
**** actual rows locked by itl 2 = 1 != # in trans. header = 0
---- end index block validation
Page 75311 failed with check code 6401
Page 85271 is influx - most likely media corrupt
Corrupt block relative dba: 0x00414d17 (file 1, block 85271)
Fractured block found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x00414d17
last change scn: 0x0000.3afaf495 seq: 0x1 flg: 0x04
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0xfe830601
check value in block header: 0x96c6
computed block checksum: 0x3c6b
 
Page 85383 is influx - most likely media corrupt
Corrupt block relative dba: 0x00414d87 (file 1, block 85383)
Fractured block found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x00414d87
last change scn: 0x0000.3b6b9d19 seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x970f0601
check value in block header: 0xe825
computed block checksum: 0x3c6b
 
DBVERIFY - Verification complete
 
Total Pages Examined : 640000
Total Pages Processed (Data) : 116312
Total Pages Failing (Data) : 1
Total Pages Processed (Index): 65914
Total Pages Failing (Index): 3
Total Pages Processed (Other): 64634
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 393138
Total Pages Marked Corrupt : 3
Total Pages Influx : 2
Highest block SCN : 1002028510 (0.1002028510)

这部分错误,其实处理起来也不困难,部分是业务表的index,但是其他的几乎都是AWR相关基表,有2个坏块跟是system相关的基表和索引,分别是I_H_OBJ#_COL#和COM$ ,HISTGRM$。

对于业务索引,很简单,直接drop 重建即可,对于这个sys的index,可以通过设置38003 event进行drop重建。

对于基表COM$,HISTGRM$,由于是非bootstrap$核心对象,其实也可以处理掉的。

处理方法

不过考虑到这种毕竟是存储掉电,undo异常的情况,还是重建库更稳妥一些。最后补充一点,这个库稍微有点奇葩的地方是全库1.2TB,其中有个表的LOB自动980GB,重建数据库是相对较慢的。对于大表,且有LOB自动,通常建议基于分片,否则会报ORA-01555错误的,如下是常用的一个基于rowid的分片脚本,供大家参考:

set verify off
 
undefine rowid_ranges
 
undefine segment_name
 
undefine owner
 
set head off
 
set pages 0
 
set trimspool on
 
select 'where rowid between ''' ||
 
  sys.dbms_rowid.rowid_create(1, d.oid, c.fid1, c.bid1, 0) ||
 
  ''' and ''' ||
 
  sys.dbms_rowid.rowid_create(1, d.oid, c.fid2, c.bid2, 9999) || '''' || ';'
 
 from (select distinct b.rn,
 
      first_value(a.fid) over(partition by b.rn order by a.fid, a.bid rows between unbounded preceding and unbounded following) fid1,
 
      last_value(a.fid) over(partition by b.rn order by a.fid, a.bid rows between unbounded preceding and unbounded following) fid2,
 
      first_value(decode(sign(range2 - range1),
 
           1,
 
           a.bid +
 
           ((b.rn - a.range1) * a.chunks1),
 
           a.bid)) over(partition by b.rn order by a.fid, a.bid rows between unbounded preceding and unbounded following) bid1,
 
      last_value(decode(sign(range2 - range1),
 
           1,
 
           a.bid +
 
           ((b.rn - a.range1 + 1) * a.chunks1) - 1,
 
           (a.bid + a.blocks - 1))) over(partition by b.rn order by a.fid, a.bid rows between unbounded preceding and unbounded following) bid2
 
   from (select fid,
 
      bid,
 
      blocks,
 
      chunks1,
 
      trunc((sum2 - blocks + 1 - 0.1) / chunks1) range1,
 
      trunc((sum2 - 0.1) / chunks1) range2
 
     from (select /*+ rule */
 
       relative_fno fid,
 
       block_id bid,
 
       blocks,
 
       sum(blocks) over() sum1,
 
       trunc((sum(blocks) over()) / &&rowid_ranges) chunks1,
 
       sum(blocks) over(order by relative_fno, block_id) sum2
 
       from dba_extents
 
       where segment_name = upper('&&segment_name')
 
       and owner = upper('&&owner'))
 
     where sum1 > &&rowid_ranges) a,
 
    (select rownum - 1 rn
 
     from dual
 
    connect by level <= &&rowid_ranges) b
 
   where b.rn between a.range1 and a.range2) c,
 
  (select max(data_object_id) oid
 
   from dba_objects
 
   where object_name = upper('&&segment_name')
 
   and owner = upper('&&owner')
 
   and data_object_id is not null) d
 
/

总结

以上就是这篇文章的全部内容了,希望本文的内容对大家的学习或者工作具有一定的参考学习价值,如果有疑问大家可以留言交流,谢谢大家对脚本之家的支持。

相关文章

  • Oracle的约束介绍与约束维护

    Oracle的约束介绍与约束维护

    这篇文章介绍了Oracle的约束与约束维护,文中通过示例代码介绍的非常详细。对大家的学习或工作具有一定的参考借鉴价值,需要的朋友可以参考下
    2022-05-05
  • Oracle服务端1521端口无法telnet,服务名未开启监听的解决

    Oracle服务端1521端口无法telnet,服务名未开启监听的解决

    这篇文章主要介绍了Oracle服务端1521端口无法telnet,服务名未开启监听的解决方案,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教
    2023-12-12
  • 解决Oracle RMAN删除归档日志不释放问题的方法

    解决Oracle RMAN删除归档日志不释放问题的方法

    RMAN清除方式会自动清除磁盘上的归档日志文件,同时会释放控制文件中对应的归档日志的归档信息,但最近在使用中就遇到了一个问题,下面这篇文章主要给大家介绍了关于解决Oracle RMAN删除归档日志不释放问题的方法,需要的朋友可以参考下。
    2017-07-07
  • 常用Oracle分析函数大全

    常用Oracle分析函数大全

    本文是小编日常手机整理的有关oracle分析函数,oracle分析函数功能非常强大,工作也常用到,所以大家务必要掌握,对oracle分析函数知识感兴趣的朋友一起看看吧
    2016-08-08
  • oracle通过存储过程上传list保存功能

    oracle通过存储过程上传list保存功能

    这篇文章主要介绍了oracle通过存储过程上传list保存功能,本文通过实例代码给大家介绍的非常详细,对大家的学习或工作具有一定的参考借鉴价值,需要的朋友可以参考下
    2021-05-05
  • Oracle与Mysql自动增长列(id)的区别

    Oracle与Mysql自动增长列(id)的区别

    这篇文章主要介绍了Oracle与Mysql自动增长列(id)的区别,需要的朋友可以参考下
    2014-01-01
  • oracle远程连接服务器出现 ORA-12170 TNS:连接超时 解决办法

    oracle远程连接服务器出现 ORA-12170 TNS:连接超时 解决办法

    oracle远程连接服务器出现 ORA-12170 TNS:连接超时 解决办法,需要的朋友可以参考一下
    2013-03-03
  • Oracle中带条件插入数据的使用方法示例详解

    Oracle中带条件插入数据的使用方法示例详解

    在Oracle数据库中,INSERT WHEN语句用于在满足特定条件时插入数据,它允许您根据条件控制插入操作是否执行,本文给大家分享Oracle中带条件插入数据的使用方法,感兴趣的朋友一起看看吧
    2023-12-12
  • Oracle中查看引起Session阻塞的2个脚本分享

    Oracle中查看引起Session阻塞的2个脚本分享

    这篇文章主要介绍了Oracle中查看引起Session阻塞的2个脚本分享,本文给出了2个脚本来查询导致Session阻塞的原因,并给出Kill引起阻塞的Session方法,需要的朋友可以参考下
    2014-10-10
  • Oracle 启动例程 STARTUP参数说明

    Oracle 启动例程 STARTUP参数说明

    Oracle启动例程:STARTUP参数说明(2008-07-28 20:51:20)标签:startup shutdown 事例 it 分类:Oracle数据管理
    2009-07-07

最新评论