数据文件出现坏快的解决之四(实际诊断案例)

http://blog.chinaunix.net/u1/50863/showart_400578.html

问题：用户查询一个表时，报数据文件有坏块

目标：用户可以接受丢失这些坏块的数据，但该数据文件其它的好块应该可以查询数据。

下面是具体的步骤：

1.询问用户徐工出错的表名，收集出错信息
出错表名：
fsgazhjf.fsgazhjf_tac_20061018
trace文件中的出错信息：

***
Corrupt block relative dba: 0xb8428b33 (file 737, block 166707)
Fractured block found during user buffer read
Data in bad block -
type: 6 format: 2 rdba: 0xb8428b33
last change scn: 0x0000.0a66398d seq: 0x1 flg: 0x00
consistency value in tail: 0xbddc0601
check value in block header: 0x0, block checksum disabled
spare1: 0x0, spare2: 0x0, spare3: 0x0
***

2.根据出错块id，查询出该块对应的物理表，跟第一步收集的比对
select * from dba_extents
where file_id=737 and block_id <= 166707 and (block_id + blocks - 1) >= 166707;

FSGAZHJF FSGAZHJF_TAC_20061018 TABLE FSGAZHJF_GSM_10 1201 737 166665 1048576 128 737

结果：的确是该表:FSGAZHJF_TAC_20061018，用户FSGAZHJF，表空间FSGAZHJF_GSM_10

3.查询该表，看报错信息是否和第一步一致
select count(1) from fsgazhjf.fsgazhjf_tac_20061018;

结果：果然报错

4.收集该表的所有索引
select * from dba_indexes where owner='FSGAZHJF' and lower(table_name)='fsgazhjf_tac_20061018';
no rows
结果：无索引

5.用dbv工具来check bad block
SQL> select file_id||' '||file_name from dba_data_files where file_id=737;

FILE_ID||''||FILE_NAME
------------------------------------------------------------------------------
--------------------
737 K:ORADATAORA8FSGAZHJF_GSM_10_50.DBF

C:>dbv file='K:ORADATAORA8FSGAZHJF_GSM_10_50.DBF' blocksize=8192 logfile='h:dbv.log'

DBVERIFY: Release 8.1.7.4.1 - Production on 星期四 11月 9 10:57:13 2006

DBVERIFY - 检验开始：FILE = K:ORADATAORA8FSGAZHJF_GSM_10_50.DBF
标记为损坏的页面166708
***
Corrupt block relative dba: 0xb8428b34 (file 0, block 166708)
Bad header found during dbv:
Data in bad block -
type: 6 format: 2 rdba: 0xcf012b08
last change scn: 0x0000.0a91bf69 seq: 0x1 flg: 0x00
consistency value in tail: 0x0ccc0601
check value in block header: 0x0, block checksum disabled
spare1: 0x0, spare2: 0x0, spare3: 0x0
***

标记为损坏的页面166709
***
Corrupt block relative dba: 0xb8428b35 (file 0, block 166709)
Bad header found during dbv:
Data in bad block -
type: 6 format: 2 rdba: 0xce00e7e9
last change scn: 0x0000.0a910ce8 seq: 0x1 flg: 0x00
consistency value in tail: 0x0ce80601
check value in block header: 0x0, block checksum disabled
spare1: 0x0, spare2: 0x0, spare3: 0x0
***

标记为损坏的页面166710
***
Corrupt block relative dba: 0xb8428b36 (file 0, block 166710)
Bad header found during dbv:
Data in bad block -
type: 6 format: 2 rdba: 0xce00e7ea
last change scn: 0x0000.0a910ce8 seq: 0x1 flg: 0x00
consistency value in tail: 0x0ce80601
check value in block header: 0x0, block checksum disabled
spare1: 0x0, spare2: 0x0, spare3: 0x0
***

标记为损坏的页面166711
***
Corrupt block relative dba: 0xb8428b37 (file 0, block 166711)
Bad header found during dbv:
Data in bad block -
type: 6 format: 2 rdba: 0xce00e7eb
last change scn: 0x0000.0a910ce8 seq: 0x1 flg: 0x00
consistency value in tail: 0x39910601
check value in block header: 0x0, block checksum disabled
spare1: 0x0, spare2: 0x0, spare3: 0x0
***

DBVERIFY - 完成检验

检查的页面总数：262144
处理的页面总数（数据）：262010
失败的页面总数（数据）：0
处理的页面总数（索引）：0
失败的页面总数（索引）：0
处理的页面总数（其它）：9
空的页面总数：120
标记损坏的页面总数：4
汇集的页面总数：0

检查结果：4个坏块，块号是166708 ~ 166711 ，经查询，发现都在一个extent里，属于同一张表

6.开始打标记
具体过程：
C:>sqlplus sys/change_on_install

SQL*Plus: Release 8.1.7.0.0 - Production on 星期四 11月 9 12:18:08 2006

连接到:
Oracle8i Enterprise Edition Release 8.1.7.4.1 - Production
With the Partitioning option
JServer Release 8.1.7.4.1 - Production

SQL>exec dbms_repair.admin_tables('REPAIR_TABLE',1,1,'USERS');

PL/SQL 过程已成功完成。

SQL>exec dbms_repair.admin_tables('ORPHAN_TABLE',2,1,'USERS');

PL/SQL 过程已成功完成。

SQL> declare
2 cc number;
3 begin
4 dbms_repair.check_object(schema_name => 'FSGAZHJF',object_name => 'FSGAZHJF_TAC_20061018',corrupt_count => cc);
5 dbms_output.put_line(a => to_char(cc));
6 end;
7 /

PL/SQL 过程已成功完成。

SQL>
SQL> select count(1) from repair_table;

COUNT(1)
----------
5

***这里发现5个坏快，dbv发现的是4个
SQL>
***具体信息参考repair_table.xls

发现marked_corrupt列已经为true,可能不需执行
exec dbms_repair.skip_corrupt_blocks(schema_name => 'FSGAZHJF',object_name => 'FSGAZHJF_TAC_20061018',flags => 1);
了，通过下面的查询，确认不需要执行了

SQL> select count(1) from FSGAZHJF.FSGAZHJF_TAC_20061018;

COUNT(1)
----------
18804767

7.为该表建立两个索引
建立成功

证明可以对该表进行全表扫描了，问题解决，但丢失5个块的数据

posted on 2009-11-25 22:31 gdufo 阅读(305) 评论(0) 编辑收藏所属分类: Database (oracle, sqlser,MYSQL)

gdufo

数据文件出现坏快的解决之四(实际诊断案例)

导航

统计

常用链接

留言簿(6)

随笔分类

随笔档案

文章分类

文章档案

收藏夹

Hibernate

友情链接

搜索

最新评论

阅读排行榜

评论排行榜