(一)NOLOGGING操作引起的坏块(ORA-01578和ORA-26040)简介 如果只是错误ORA-01578,而没有伴随ORA-26040,那么这个坏块是由其它的原因引起的坏块,可以尝试使用RMAN的BMR(Block Media Recovery)修复。 如果数据段(表段、索引段)被定义为NOLOGGING属性,那么当NOLOGGING加APPEND、UNRECOVERABLE操作修改该数据段或者使用数据泵(DATAPUMP)impdp参数DISABLE_ARCHIVE_LOGGING:Y时,联机重做日志只会记录很少的日志信息。如果这些联机重做日志或归档日志被用来恢复数据文件,那么Oracle会将对应的数据块标志为无效(Soft Corrupt),而且下一次访问这些数据块时,会报ORA-01578和ORA-26040错误。 例如: SQL> select * from test_nologging; ORA-01578: ORACLE data block corrupted (file # 11, block # 84) ORA-01110: data file 4: '/oradata/users.dbf' ORA-26040: Data block was loaded using the NOLOGGING option 数据字典视图DBA_TABLES、DBA_INDEXES、DBA_LOBS、DBA_TAB_PARTITIONS、DBA_LOB_PARTITIONS、DBA_TAB_SUBPARTITIONS中的LOGGING列记录了NOLOGGING属性。若LOGGING='NO'则表示NOLOGGING。 数据泵DATAPUMP的impdp参数DISABLE_ARCHIVE_LOGGING:Y在执行导入时会禁止LOGGING定义,而产生NOLOGGING操作。如果相应的datafile被restored和recovered,那么接下来的涉及到目标表的查询会报错ORA-1578和ORA-26040。如果数据库是FORCE LOGGING模式,那么DISABLE_ARCHIVE_LOGGING选项不会关闭LOGGING。 impdp使用参数“DISABLE_ARCHIVE_LOGGING:Y”的一个例子: impdp scott/tiger directory=DATA_PUMP_DIR dumpfile=dp transform=disable_archive_logging:y NOLOGGING导致的坏块不会导致RMAN备份失败。一般来说soft corrupt block不会导致RMAN备份失败,不需要设置MAXCORRUPT。数据库备份中就会含有soft corrupt block,如果使用这些备份恢复数据,那么恢复的数据也含有soft corrupt block。 除ORA-26040错误之外,当还有一些其他通用信息出现时,block dump可能会被产生。如果数据块的block dump内有byte 0xff信息或者属于某个段,ORA-1578和ORA-26040会因为介质恢复了NOLOGGING的部分导致了corruption而出现。 (二)利用RMAN、DBV检测NOLOGGING导致的坏块 DBV在检测坏块时,如果RDBMS版本小于10.2.0.4,那么DBV打印错误DBV-200,如果RDBMS版本大于或等于10.2.0.4,那么DBV打印错误DBV-201: DBV-00200: Block, dba 46137428, already marked corrupted DBV-00201: Block, DBA 46137428, marked corrupt for invalid redo application RMAN的VALIDATE命令可以用来检测NOLOGGING数据块,检查结果记录在视图V$DATABASE_BLOCK_CORRUPTION(小于12c的版本)和V$NONLOGGED_BLOCK(12c及其以上)。 下面的例子中检查出DATAFILE 4有933坏块,查询V$DATABASE_BLOCK_CORRUPTION或者V$NONLOGGED_BLOCK。 RMAN> VALIDATE DATABASE; ... ..... File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 4 OK 933 1 6401 2275124 File Name: /oracle/dbs/users.dbf RMAN在检测坏块时,如果RDBMS版本小于10.2.0.5和11.1.0.7,RMAN打印如下错误: 10.2.0.4 and lower, 11.1.0.6, 11.1.0.7: RMAN validate reports it in v$database_block_corruption with CORRUPTION_TYPE=LOGICAL 如果RDBMS版本大于或等于10.2.0.5和11.2.0.1,RMAN报告,查看视图v$database_block_corruption中CORRUPTION_TYPE=NOLOGGING的记录。 10.2.0.5 and 11.2.0.1+: RMAN validate reports it in v$database_block_corruption with CORRUPTION_TYPE=NOLOGGING 在12c及以后版本中,RMAN validate的结果不在视图v$database_block_corruption中,而是在视图v$nonlogged_block。从12.2 版本开始,可以使用新的命令:“validate .. nonlogged block”去验证nologging的block。 在以下的例子中,数据文件5和6有nologged的block: RMAN> validate database nonlogged block; Starting validate at ... using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=133 device type=DISK channel ORA_DISK_1: starting validation of datafile channel ORA_DISK_1: validation complete, elapsed time: 00:00:35 List of Datafiles ================= File Status Nonlogged Blocks Blocks Examined Blocks Skipped ---- ------ ---------------- --------------- -------------- 1 OK 0 106363 0 2 OK 0 78919 0 3 OK 0 96639 0 4 OK 0 4991 0 5 OK 400 2559 0 6 OK 569 2559 0 Details of nonlogged blocks can be queried from v$nonlogged_block view 在告警日志中会更新以下信息: Started Nonlogged Block Replacement recovery(validate) on file 5 (ospid 26351 rcvid 10616970560844821494) Finished Nonlogged Block Replacement recovery(validate) on file 5. 400 blocks found Started Nonlogged Block Replacement recovery(validate) on file 6 (ospid 26351 rcvid 10616970560844821494) Finished Nonlogged Block Replacement recovery(validate) on file 6. 569 blocks found (三)监控NOLOGGING操作 若执行了NOLOGGING操作,并且之后在没有备份的情况下,RMAN命令“REPORT UNRECOVERABLE”可以查询出被影响的datafile。 RMAN> report unrecoverable; using target database control file instead of recovery catalog Report of files that need backup due to unrecoverable operations File Type of Backup Required Name ---- ----------------------- ----------------------------------- 4 full or incremental /oracle/dbs/users.dbf 当初始化参数db_unrecoverable_scn_tracking设置为true(默认值,该参数在10g中是不可用的),那么V$DATAFILE中以下列会被更新; SYS@lhr121> select UNRECOVERABLE_CHANGE# , 2 UNRECOVERABLE_TIME , 3 FIRST_NONLOGGED_SCN , 4 FIRST_NONLOGGED_TIME from v$datafile where file#=6; UNRECOVERABLE_CHANGE# UNRECOVERABLE_TIME FIRST_NONLOGGED_SCN FIRST_NONLOGGED_TIM --------------------- ------------------- ------------------- ------------------- 2878238 2018-04-10 10:53:47 2878238 2018-04-10 10:53:47 在11.2.0.4 或12.1.0.2+版本中,设置event 16490的情况下,物理备库的MRP进程会检查出NOLOGGING变化,并记录在alert log。 ORA-16490 "logging invalidated blocks on standby due to invalidation redo" "INVD_BLKS: Invalidating (file , bno )" "fname: 'Datafile name'. rdba: ..." (四)识别数据块什么时候被标志为NOLOGGING 识别数据块什么时候被标志为NOLOGGING,可以将trace文件中数据块SCN或者v$database_block_coruption视图中CORRUPTION_CHANGE#值转换为时间: ① 使用trace文件中数据块SCN,例如: Start dump data blocks tsn: 60 file#: 4 minblk 84 maxblk 84 buffer tsn: 3 rdba: 0x02c00054 (11/84) scn: 0x0771.4fa24eb5 seq: 0xff flg: 0x04 tail: 0x4eb500ff 提取SCN值0x0771.4fa24eb5,删除'.',然后转换0x07714fa24eb到十进制511453045995。 ② 使用v$database_block_coruption视图中CORRUPTION_CHANGE#值 如果运行RMAN validate命令后,v$database_block_coruption视图中corruption_type='NOLOGGING' (10.2.0.5 和 11.2.0.1+),那么CORRUPTION_CHANGE#列的值就是十进制的SCN值。可以使用下面的方法获得SCN Timestamp时间: select scn_to_timestamp(&&decimal_scn) from dual; 如果运行RMAN validate: select file#, block#, scn_to_timestamp(CORRUPTION_CHANGE#) from v$database_block_corruption where CORRUPTION_TYPE='NOLOGGING'; 在12c中: select file#, block#, scn_to_timestamp(NONLOGGED_START_CHANGE#) from v$nonlogged_block; 如果查询gv$archived_log 或 gv$log_history遇到错误ORA-08181: alter session set nls_date_format = 'DD-MON-YY HH24:MI:SS'; select first_time, next_time from gv$archived_log where &decimal_scn between first_change# and next_change#; 或 select first_time from gv$log_history where &decimal_scn between first_change# and next_change#; 如果运行RMAN validate: alter session set nls_date_format = 'DD-MON-YY HH24:MI:SS'; select file#, block#, first_time, next_time from varchivedlog,vdatabase_block_corruption where CORRUPTION_CHANGE# between first_change# and next_change# and CORRUPTION_TYPE='NOLOGGING'; 或 select file#,block#,first_time from vloghistory,vdatabase_block_corruption where CORRUPTION_CHANGE# between first_change# and next_change# and CORRUPTION_TYPE='NOLOGGING'; 12c: alter session set nls_date_format = 'DD-MON-YY HH24:MI:SS'; select file#, block#, first_time, next_time from vnonloggedblock,varchived_log where NONLOGGED_START_CHANGE# between first_change# and next_change#; 或 select file#, block#, first_time from vnonloggedblock,vlog_history where NONLOGGED_START_CHANGE# between first_change# and next_change#; (五)SYSAUX表空间、AWR、EM等出现NOARCHIVELOG和NOLOGGING问题 如果数据库版本是11.1.0.6 或 11.1.0.7 或 11.2.0.1,对NOLOGGING对象执行过DIRECT PATH操作,并且后续执行了RECOVER DATABASE命令,即使数据库FORCE LOGGING是打开的情况下,会出现ORA-1578和ORA-26040错误。这种问题经常发生在SYSAUX表空间中的AWR或EM对象。请参考Note 1071869.1。注意数据库当前版本可能已经大于11.1 或者 11.2.0.1但是问题可能是在升级之前产生的。这个约束在11.2.0.2以上版本中取消,这个问题在10g不会发生。 RDBMS版本变化: RDBMS版本 变化 10.2.0.4+ DBverify报告NOLOGGING block错误信息 "DBV-00201: Block, DBA , marked corrupt for invalid redo application" 10.2.0.5, 10.2.0.1+ RMAN validate命令检查NOLOGGING block,在v$database_block_coruption视图中记录corruption_type='NOLOGGING' 11g+ 引入db_unrecoverable_scn_tracking参数 11.1.0.6 or 11.1.0.7 or 11.2.0.1 NOARCHIVELOG模式数据库,对NOLOGGING对象执行了DIRECT PATH操作,并且以后手动恢复数据库,即使打开了FORCE LOGGING,也会报ORA-1578 和 ORA-26040。这个约束在11.2.0.2以上版本取消,这个问题在10g不会发生。 12c RMAN validate的结果不在视图v$database_block_corruption中,而是在视图v$nonlogged_block 12.2 以下RMAN命令被引入: RMAN> validate [database / datafile] nonlogged block; RMAN> recover [database / datafile] nonlogged block; -> 对于 Standby 数据库 (六)解决方法 NOLOGGING操作引起的坏块是不能修复的,比如“Media Recovery”或“RMAN blockrecover”都无法修复这种坏块。可行的方法是在NOLOGGING操作之后立刻备份对应的数据文件。 如果错误是执行RMAN DUPLICATE 或 RESTORE之后产生的,那么在源库打开FORCE LOGGING,然后再重新运行RMAN DUPLICATE 或 RESTORE。 alter database force logging; 如果错误出现在物理STANDBY数据库,那么可以从主库恢复被影响的数据文件(只有当主库没有这个问题的情况下)。参考文档Doc ID 958181.1。在Oracle 12c中可以使用RMAN选项RECOVER NONLOGGED BLOCK with DATAFILE、TABLESPACE、DATABASE。例如: RMAN> RECOVER DATABASE NONLOGGED BLOCK; 为了避免这个问题发生,在主库强制生产日志: alter database force logging; 如果同一个datafile的数据块在主库出现nologging坏块,但是备库没有,可以通过手动跳过(dbms_repair)坏块或者设置event 10231。主库出现nologging坏块可能是由于主库执行过备份恢复或者之前是备库,执行了switchover。 如果NOLOGGING数据块位于空闲数据块(dba_free_space视图可以查询到),那么DBVerify检查会发现这个问题,报错DBV-00201或者在v$database_block_corruption视图中显示。对于这种情况,可以等待到这个数据块被重用时会自动格式化或者手动强制格式化。 如果是索引,那么可以重新创建(drop/create)索引。如果是表,那么可以使用存储过程DBMS_REPAIR.SKIP_CORRUPT_BLOCKS跳过坏块,然后考虑是否重建表。 在删除有坏块的段之后,这个坏块就处于空闲状态,后续可以被分配给其他对象或段,当这个坏块被分配给其它对象或段时,这个数据块被重新格式化。如果v$database_block_corruption视图中还是显示为坏块,那么可以手动运行rman validate来清除视图中的信息。 如果是LOB,那么请参考Note 293515.1。 实验一:DISABLE_ARCHIVE_LOGGING:Y RMAN> list backupset of datafile 6; List of Backup Sets =================== BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ ------------------- 12 Full 352.78M DISK 00:03:21 2018-04-09 14:50:59 BP Key: 12 Status: AVAILABLE Compressed: YES Tag: TAG20180409T144738 Piece Name: /u04/flash_recovery_area/LHR121/backupset/2018_04_09/o1_mf_nnndf_TAG20180409T144738_fdp34bfm_.bkp List of Datafiles in backup set 12 File LV Type Ckp SCN Ckp Time Name ---- -- ---- ---------- ------------------- ---- 6 Full 2865977 2018-04-09 14:47:38 /u04/oradata/lhr121/users01.dbf [oracle@rhel6lhr env_oracle]$ impdp scott/tiger dumpfile=log.dmp tables=T_LOG TRANSFORM=DISABLE_ARCHIVE_LOGGING:Y remap_schema=LHR:SCOTT Import: Release 12.1.0.2.0 - Production on Tue Apr 10 10:53:17 2018 Copyright (c) 1982, 2014, Oracle and/or its affiliates. All rights reserved. Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options ORA-39002: invalid operation ORA-39166: Object SCOTT.T_LOG was not found or could not be exported or imported. [oracle@rhel6lhr env_oracle]$ impdp lhr/lhr dumpfile=log.dmp tables=T_LOG TRANSFORM=DISABLE_ARCHIVE_LOGGING:Y remap_schema=LHR:SCOTT Import: Release 12.1.0.2.0 - Production on Tue Apr 10 10:53:40 2018 Copyright (c) 1982, 2014, Oracle and/or its affiliates. All rights reserved. Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options Master table "LHR"."SYS_IMPORT_TABLE_02" successfully loaded/unloaded Starting "LHR"."SYS_IMPORT_TABLE_02": lhr/******** dumpfile=log.dmp tables=T_LOG TRANSFORM=DISABLE_ARCHIVE_LOGGING:Y remap_schema=LHR:SCOTT Processing object type TABLE_EXPORT/TABLE/TABLE Processing object type TABLE_EXPORT/TABLE/TABLE_DATA . . imported "SCOTT"."T_LOG" 34.24 KB 9 rows Processing object type TABLE_EXPORT/TABLE/INDEX/INDEX Processing object type TABLE_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS Processing object type TABLE_EXPORT/TABLE/STATISTICS/MARKER Job "LHR"."SYS_IMPORT_TABLE_02" successfully completed at Tue Apr 10 10:54:14 2018 elapsed 0 00:00:32 [oracle@rhel6lhr env_oracle]$ rm -rf /u04/oradata/lhr121/users01.dbf [oracle@rhel6lhr env_oracle]$ rman target / Recovery Manager: Release 12.1.0.2.0 - Production on Tue Apr 10 10:55:09 2018 Copyright (c) 1982, 2014, Oracle and/or its affiliates. All rights reserved. connected to target database: LHR121 (DBID=3221842516) RMAN> restore datafile 6; Starting restore at 2018-04-10 10:55:17 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=29 device type=DISK channel ORA_DISK_1: starting datafile backup set restore channel ORA_DISK_1: specifying datafile(s) to restore from backup set channel ORA_DISK_1: restoring datafile 00006 to /u04/oradata/lhr121/users01.dbf channel ORA_DISK_1: reading from backup piece /u04/flash_recovery_area/LHR121/backupset/2018_04_09/o1_mf_nnndf_TAG20180409T144738_fdp34bfm_.bkp RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of restore command at 04/10/2018 10:55:20 ORA-19870: error while restoring backup piece /u04/flash_recovery_area/LHR121/backupset/2018_04_09/o1_mf_nnndf_TAG20180409T144738_fdp34bfm_.bkp ORA-19573: cannot obtain exclusive enqueue for datafile 6 RMAN> startup force mount Oracle instance started database mounted Total System Global Area 658505728 by