oracledba.help
TroubleShooting

Health Checks

<- TroubleShooting

Daily

  • Are the OS and system hardware functioning OK (SAN/RAID, CPU, Network, RAM)?
  • Is the database up? Was it down for any reason in the last 24 hours?
  • Can client apps connect to the database?
  • When was the last RMAN incremental backup run and what were the results?
  • Are there any database errors that you need to know about?
    • Alert Log
      SQL To determine Alert Log location:
      SELECT value FROM v$diag_info WHERE name='Diag Trace';
      Common Path:
      %ORACLE_BASE%\diag\rdbms\{DBNAME}\{DBNAME}\trace\alert_{DBNAME}.log
    • Listener Log
      To determine location:
      OS> lsnrctl status
      Common Path:
      %ORACLE_BASE%\diag\tnslsnr\{DBNAME}\listener\trace\listener.log
  • Is there ample disk space for the datafiles, archive logs, export files and RMAN files?
  • Did the daily scheduled jobs run OK?
  • Review any security provided monitoring reports for inappropriate changes (Tripwire etc.).

IMPORTANT: Never open a live listener log file from an editor!
Use tail -f or copy it first and view the copy.

Weekly

  • When was the last RMAN full backup run and what were the results?
  • When was the last RMAN database validation run and what were the results?
  • When was the last full database export and what were the results?
  • How often are redo log switches occurring?
  • Are the number of connections increasing? If so by how much?
  • Are all the blocks in your datafiles OK (not just the ones with data)?
  • Is any partition maintenance required (extending ranges etc.)?
  • Does any of your documentation need to be updated?
  • Did the weekly scheduled jobs run OK?

Monthly

  • Are all your database patches up-to-date?
  • Are your indexes OK or do they need to be rebuilt?
  • Which tablespaces are growing, and by how much?
  • Is the size of any schema increasing? If so by how much?
  • Is your SGA and PGA usage increasing? If so by how much?
  • What SQL is being run most often and can it be made more efficient?
  • Is there a significant increase in DML (INSERTS, UPDATES and DELETES)?
  • Are the segments in your tablespaces OK or do they need to be optimized?
  • Can the maintenance of any the databases be simplified?
  • Did the monthly scheduled jobs run OK?

Appendix: Using SQL to Check Alert Log (11g and later)

SELECT inst_id, originating_timestamp, message_text
FROM TABLE(gv$(cursor(SELECT inst_id, originating_timestamp, message_text 
                      FROM v$diag_alert_ext
                      WHERE originating_timestamp > (sysdate - 1)
                      AND   message_text LIKE '%ORA-%')))
ORDER BY inst_id, originating_timestamp;

<- TroubleShooting