Tuesday, March 31, 2015

RMAN jobs not working after OEM upgrade to 12.1.0.4

If you are planning to upgrade your OEM to 12.1.0.4 and you have RMAN jobs scheduled in Cloud Control, you should consider applying patch 19519190 to the OMS. I noticed that most of the RMAN jobs were having issues and even worst, some steps were empty!!! 

Obviously, the jobs were succeeding as the step is empty. In other words, the jobs were doing nothing.

Looks like this patch is not part of any PSU, yet! But having a problem with hundreds of jobs and especially with RMAN jobs is very risky.

Take a look at EM 12c: RMAN Step Commands are Being Removed from Multi-step RMAN Script Jobs in Enterprise Manager 12.1.0.4 Cloud Control (Doc ID 1914916.1).

Thanks,

Alfredo

Thursday, March 26, 2015

Using OMS DEBUG mode to troubleshoot OEM 12c problems

This time, I want to show you how to troubleshoot OEM problems by enabling DEBUG mode in the OMS. The virtual machine (VM) running my sandbox installation of OEM 12c 12.1.0.4 crashed during the night. After restarting the VM and all the OEM components, I wasn’t able to login using the SYSMAN account. The error from the console was not very explicit, just, “Authentication failed. If problem persists, contact your system administrator.”

In order to get more details about the error, I decided to enable DEBUG mode for the OMS and reproduce the error. This is what I did to enable DEBUG mode.

$ cd /u01/app/oracle/oms/oms/bin
$ ./emctl set property -name log4j.rootCategory -value "DEBUG, emlogAppender, emtrcAppender" -module logging
Oracle Enterprise Manager Cloud Control 12c Release 4
Copyright (c) 1996, 2014 Oracle Corporation.  All rights reserved.
SYSMAN password:
Property log4j.rootCategory has been set to value DEBUG, emlogAppender, emtrcAppender for all Management Servers
OMS restart is not required to reflect the new property value

After enabling DEBUG mode, I reproduced the error several times using the console. I also wrote down the approximate time of the error, just to easy the search in the log file. Searching in the emoms.trc file located under <EM_HOME>/em/EMGC_OMS1/sysman/log/, found an ORA-14400 error. The MOS note 1493151.1, explains how to fix the issue by adding a new audit partition.

$ cd /u01/app/oracle/gc_inst/em/EMGC_OMS1/sysman/log/
$ view emoms.trc
java.sql.SQLException: ORA-14400: inserted partition key does not map to any partition

The final step is to disable the DEBUG mode for your OMS, otherwise the log files can grow real big and the performance could be affected.

$ ./emctl set property -name log4j.rootCategory -module LOGGING -value "WARN, emlogAppender, emtrcAppender"
Oracle Enterprise Manager Cloud Control 12c Release 4
Copyright (c) 1996, 2014 Oracle Corporation.  All rights reserved.
SYSMAN password:
Property log4j.rootCategory has been set to value WARN, emlogAppender, emtrcAppender for all Management Servers
OMS restart is not required to reflect the new property value

I hope this information is useful to you next time you are troubleshooting an OEM 12c issue.

Thanks,

Alfredo

Sunday, March 22, 2015

OEM 12c very slow after upgrade to 12.1.0.4

I noticed that OEM 12c console was very slow a few hours after the upgrade to 12.1.0.4 version. 

Looking at the repository DB, found several OMS sessions consuming significant CPU resources.
Bug 19199023 explains that some SQL queries executed against the repository consume high CPU on the servers. This bug affects the DB plug-in 12.1.0.6 and the patch 19176910 should be applied to the plug-ins.

More information available on MOS note, 12.1.0.4 OEM: High CPU utilization on Repository Database due to SYSMAN query WITH TARGETGUID AS (SELECT target_guid, host_name FROM mgmt_targets (Doc ID 1912172.1)

Thanks,

Alfredo

Tuesday, March 10, 2015

Cannot start ASM - ORA-15063: ASM discovered an insufficient number of disks for diskgroup “DATA”

Today, I faced and issue with an ASM instance. After bouncing the server, CRS went up along with the ASM instance, but the diskgroups were offline.

$ crsctl status resource -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  OFFLINE      hosta
ora.FRA.dg
               ONLINE  OFFLINE      hosta
ora.LISTENER.lsnr
               ONLINE  ONLINE       hosta
ora.LISTENER_1.lsnr
               ONLINE  ONLINE       hosta
ora.asm
               ONLINE  ONLINE       hosta                 Started
ora.ons
               OFFLINE OFFLINE      hosta
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
      1        ONLINE  ONLINE       hosta
ora.diskmon
      1        OFFLINE OFFLINE
ora.evmd
      1        ONLINE  ONLINE       hosta
ora.database.db
      1        ONLINE  OFFLINE                               Instance Shutdown
ora.database1.db
      1        OFFLINE OFFLINE                               Instance Shutdown

I tried to start ora.DATA.dg resource, but it failed.

$ crsctl start resource ora.DATA.dg
CRS-2672: Attempting to start 'ora.DATA.dg' on 'hosta'
CRS-5017: The resource action "ora.DATA.dg start" encountered the following error:
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"
. For details refer to "(:CLSN00107:)" in "/u01/oracle/grid/log/hosta/agent/ohasd/oraagent_oracle/oraagent_oracle.log".
CRS-2674: Start of 'ora.DATA.dg' on 'hosta' failed
CRS-2679: Attempting to clean 'ora.DATA.dg' on 'hosta'
CRS-2681: Clean of 'ora.DATA.dg' on 'hosta' succeeded
CRS-4000: Command Start failed, or completed with errors.

After checking the RAW devices on the host, everything appeared to be properly configured. Then I checked the configuration of the ASM instance, finding the ASM_DISKSTRING empty.

$ crsctl stat resource ora.asm -f
NAME=ora.asm
TYPE=ora.asm.type
STATE=OFFLINE
TARGET=OFFLINE
ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r--
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%
ALIAS_NAME=
ASM_DISKSTRING=
AUTO_START=restore
CHECK_INTERVAL=1
CHECK_TIMEOUT=30
CREATION_SEED=11
DEFAULT_TEMPLATE=PROPERTY(RESOURCE_CLASS=asm) ELEMENT(INSTANCE_NAME= %GEN_USR_ORA_INST_NAME%)
DEGREE=1
DESCRIPTION=Oracle ASM resource
ENABLED=1
GEN_USR_ORA_INST_NAME=+ASM
ID=ora.asm
LOAD=1
LOGGING_LEVEL=1
NLS_LANG=
NOT_RESTARTING_TEMPLATE=
OFFLINE_CHECK_INTERVAL=0
PROFILE_CHANGE_TEMPLATE=
RESTART_ATTEMPTS=5
SCRIPT_TIMEOUT=60
SPFILE=+DATA/asm/asmparameterfile/registry.123.785123625
START_DEPENDENCIES=hard(ora.cssd) weak(ora.LISTENER.lsnr)
START_TIMEOUT=900
STATE_CHANGE_TEMPLATE=
STOP_DEPENDENCIES=hard(ora.cssd)
STOP_TIMEOUT=600
TYPE_VERSION=1.2
UPTIME_THRESHOLD=1d
USR_ORA_ENV=
USR_ORA_INST_NAME=+ASM
USR_ORA_OPEN_MODE=mount
USR_ORA_OPI=false
USR_ORA_STOP_MODE=immediate
VERSION=11.2.0.3.0

I updated ASM_DISKSTRING with the discovery path of the disks and then bounced ASM instance.

$ srvctl modify asm -d ‘/dev/sd*’
$ srvctl stop asm
$ srvctl start asm

After this the ASM instance came up cleanly and the diskgroups were mounted.

$ crsctl status resource -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE      hosta
ora.FRA.dg
               ONLINE  ONLINE      hosta
ora.LISTENER.lsnr
               ONLINE  ONLINE       hosta
ora.LISTENER_1.lsnr
               ONLINE  ONLINE       hosta
ora.asm
               ONLINE  ONLINE       hosta                 Started
ora.ons
               OFFLINE OFFLINE      hosta
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
      1        ONLINE  ONLINE       hosta
ora.diskmon
      1        OFFLINE OFFLINE
ora.evmd
      1        ONLINE  ONLINE       hosta
ora.database.db
      1        ONLINE  ONLINE                               Open
ora.database1.db
      1        OFFLINE ONLINE                               Open

Hope this help you to troubleshoot and fix the issue on your ASM, when is not able to find the disks.

Thanks,

Alfredo