Oracle® Application Server Administrator's Guide
10g Release 2 (10.1.2) B13995-06 |
|
Previous |
Next |
This chapter describes common problems that you might encounter when using the Backup and Recovery Tool, and explains how to solve them. It contains the following topic:
This section describes common problems and solutions. It contains the following topics:
Receiving Missing Files Messages During restore_config Operation
Cannot Run a Cold Backup on Identity Management or J2EE Instance
Timeout Occurs While Trying to Stop Processes Using opmnctl stopall
Using the Backup and Recovery Tool to Perform a Recovery Fails Due to an Unknown Log Sequence Number
Enterprise Manager Cannot Access Restored Nodes on New Hosts
Cold Backups Do Not Shut Down All Databases in RAC Environment
Restore Operation Changes Farm Topology Leaving an Instance in Inconsistent State
Post-deployment Changes to Configuration Files Are Lost After Restoring DCM-Managed Components
A restore_config operation fails.
Problem
A restore_config operation fails with the following error:
C:\OracleAS\IM_1128/dcm/bin/dcmctl.bat applyarchiveto -archive 2004-11-29_11-23-18 -script ADMN-906025 Base Exception: The exception, 100999, occurred at Oracle Application Server instance "im_1128.stajx14.us.oracle.com" "See base exception for details.See base exception for details." Resolution: Resolve the indicated problem at the Oracle Application Server instance where it occurred then resync the instance java.lang.Exception: Could not delete file C:\OracleAS\IM_1128\j2ee\OC4J_SECURITY\application- deployments\wirelesssso\jazn-data.xml. Please check file permissions. at oracle.security.jazn.smi.JAZNPlugin.commit(Unknown Source) at oracle.ias.sysmgmt.repository.DcmPlugin.commit(Unknown Source)
Solution
If you see an error similar to "Could not delete file jazn-data.xml
", execute the following steps:
Stop all the OC4J processes using the following command:
ORACLE_HOME
/opmn/bin/opmnctl stopproc ias-component=OC4J
Rerun the restore_config
operation.
A restore_config
operation generates missing file messages.
Problem
During a restore_config operation, you receive messages indicating that files are missing, for example:
Could not copy file C:\Product\OracleAS\Devkit_1129/testdir/ to C:\Product\OracleAS\Devkit_1129\backup_restore\cfg_bkp/2004-12-01_03-26-22.
Solution
During a restore_config
operation, a temporary configuration backup is taken so that, if the restore fails, the temporary backup can be restored returning the instance to the same state as before the restore.
If some files are deleted (including files/directories specified in config_misc_files.inp) before a restore operation, then, during the temporary backup, messages are displayed indicating that certain files are missing. These error/warning messages should be ignored since the missing files are restored as part of the restore_config
operation.
A file-based repository restoration fails.
Problem
File-based repository restoration fails with the error indicating that the dcm daemons across the farm could not be restarted.
C:\fbfhost\backup_restore>bkp_restore.bat -m restore_repos -t 2004-12-07_13-49-13 C:\fbfhost\backup_restore>echo off Stopping dcm-daemon across the farm ... Importing file based repository ... Restarting dcm-daemon across the farm ... Problem running command (Returned 150) c:\fbfhost/opmn/bin/opmnctl @farm restartproc ias-component=dcm-daemon The file based repository has been restored. But, dcm daemons across farm could not be restarted. Please take the appropriate action. See c:\logs/2004-12-07_13-50-18_restore_repos.log for more info
Solution
At this point, the file-based repository has been restored successfully. Now, perform the following steps on the repository host:
Stop the dcm-daemon process on the file based repository host:
ORACLE_HOME
/opmn/bin/opmnctl stopproc ias-component=dcm-daemon
Start the dcm-daemon processes across farm:
ORACLE_HOME
/opmn/bin/opmnctl @farm startproc ias-component=dcm-daemon
You cannot run a cold backup on Identity Management or a J2EE instance.
Problem
When backup_cold is attempted on Identity Management or a J2EE instance, the following error message displays:
C:\Product\OracleAS\SSO_1203\backup_restore>bkp_restore.bat -v -m backup_cold C:\Product\OracleAS\SSO_1203\backup_restore>echo off ======================================== Running command: C:\Product\OracleAS\SSO_1203/dcm/bin/dcmctl.bat whichfarm -v -script >> C:\Product\OracleAS\SSO_1203\backups\log_path/2004-12-09_03-56-55_whichfarm.log C:/Product/OracleAS/SSO_1203/backup_restore/config/config.inp: Invalid 'database backup_path' specified VALUE_NOT_SET - No such file or directory Consider using '-f' to force creation of this path Failure: backup_cold failed
Solution
The backup_cold
operation should be used only on the repository hosts—Metadata Repository instance or any instance hosting a file-based repository.
The loss or corruption of the opmn.xml file is causing a failure.
Problem
The loss or corruption of the opmn.xml file caused the following error:
ADMN-906025 Base Exception: The exception, 100999, occurred at Oracle Application Server instance "J2EE_1123.stada07.us.oracle.com"
Resolution
Perform the following steps to restore the opmn.xml file:
Run
bkp_restore.bat -m restore_config -t <timestamp>
If that command fails, stop the OC4J processes.
Rerun
bkp_restore.bat -m restore_config -t <timestamp>
A restore_config
operation fails or the ORACLE_HOME
/j2ee/OC4J_SECURITY
directory is deleted.
Problem:
The ORACLE_HOME
/j2ee/OC4J_SECURITY
directory is accidently deleted or a restore_config
operation fails with the following error:
ADMN-906025 Base Exception: The exception, 806212, occurred at Oracle Application Server instance "OID.stada07.us.oracle.com" "OPMN Request: /start?mode=sync&process-type=OC4J_SECURITY OPMN Response: HTTP/1.1 204 No Content Content-Length: 724 Content-Type: text/html Response: 0 of 1 processes started. . <?xml version='1.0' encoding='US-ASCII'?> <response> <opmn id="stada07:6200" http-status="204" http-response="0 of 1 processes started."> <ias-instance id="OID.stada07.us.oracle.com"> <ias-component id="OC4J"> <process-type id="OC4J_SECURITY"> <process-set id="default_island"> <process id="511967353" pid="956" status="Init" index="1" log="C:\Product\OracleAS\OID\opmn\logs\OC4J~OC4J_SECURITY~default_island~1" . operation="request" result="failure"> <msg code="-21" text="failed to start a managed process after the maximum retry limit">
Solution:
To resolve this problem, run the following command:
On UNIX systems:
bkp_restore.sh -m restore_config -F DCM-resyncforce
On Windows systems:
bkp_restore.bat -m restore_config -F DCM-resyncforce
The backup of a DCM file-based repository fails.
Problem:
The backup of a DCM file-based repository fails because of missing or corrupted files in the repository.
Solution:
If *.bom files are missing, use restore_config
to restore the repository and then backup the repository.
For all other files, use restore_repos
to restore the repository, and then run any of the backup options to backup the repository.
During backup_instance_cold
, backup_instance_cold_incr
and restore_instance
operations, a timeout may occur while trying to stop processes using the opmnctl stopall
.
Problem:
During some operations involving the backup or restore of a server instance, a timeout may occur while trying to stop processes using the opmnctl stopall
command. This can occur because of heavy machine load or a process taking a long time to shut down. Under these conditions, you may receive an error message similar to the following:
Oracle Application Server instance backup failed. Stopping all opmn managed processes ... Failure : backup_instance_cold_incr failed Unable to stop opmn managed processes !!!
Solution:
Running opmnctl stopall
a second time should resolve this problem.
When performing a recovery using the Backup and Recovery Tool, the RMAN recovery fails due to an unknown log sequence number. Use the following command to correct the problem:
sqlplus> alter database open resetlogs;
After using Loss of Host Automation to restore the nodes to new hosts, Enterprise Manager cannot access the nodes.
Problem
The scenario is that all nodes on a farm were lost. After using Loss of Host Automation to restore the nodes to new hosts, Enterprise Manager cannot access the nodes. The cause of this problem is that the dcmCache.xml files are not updated between restores of the individual nodes.
Solution
After restoring the first node, save a copy of dcmCache.xml from the second node. After restoring the second node, copy the saved copy of dcmCache.xml to the second node. Restart all processes on both nodes.
A restore of a Portal instance fails after deleting an OC4J instance that was part of the backup being restored.
Problem
After a successful backup of an Infrastructure and a Portal with an OC4J instance, a restore of the Infrastructure succeeds, but the restore of the Portal fails. The OC4J instance was deleted before the restore.
Solution
Before running a restore on the Portal, run the following command:
dcmctl resyncInstance -force
If the Oracle Application Server Metadata Repository is installed in an existing Oracle database (RepCA database), which is configured as a Real Application Cluster (RAC), then before performing a Full Cold Backup using Enterprise Manager or executing backup_instance_cold
or backup_cold
in command-line mode, you must shut down all the instances in the cluster database. You can use Enterprise Manager to shutdown the entire cluster database, run srvctl stop database
to stop all the started instances or run SQL*PLUS to shut down each started instance.
Running restore_instance
fails when trying to restore the database (restore_repos
).
Problem
Restoring an instance fails with the following error:
unable to find archive log archive log thread=1 sequence=3 released channel: dev1 RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of recover command at <time> RMAN-06054: media recovery requesting unknown log: thread <> seq <> lows cn <>
Solution
Perform the following steps to resolve the problem:
Complete database recovery by running the following command:
sqlplus > alter database open resetlogs;
Configuration recovery:
perform opmnctl startall
Configuration restore:
On UNIX:
bkp_restore.sh -m restore_config -t <timestamp>
On Windows:
bkp_restore.bat -m restore_config -t <timestamp>
Changing ORACLE_HOME from the ORACLE_HOME used to start the database may result in an error while performing backup or recovery operations.
Problem
Changing ORACLE_HOME to a different directory from the directory used to start the database may result in errors when trying to perform backup or recovery. For example, if you started the database with ORACLE_HOME set to home/foo and later try to connect to private/foo, you will not be able to connect to the original instance.
Solution
To verify where ORACLE_HOME resides, run the following command:
$ /usr/ucb/ps -auxeww | grep pmon
If the value returned for ORACLE_HOME is different from the environment ORACLE_HOME, restart the database with the ORACLE_HOME set for the environment.
A restore operation on one instance can change the farm topology leaving another instance on the farm in an inconsistent state.
Problem
The scenario: install core1 as a file-based repository host and take a cold backup. Install core2 and join it to core1 as a file-based repository client. Restore the file-based repository for core1. This will corrupt core2 as it was joined to core1 after the cold backup. Core2 points to core1 as the file-based repository host, but there is no record of core2 in core1 after the restore.
Resolution
Before restoring the file-based host (core1), run dcmctl leavefarm
on core2. After restoring the repository, run dcmctl joinfarm
on core2.
Alternatively, restore core2 with a backup taken prior to joining it to the core1 file-based repository.
Post-deployment changes to configuration files are lost after restoring DCM-managed component configurations.
Problem
After deploying Oracle Application Server, changes made to configuration files, such as web.xml (1 per application), are lost after the Backup and Recovery Tool restores DCM-managed component configurations.
Solution
After the restore operation completes, the web.xml files can be copied from the configuration backup using the following manual procedure:
Find the config_backup_path
value from ORACLE_HOME
/backup_restore/config/config.inp
file.
Change the current directory to the config_backup_path
directory:
cd config_backup_path
Locate the config backup jar file containing the web.xml
files with the changes.
Copy the config backup jar file to a temporary location:
cp config_bkp_yyyy-mm-dd_hh-mm-ss.jar /tmp
Unjar the config backup jar file at temporary location:
cd /tmp jar xvf config_bkp_yyyy-mm-dd_hh-mm-ss.jar
Find the web.xml
files in config backup directory:
cd config_bkp_yyyy-mm-dd_hh-mm-ss
On UNIX:
find . -name web.xml -print ./j2ee/home/applications/dms/WEB-INF/web.xml ./j2ee/home/applications/BC4J/webapp/WEB-INF/web.xml ./j2ee/home/default-web-app/WEB-INF/web.xml
Restore the web.xml files into the ORACLE_HOME:
cp j2ee/home/applications/dms/WEB-INF/web.xml ORACLE_HOME/j2ee/home/applications/dms/WEB-INF/web.xml cp j2ee/home/applications/BC4J/webapp/WEB-INF/web.xml ORACLE_HOME/j2ee/home/applications/BC4J/WEB-INF/web.xml cp j2ee/home/default-web-app/WEB-INF/web.xml ORACLE_HOME/j2ee/home/default-web-app/WEB-INF/web.xml
Alternatively, you can combine steps 6 and 7 in a script. This can be done in a UNIX shell script as follows:
CSH> foreach (i) `find . -name web.xml -print` CSH> cp $i $ORACLE_HOME\$i CSH> end