Oracle9i Recovery Manager User's Guide Release 2 (9.2) Part Number A96566-01 |
|
The primary goal of RMAN tuning is to create an adequate flow of data between disk and storage device. Tuning RMAN backup and restore operations involves the following tasks discussed in this chapter:
RMAN backup and restore operations have the following distinct components:
The slowest of these operations is called the bottleneck. RMAN tuning is the task of identifying the bottleneck (or bottlenecks) and attempting to make it more efficient by using RMAN commands, initialization parameter settings, or adjustments to physical media. The key to tuning RMAN is understanding I/O.
RMAN's backup and restore jobs use two types of I/O buffers: DISK
and tertiary storage (usually tape). When performing a backup, RMAN reads input files using disk buffers and writes the output backup file by using either disk or tape buffers. When performing restores, RMAN reverses these roles.
Besides being divided into DISK
and sbt
, I/O is also divided into synchronous and asynchronous. Synchronous devices only perform one I/O task at a time. Hence, you can easily determine how much time backup jobs require. In contrast to synchronous I/O, asynchronous I/O can perform more than one task at a time.
To tune RMAN effectively, you must thoroughly understand concepts such as synchronous and asynchronous I/O, disk and tape buffers, and channel architecture. When you understand these concepts, then you can learn how to use fixed views to monitor bottlenecks, and use the techniques described in "Improving RMAN Backup Performance" to solve problems.
This section contains these topics:
RMAN I/O uses two different types of buffers: disk and tape. These buffers are typically different sizes. To understand how RMAN allocates disk buffers, you must understand how RMAN multiplexing works, as described in "Multiplexed Backup Sets". Review this section before proceeding.
RMAN multiplexing is the number of files in a backup read simultaneously and then written to the same backup piece. The degree of multiplexing depends on the FILESPERSET
parameter of the BACKUP
command as well as the MAXOPENFILES
parameter of the CONFIGURE
CHANNEL
or ALLOCATE
CHANNEL
commands.
For example, assume that you back up two datafiles with one channel. You set FILESPERSET
to 3
and set MAXOPENFILES
to 8
. In this case, the number of files in each backup set is 2 (the lesser of FILESPERSET
and the files read by each channel), and so the level of multiplexing is 2
(the lesser of MAXOPENFILES
and the number of files in each backup set).
When RMAN backs up from disk, it uses the algorithm described in Table 14-1 to determine how many buffers to allocate and how large to make the buffers.
In the example shown in Figure 14-1, "Disk Buffer Allocation", one channel is backing up four datafiles on a robust striped disk configuration. MAXOPENFILES
is set to 4
and FILESPERSET
is set to 4
. Hence, the level of multiplexing is 4. So, the total size of the buffers for each datafile is 4 MB.
To calculate the total size of the buffers allocated in a backup set, multiply the total bytes for each datafile by the number of datafiles being concurrently accessed by the channel, and then multiply this number by the number of channels.
Assume that you use one channel to back up four datafiles, and use the settings shown in Figure 14-1. In this case, multiply as follows to obtain the total size of the buffers allocated for the backup:
4 MB/datafile x 1 channel x 4 datafiles/channel = 16 MB
Set the MAXOPENFILES
parameter so that the number of files read simultaneously is just enough to utilize the output device fully. This consideration is especially important when the output device is tape.
If you make a backup to an sbt
device, then Oracle allocates four buffers for each channel for the tape writers (or reads if doing a restore). Oracle allocates these buffers only if the channel is an sbt
channel. Typically, each tape buffer is 256 KB. To calculate the total size of buffers used during a backup or restore, multiply the buffer size by 4, and then multiply this product by the number of channels.
As illustrated in Figure 14-2, assume that you use one tape channel and each buffer is 256 KB. In this case, the total size of buffers used during a backup is as follows:
256 KB/buffer x 4 buffers/channel x 1 channel = 1024 KB
RMAN allocates the tape buffers in the SGA or the PGA, depending on whether I/O slaves are used. If you set the initialization parameter BACKUP_TAPE_IO_SLAVES
= true
, then RMAN allocates tape buffers from the SGA or the large pool if the LARGE_POOL_SIZE
initialization parameter is set. If you set the parameter to false
, then RMAN allocates the buffers from the PGA.
If you use I/O slaves, then set the LARGE_POOL_SIZE
initialization parameter to set aside SGA memory dedicated to holding these large memory allocations. Hence, the RMAN I/O buffers do not compete with the library cache for SGA memory.
When RMAN reads or writes data, the I/O is either synchronous or asynchronous. When the I/O is synchronous, a server process can perform only one task at a time. When it is asynchronous, a server process can begin an I/O and then perform other work while waiting for the I/O to complete. It can also begin multiple I/O operations before waiting for the first to complete.
You can set initialization parameters that determine the type of I/O. If you set BACKUP_TAPE_IO_SLAVES
to true
, then the tape I/O is asynchronous. Otherwise, the I/O is synchronous. It is recommended that you always set BACKUP_TAPE_IO_SLAVES
to true
.
Some operating systems support native asynchronous I/O, and Oracle takes advantage of this feature if it is available. On operating systems that do not support native asynchronous I/O, Oracle can simulate it by using special I/O slave processes that are dedicated to performing I/O on behalf of another process. You can control disk I/O slaves by setting the DBWR_IO_SLAVES
parameter to a nonzero value. Oracle allocates four backup disk I/O slaves for any nonzero value of DBWR_IO_SLAVES
.
Figure 14-3 shows synchronous I/O in a backup to tape. The following steps occur:
Figure 14-4 shows asynchronous I/O in a tape backup. The following steps occur:
The following factors affect the speed of the backup to tape:
The tape native transfer rate is the speed of writing to a tape without compression. This speed represents the upper limit of the backup rate. The upper limit of your backup performance should be the aggregate transfer rate of all of your tape drives. If your backup is already performing at that rate, and if it is not using an excessive amount of CPU, then RMAN performance tuning will not help.
The level of tape compression is very important for backup performance. If the tape has good compression, then the sustained backup rate is faster. For example, if the compression ratio is 2:1 and native transfer rate of the tape drive is 6 MB/s, then the resulting backup speed is 12 MB/s.
One of the most interesting issues for backup performance is tape streaming. Almost all tape drives currently on the market are fixed-speed, streaming tape drives. In other words, these drives can only write data at one speed. As a result, when they run out of data to write to tape, they must slow down and stop. For example, when the drive's buffer empties, the tape is moving so quickly that it actually overshoots and must rewind past the point where it stopped writing.
The physical tape block size can affect backup performance. The block size is the amount of data written by media management software to a tape in one write operation. The common rule is that a larger tape block size leads to a faster backup. Note that physical tape block size is not controlled by RMAN or the Oracle server, but by media management software. Larger physical tape block size leads to a faster backup. The physical tape block size is controlled by media management software.
You can set various channel limit parameters that apply to operations performed by the allocated server session in the CONFIGURE
CHANNEL
and ALLOCATE
CHANNEL
commands.
You can use these parameters to do the following:
You can specify the channel parameters described in Table 14-2.
Parameter | Description |
---|---|
|
Specifies the maximum size of a backup piece. Use this parameter to force RMAN to create multiple backup pieces in a backup set. RMAN creates each backup piece with a size no larger than the value specified in the parameter. |
|
Specifies the bytes/second that RMAN reads on this channel. Use this parameter to set an upper limit for bytes read so that RMAN does not consume excessive disk bandwidth and degrade online performance. For example, set |
|
Determines the maximum number of input files that a backup or copy can have open at a given time (default value is |
See Also:
|
The BACKUP
command lets you set parameters that influence how RMAN selects files for input into backup sets. You can set these parameters to do the following:
You can specify the parameters described in Table 14-3.
Note:
Control the number of datafiles accessed by a channel by setting |
See Also:
Oracle9i Recovery Manager Reference for |
Many factors can affect backup performance. Often, finding the solution to a slow backup is a process of trial and error. To get the best performance for a backup, follow the suggested steps in this section:
Make sure that the RATE
parameter is not set on the ALLOCATE
CHANNEL
or CONFIGURE
CHANNEL
commands, as described in Table 14-2. The RATE
parameter is intended to slow down a backup so that you can run it in the background with as little effect as possible on OLTP operations.
The RATE
parameter specifies units of bytes/second. Test to find a value that improves performance of your queries while still letting RMAN complete the backup in a reasonable amount of time. Note that RATE
is not designed to increase backup throughput, but to decrease backup throughput so that more disk bandwidth is available for other database operations.
If (and only if) you are backing up to an sbt
device, then set the BACKUP_TAPE_IO_SLAVES
initialization parameter to true
to cause the tape buffers to be allocated from the SGA. You can control the buffer size with the PARMS
parameter on the ALLOCATE
CHANNEL
or CONFIGURE
CHANNEL
command.
The BACKUP_TAPE_IO_SLAVES
initialization parameter simulates asynchronous tape I/O by spawning an additional process to wait for tape I/O completion, leaving the primary process free to process additional disk blocks while waiting for tape I/O to complete. If you do not set this parameter, then I/O to the tape layer is synchronous, which means no other work can occur until the tape is done writing.
The BACKUP_TAPE_IO_SLAVES
parameter requires that the buffers for the respective disk or tape I/O be allocated from the shared memory (SGA), so that they can be shared between two processes. Therefore, allocate a large enough SGA size to accommodate this memory usage. If you set the BACKUP_TAPE_IO_SLAVES
parameter, then also set the LARGE_POOL_SIZE
parameter.
If (and only if) your disk does not support asynchronous I/O, then try setting the DBWR_IO_SLAVES
initialization parameter to a nonzero value. Any nonzero value for DBWR_IO_SLAVES
causes a fixed number (four) of disk I/O slaves to be used for backup and restore, which simulates asynchronous I/O. If I/O slaves are used, I/O buffers are obtained from the SGA (or the large pool, if configured).
Set this initialization parameter only if Oracle reports an error in the alert.log
stating that it does not have enough memory and that it will not start I/O slaves. The message looks something like the following:
ksfqxcre: failure to allocate shared memory means sync I/O will be used whenever async I/O to file not supported natively
When attempting to get shared buffers for I/O slaves, Oracle does the following:
LARGE_POOL_SIZE
is set, then Oracle attempts to get memory from the large pool. If this value is not large enough, then Oracle does not try to get buffers from the shared pool.LARGE_POOL_SIZE
is not set, then Oracle attempts to get memory from the shared pool.alert
.log
file indicating that synchronous I/O is used for this backup.The memory from the large pool is used for many features, including the shared server (formerly called multi-threaded server), parallel query, and RMAN I/O slave buffers. Configuring the large pool prevents RMAN from competing with other subsystems for the same memory.
Requests for contiguous memory allocations from the shared pool are usually small (under 5 KB) in size. However, it is possible that a request for a large contiguous memory allocation can either fail or require significant memory housekeeping to release the required amount of contiguous memory. Although the shared pool may be unable to satisfy this memory request, the large pool is able to do so. The large pool does not have a least recently used (LRU) list; Oracle does not attempt to age memory out of the large pool.
Use the LARGE_POOL_SIZE
initialization parameter to configure the large pool. To see in which pool (shared pool or large pool) the memory for an object resides, query V$SGASTAT.POOL
.
The Oracle9i formula for setting LARGE_POOL_SIZE
is as follows:
LARGE_POOL_SIZE = number_of_allocated_channels
* (16 MB + ( 4 * size_of_tape_buffer ) )
For backups to disk, the tape buffer size is obviously 0, so set LARGE_POOL_SIZE
to 16 MB. For tape backups, the size of a single tape buffer is defined by the RMAN channel parameter BLKSIZE
, which defaults to 256 KB. Assume a case in which you are backing up to two tape drives. If the tape buffer size is 256 KB, then set LARGE_POOL_SIZE
to 18 MB. If you increase BLKSIZE
to 512 KB, then increase LARGE_POOL_SIZE
to 20 MB.
See Also:
Oracle9i Database Concepts for more information about the large pool, and Oracle9i Database Reference for complete information about initialization parameters |
As explained in "Multiplexed Backup Sets", the level of multiplexing is determined by the following factors:
FILESPERSET
settingMAXOPENFILES
value and the number of files going into in each backup setYou should adjust the level of multiplexing to account for the disk configuration on the server. Striped disk configurations involve hardware multiplexing, so that the level of RMAN multiplexing does not need to be as high. For example, consider the following disk configuration scenarios:
For example, if each channel reads fifteen datafiles, and FILESPERSET=10
and MAXOPENFILES=8
, then you can calculate the level of multiplexing as follows:
min( min( 15, 10 ), 8 ) = 8
If the datafiles are striped across two disks, then the multiplexing level is too high. In this case, you should set MAXOPENFILES
to a lower value such as 6
.
When performing a full backup of files that are largely empty, or when performing an incremental backup when few blocks have changed, you may not be able to supply data to the tape fast enough to keep it streaming. In either case, you can improve performance by increasing the level of multiplexing.
An incremental backup is an RMAN backup in which only modified blocks are backed up. Incremental backups are not necessarily faster than full backups because Oracle still reads the entire datafile to take an incremental backup. If tape drives are not locally attached, then incremental backups can be faster. You must consider how much bandwidth exists for reading the disks compared to the bandwidth for writing to the tapes. If tape bandwidth is limited compared to disk, then incremental backups may help.
If only a few blocks have changed in an incremental backup, then you need to input many buffers from the datafile before you accumulate enough blocks to fill a buffer and write to tape. Hence, the tape drive may not stream.
If you set the level of multiplexing (as described in "Step 5: Adjust the Level of Multiplexing") to a large value, then you can scan many datafiles in parallel, the output buffers for the tape drive are filled quickly, and you can write them frequently o keep the drive streaming. The FILESPERSET
value should be less than or equal to MAXOPENFILES
. For example, set both parameters to 8
, and raise this value if the tape drive does not stream. For an incremental backup, 50
is a good value for the level of multiplexing. For a full or incremental level 0 backup, the level of multiplexing should be a lower value such as 4
or 8
.
If the tape is not streaming, but the problem is not due to an incremental backup or by backing up empty files, then you can try adjusting the block size of the tape buffer. You can change the size of each tape buffer using the PARMS
parameter of the ALLOCATE
CHANNEL
or CONFIGURE
CHANNEL
command. If the BLKSIZE
parameter for PARMS
is supported on your platform, then you can set it to the desired size of each buffer. For example, configure an sbt
channel as follows:
CONFIGURE CHANNEL DEVICE TYPE sbt PARMS="BLKSIZE=524288";
A good rule of thumb is to set BLKSIZE
to a value that is a little less than the tape block size of the media manager. What "a little less" means depends on the media manager. For example, if the tape block size is 512 KB and the media manager has a header of size 16 KB, then you can set BLKSIZE=49600
.
Note that it is also a good idea to increase the media management physical tape block size. For example, you do not want to set the BLKSIZE
parameter to 512 KB and leave the physical tape block size as 32 KB.
If none of the previous steps improves backup performance, then try to determine the exact source of the bottleneck. Use the V$BACKUP_SYNC_IO
and V$BACKUP_ASYNC_IO
views to determine the source of backup or restore bottlenecks and to see detailed progress of backup jobs.
V$BACKUP_SYNC_IO
contains rows when the I/O is synchronous to the process (or thread on some platforms) performing the backup. V$BACKUP_ASYNC_IO
contains rows when the I/O is asynchronous. Asynchronous I/O is obtained either with I/O processes or because it is supported by the underlying operating system.
This section contains these topics:
See Also:
Oracle9i Database Reference for more information about these views |
To determine whether your tape is streaming when the I/O is synchronous, query the EFFECTIVE_BYTES_PER_SECOND
column in the V$BACKUP_SYNC_IO
or V$BACKUP_ASYNC_IO
view. Table 14-4 describes how to use this column.
With synchronous I/O, it is difficult to identify specific bottlenecks because all synchronous I/O is a bottleneck to the process. The only way to tune synchronous I/O is to compare the rate (in bytes/second) with the device's maximum throughput rate. If the rate is lower than the rate that the device specifies, then consider tuning this aspect of the backup and restore process. The DISCRETE_BYTES_PER_SECOND
column in the V$BACKUP_SYNC_IO
view displays the I/O rate. Note that if you see data in V$BACKUP_SYNC_IO
, then the problem is that you have not enabled asynchronous I/O or you are not using disk I/O slaves.
Long waits are the number of times the backup or restore process told the operating system to wait until an I/O was complete. Short waits are the number of times the backup or restore process made an operating system call to poll for I/O completion in a nonblocking mode. Ready indicates the number of time when I/O was already ready for use and so there was no need to made an operating system call to poll for I/O completion.
The simplest way to identify the bottleneck is to query V$BACKUP_ASYNC_IO
for the datafile that has the largest ratio for LONG_WAITS
divided by IO_COUNT
.
Note: If you have synchronous I/O but you have set |
See Also:
Oracle9i Database Reference for descriptions of the |
|
Copyright © 1996, 2002 Oracle Corporation. All Rights Reserved. |
|