Oracle® Application Server Wireless Administrator's Guide
10g Release 2 (10.1.2) B13820-02 |
|
Previous |
Next |
This chapter, through the following sections, discusses factors that enable application developers to optimize Oracle Application Server Wireless.
Section 13.5, "Optimizing the Performance of the Oracle HTTP Server"
Section 13.6, "Optimizing the Oracle Process Management and Notification Service (OPMN)"
Section 13.10, "Optimizing the OC4J_Wireless Server Instance"
Section 13.11, "Tuning the Performance of the Operating System"
Oracle Application Server Wireless, when installed, initializes a default setup that is appropriate for the performance of most applications. However, you may need to use additional tuning knobs to adjust performance, since applications vary in features, hardware setup, and performance requirements.
This chapter discusses the tuning options and methods available within Oracle Application Server Wireless and the performance logger utility. It also discusses JVM tuning, JDBC connection performance, and TCP/IP stack tuning.
Note: Throughout the documentation, you can substitute UNIX for Solaris in all instances except for this chapter. The tuning knobs described in this chapter are Solaris-specific. |
You can view the performance statistics of the Transport system from the Oracle Enterprise Manager 10g Application Server Control by first selecting the middle-tier node on the Farm page of the Oracle Enterprise Manager 10g Application Server Control and then by clicking Wireless on the Application Server Home page to access the OracleAS Wireless system management functions described in Chapter 3, "Managing the OracleAS Wireless Server". Click the Site Performance tab of the Wireless Home page. From the Component Performance Section, select Messaging Servers. The Messaging Server performance page appears (Figure 13-1).
Figure 13-1 The Messaging Server Performance Metrics
This page displays the client side and server side Messaging Server performance metrics. For each of the Messaging Server performance metrics, Wireless displays performance data by process name and delivery type (for example, SMS).
The client side performance metrics include:
Average Sending Response Time
The average time of a sending method. On the client side, a sending method is called to send a message. This time is the period from when the method is called to the time the method returns. When the method returns, the message is saved in a database persistently, but is not delivered.
Total Number of Sending Requests
The total time that the sending method is called by the client process. A sending method called once to send a message to a set of destinations counts as a single sending request.
Total Number of Sending Requests Sent
The total number of successful calls, where a message is delivered to a proper gateway and its receipt is acknowledged. The client process can call the sending method many times to send many messages. Some of these requests fail, as in the case where a destination cannot be reached. Other requests could be undergoing processing.
Total Number of Sending Requests Failed
The total number of all calls that are known to have failed.
Average Receiving Process Time
The performance of the listener in terms of the time taken by the onMessage
call-back.
The server-side performance metrics include:
Average Sending Process Time
The performance of a driver in terms of the time taken by the sending method of the driver. The driver performance is measured by delivery type (for example, SMS), process time (the time taken by a driver to send a message to the proper gateway), dequeue time, and driver process time. When you measure the performance of the transport system, you can deduct the process time, because the transport system is waiting while the driver sends a message. If the driver is fast, then the system does not wait long.
Average Receiving Response Time
Once a transport driver receives a message, the message is passed to the transport system by an onMessage
method. The response time is the time taken by the onMessage
method. Once the onMessage
returns, the received message is saved in a database for dispatching.
Total Number of Received Messages
The total number of times the transport drivers call the onMessage
call-back method.
Total Number of Received Messages Dispatched
The total number of received messages which are dispatched to, and are accepted by, the listeners. Among received messages, some may be in processing. Others may not have been dispatched to listeners, or listeners may have failed to process dispatched messages.
Total Number of Received Messages Dispatch Failed
The total number of received messages which failed to dispatch to a listener.
For more information on the Site Performance tab, see Section 3.8.
This section describes the factors that affect the messaging server performance. Topics include:
The sending and receiving threads are the number of threads spawned by the Messaging Server to call a driver's send and receive methods.
Every delivery type has an associated driver queue in which the Messaging Server queues the messages that are to be sent using a specific delivery type. Every messaging client has an associated service queue in which the Messaging Server queues the messages that it receives from the gateway.
By increasing the number of sending threads, you also increase the number of threads that both dequeue the message from the driver queue and pass it to the send method that performs the send operation by submitting the message to the gateway. If the dequeue rate from the driver queue is low, then increasing the number of sending threads enhances performance.
Likewise, by increasing the number of receiving threads, you also increase the number of threads that both receive messages from the gateway and queue them to the service queue for a messaging client. The messaging client then receives messages from this queue. Increasing the number of working threads improves performance if the enqueue rate is low or if there are many pending messages in the gateway.
Tips:
. |
The name of the table that queues the messages for the driver to send is of the form trans_t_<queue_id>
. The <queue id>
for a delivery type is found in the trans_driver_queue table. For example, if the <queue id>
for the SMS delivery type is 1, then all of the messages sent using the SMS delivery type are in the table, trans_t_1
.
The messaging server receives a message from the gateway and then queues it into AQ if any client or service has been registered for the message. All messages received by a driver for a particular service or client is queued in a the queue table of the form trans_t_<queue_id>
. The <queue_ id>
for a service name is found in the trans_service_queue table. For example, if the <queue_ id>
for a service called ASYNCAGENT is 2, then all of the messages for this service are stored in the queue table called trans_t_2
.
You can increase the number of sending and receiving threads by editing a driver instance. To edit a driver instance, first select a Messaging Server process from the Standalone Processes section of the OracleAS Wireless System Manager Home page (accessed through Oracle Enterprise Manager 10g Application Server Control as described in Section 13.2) which has the driver instance with the thread values that you need to edit. From the Driver Instances section of the detail page for the selected Messaging Server Process, select the driver and then click Edit. The Properties page for the selected driver appears (Figure 13-2), with its fields populated by the values set for the selected driver. Change the values for the Sending Threads and Receiving Threads parameters as needed and then click OK to commit any changes.
Figure 13-2 The Driver Instance Properties Page
The Messaging Server client is a service that is registered with the Messaging Server for sending and receiving messages (for example, the Async Listener and the Notification Engine are clients of the Messaging Server). A service queue is associated with every client that is registered to receive messages from the Messaging Server. The service queue contains all of the messages intended for the client and received by the Messaging Server.
Messaging Server client threads are the number of threads that the client uses to dequeue the messages from its service queue. To find the name of the service queue table for a client, refer to Section 13.2.1.1.2. If the dequeuing rate from the service queue is low, then you can improve performance by increasing the number of Messaging Server client threads. You can increase the client threads for Messaging Server clients using the appropriate configuration page for the Messaging Server client in the OracleAS Wireless system management described in Chapter 3, "Managing the OracleAS Wireless Server". For example, increasing the thread pool size of the Messaging Server client for the Async Listener configures the Messaging Server client threads for the Async Listener. For more information, see Section 3.10.2.1.
The JDBC Connection Pool parameter defines the size of the database connection pool. If you increase the number of driver or client threads to more than 10, then increasing the maximum number of connections from 10 (the default value) to 50 ensures decent performance at peak load.
If a very high load is expected to hit the Messaging Server and if performance data logged by the Performance Monitor is not required, then stopping the Performance Monitor process will improve performance. At high loads, this process has been observed to consume considerable resources, resulting in low throughput.
AQ (Advanced Queuing) operations result in high number of insertions and deletions from the database. Hence, I/O values on the database will be high and will need careful tuning. Based on the volume of operations, you may want to increase the number of I/O controllers on the machine.
In the test environment, the following observations have been verified.
With the 3 I/O controller, a throughput of 40 messages per second with 7 sending threads was achieved.
With the 12 I/O controller, a throughput of 100 messages per second with 9 sending threads was achieved.
Cleaning database tables can improve performance if the Messaging Server or the Performance Monitor process has been running for a long time. You can delete the following tables if the Messaging Server has been running for a long time and if the listed tables contain many rows:
Trans_message
Trans_store
Trans_ids
Note: Before deleting these tables, be sure that you do not need message details about the messages received and sent by the Messaging Server. |
If the Performance Monitor process has been running for a long time and if the listed tables contain many rows, the you can truncate the following tables to improve performance:
trans_request_log
trans_handle_log
trans_process_log
trans_enqueue_log
trans_dequeue_log
Note: Before truncating these tables, be sure that you do not need any performance data. |
To recreate the transport repository from scratch, run the following scripts from SQL*Plus connected to the repository as wireless user. Be sure to stop the middle tier before running the following scripts, which are located at wireless/repository/sql
directory:
trans_clean.sql
trans_setup.sql
trans_setup.pls
trans_setup.plb
The performance logging framework at the Web Server level collects performance-related data for an Async Listener. To view this data, you first select an Async Listener process (located in the Web-Based Processes section of the OracleAS Wireless System Manager Home page). The detail page for the selected process appears. Clicking the Performance tab of the detail page invokes the process's Performance page. The page includes the following performance metrics:
Number of Messages Received
The number of messages received, grouped by process ID.
Average Message Response Time (seconds)
The average time a message stayed on the server.
Average Message Queue Size
The average size of the message queue on an hourly basis for today.
Service Access Count
The number of times that each application was accessed today.
User Access Count
The number of messages issued by each user device.
Number of Errors
The number of errors on an hourly basis.
The following sections describe the tuning knobs available in OracleAS Wireless that affect Async Listener performance:
Section 13.3.1.1, "Tuning the Working Threads for the Async Listener"
Section 13.3.1.2, "Adjusting the Thread Pool Size of Messaging Server Client"
Section 13.3.1.3, "Adjusting the Sending and Receiving Threads"
The Async Listener Configuration page (Figure 13-3) enables you to change the number of working threads for the Async Listener. By default, the value for the Working Threads parameter is 10. You can increase this parameter to a higher value to accommodate a higher request rate.
Figure 13-3 Configuring the Async Worker Threads
Increasing the size of the thread pool enables the Messaging Server client to handle higher loads. You can adjust the size of the thread pool from the Messaging Server Client page (Figure 13-4). To access this page (Figure 13-4), select Messaging Server Client (located under Notification Engine in the Component Configuration section of the Administration tab. For more information on configuring the Messenger Sever client, see Section 3.10.2.1.
Figure 13-4 Adjusting the Thread Pool Size
You can also speed up de-queuing and enqueuing by increasing the number of sending and receiving threads. For more information, see Section 13.2.1.1.3.
Parsing input is a costly operation. The performance of such operations depends largely on the amount of memory available to the Java Virtual machine (JVM). To handle a high feed size, you can increase the heap size of the Data Feeder process. Normally, parsing XML feeds consume larger resources than CSV (comma-separated variable) feeds.
In the test environment, the following observations have been verified.
With a large XML feed of 25 MB, a throughput of 43 data rows per-second was achieved by using a heap size of 512 MB.
For a CSV feed of the same volume, a throughput of 48 data rows per-second was achieved.
The following sections describe the configuration directives that you can tune in httpd.conf(located in the ORACLE_HOME/Apache/Apache/conf/ directory) to enhance performance of the Oracle HTTP Server (OHS).
This is the maximum number of servers that can run. An optimum number should be used based on load. A low number causes clients to be locked out; a high number of servers consumes more resources.
The number of requests that a child process handles before it expires and gets re-spawned. The default value 0 means that it will never expire. As as result, you should limit this value. Generally, 10000 is sufficient.
This is the maximum number of pre-spawned processes that are available in the pool of the Apache process that handles connections. The suggested value may vary, as 10 will suffice for most requirements.
This is the minimum number of child processes that need to be pre-spawned all the time. The value 5 will suffice for most requirements.
The number of servers to start initially. If a sudden load is expected on startup, then this value should be increased.
Because the default file descriptor number per JVM is low, you should increase this number to a higher value in the ORACLE_HOME/opmn/bin/opmnctl
script by modifying (or adding) the following line:
> ulimit -n 2048
Note: You must bounce OMPN after you increase the value of the default file descriptor. |
Oracle Application Server uses database connections for Single Sign On (SSO), Oracle Internet Directory (OID), and other connections. Because the default number of connections may not suffice for a high number of users, you should therefore increase this number as users increase. You can increase this number by modifying the relevant files in the database.
The Webcache capacity should be set to a high value depending upon the load. For example, if you are hitting 50 requests per second, then you must set the capacity to 1000. You can increase the Webcache capacity by editing the properties of the Webcache's origin server. To edit the origin server:
Click Web Cache in the Application Server Home page of the Enterprise Manager. The Web Cache Home page appears.
Click the Administration tab.
Click Origin Servers (located in the Application section under Properties). The Origin Server page appears.
Select an origin server and then click Edit. The Edit Origin Server page appears.
Increase the value in the Capacity field.
Click OK to commit the change.
Note: Depending on the incoming requests and the size of documents to be cached, you must also change the maximum incoming connections and the maximum cache size. You can change these values using the Resource Limits and Timeouts page (accessed by clicking Resource Limits and Timeouts in the Web Cache section of the Web Cache Administration tab). |
Because Java applications run within the context of the JVM, some default properties of the JVM must be modified to enable the applications to run faster and consume fewer resources.
Since garbage collection (GC) was not a parallel process until Java 1.3.1, it can become a principal performance bottleneck as the number of CPU's increase. Java 1.4.2 implements the concept of generational garbage collections. Generations are memory pools that hold objects of different ages. There are two GC cycles: Minor Collection and Major collection.
Minor Collection
Typically young objects, which comprise the young generation, die fast. Minor collection occurs when memory pool (or generation) of young objects fills up. During Minor Collection, live objects from the young generation memory pool are eventually copied to a tenured pool. The young generation consists of Eden, where objects are initially allocated and two survivor spaces. One of these survivor spaces remains empty and serves as the destination where live objects in Eden and the other survivor space are copied. Objects are copied between the survivor spaces until they become old enough to be tenured (copied to the tenured generation).
Major Collection
The collection of older generation, or tenured objects. Typically, a Major Collection is slower than a Minor Collection because it involves all live objects.
The first step in tuning is to observe the frequency of GC by using the following command-line option:
> java -verbose: gc classname
This command results in output similar Example 13-1.
Example 13-1 GC Frequency
> [GC 866K->764K(1984K), 0.0037943 secs] > [GC 1796K->1568K(2112K), 0.0068823 secs] > [Full GC 2080K->1846K(3136K), 0.0461094 secs] > [GC 2047K->1955K(3136K), 0.0157263 secs]
For more information on parallel GC, see Section 13.10.1.
The following knobs (available within Java 1.4.2) change the default GC behavior by modifying the heap and generation size.
-Xms and -Xmx
The total size of the heap is bounded by the -Xms
and -Xmx
values. -Xms
is the minimum size of the heap and -Xmx
is the maximum size to which the heap can grow. Having a larger heap reduces the frequency of collections. Increase the heap size as the number of processors increase, since allocation can be done in parallel.
-XXNewSize and -XX:MaxNewSize
These options are specific to Sun Microsystem's HotSpot VM. The young generation size is bounded by these values. A smaller generation, which means a faster rate of Minor Collections and lower frequency of Major Collections, is best suited for Web applications.
By changing these parameters, you can change the frequency of collections as desired by the application.
Other knobs that improve GC performance include:
- XX: New Ratio
NewRatio controls the young generation size. For example, setting -XX:NewRatio=3
means that the ration between the young and tenured generation is 1:3.
-XX:SurvivorRatio
Use the parameter SurvivorRatio
to tune the size of the survivor spaces. For example, -XX:SurivorRatio=6
sets the ratio between each survivor space and Eden at 1:6. In other words, each survivor space is one-eighth of the young generation (not one-seventh, because there are two survivor spaces).
-XX: +UseParallelGC
The throughput collector is a generational collector similar to the default collector but with multiple threads that perform minor collection. Enable the throughput collector using the command-line flag -XX:+UseParallelGC
.
-XX:ParallelGCThreads
Use the ParallelGCThreads
command-line option to control the number of garbage collector threads.
-Xss
This is the size of the stack on a per-thread basis. Its default value changes from platform to platform. If the number of threads running in the application is high, then you can decrease the default size. If, for example, the threads require a high stack space for parsing operations and recursive calls, then increasing the stack size can provide significant performance increase.
Tune the value of these options according to the application type. Table 13-1 describes a typical setup for the E420/Solaris box with four 450Mhz processors and four GB RAM to support 2000 concurrent users.
Modify these Java attributes in the Java Options field, located in the Command Line Options section of the Oracle Enterprise Manger's Server Properties page (Figure 13-5). To access this page, first select the OC4J_Wireless process on the Home page and then select Administration. Next, select Server Properties (located under Instance Properties). The Server Properties page appears.
The JVM heap-tuning options -Xms512m
and -Xmx1024m
increase the maximum memory size to 1 GB (or more) and enable the OC4J_Wireless server instance to support a large number of concurrent users.
To support higher hit rates, increase the MaxClients
parameter in httpd.conf
. For example, setting the MaxClients
parameter to 1024 in httpd.conf
allows up to 1024 concurrent HTTP requests. Consequently, increased requests result in an increased number of Oracle Application Server threads in the OC4J_Wireless server instance. To support large numbers of Oracle Application Server threads in the OC4J_Wireless server instance, reduce the thread stack size to 256k using the JVM option, -Xss256k
. The default thread stack size in the Solaris environment is 512k.
For more information on the JVM options, see Section 13.9.
For OC4J_Wireless server instances running on multi-CPU machines, set the JVM options to enable the Parallel Garbage Collection (GC) algorithm in JDK 1.4. Set the ParallelGCThreads
parameter to the number of CPUs in the host. The following JVM options for JDK 1.4 increase the performance of the OC4J_Wireless Server instance in 4-CPU Solaris machines:
-XX:+UseParallelGC -XX:ParallelGCThreads=4
The following GC tuning parameters provide improved performance for the OC4J_Wireless server instance:
-XX:NewRatio=2 -XX:SurvivorRatio=16
This section describes tuning methods for the operating system's performance of Oracle Application Server Wireless.
Correctly tuned TCP/IP settings improve performance. The indicators for changing default parameters are primarily TCP connection drops while making the three-way handshake, and the system refusing connections at a certain load.
Note: The information in this section applies only to Solaris. |
Use the following UNIX command to check for TCP connection drops:
netstat - s | grep Drop
Note the following value:
tcpListenDrop
, tcpListenDropQ0
, tcpHalfOpenDrop
Any value other than zero suggests the need for changing the tcp connection queue size. While any value for tcpListernDrop
suggests a bottleneck in executing the accept()
call and value for tcpListenDropQ0
, it is an indication of SYN flood or denial-of-services attack.
Use the following UNIX command to check if connections should be replenished more quickly:
netstat | grep TIME_WAIT | wc - l
Note the number of connections in the TIME_WAIT
state. If the rate of establishing connections (load) is known, then you can compute the time taken to run out of connections. To ensure that new connections are readily available, you can decrease the tcp_time_wait_interval
to a low value of 10000 ms.
You can set most of these values using UNIX command ndd. For example:
> ndd - set /dev/tcp tcp_time_wait_interval 10000
These parameters (described in Table 13-2), take effect after the application is restarted. They should be added to the system startup file so that they are not lost after a reboot
You must change the tcp_conn_hash_size
in the file /etc/system after a reboot.
Table 13-2 Operating System Performance Parameters
Parameter | Setting | Comments |
---|---|---|
|
10000 |
The time out for disposing closed connection information. This makes new connections readily available. |
|
65536 |
The size of the TCP transfer windows for sending and receiving data determines how much data can be sent without waiting for an acknowledgment. This can speed up large data transfers significantly. |
|
10240 |
The size of the complete (and incomplete connection) queue. Generally, the default values are sufficient. However, it is recommended to increase these values to 10240. These values can be changed if connection drop problems are observed. |
|
4 |
This setting changes the data transmission rate. Changing this value is important to work arounds to bugs used some operating systems in the implementation of slow start algorithms. |
Solaris Kernel Recommendations
To enhance performance, change the Solaris Kernel performance parameters (described in Table 13-3) in the file /etc/system.
Table 13-3 Solaris Kernel Performance Parameters
Parameter | Value | Comment |
---|---|---|
|
8192 |
The hard limit for number of file descriptors. |
|
2048 |
The soft limit for number of file descriptors. |
|
0x4000 |
The LWP stack size. |
|
0x4000 |
The NFS stack size. |
|
1600 |
By increasing |