Oracle® Application Server Personalization Programmer's Guide
10g Release 2 (10.1.2) B14051-01 |
|
Previous |
Next |
This chapter provides an overview of the methods that are used to manage the recommendation engine proxy, to collect data, and to obtain recommendations, followed by usage notes for some of the methods. The supporting classes for these methods are described in Chapter 3.
For detailed descriptions of these methods, see the Javadoc in the OracleAS Personalization section of the Oracle Application Server 10g Documentation Library.
For examples of how to use these classes and methods, see Chapter 5 and the complete example in Appendix A. All these methods return results in real time.
All methods described in this chapter are public.
The real time recommendation proxy (REProxyRT
) methods can be divided according to function, as follows:
Proxy creation and management (the proxy manager and related methods)
Session management (create and close)
Data collection (collect, preprocess, and store data in recommendation engine (RE) tables)
Recommendations (obtain various types of recommendations)
To use the REProxyRT
(and its exceptions), you must include the following statements in your Java program:
import oracle.dmt.op.re.reapi.rt.*;
import oracle.dmt.op.re.reexception.*;
All these classes reside on the system where Oracle Application Server is installed.
REProxyManager
handles a pool of REProxyRT
instances. Using multiple REProxyRT
instances within a Web server, such as Oracle Application Server, provides the following benefits:
Fault tolerance (if one instance fails, there is another to use)
Load distribution (the load can be spread among all proxy instances)
Domain-dependent recommendations (each proxy instance is associated with a specific RE)
Multiple proxy instances can result in the following issues:
Collected data may be lost when an instance of the proxy fails and the application shifts to another instance.
A given customer must be connected to the same RE for all transactions during a session.
The REProxyManager
class also includes a caching mechanism that supports data collection in the recommendation engine.
The REProxyRT
class includes the DataCollection
cache, which supports data collection in the RE. Every time you create an REProxyRT
object, the cache is built as a subcomponent of the proxy object. When data is collected using the REAPI calls addItem()
and addItems()
, the data is stored in the cache (in the memory) and is periodically flushed to RE schema. This "batch save" improves RE performance. The cache is created when a new REProxyRT
object is created. The refresh rate is defined by an input parameter to REProxyManager.createProxy()
.
Currently, only item and user ID data in the classes DataItem
and IdentificationData
are cached, and they are cached as current session data.
REProxyManager
is a singleton implementation, that is, only one instance of the REProxyManager
class is created in a particular JVM instance, and the class is loaded automatically.
The REProxyManager
class is used to create and manage the instances of REProxyRT
. REProxyManager
has only static public methods. REProxyManager
does not have a public constructor and hence cannot be created by the user. REProxyManager
maintains an REProxyRT
pool and uses proxy names to reference individual REProxyRT
objects.
The following methods manage REProxyRT
objects:
createProxy
getProxy
destroyAllProxies
destroyProxy
For examples of how to use the proxy manager, see Chapter 5 and the complete example in Appendix A.
All the recommendation requests are processed through class REProxyRT
. Obtain a REProxyRT
object using createProxy
or getProxy
before you perform any recommendation tasks, such as handling sessions for a sessionful application, collecting customer profile data, and getting recommendations.
The following methods manage sessions:
createCustomerSession
createVisitorSession
closeSession
The following methods collect, preprocess, and store data in RE tables. The collected data can be persisted by setting appropriate configuration parameters:
addItem
addItems
removeItem
removeItems
The following method permits you to change a visitor to a customer (registered user):
setVisitorToCustomer
This method can be used in both sessionful or sessionless applications.
The following methods obtain and manage recommendations:
rateItem
rateItems
recommendTopItems
recommendBottomItems
recommendFromHotPicks
recommendCrossSellForItem
recommendCrossSellForItems
crossSellForItemFromHotPicks
crossSellForItemsFromHotPicks
selectFromHotPicks
Communicating the returned recommendations to the end user is the responsibility of the calling Web application. The calling Web application must also decide which recommendations to pass to the user. For example, the Web application may want to check that an item is in stock before recommending the item.
The methods that return recommendations do not necessarily return a list of items. If you set FilteringSettings.CategoryMembership
to one of the values
Enum.CategoryMembership.EXCLUDE_CATEGORIES
Enum.CategoryMembership.INCLUDE_CATEGORIES
Enum.CategoryMembership.SUBTREE_CATEGORIES
Enum.CategoryMembership.ALL_CATEGORIES
then the recommendation methods (such as recommendTopItems
) return categories.
Categories are components of a taxonomy. Taxonomies are defined in the following tables in the mining table repository (MTR):
MTR_ TAXONOMY
MTR_ TAXONOMY_CATEGORY
MTR_ TAXONOMY_CATEGORY_ITEM
MTR_CATERGORY
An appropriate taxonomy is crucial to the design of an OracleAS Personalization application. For information about how to create taxonomies, see Oracle Application Server Personalization Administrator's Guide.
Ratings in OracleAS Personalization are in "ascending order of goodness", that is, the higher the rating, the more the user prefers the item. Low rated items are items that the user does not prefer. OracleAS Personalization algorithms use these assumptions, so it is important that ratings are in ascending order of goodness. Note that sorting in ascending or descending order is possible on recommendations containing ratings.
The meaning of the value returned for recommendation instances where ItemDetailData.attribute
is equal to Enum.RecommendationAttribute.PREDICTION
depends on the value of interestDimension
as follows:
For InterestDimension.RATING
, the expected rating for the item is returned.
For InterestDimension.PURCHASING
or InterestDimension.NAVIGATION
, the ranking is returned. The most probable item is assigned a value of 1 and other items are assigned integer values representing their rank according to how probable the item is.
OracleAS Personalization uses rule tables stored in the RE to generate the recommendations requested by the recommendation methods. The rule tables are created in the MOR when a package is built and deployed to the RE. The specific rule table used depends upon the REAPI call made. In general, the antecedents of the rules are matched against the data in cache (both historical and current session data) and the probabilities of the various consequents are computed. These items are then ordered by probability, and numberOfItems
(an API argument) items are returned.
For detailed descriptions of these methods, see the OracleAS Personalization Javadoc included in the OracleAS Personalization section of the Oracle Application Server 10g Documentation Library. This section provides an overview of the methods and how to use them.
For both createCustomerSession
and createVisitorSession
, the calling Web application must provide session IDs that are unique among currently active sessions. If either method is invoked with a session ID that is currently active at the RE, an exception is thrown. However, a session ID can be reused as long as that session ID is not already active at the RE. appSessionID
is synchronized to the MTR by OracleAS Personalization. (For more information about data synchronization, see the administrator's guide.) OracleAS Personalization has no way to tell whether customerID
and appSessionID
are valid values; it is the responsibility of the calling Web application to verify that these values are valid.
To collect data, use addItem
or addItems
. Use removeItem
or removeItems
to remove data from the local cache.
For both addItem
and addItems
, items are cached locally first and synchronously written to the RE; the frequency of the writes is specified as a configuration parameter when OracleAS Personalization is installed. It is important that the data synchronization interval is frequent enough to support the Web applications' requirements. For more information about data synchronization, see the administrator's guide.
When an application needs to add several items at a time, it can either use several addItem
calls or one addItems
call. When using addItems
, the application must maintain the details of the items to be added until the call is made; in other words, the application needs to keep the state. It may be simpler to issue several addItem
calls.
addItem
and addItems
are asynchronous, so the calling application does not need to wait until either call saves the data to the RE database.
Data collected in the RE is automatically written to the MTR, if the RE is configured to do so.
removeitem
and removeItems
remove items that have not been written to the MTR (permanent storage). Once data is written to the MTR, you cannot use these methods to remove the data. In the applied scenario where a customer returns an item after purchasing it, this item will have to be removed from the MTR or source sales database. Similarly, if a customer abandons the purchase process before fully completing the process, the items selected for purchase will have to be removed from the MTR.
In createProxy
, you must specify a cache size and an interval. This section describes how to determine these values.
It takes experimentation to determine an optimum interval coupled with an appropriate cache size.
A good way to configure cache size and interval is the following:
Set cache size to approximately 3027 kilobytes.
Set interval according to the estimated data collection rate. (Example provided in Section 4.4.3.1, "Cache Size".)
Test.
Adjust the archive interval.
The cache size is the size of the cache used by the recommendation engine, in kilobytes.
There are several factors to consider when determining the cache size:
System resources: Since cache takes memory space, you must make sure that you have enough memory to do what you want.
Archive interval: The longer the interval, the larger the cache size.
Maximum VArray size: The PL/SQL procedure that performs the archive uses VArrays, and the maximum size is currently set at 5000. The archive can handle more than 5000 items, but performance increasingly worsens above that size. Therefore, it is not recommended to have the cache buffer larger than 5000. Each data item stored in the cache takes up about 340 bytes; so, the maximum VArray size translates to 3.4 MB (the actual cache buffer size is half of that since the cache has two buffers).
Data collection rate, the most important factor: If the data collection rate is no more than 100 items per second and the archive interval is 20 seconds, then a reasonable cache size is (assuming a safety factor of 1.5 to ensure that no data is dropped): 100 * 340 * 1.5 * 20, which is approximately 2 MB.
The interval determines how often the collected data is archived (flushed from memory to the RE schema). There are several factors to consider when determining this setting:
Data collection volume and speed: The more frequent the data collection and the larger the volume of data collected, the shorter the interval should be.
Cache size: The smaller the cache, the shorter the interval.
Use of current session data: If you want to use the current session data to improve the recommendation accuracy, the data should not be held in the cache for too long. If the volume and speed of the data collection is not a problem, an interval of 10-30 seconds may be fine.
The comments in this section apply to crossSellForItemFromHotPicks,
crossSellForItemsFromHotPicks
, recommendCrossSellForItem
, and recommendCrossSellForItems
.
For cross-sell recommendations, interest dimension must be the same as that of the data source type. Data source type must be either navigation or purchasing. No other types are supported.
The following filtering settings cannot be used with these methods:
setCategoryLevelFiltering
setCategorySubtreeFiltering
setCategoryExclusion
setCategoryFiltering(int)
setCategoryFiltering(int, long[])
Destroy proxy objects with caution. REProxyRT
objects are shared by many clients; therefore, destruction of a proxy may interrupt recommendation services. For Web applications, REProxyRT
objects should be treated as part of the server services; they should not be destroyed unless it is absolutely necessary. Note that there is one REProxyRT
object per JVM. Like other server components, these objects only need to be destroyed when the server is shut down or taken offline for maintenance purposes.
You can either destroy a specific proxy in the pool, using destroyProxy
, or all proxies in the pool, using destroyAllProxies
.