8 Converting to the ODM 10.2 Java API

This chapter will assist you in converting your data mining applications from the 10.1 proprietary Java API to the standard-compliant Java API available with Oracle 10g Release 2 (10.2).

8.1 Comparing the 10.1 and 10.2 Java APIs

The new ODM Java API available with Oracle 10g Release 2 (10.2)is standardized under the Java Community Process and is fully compliant with the JDM 1.0 standard. Oracle supports open standards for Java and is one of the primary vendors that implements JDM.

The ODM 10.2 JDM-based API replaces the proprietary Java API for data mining that was available with Oracle 10.1.

Note:

The proprietary Java API is no longer supported in ODM 10.2.

If you have created applications in 10.1 and you want to use them in your Oracle 10.2 installation, you must convert them to use the 10.2 API.

Table 8-1 lists the major differences between the ODM 10.1 and ODM 10.2 Java APIs.

Table 8-1 Differences Between Oracle 10.1 and 10.2 Java APIs for Data Mining

Feature	ODM 10.1 Java API	ODM 10.2 Java API
Standards	Oracle proprietary Java API designed for accessing data mining functionality in the Database. Not supported in Oracle 10.2.	Java industry standard API defined under Java Community Process (JCP). ODM 10.2 implements conformant subsets of the standard along with Oracle proprietary extensions.
Interoperability with `DBMS_DATA_MINING` PL/SQL API	Not interoperable with models created by the PL/SQL API.	Interoperable with PL/SQL API. All objects created using the ODM 10.2 Java API can be used with the PL/SQL API. Results and values are consistent with the PL/SQL API.
Functions and algorithms	Classification function NB, ABN, SVM Clustering function k-Means, O-Cluster Regression function SVM Association function Apriori Attribute Importance function MDL Feature Extraction function NMF	Classification function NB, ABN, SVM, Tree Clustering function k-Means (PL/SQL API version), OCluster Regression function SVM Association function Apriori Attribute Importance function MDL Feature Extraction function NMF
Object creation	Primarily designed as Java classes. Objects are instantiated using constructors or static `create` methods.	Uses the factory method pattern to instantiate objects. `javax.dataminig.Connection` is the primary factory for all other object factories. Oracle extensions follow the same pattern for object creation.
Task execution	Tasks executed by `oracle.dmt.odm.task.MiningTask`. `ExecutionHandle` and `MiningTaskStatus` used for task execution tracking. Asynchronous task execution implemented by `DBMS_JOB`.	Tasks executed by `javax.datamining.Connection`. `ExecutionHandle` and `ExecutionStatus` used for task execution tracking. Asynchronous task execution implemented by `DBMS_SCHEDULER`.
Data	Supports both physical and logical data representations. Supports transactional and non-transactional format. Transactional format enables sparse data representation and wide data (>1000 columns)	Supports only physical data representation. Logical data can be represented with database views. Supports nested tables in place of transactional format.
Settings for model building	Settings for model building created by `oracle.dmt.odm.settings.function. MiningFunctionSettings`	Settings for model building created by `javax.datamining.base.BuildSettings`. Settings are saved as a table in the user's schema. The name of the `BuildSettings` object must be unique in the namespace of the table object.
Model	Models represented by `oracle.dmt.odm.model.MiningModel`. The `MiningModel` object stores the automated transformation details.	Models represented by `javax.datamining.base.Model`. The `Model` object does not store transformation details. Applications must manage the transformation details.
Cost matrix	Cost matrix represented by `oracle.dmt.odm.CostMatrix`. Cost matrix for all classification algorithms is specified at build time, even though the cost matrix is used as a post-processing step to the apply operation.	Cost matrix represented by `javax.datamining.supervised. classification.CostMatrix`. Cost matrix for the decision tree algorithm is specified at build time. All other classification algorithms are specified with apply and test operations.
Model detail	Model details not represented as an object. Model details are stored with the associated model object.	Model details represented by `javax.dataminig.base.ModelDetail`.
Apply settings	Apply settings represented by `oracle.dmt.odm.result. MiningApplyOutput`.	Apply settings represented by `javax.datamining.task.apply. ApplySettings`.
Results object	Mining results represented by `oracle.dmt.odm.result.MiningResult`.	Mining results are not explicit objects. Each task creates either a Java object or a database object such as a table.
Transformations	Supports automated data preparation. Provides utility methods for external and embedded data preparation.	Does not support automated transformations. The transformation task `oracle.dmt.jdm.task. OraTransformationTask` can be used to emulate automated transformations.
Text transformation	Supports text data types, such as CLOB and BLOB, for SVM and NMF. No explicit text transformations are provided.	Supports explicit text transformations. These can be used with any algorithm to emulate text data type support.

8.2 Converting Your Applications

Most objects in the ODM 10.2 API are similar to the objects in the ODM 10.1 API. However, there are some major differences in class names, package structures, and object usage. Some of the primary differences are:

In 10.1, all primary objects are created using constructors or create methods. In 10.2, objects are created using object factories, as described in "Connection Factory" and "Features of a DMS Connection".
In 10.1, DMS metadata-related operations are distributed in each class. In 10.2, most DMS metadata-related operations are centralized in a Connection object. For example, a mining task is restored in 10.1 with the MiningTask.restore method and in 10.2 with the Connection.retrieveObject method.
In 10.1, all named objects are persisted in the database. In 10.2, PhysicalDataSet and ApplySettings are transient objects.

Note:

Although the ODM 10.1 Java API is incompatible with Oracle 10.2, future releases will follow the backward compatibility scheme proposed by the JDM standard.

Table 8-2 provides sample code for performing various mining operations in both 10.1 and 10.2. Refer to Chapter 6 for additional 10.2 code samples.

Table 8-2 Sample Code from 10.1 and 10.2 ODM Java APIs

ODM 10.1 Java API	ODM 10.2 Java API
Connect to the DMS	Connect to the DMS
//Create a DMS object DataMiningServer m_dms = new DataMiningServer ( "put DB URL here", //JDBC URL "user name", //User Name "password" //Password ); //Login to the DMS and create a DMS Connection m_dmsConn = m_dms.login();	//Create ConnectionFactory & connection OraConnectionFactory m_dmeConnFactory = new OraConnectionFactory(); ConnectionSpec connSpec = m_dmeConnFactory.getConnectionSpec(); connSpec.setURI( "put DB URL here" ); connSpec.setName( "user name" ); connSpec.setPassword( "password" ); m_dmeConn = m_dmeConnFactory.getConnection( connSpec );
Create a PhysicalDataSpecification	Create and Save PhysicalDataSet
LocationAccessData lad = new LocationAccessData ( "MINING_DATA_BUILD_V", //Table/view Name "DMUSER" //Schema Name ); PhysicalDataSpecification pds = newNonTransactionalDataSpecification (lad);	PhysicalDataSetFactory pdsFactory = ( PhysicalDataSetFactory )m_dmeConn.getFactory ( "javax.datamining.data.PhysicalDataSet" ); m_paFactory = ( PhysicalAttributeFactory ) m_dmeConn.getFactory ( "javax.datamining.data.PhysicalAttribute" ); PhysicalDataSet buildData = m_pdsFactory.create ( "MINING_DATA_BUILD_V",false ); PhysicalAttribute pa =m_paFactory.create ( "cust_id", AttributeDataType.integerType, PhysicalAttributeRole.caseId ); buildData.addAttribute( pa ); m_dmeConn.saveObject( "nbBuildData", buildData, true );
Create and Save MiningFunctionSettings	Create BuildSettings
NaiveBayesSettings nbAlgo = new NaiveBayesSettings (0.01f, 0.01f); ClassificationFunctionSettings mfs = ClassificationFunctionSettings.create ( m_dmsConn, //DMS Connection nbAlgo, //NB algorithm settings pds, //Build data specification "AFFINITY_CARD", //Target column AttributeType.categorical, //Attribute type DataPreparationStatus.unprepared ); //Set Cust_ID attribute as inactive mfs.adjustAttributeUsage( new String[]{"CUST_ID"}, AttributeUsage.inactive ); mfs.store( m_dmsConn,"NBDemo_MFS" );	m_clasFactory = ( ClassificationSettingsFactory ) m_dmeConn.getFactory ( "javax.datamining.supervised.classification. ClassificationSettings" ); m_nbFactory = ( NaiveBayesSettingsFactory ) m_dmeConn.getFactory ("javax.datamining.algorithm.naivebayes. NaiveBayesSettings"); //Create NB algorithm settings NaiveBayesSettings nbAlgo = m_nbFactory.create(); nbAlgo.setPairwiseThreshold( 0.01f ); nbAlgo.setSingletonThreshold( 0.01f ); //Create ClassificationSettings ClassificationSettings buildSettings = m_clasFactory.create(); buildSettings.setAlgorithmSettings(nbAlgo); buildSettings.setTargetAttributeName ( "affinity_card"); m_dmeConn.saveObject ("nbBuildSettings",buildSettings,true);
Create and Execute MiningBuildTask	Create and Execute BuildTask
MiningBuildTask buildTask = new MiningBuildTask ( pds, //Build data specification "NBDemo_MFS", //Mining function settings "NBDemo_Model" //Mining model name ); //Store the taskbuild buildTask.store( m_dmsConn,"NBDemoBuildTask" ); Task.execute( m_dmsConn ); //Wait for completion of the task MiningTaskStatus taskStatus = buildTask.waitForCompletion( m_dmsConn );	m_buildFactory = ( BuildTaskFactory ) m_dmeConn.getFactory ( "javax.datamining.task.BuildTask" ); BuildTask buildTask = m_buildFactory.create ( "nbBuildData", //Build data specification "nbBuildSettings", //build settings name "nbModel" //Mining model namem_dme ); Conn.saveObject( "nbBuildTask", taskObj, true ); ExecutionHandle execHandle = m_dmeConn.execute( taskName ); ExecutionStatus status = execHandle.waitForCompletion( Integer.MAX_VALUE);
Retrieve MiningModel	Retrieve Model
NaivebayesModel model = ( NaiveBayesModel ) SupervisedModel.restore ( m_dmeConn, "NBDemo_Model" );	ClassificationModel model = ( ClassificationModel ) m_dmeConn.retrieveObject ( "nbModel", NamedObject.model );
Evaluate the Model	Evaluate the Model
//Compute accuracy & confusionmatrix LocationAccessData lad = new LocationAccessData ( "MINING_DATA_TEST_V", "DMUSER" ); //Schema Name PhysicalDataSpecification pds = new NonTransactionalDataSpecification( lad ); ClassificationTestTask testTask = new ClassificationTestTask ( pds,"NBDemo_Model", "NBDemo_TestResults" ); testTask.store( m_dmsConn, "NBDemoTestTask" ); testTask.execute( m_dmsConn ); MiningTaskStatus taskStatus = testTask.waitForCompletion( m_dmsConn ); ClassificationTestResult testResult = ClassificationTestResult.restore ( m_dmsConn, "NBDemo_TestResults" ); float accuracy = testResult.getAccuracy(); CategoryMatrix confusionMatrix = TestResult.getConfusionMatrix(); //Compute lift Category positiveCategory = new Category ( "Positive value", "1",DataType.intType ); MiningLiftTask liftTask = new MiningLiftTask ( pds, 10, //Number of quantiles to be used positiveCategory, /positive target value "NBDemo_Model", // model to be tested "NBDemo_LiftResults" //Lift results name ); liftTask.store( m_dmsConn, "NBDemoLiftTask" ); liftTask.execute( m_dmsConn ); MiningTaskStatus taskStatus = liftTask.waitForCompletion( m_dmsConn ); MiningLiftResult liftResult = MiningLiftResult.restore ( m_dmsConn,"NBDemo_LiftResults" );	//Compute accuracy, confusion matrix, lift & roc PhysicalDataSet testData = m_pdsFactory.create ( "MINING_DATA_TEST_V", false ); PhysicalAttribute pa = m_paFactory.create ( "cust_id", AttributeDataType.integerType, PhysicalAttributeRole.caseId ); testData.addAttribute( pa ); m_dmeConn.saveObject ( "nbTestData", testData, true ); ClassificationTestTask testTask = m_testFactory.create ( "nbTestData", "nbModel", "nbTestMetrics" ); testTask.setNumberOfLiftQuantiles( 10 ); testTask.setPositiveTargetValue( new Integer(1) ); m_dmeConn.saveObject( "nbTestTask", testTask, true ); ExecutionHandle execHandle = m_dmeConn.execute("nbTestTask"); ExecutionStatus status = execHandle.waitForCompletion ( Integer.MAX_VALUE ); ClassificationTestMetrics testMetrics = ( ClassificationTestMetrics ) m_dmeConn.retrieveObject ( "nbTestMetrics", NamedObject.testMetrics ); Double accuracy = testMetrics.getAccuracy(); ConfusionMatrix confusionMatrix = testMetrics.getConfusionMatrix(); Lift lift = testMetrics.getLift(); ReceiverOperatingCharacterics roc = testMetrics.getROC();
Apply the Model	Apply the Model
LocationAccessData lad = new LocationAccessData ( "MINING_DATA_APPLY_V", "DMUSER"); PhysicalDataSpecification pds = new NonTransactionalDataSpecification( lad ); MiningApplyOutput mao = MiningApplyOutput.createDefault(); MiningAttribute srcAttribute = new MiningAttribute ( "CUST_ID", DataType.intType, AttributeType.notApplicable ); Attribute destAttribute = new Attribute ("CUST_ID", DataType.intType); ApplySourceAttributeItem m_srcAttrItem = new ApplySourceAttributeItem ( srcAttribute,destAttribute); mao.addItem(m_srcAttrItem); LocationAccessData outputTable = new LocationAccessData ( "NBDemo_Apply_Output", "DMUSER"); MiningApplyTask applyTask = new MiningApplyTask ( pds, //test data specification "NBDemo_Model", //Input model name mao, //MiningApplyOutput object outputTable, //Apply output table "NBDemo_ApplyResults" //Apply results ); applyTask.store( m_dmsConn, "NBDemoApplyTask" ); applyTask.execute( m_dmsConn ); MiningTaskStatus taskStatus = applyTask.waitForCompletion( m_dmsConn );	PhysicalDataSet applyData = m_pdsFactory.create ( "MINING_DATA_APPLY_V", false ); PhysicalAttribute pa = m_paFactory.create ( "cust_id", AttributeDataType.integerType, PhysicalAttributeRole.caseId ); applyData.addAttribute( pa ); m_dmeConn.saveObject( "nbApplyData",applyData,true ); ClassificationApplySettings clasAS = m_applySettingsFactory.create(); m_dmeConn.saveObject( "nbApplySettings",clasAS,true ); DataSetApplyTask applyTask = m_dsApplyFactory.create ( "nbApplyData", "nbModel", "nbApplySettings", "nb_apply_output" ); m_dmeConn.saveObject ( "nbApplyTask", applyTask, true ); ExecutionHandle execHandle = m_dmeConn.execute( "nbApplyTask" ); ExecutionStatus status = execHandle.waitForCompletion( Integer.MAX_VALUE );