AGGREGATE command

The AGGREGATE command calculates summary data from detail data. Use the AGGREGATE command to pre-calculate data and store it in an Oracle OLAP analytic workspace. Use the AGGREGATE function to calculate data at runtime. In either case, the aggregation is limited to the base values that are currently in status.

Syntax

AGGREGATE|AGGR var... [USING aggmap] [FROM fromspec|FROMVAR textvar]

[FORCEORDER] [FUNCDATA] [COUNTVAR countvar...]

Arguments

var

One or more variables whose data values are to be calculated. Every variable in a single AGGREGATE command must have exactly the same dimensions in exactly the same order. The variables are often numeric, but can also be TEXT, DATETIME, or DATE when the aggregation operation is FIRST, LAST, MIN, or MAX, as specified in the aggmap.

USING

This keyword indicates that the aggregation is performed using the specified aggmap. When you do not include this phrase, the command uses the default aggmap for the variable as previously specified using an AGGMAP statement or the $AGGMAP property.

aggmap

The name of a previously-defined aggmap that specifies how the data will be aggregated. For information about aggmaps, see DEFINE AGGMAP.

FROM

This keyword indicates that the detail data is obtained from a different object.

A FROM clause is only one way in which you can specify the variable from which detail data should be obtained when performing aggregation. See "Ways of Specifying Where to Obtain Detail Data for Aggregation".

fromspec

An arbitrarily dimensioned variable, formula, or relation from which the detail data for the aggregation is obtained.

FROMVAR

This keyword indicates that the detail data is obtained from different objects to perform a capstone aggregation. (For an example of using the FROMVAR clause, see Example 7-7, "Capstone Aggregation".)

A FROMVAR clause is only one way in which you can specify the variable from which detail data should be obtained when performing aggregation. See "Ways of Specifying Where to Obtain Detail Data for Aggregation".

textvar

An arbitrarily dimensioned variable used to resolve any leaf nodes. Specify NA to indicate that a node does not need detail data to calculate the value.

FORCEORDER

Specifies that the calculation must be performed in the order in which the RELATION (for aggregation) statements are listed in the aggmap. Use this option when you have changed some of the values calculated by the AGGREGATE command. Otherwise, the optimization methods used by the AGGREGATE command may cause the modified values to be ignored.

Note:

You can also set a $AGGREGATE_FORCEORDER property on a variable to specify this behavior as the default aggregation behavior. In this case, you do not need to include the FORCEORDER keyword with the AGGREGATE command.

FUNCDATA

Compiles the aggregation specification for future use by the AGGREGATE function. When you use FUNCDATA, you do not have to recompile the aggmap before using the AGGREGATE function, unless afterward you make changes to the aggmap, the relation hierarchies, or a composite.

When the variables have composite dimensions, the indexes (composite tuples) are created and saved for use by the AGGREGATE function. Otherwise, the indexes are re-created each time the AGGREGATE function is called. Refer to AGGINDEX for more information about composite indexes.

COUNTVAR countvar

Indicates that Oracle OLAP should use the user-defined variable specified by countvar to store the non-NA counts of the number of leaf nodes that contributed to aggregate values calculated for RELATION (for aggregation) statements that have an AVERAGE, HAVERAGE, HWAVERAGE, or WAVERAGE operator.

Note:

Typically, you do not use a user-defined Countvar variable to store the counts for average aggregations. Instead, you use an Oracle OLAP-created Aggcount variable. You must use an Aggcount variable when the aggregation specification includes a RELATION (for aggregation) statement with one of the average operators is for a compressed composite.

For more information on Aggcount variables, see "Aggcount Variables".

The countvar variable must be an INTEGER variable with exactly the same dimensions in exactly the same order as the dimensions of the variable specified by var. When you aggregate several variables together, you must define an INTEGER variable for each one to record the results.

Notes

Effect of Status on AGGREGATE

The current status only affects dimension values at the lowest level of the hierarchy, that is, the leaf nodes. Only leaf-node dimension values that are currently in status are aggregated. The parent values of leaf nodes in status are calculated, whether the parent values are in status or not (unless you exclude the dimension values in those levels with a RELATION PRECOMPUTE statement in the aggmap). Thus, when you want to aggregate all of the data specified in the aggmap, then be sure to set the status of the dimensions to ALL before performing the aggregation. AGGREGATE uses the parent relation to distinguish among dimension values at different levels of the hierarchy. Alternatively, you can perform a partial aggregation of the data by limiting status. However, this must be done carefully when some of the data will be aggregated at runtime by the AGGREGATE function. See the AGGREGATE function notes for more information.

For example, suppose you use the area dimension and the area.area child-parent relation that supports one hierarchy for a geography dimension as illustrated in Table 7-2, "Geography Hierarchy".

Table 7-2 Geography Hierarchy

Level	area Dimension	area.area Parent Relation
1	`TotalUS`	`NA`
2	`East`	`TotalUS`
2	`South`	`TotalUS`
3	`Boston`	`East`
3	`New York`	`East`
3	`Atlanta`	`South`

Now suppose you change the data value for New York. When you then use AGGREGATE with only New York, the calculation occurs without including the child value for South (Atlanta), but still includes level 2 as it goes from level 3 to level 1 (TotalUS). When you want all the child values included in rolling up to TotalUS, use a LIMIT TO ALL statement before you execute the AGGREGATE command.

When the data has changed for some, but not all, of the child values in a hierarchy, you can set the status to calculate just the values that have changed. For example, when your embedded-total dimension is called d2, and its parent relation is called reld2, first limit d2 to the values that have changed.

To calculate the data for every hierarchy in a dimension, limit the dimension's hierarchy dimension to ALL before you execute the AGGREGATE command.

Controlling the Amount of Data That Is Calculated

You can control how much of the variable data is calculated by using the PRECOMPUTE keyword with the RELATION statement in the aggmap. Use the limit-clause (after the PRECOMPUTE keyword) to set the status of the dimension.

When Users Modify Data

When users are able to change the data in a variable, then you should calculate aggregates on the fly using the AGGREGATE function, so that their changes are reflected in the aggregate data. See the AGGREGATE function for more information about runtime changes to the data.

Generation-Skipping Hierarchies

AGGREGATE automatically distinguishes between generations in the parent relation, even to the extent of allowing generation-skipping hierarchies. For example, you can have a four-level hierarchy (for example, neighborhoods, cities, states, and totalUS) that has a three-level branch (for example, Boston, Massachusetts, and totalUS).

Restrictions on Permissions

AGGREGATE does not work on variables that have cell-by-cell permissions; it will immediately return an error. It also ignores the PERMITERROR option. However, AGGREGATE will operate on variables with object level or dimension level permission. See the PERMIT and PERMITERROR entries.

Ways of Specifying Where to Obtain Detail Data for Aggregation

You can specify where to obtain detail data when aggregating data in the following ways:

Assign either a $AGGREGATE_FROM property or a $AGGREGATE_FROMVAR property to a variable.

Note:
You can only assign one of these properties to a variable. A variable cannot have both the $AGGREGATE_FROM and $AGGREGATE_FROMVAR properties assigned to it.
Include either a FROM or FROMVAR clause in the AGGREGATE command or AGGREGATE function that aggregates the data.

When performing an aggregation, Oracle OLAP determines where to obtain the detail data as follows:

When a location has been specified using a FROM or FROMVAR clause, Oracle OLAP uses the detail data at that location.
When a location has not been specified using a FROM or FROMVAR clause, Oracle OLAP checks to see if a location has been specified using aa $AGGREGATE_FROM property or a $AGGREGATE_FROMVAR property. When a location has been specified using one of these properties, Oracle OLAP uses the detail data at that location.
When a location has not been specified using either FROM or FROMVAR clause or a $AGGREGATE_FROM property or a $AGGREGATE_FROMVAR property, Oracle OLAP performs the aggregation using the detail data in the variable itself.

Examples

This section contains several examples of using the AGGREGATE command. For additional aggregation examples, see the examples in AGGMAP.

Example 7-2 Precalculating Data in a Batch Job

Frequently, you generate precalculated aggregates in a batch window as part of maintaining the data in your database. When you wish, you can use Job Manager to schedule batch jobs in Oracle Enterprise Manager, as described in the Oracle OLAP Application Developer's Guide.

To generate precalculated aggregates, you use the AGGREGATE command. The AGGREGATE command aggregates the data for one or more variables according to the specifications provided in the aggmap.

Your batch job should include statements like the following.

POUTFILEUNIT=FILEOPEN('userfiles/progress.txt' WRITE)
AGGREGATE sales units USING gpct.aggmap
UPDATE
COMMIT
FILECLOSE POUTFILEUNIT

Example 7-3 Aggregating One Variable

Suppose your analytic workspace contains a variable named actuals, which has the following definition.

DEFINE actuals DECIMAL <time, SPARSE <product, customer, channel>>

The next step is to define an aggmap object, whose definition has the same dimensions in the same dimension order. Suppose you define an aggmap object named act.agg using DEFINE AGGMAP.

DEFINE act.agg AGGMAP <time, SPARSE <product, customer, channel>>

Suppose that the name of the hierarchy for the time dimension is time.r, the name of the product dimension is product.r, and so on Next, you use an AGGMAP statement to add the following text in the act.agg aggmap.

AGGMAP
RELATION time.r
RELATION product.r
RELATION customer.r
RELATION channel.r
END

The preceding text specifies the name of each dimension's hierarchy for which data should be rolled up. Assuming that the current status of every dimension is ALL, data will be calculated for every dimension value of every dimension in the definition of actuals. No data will be calculated on the fly.

Use the following statements to calculate the actuals variable. (It is not necessary to compile the aggmap, because the compilation is included as part of the AGGREGATE command.)

AGGREGATE actuals USING act.agg

Example 7-4 Aggregating Multiple Variables

Suppose your analytic workspace contains a variable named actuals and a variable named forecast. As shown in the following variable definitions, these variables have the same dimensions in the same dimension order.

DEFINE actuals DECIMAL <time, SPARSE <product, customer, channel>>
DEFINE forecast DECIMAL <time, SPARSE <product, customer, channel>>

The next step is to define an aggmap object, whose definition has the same dimensions in the same dimension order. Suppose you define the same aggmap object named act.agg, as described in "Aggregating One Variable". As long as you want the data for each variable to be rolled up in exactly the same way, you can use the same aggmap to calculate both variables in a single statement.

Use the following statements to calculate the actuals and the forecast variables.

AGGREGATE actuals forecast USING act.agg

Because the aggmap specifies that all data for every dimension value in each dimension should be rolled up, this statement rolls up all of the data in actuals and all of the data in forecast.

Example 7-5 Using COUNTVAR with Multiple Variables

Suppose you plan to use one AGGREGATE command to aggregate the data for three variables: sales, units, and projected_sales. Each variable has the following dimensionality.

<month product geography>

To tally the results with COUNTVAR, you must define three INTEGER variables that have the same dimensionality as sales, units, and projected_sales.

DEFINE intsales INTEGER <month product geography>
DEFINE intunits INTEGER <month product geography>
DEFINE intprojsales INTEGER <month product geography>

You can then specify the INTEGER variables in the following statement.

AGGREGATE sales units projected_sales USING sales.agg -
  COUNTVAR intsales intunits inprojsales

Example 7-6 Performing a Partial Aggregation

This example limits the time dimension to the last two time periods, so that only newly loaded data is aggregated.

The tp2.agg aggmap specifies preaggregation for all detail data currently in status.

DEFINE TP2.AGG AGGMAP
LD Full preaggregation
AGGMAP
RELATION time.parentrel PRECOMPUTE (ALL)
RELATION product.parentrel PRECOMPUTE (ALL)
END

For the aggregation, time is limited to the last two time periods and all product values are in status.

LIMIT time TO LAST 2
STATUS time product
The current status of TIME is:
Apr02, May02
LIMIT product TO ALL

The following AGGREGATE statement calculates units using the tp2.agg aggmap.

AGGREGATE units USING tp2.agg

The results of this aggregation show that parent values are calculated, regardless of their own status, when their children are in status.

LIMIT time TO '2002' 'Q1.02' 'Q2.02' 'Jan02' to 'May02'
REPORT DOWN time units

-----------------------------------------UNITS-----------------------------------------
 ----------------------------------------PRODUCT----------------------------------------
TIME     FOOD      SNACKS    DRINKS   POPCORN   COOKIES   CAKES     SODA      JUICE
-------  --------  --------  -------- --------  --------  --------  --------  --------
2002     38        24        14       6         9         9         9         5
Q1.02    NA        NA        NA       NA        NA        NA        NA        NA
Q2.02    38        24        14       6         9         9         9         5
Jan02    NA        NA        NA       8         2         4         5         8
Feb02    NA        NA        NA       5         3         2         2         5
Mar02    NA        NA        NA       3         4         4         2         4
Apr02    21        13        8        2         7         4         6         2
May02    17        11        6        4         2         5         3         3

Example 7-7 Capstone Aggregation

Assume that your analytic workspace has the two hierarchical TEXT dimensions named geog.d and time.d with the following values.

GEOG.D
--------------
Boston
Medford
San Diego
Sunnydale
Massachusetts
California
United States

TIME.D
--------------
Jan76
Feb76
Mar76
76Q1

Assume, also, that there are four variables with the following definitions

DEFINE sales_jan76 VARIABLE INTEGER <geog.d>
DEFINE sales_feb76 VARIABLE INTEGER <geog.d>
DEFINE sales_mar76 VARIABLE INTEGER <geog.d>
DEFINE sales_capstone76 VARIABLE INTEGER <geog.d time.d>

Assume that you issue the following REPORT statements for the variables. The output of the reports show the detail data in the variables.

REPORT sales_jan76  sales_feb76 sales_mar76
REPORT DOWN geog.d sales_capstone76

GEOG.D         SALES_JAN76  SALES_FEB76  SALES_MAR76
-------------- ------------ ------------ ------------
Boston                1,000        2,000        3,000
Medford               2,000        4,000        6,000
San Diego             3,000        6,000        9,000
Sunnydale             4,000        8,000       12,000
Massachusetts            NA           NA           NA
California               NA           NA           NA
United States            NA           NA           NA

               -----------------SALES_CAPSTONE76------------------
               ----------------------TIME.D-----------------------
GEOG.D            Jan76        Feb76        Mar76         76Q1
-------------- ------------ ------------ ------------ ------------
Boston                   NA           NA           NA           NA
Medford                  NA           NA           NA           NA
San Diego                NA           NA           NA           NA
Sunnydale                NA           NA           NA           NA
Massachusetts            NA           NA           NA           NA
California               NA           NA           NA           NA
United States            NA           NA           NA           NA

Define two aggmap objects with the following definitions.

DEFINE leaf_aggmap AGGMAP
AGGMAP
RELATION geog.parentrel OPERATOR SUM
END

DEFINE capstone_aggmap AGGMAP
AGGMAP
RELATION time.parentrel OPERATOR SUM
END

Define a variable named capstone_source with the following definition to use to aggregate the data.
```
DEFINE capstone_source VARIABLE TEXT <time.d>
```
As the following output of a REPORT statement illustrates, for each value of time.d, you populate capstone_source with the name of the variable that contains the corresponding sales data.
```
TIME.D            CAPSTONE_SOURCE
-------------- ----------------------
Jan76          sales_jan76
Feb76          sales_feb76
Mar76          sales_mar76
76Q1           NA
```

Issue the following statements to aggregate the variables.

AGGREGATE sales_jan76 sales_feb76 sales_mar76 USING leaf_aggmap
AGGREGATE sales_capstone76 USING capstone_aggmap FROMVAR capstone_source

After aggregating the variables, when you issue the REPORT statements, the variables are populated with the calculated data.

REPORT sales_jan76  sales_feb76 sales_mar76
REPORT DOWN geog.d sales_capstone76

GEOG.D         SALES_JAN76  SALES_FEB76  SALES_MAR76
-------------- ------------ ------------ ------------
Boston                1,000        2,000        3,000
Medford               2,000        4,000        6,000
San Diego             3,000        6,000        9,000
Sunnydale             4,000        8,000       12,000
Massachusetts         3,000        6,000        9,000
California            7,000       14,000       21,000
United States        10,000       20,000       30,000

               -----------------SALES_CAPSTONE76------------------
               ----------------------TIME.D-----------------------
GEOG.D            Jan76        Feb76        Mar76         76Q1
-------------- ------------ ------------ ------------ ------------
Boston                1,000        2,000        3,000        6,000
Medford               2,000        4,000        6,000       12,000
San Diego             3,000        6,000        9,000       18,000
Sunnydale             4,000        8,000       12,000       24,000
Massachusetts         3,000        6,000        9,000       18,000
California            7,000       14,000       21,000       42,000
United States        10,000       20,000       30,000       60,000

Example 7-8 Aggregating a Variable with External Partitions

Assume that you have the following objects defined in your analytic workspace.

DEFINE YEAR_2003 DIMENSION TEXT
DEFINE YEAR_2002 DIMENSION TEXT
DEFINE PRODUCT DIMENSION TEXT
DEFINE SALES_2003 VARIABLE DECIMAL <YEAR_2003 PRODUCT>
DEFINE SALES_2002 VARIABLE DECIMAL <YEAR_2002 PRODUCT>
DEFINE TIME DIMENSION CONCAT (YEAR_2003 YEAR_2002) UNIQUE
DEFINE TIME_PARENTREL RELATION TIME <TIME>
DEFINE PART_TEMP_SALES_BY_YEAR PARTITION TEMPLATE <TIME PRODUCT> -
   PARTITION BY CONCAT (TIME) -
    (PARTITION PARTITION_2002 <YEAR_2002 PRODUCT> -
     PARTITION PARTITION_2003 <YEAR_2003 PRODUCT>)
DEFINE SALES VARIABLE DECIMAL <PART_TEMP_SALES_BY_YEAR <TIME PRODUCT>> -
    (PARTITION PARTITION_2002 EXTERNAL SALES_2002 -
     PARTITION PARTITION_2003 EXTERNAL SALES_2003)
DEFINE AGG_SALES AGGMAP
  AGGMAP
  RELATION time_parentrel OPERATOR SUM
  END

To aggregate sales, you issue the following statement.

AGGREGATE sales USING agg_sales

When you issue REPORT statements on sales, you can see the aggregated values in sales.

--------SALES--------
           -------PRODUCT-------
TIME         00001      00002
---------- ---------- ----------
01Jan2003       10.00      15.21
31Jan2003       10.88      13.37
01Dec2003          NA         NA
31Dec2003          NA         NA
Jan2003         20.88      28.58
Dec2003            NA         NA
2003            20.88      28.58
01Jan2002       14.44      11.03
31Jan2002       15.55      12.20
01Dec2002       11.39      12.80
31Dec2002       10.53      13.77
Jan2002         29.98      23.23
Dec2002         21.92      26.57
2002            51.91      49.80

Since sales_2002 and sales_2003 are external partitions of sales, aggregating sales effectively means that you aggregated sales_2002 and sales_2003. When you issue REPORT statements onsales_2002 and sales_2003, you can see the aggregated values in those variables.

-----SALES_2002------
           -------PRODUCT-------
YEAR_2002    00001      00002
---------- ---------- ----------
01Jan2002       14.44      11.03
31Jan2002       15.55      12.20
01Dec2002       11.39      12.80
31Dec2002       10.53      13.77
Jan2002         29.98      23.23
Dec2002         21.92      26.57
2002            51.91      49.80
 
           -----SALES_2003------
           -------PRODUCT-------
YEAR_2003    00001      00002
---------- ---------- ----------
01Jan2003       10.00      15.21
31Jan2003       10.88      13.37
01Dec2003          NA         NA
31Dec2003          NA         NA
Jan2003         20.88      28.58
Dec2003            NA         NA
2003            20.88      28.58