Skip Headers

Oracle® Database Data Warehousing Guide
10g Release 1 (10.1)

Part Number B10736-01

Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Ma
ster Index
Master Index
Go to Feedback page
Feedb ack

Go to previous page
Previous
Go to next page
Next
View PDF

10 Dimensions

The following sec tions will help you create and manage a data warehouse:

What are Dimensions?

A dimension is a structure that categorizes da ta in order to enable users to answer business questions. Commonly used dimensions are customers, products, and time. For example, ea ch sales channel of a clothing retailer might gather and store data regarding sales and reclamations of their Cloth assortment. The r etail chain management can build a data warehouse to analyze the sales of its products across all stores over time and help answer qu estions such as:

The data in the retailer's data warehouse system has t wo important components: dimensions and facts. The dime nsions are products, customers, promotions, channels, and time. One approach for identifying your dimensions is to review your refere nce tables, such as a product table that contains everything about a product, or a promotion table containing all information about p romotions. The facts are sales (units sold) and profits. A data warehouse contains facts about the sales of each product at on a dail y basis.

A typical relational implementation for such a data warehouse is a star schema. The fact information is stored in wha t is called a fact table, whereas the dimensional information is stored in dimension tables. In our example, each sales transaction r ecord is uniquely defined as for each customer, for each product, for each sales channel, for each promotion, and for each day (time) .


See Also:

Chapter 19, " Schema Modeling Techniques" for further details

In Oracle Database, the dimensional information itself is stored in a dimension table. In addition, the database object dimension helps to organize and group dimensional information into hierarchies. This represe nts natural 1:n relationships between columns or column groups (the levels of a hierarchy) that cannot be represented wi th constraint conditions. Going up a level in the hierarchy is called rolling up the data and going down a lev el in the hierarchy is called drilling down the data. In the retailer example:

Figure 10-1 Sample Rollup for a Customer Dimension

Description of dwhsg072.gif follows
Description of the illustration dwhsg072.gif

Data analysis typically starts at hig her levels in the dimensional hierarchy and gradually drills down if the situation warrants such analysis.

Dimensions do not h ave to be defined. However, if your application uses dimensional modeling, it is worth spending time creating them as it can yield si gnificant benefits, because they help query rewrite perform more complex types of rewrite. Dimensions are also beneficial to certain types of materialized view refresh operations and with the SQLAccess Advisor. They are only mandatory if you use the SQLAccess Adviso r (a GUI tool for materialized view and index management) without a workload to recommend which materialized views and indexes to cre ate, drop, or retain.


See Also:

Chapter 18, " Query Rewrite" for further details regarding query rewrite and Chapter 17, " SQLAccess Advisor" for further details regarding the SQLAccess Advisor

In spite of the benefits of dimensions, you must not create dimensions in any s chema that does not fully satisfy the dimensional relationships described in this chapter. Incorrect results can be returned from que ries otherwise.

Creating Dimensions

Before you can create a dimension object, the dimension tables must exist in the database possibly containing the d imension data. For example, if you create a customer dimension, one or more tables must exist that contain the city, state, and count ry information. In a star schema data warehouse, these dimension tables already exist. It is therefore a simple task to identify whic h ones will be used.

Now you can draw the hierarchies of a dimension as shown in Figure 10-1. For exam ple, city is a child of state (because you can aggregate city-level data up to state), and country. This hierarchical information will be stored in the database object dimension.

In the case of normalized or partially no rmalized dimension representation (a dimension that is stored in more than one table), identify how these tables are joined. Note whe ther the joins between the dimension tables can guarantee that each child-side row joins with one and only one parent-side row. In th e case of denormalized dimensions, determine whether the child-side columns uniquely determine the parent-side (or attribute) columns . If you use constraints to represent these relationships, they can be enabled with the NOVALIDATE and RELY clauses if the relationships represented by the constraints are guaranteed by other means.

You create a dimension using eithe r the CREATE DIMENSION statement or the Dimension Wizard in Oracle E nterprise Manager. Within the CREATE DIMENSION statement, use the LEVEL clause to identify the names of the dimension levels.



This customer dimension contains a single hierarchy with a geographical r ollup, with arrows drawn from the child level to the parent level, as shown in Figure 10-1.

Each arrow in this graph indicates that for any child there is one and only one parent. For example, each city must be contained in exactly one state and each state must be contained in exactly one country. States that belong to more than one country, or that belong to no cou ntry, violate hierarchical integrity. Hierarchical integrity is necessary for the correct operation of management functions for mater ialized views that include aggregates.

For example, you can declare a dimension products_dim, which contains leve ls product, subcategory, and category:

CREATE DIMENSION products_dim
       LEVEL product           IS
(products.prod_id)
       LEVEL subcategory       IS (products.prod_subcategory)
       LEVEL category          IS (products.prod_cat
egory) ...

Each level in the dimension must correspond to one or more columns in a table in the database. Thus, level pro duct is identified by the column prod_id in the products table and level subcategory is identified by a column called prod_subcategory in the same table.

In this example, the database tables are denormalized and all the columns exist in the same table. However, this is not a prerequisite for creating dimensions. "Using Normalized Dimension Tabl es" shows how to create a dimension customers_dim that has a normalized schema design using the JOIN KEY clause.

The next step is to declare the relationship between the levels with the HIERARCHY stateme nt and give that hierarchy a name. A hierarchical relationship is a functional dependency from one level of a hierarchy to the next l evel in the hierarchy. Using the level names defined previously, the CHILD OF relationship denotes that eac h child's level value is associated with one and only one parent level value. The following statement declares a hierarchy prod _rollup and defines the relationship between products, subcategory, and category.

HIERARCHY prod
_rollup 
 (product         CHILD OF
  subcategory     CHILD OF
  category)

In addition to the 1:n hierarchic al relationships, dimensions also include 1:1 attribute relationships between the hierarchy levels and their dependent, determined dimension attributes. For example, the dimension times_dim, as defined in Oracle Database Sample Schemas, has columns fiscal_month_desc, fiscal_m onth_name, and days_in_fiscal_month. Their relationship is defined as follows:

LEVE
L fis_month   IS TIMES.FISCAL_MONTH_DESC
...
ATTRIBUTE fis_month DETERMINES
      (fiscal_month_name, days_in_fiscal_month)

The ATTRIBUTE ... DETERMINES clause relates fis_month to fiscal_month_name and days_in_fiscal_month. Note that this is a unidirectional determination. It is only guaranteed , that for a specific fiscal_month, for example, 1999-11, you will find exactly one matching values for fiscal_month_name, for example, November and days_in_fiscal_month, for example, 28. You cannot de termine a specific fiscal_month_desc based on the fiscal_month_name, which is November for eve ry fiscal year.

In this example, suppose a query were issued that queried by fiscal_month_name instead of f iscal_month_desc. Because this 1:1 relationship exists between the attribute and the level, an already aggregated materialized view containing fiscal_month_desc can be joined back to the dimension information and used to identify the data.


See Als o:

Oracle Databa se SQL Reference for a complete description of the CREATE DIMENSION statement

A sample dimension definition follows:

CREATE
DIMENSION products_dim
    LEVEL product           IS (products.prod_id)
    LEVEL subcategory       IS (products.prod_subcategory)
   LEVEL category          IS (products.prod_category)
    HIERARCHY prod_rollup (
         product         CHILD OF
         subcate
gory     CHILD OF
         category)
    ATTRIBUTE product DETERMINES
     (products.prod_name, products.prod_desc,
      prod_weight
_class, prod_unit_of_measure,
      prod_pack_size, prod_status, prod_list_price, prod_min_price)
    ATTRIBUTE subcategory DETERMINE
S
     (prod_subcategory, prod_subcategory_desc)
    ATTRIBUTE category DETERMINES
     (prod_category, prod_category_desc);

Alternatively, the extended_attribute_clause could have been used instead of the attribute_clause , as shown in the following example:

CREATE DIMENSION products_dim
    LEVEL product
    IS (products.prod_id)
    LEVEL subcategory       IS (products.prod_subcategory)
    LEVEL category          IS (products.prod_ca
tegory)
    HIERARCHY prod_rollup (
         product         CHILD OF
         subcategory     CHILD OF
         category
        )
   ATTRIBUTE product_info LEVEL product DETERMINES
     (products.prod_name, products.prod_desc,
      prod_weight_class, prod_unit_o
f_measure,
      prod_pack_size, prod_status, prod_list_price, prod_min_price)
    ATTRIBUTE subcategory DETERMINES
     (prod_subcat
egory, prod_subcategory_desc)
    ATTRIBUTE category DETERMINES
     (prod_category, prod_category_desc);

The design, cre ation, and maintenance of dimensions is part of the design, creation, and maintenance of your data warehouse schema. Once the dimensi on has been created, check that it meets these requirements:

  • There must be a 1:n relationship between a parent and children. A parent can have one or more children, but a child can have only one parent.

  • There must be a 1:1 attribute relationship between hierarchy levels and their dependent dimension attributes. For example, if there is a c olumn fiscal_month_desc, then a possible attribute relationship would be fiscal_month_desc to fiscal_ month_name.

  • If the columns of a parent level and child level are in different relations, then the connection between them also requires a 1:n join relationship. Each row of the child table must join with one and only one row of th e parent table. This relationship is stronger than referential integrity alone, because it requires that the child join key must be n on-null, that referential integrity must be maintained from the child join key to the parent join key, and that the parent join key m ust be unique.

  • You must ensure (using database constraints if necessary) that the columns of each hierar chy level are non-null and that hierarchical integrity is maintained.

  • The hierarchies of a dimension can overlap or be disconnected from each other. However, the columns of a hierarchy level cannot be associated with more than one dimens ion.

  • Join relationships that form cycles in the dimension graph are not supported. For example, a hierar chy level cannot be joined to itself either directly or indirectly.


See Also:

Chapter 18, " Query Rewrite" for further details of using dimensional information

Note:

The information stored with a dimension objects is only declarative. The previously disc ussed relationships are not enforced with the creation of a dimension object. You should validate any dimension definition with the < code>DBMS_DIMENSION.VALIDATE_DIMENSION procedure, as discussed on "Validating Dimensions".

Dropping and Creating Attributes with C olumns

You can use the attribute clause in a CREATE DIMENSION statement to specify one or mu ltiple columns that are uniquely determined by a hierarchy level.

If you use the extended_attribute_clause to create multiple columns determined by a hierarchy level, you can drop one attribute column without dropping them all. Alternati vely, you can specify an attribute name for each attribute clause CREATE or ALTER DIMENSION st atement so that an attribute name is specified for each attribute clause where multiple level-to-column relationships can be individu ally specified.

The following statement illustrates how you can drop a single column without dropping all columns:

CREATE DIMENSION products_dim
LEVEL product         IS (products.prod_id)
LEVEL subcategory     IS (products.prod_
subcategory)
LEVEL category        IS (products.prod_category)
HIERARCHY prod_rollup (
          product        CHILD OF
          su
bcategory    CHILD OF category)
ATTRIBUTE product DETERMINES
         (products.prod_name, products.prod_desc,
          prod_weight_
class, prod_unit_of_measure,
          prod_pack_size,prod_status, prod_list_price, prod_min_price)
ATTRIBUTE subcategory_att DETERMI
NES
         (prod_subcategory, prod_subcategory_desc)
ATTRIBUTE category DETERMINES
         (prod_category, prod_category_desc);

A
LTER DIMENSION products_dim
DROP ATTRIBUTE subcategory_att LEVEL subcategory COLUMN prod_subcategory;

See Also:

Oracle Database SQL Reference for a complete description of the CREATE DIMENSION statement

Multiple Hierarchies

A s ingle dimension definition can contain multiple hierarchies. Suppose our retailer wants to track the sales of certain items over time . The first step is to define the time dimension over which sales will be tracked. Figure 10-2 illustrates a dimension times_dim with two time hierarchies.

Figure 10-2 times_dim Dimension with Two Time H ierarchies

Description of dwhsg075.gif follows
Description of the illustration dwhsg075.gif

From the illustration, you can construct the hierarchy of the denormalized time_dim dimension's CREATE DIMENSION statement as follows. The complete CREATE DIMENSION statement as well as the CREATE TABLE statement are shown in Oracl e Database Sample Schemas.

CREATE DIMENSION times_dim
   LEVEL day         IS times.time_id
 LEVEL month       IS times.calendar_month_desc
   LEVEL quarter     IS times.calendar_quarter_desc
   LEVEL year        IS times.cal
endar_year
   LEVEL fis_week    IS times.week_ending_day
   LEVEL fis_month   IS times.fiscal_month_desc
   LEVEL fis_quarter IS time
s.fiscal_quarter_desc
   LEVEL fis_year    IS times.fiscal_year
   HIERARCHY cal_rollup    (
             day     CHILD OF
   month   CHILD OF
             quarter CHILD OF
             year
   )
   HIERARCHY fis_rollup    (
             day         CHILD
OF
             fis_week    CHILD OF
             fis_month   CHILD OF
             fis_quarter CHILD OF
             fis_year
   ) &
lt;attribute determination clauses>;

Using Normalized Dimension T ables

The tables used to define a dimension may be normalized or denormalized a nd the individual hierarchies can be normalized or denormalized. If the levels of a hierarchy come from the same table, it is called a fully denormalized hierarchy. For example, cal_rollup in the times_dim dimension is a denormalized hierar chy. If levels of a hierarchy come from different tables, such a hierarchy is either a fully or partially normalized hierarchy. This section shows how to define a normalized hierarchy.

Suppose the tracking of a customer's location is done by city, state, and country. This data is stored in the tables customers and countries. The customer dimension customers_dim is partially no rmalized because the data entities cust_id and country_id are taken from different tables. The clause JOIN KEY within the dimension definition specifies how to join together the levels in the hierarchy. The dimensi on statement is partially shown in the following. The complete CREATE DIMENSION statement as well as the CREATE TABLE statement are shown in Orac le Database Sample Schemas.

CREATE DIMENSION customers_dim
   LEVEL customer  IS (customers.cu
st_id)
   LEVEL city      IS (customers.cust_city)
   LEVEL state     IS (customers.cust_state_province)
   LEVEL country   IS (count
ries.country_id)
   LEVEL subregion IS (countries.country_subregion)
   LEVEL region IS (countries.country_region)
   HIERARCHY geog_
rollup (
      customer        CHILD OF
      city            CHILD OF
      state           CHILD OF
      country         CHILD OF
      subregion       CHILD OF
      region
   JOIN KEY (customers.country_id) REFERENCES country);

Viewing Dimensions

Dimensions can be viewed through one of two methods:

< a id="sthref615" name="sthref615">

Using Oracle Enterprise Manager

All of the dimensions that exist in the data warehouse can be viewed using Oracle Enterprise Manager. Select the Dimension object from within the Schema icon to display all of the dimensions. Select a specific dimens ion to graphically display its hierarchy, levels, and any attributes that have been defined.


< /tr>

See Also:

Oracle Enterprise Manager Administrator's Guide for details regarding creating and using dimensions

Usin g the DESCRIBE_DIMENSION Procedure

To view the definition of a dimension, use the DESCRIBE_DIMENSION proc edure in the DBMS_DIMENSION package. For example, if a dimension is created in the sh sample schema with th e following statements:

CREATE DIMENSION channels_dim
        LEVEL channel       IS (channels.channel_
id)
        LEVEL channel_class IS (channels.channel_class)
        HIERARCHY channel_rollup (
                channel CHILD OF chann
el_class)
        ATTRIBUTE channel DETERMINES (channel_desc)
        ATTRIBUTE channel_class DETERMINES (channel_class);

Execute the DESCRIBE_DIMENSION procedure as follows:

SET SERVEROUTPUT ON FORMAT WRAPPED;
 --to improve the display of info
EXECUTE DBMS_DIMENSION.DESCRIBE_DIMENSION('SH.CHANNELS_DIM');

You then see the followin g output results:

EXECUTE DBMS_DIMENSION.DESCRIBE_DIMENSION('SH.CHANNELS_DIM'); 
  DIMENSION SH.CHANNEL
S_DIM 
    LEVEL CHANNEL IS SH.CHANNELS.CHANNEL_ID 
    LEVEL CHANNEL_CLASS IS SH.CHANNELS.CHANNEL_CLASS 

    HIERARCHY CHANNEL_ROLL
UP ( 
             CHANNEL CHILD OF 
             CHANNEL_CLASS) 

    ATTRIBUTE CHANNEL LEVEL CHANNEL DETERMINES
SH.CHANNELS.CHANNEL
_DESC 
    ATTRIBUTE CHANNEL_CLASS LEVEL CHANNEL_CLASS DETERMINES
SH.CHANNELS.CHANNEL_CLASS

Using Dimensions with Constraints

Constraints play an important role with dimensions. Full referential integrity is som etimes enabled in data warehouses, but not always. This is because operational databases normally have full referential integrity and you can ensure that the data flowing into your data warehouse never violates the already established integrity rules.

It is r ecommended that constraints be enabled and, if validation time is a concern, then the NOVALIDATE clause should be used a s follows:

ENABLE NOVALIDATE CONSTRAINT pk_time;

Primary and foreign keys should be impleme nted also. Referential integrity constraints and NOT NULL constraints on the fact tables provide informatio n that query rewrite can use to extend the usefulness of materialized views.

In addition, you should use the RELY clause to inform query rewrite that it can rely upon the constraints being correct as follows:

ALTER T
ABLE time MODIFY CONSTRAINT pk_time RELY;

This information is also used for query rewrite. See Chapter 18, " Query Rewrite" for more information.

Validating Dimens ions

The information of a dimension obje ct is declarative only and not enforced by the database. If the relationships described by the dimensions are incorrect, incorrect re sults could occur. Therefore, you should verify the relationships specified by CREATE DIMENSION using the < code>DBMS_DIMENSION.VALIDATE_DIMENSION procedure periodically.

This procedure is easy to use and has only four paramete rs:

  • Dimension owner and name.

  • Set to TRUE to check only the ne w rows for tables of this dimension.

  • Set to TRUE to verify that all columns are not null.

  • A user-supplied unique identifier to identify the result of each run of the procedure.

T he following example validates the dimension TIME_FN in the sh schema:

@utldi
m.sql 
EXECUTE DBMS_DIMENSION.VALIDATE_DIMENSION ('SH.TIME_FN', FALSE, TRUE, 
  'my 1st example');

Before running the VALIDATE_DIMENSION procedure, you need to create a local table, DIMENSION_EXCEPTIONS, by running the provided script utldim.sql. If the VALIDATE_DIMENSION procedure encounters any errors, they are placed in this table . Querying this table will identify the exceptions that were found. The following illustrates a sample:

SELECT * FROM dimension_exceptions
WHERE statement_id = 'my 1st example';

STATEMENT_ID     OWNER  TABLE_NAME  DIMENSION_NAME   RELA
TIONSHIP   BAD_ROWID
------------     -----  ----------  --------------   ------------   ---------
my 1st example   SH     MONTH
  TIME_FN          FOREIGN KEY    AAAAuwAAJAAAARwAAA

However, rather than query this table, it may be better to query the rowid of the invalid row to retrieve the actual row that has violated the constraint. In this example, the dimension TIME_FN is checking a table called month. It has found a row that violates the constraints. Using the rowid, you can see e xactly which row in the month table is causing the problem, as in the following:

SELECT * FROM month
WH
ERE rowid IN (SELECT bad_rowid
                FROM dimension_exceptions
                WHERE statement_id = 'my 1st example');

MON
TH    QUARTER   FISCAL_QTR    YEAR    FULL_MONTH_NAME    MONTH_NUMB
------   -------   ----------    ----    ---------------    -----
-----
199903     19981        19981    1998              March             3

Altering Dimensions

You can mod ify the dimension using the ALTER DIMENSION statement. You can add or drop a level, hierarchy, or attribute from the dimension using this command.

Referring to the time dimension in Figure 10-2, y ou can remove the attribute fis_year, drop the hierarchy fis_rollup, or remove the level fiscal_year< /code>. In addition, you can add a new level called f_year as in the following:

ALTER DIME
NSION times_dim DROP ATTRIBUTE fis_year;
ALTER DIMENSION times_dim DROP HIERARCHY fis_rollup;
ALTER DIMENSION times_dim DROP LEVEL fi
s_year
ALTER DIMENSION times_dim ADD LEVEL f_year IS times.fiscal_year;

If you used the extended_attribute_clau se when creating the dimension, you can drop one attribute column without dropping all attribute columns. This is illustr ated in "Dropping and Creating Attributes with Columns", which shows the following statement:

ALTER DIMENSION product_dim 
DROP ATTRIBUTE size LEVEL prod_type COLUMN Prod_TypeSize;

If you try to remov e anything with further dependencies inside the dimension, Oracle Database rejects the altering of the dimension. A dimension becomes invalid if you change any schema object that the dimension is referencing. For example, if the table on which the dimension is defin ed is altered, the dimension becomes invalid.

To check the status of a dimension, view the contents of the column invali d in the ALL_DIMENSIONS data dictionary view. To revalidate the dimension, use the COMPILE option as follows:

ALTER DIMENSION times_dim COMPILE;

Dimensions can also be modified or deleted usi ng Oracle Enterprise Manager. See Oracle Enterprise Manager Admin istrator's Guide for more information.

Deleting Dimensions

A dimension is removed using the DROP< /code> DIMENSION statement. For example:

DROP DIMENSION times_dim;