Skip Headers

Oracle® Database Data Warehousing Guide
10g Release 1 (10.1)

Part Number B10736-01
< td align="center" valign="top">Go to Index
< font size="-2">Index
Go to Documentation Home
Home
Go to Book List
Book List

Contents
Go to Master Index
Master Index
Go to Feedback page Feedback

Go to previou
s page
Previous
Go to next page
Next
View PDF
< /a>

3 Physical Design in Data Warehouses

This chapter describes the physical design of a data warehousing environment, and includes the fol lowing topics:

Moving from Logical to Physical Design

Logical design is what you draw with a pen and paper or design with Oracle Warehouse Builder or Oracle Designer before building your data warehouse. Physical design is the creation of the database with SQL statements.

During the physical design process, you convert the data gathered during the logical design phase into a description o f the physical database structure. Physical design decisions are mainly driven by query performance and database maintenance aspects. For example, choosing a partitioning strategy that meets common query requirements enables Oracle Database to take advantage of part ition pruning, a way of narrowing a search before performing it.


See Also:


Physical Design

During the logical design phase, you defined a model for your data warehouse consisting of entities, attributes, and relationships. The entities are linked together using relationships. Attributes are used to describe the entities. The unique identifier (UID) distinguishes between one instance of an entity and another.

Figure 3-1 illustrates a graphical way of distinguishing between logical and physical designs.

Figure 3-1 Logical Design Compared with Physical Design

Description of dwhsg006.gif follows
Description of the illustration dwhsg006.gif

< !-- class="figure" -->

During the physical design process, you translate the expected schemas into actual database structures. At this time, you have to map:

Physical Design Structures

On ce you have converted your logical design to a physical one, you will need to create some or all of the following structures:

Some o f these structures require disk space. Others exist only in the data dictionary. Additionally, the following structures may be create d for performance improvement:

Tablespac es

A tablespace consists of one or more datafiles, which are physical structures within the operating system you are u sing. A datafile is associated with only one tablespace. From a design perspective, tablespaces are containers for physical design st ructures.

Tablespaces need to be separated by differences. For example, tables should be separated from their indexes and smal l tables should be separated from large tables. Tablespaces should also represent logical business units if possible. Because a table space is the coarsest granularity for backup and recovery or the transportable tablespaces mechanism, the logical business design aff ects availability and maintenance operations.

You can now use ultralarge data files, a significant improvement in very large databases.


See Also:

Chapter 4, " Hardware and I/O Considerations in Data Warehouses" for information regarding tablespaces

Tables and Partitioned Tables

Tables a re the basic unit of data storage. They are the container for the expected amount of raw data in your data warehouse.

Using pa rtitioned tables instead of nonpartitioned ones addresses the key problem of supporting very large data volumes by allowing you to de compose them into smaller and more manageable pieces. The main design criterion for partitioning is manageability, though you will al so see performance benefits in most cases because of partition pruning or intelligent parallel processing. For example, you might cho ose a partitioning strategy based on a sales transaction date and a monthly granularity. If you have four years' worth of data, you c an delete a month's data as it becomes older than four years with a single, fast DDL statement and load new data while only affecting 1/48th of the complete table. Business questions regarding the last quarter will only affect three months, which is equivalent to th ree partitions, or 3/48ths of the total volume.

Partitioning large tables improves performance because each partitioned piece is more manageable. Typically, you partition based on transaction dates in a data warehouse. For example, each month, one month's wor th of data can be assigned its own partition.

Table Compression

You can save disk space by compressing heap-organized tables. A typical type of heap-organized table you should cons ider for table compression is partitioned tables.

To reduce disk use and memory use (specifically, the buffer cache), you can store tables and partitioned tables in a compressed format inside the database. This often leads to a better scaleup for read-only op erations. Table compression can also speed up query execution. There is, however, a cost in CPU overhead.

Table compression sh ould be used with highly redundant data, such as tables with many foreign keys. You should avoid compressing tables with much update or other DML activity. Although compressed tables or partitions are updatable, there is some overhead in updating these tables, and h igh update activity may work against compression by causing some space to be wasted.

Views

A view is a tailored presentation of the data contained in one or more tables or other vi ews. A view takes the output of a query and treats it as a table. Views do not require any space in the database.


< tr>

See Also:

Oracle Database Concepts

Integrity Constraints

Integrity constraint s are used to enforce business rules associated with your database and to prevent having invalid information in the tables. Integrity constraints in data warehousing differ from constraints in OLTP environments. In OLTP environments, they primarily prevent the inser tion of invalid data into a record, which is not a big problem in data warehousing environments because accuracy has already been gua ranteed. In data warehousing environments, constraints are only used for query rewrite. NOT NULL constraint s are particularly common in data warehouses. Under some specific circumstances, constraints need space in the database. These constr aints are in the form of the underlying unique index.

Indexes and Part itioned Indexes

Indexes are optional structures associated with tables or clusters. In addition to the classical B-tre e indexes, bitmap indexes are very common in data warehousing environments. Bitmap indexes are optimized index structures for set-ori ented operations. Additionally, they are necessary for some optimized data access methods such as star transformations.

Indexe s are just like tables in that you can partition them, although the partitioning strategy is not dependent upon the table structure. Partitioning indexes makes it easier to manage the data warehouse during refresh and improves query performance.



Materialized Views< /font>

Materialized views are query results that have been stored in advance so long-running calculations are not necessary w hen you actually execute your SQL statements. From a physical design point of view, materialized views resemble tables or partitioned tables and behave like indexes in that they are used transparently and improve performance.


See Also:

Chapter 6, " Indexes" and Chapter 15, " Maintaining the Data Warehouse"

See Also:

Chapter 8, " Basic Materialized Views"

Dimensions

A dimension is a schema object that defines hierarchical relationships between co lumns or column sets. A hierarchical relationship is a functional dependency from one level of a hierarchy to the next one. A dimensi on is a container of logical relationships and does not require any space in the database. A typical dimension is city, state (or pro vince), region, and country.


See Also:< /strong>

Chapter 10, " Dimensions"