| Oracle® Database Data Warehousing Guide 10g Release 1 (10.1) Part Number B10736-01 font> |
|
5 a>Parallelism and Partitioning in Data WarehousesData warehouses often contain large tables and require techniques bo th for managing these large tables and for providing good query performance across these large tables. This chapter discusses two key methodologies for addressing these needs: parallelism and partitioning. These topics are discussed:
Overview of Parallel ExecutionParallel execution d ramatically reduces response time for data-intensive operations on large databases typically associated with decision support systems (DSS) and data warehouses. You can also implement parallel execution on certain types of online transaction processing (OLTP) and hybrid systems. Parallel execution is sometimes called parallelism. Simply expressed, parallelism is the idea of breaking down a task so that, instead of one proces s doing all of the work in a query, many processes do part of the work at the same time. An example of this is when four processes ha ndle four different quarters in a year instead of one process handling all four quarters by itself. The improvement in performance ca n be quite high. In this case, each quarter will be a partition, a smaller and mo re manageable unit of an index or table. When to Implement Parallel ExecutionThe most common use of parallel execution is in DSS and data wa rehousing environments. Complex queries, such as those involving joins of several tables or searches of very large tables, are often best executed in parallel. Parallel execution is useful for many types of operations th at access significant amounts of data. Parallel execution improves processing for:
You can also use parallel execution to access object types within an Oracle database. For example, use parallel execution to access LOBs (large objec ts). Parallel execution benefits systems that have all of the following characteristics:
If your system lacks any of these characteristics, parallel execution might not significantly improve performance. In fact, parallel execution can reduce system performance on overutil ized systems or systems with small I/O bandwidth.
Granules of ParallelismDifferent parallel operations use different types of parallelism. The optimal physical database layout depends on the parallel ope rations that are most prevalent in your application or even of the necessity of using partitions. The basic unit of work in pa rallelism is a called a granule. Oracle Database divides the operation being parallelized (for example, a table scan, table update, or index creation) into granules. Parallel execution processes execute the operation one granul e at a time. The number of granules and their size correlates with the degree of parallelism (DOP). It also affects how well the work is balanced across query server processes. There is no way you can enforce a specific granule strategy as Oracle Database makes this decision internally. Block Range GranulesBloc k range granules are the basic unit of most parallel operations, even on partitioned tables. Therefore, from an Oracle Database persp ective, the degree of parallelism is not related to the number of partitions. Block range granules are ranges of physical bloc ks from a table. The number and the size of the granules are computed during runtime by Oracle Database to optimize and balance the w ork distribution for all affected parallel execution servers. The number and size of granules are dependent upon the size of the obje ct and the DOP. Block range granules do not depend on static preallocation of tables or indexes. During the computation of the granul es, Oracle Database takes the DOP into account and tries to assign granules from different datafiles to each of the parallel executio n servers to avoid contention whenever possible. Additionally, Oracle Database considers the disk affinity of the granules on MPP sys tems to take advantage of the physical proximity between parallel execution servers and disks. When block range granules are u sed predominantly for parallel access to a table or index, administrative considerations (such as recovery or using partitions for de leting portions of data) might influence partition layout more than performance considerations. Partition GranulesWhen partition granules are used, a query server process works on an entire partition or subpartition of a table or index. Because partition granules are statically dete rmined by the structure of the table or index when a table or index is created, partition granules do not give you the flexibility in parallelizing an operation that block granules do. The maximum allowable DOP is the number of partitions. This might limit the utili zation of the system and the load balancing across parallel execution servers. When partition granules are used for parallel a ccess to a table or index, you should use a relatively large number of partitions (ideally, three times the DOP), so that Oracle can effectively balance work across the query server processes. Partition granules are the basic unit of parallel index range scan s and of parallel operations that modify multiple partitions of a partitioned table or index. These operations include parallel creat ion of partitioned indexes, and parallel creation of partitioned tables. Partitioning Design ConsiderationsIn conjunction with parallel execution, partitioning can improve pe rformance in data warehouses. The following are the main design considerations for partitioning: Types of PartitioningThis section describes the partitioning features that sign ificantly enhance data access and improve overall application performance. This is especially true for applications that access table s and indexes with millions of rows and many gigabytes of data. Partitioned tables and indexes facilitate administrative opera tions by enabling these operations to work on subsets of data. For example, you can add a new partition, organize an existing partiti on, or drop a partition and cause less than a second of interruption to a read-only application. Using the partitioning method s described in this section can help you tune SQL statements to avoid unnecessary index and table scans (using partition pruning). Yo u can also improve the performance of massive join operations when large amounts of data (for example, several million rows) are join ed together by using partition-wise joins. Finally, partitioning data greatly improves manageability of very large databases and dram atically reduces the time required for administrative tasks such as backup and restore. Granularity can be easily added or rem oved to the partitioning scheme by splitting partitions. Thus, if a table's data is skewed to fill some partitions more than others, the ones that contain more data can be split to achieve a more even distribution. Partitioning also allows one to swap partitions wit h a table. By being able to easily add, remove, or swap a large amount of data quickly, swapping can be used to keep a large amount o f data that is being loaded inaccessible until loading is completed, or can be used as a way to stage data between different phases o f use. Some examples are current day's transactions or online archives. < table class="notealso" summary="This is a layout table to format a tip" title="This is a layout table to format a tip" dir="ltr" bord er="1" width="80%" frame="hsides" rules="groups" cellpadding="3" cellspacing="0">
| See Also: Oracle Database Concepts for an introduction to the ideas behind partitioning |
Oracl e offers four partitioning methods:
Each partitioning method has different advantages and de sign considerations. Thus, each method is more appropriate for a particular situation.
Range partitioning maps data to partitions based on ranges of partition key values that yo u establish for each partition. It is the most common type of partitioning and is often used with dates. For example, you might want to partition sales data into monthly partitions.
Range partitioning maps rows to partitions based on ranges of column values.
Range partitioning is defined by the partitioning specification for a table or index in PARTITION BY
RANGE(column_list) and by the partitioning specifications for each individual partition in VALUES LESS THAN(value_list), where column_list is an ordered list of columns that determines the partition to which
a row or an index entry belongs. These columns are called the partitioning columns. The values in the partitioning columns of a part
icular row constitute that row's partitioning key.
value_list is an ordered list of values for the columns in the
column list. Each value must be either a literal or a TO_DATE or RPAD function with constant arguments. On
ly the VALUES LESS THAN clause is allowed. This clause specifies a non-inclusive upper bound f
or the partitions. All partitions, except the first, have an implicit low value specified by the VALUES LESS THAN literal on the previous partition. Any binary values of the partition key equal to or higher than this literal ar
e added to the next higher partition. Highest partition being where MAXVALUE literal is defined. Keyword, MAXVALUE
, represents a virtual infinite value that sorts higher than any other value for the data type, including the null value.
The following statement creates a table sales_range that is range partitioned on the sales_date field:
p>
CREATE TABLE sales_range
(salesman_id NUMBER(5),
salesman_name VARCHAR2(30),
sales_amount NUMBER(10
),
sales_date DATE)
COMPRESS
PARTITION BY RANGE(sales_date)
(PARTITION sales_jan2000 VALUES LESS THAN(TO_DATE('02/01/2000','DD/MM
/YYYY')),
PARTITION sales_feb2000 VALUES LESS THAN(TO_DATE('03/01/2000','DD/MM/YYYY')),
PARTITION sales_mar2000 VALUES LESS THAN(TO
_DATE('04/01/2000','DD/MM/YYYY')),
PARTITION sales_apr2000 VALUES LESS THAN(TO_DATE('05/01/2000','DD/MM/YYYY')));
|
Note: This table was created with theCOMPRESS keyword, thus all partitions inherit this attribute. |
|
See Also: strong> Oracle Database SQL Reference for pa rtitioning syntax and the Oracle Database Admini strator's Guide for more examples |
Hash partitioning maps data to partitions based on a hashing algorithm that Oracle applies to a partitioning ke y that you identify. The hashing algorithm evenly distributes rows among partitions, giving partitions approximately the same size. H ash partitioning is the ideal method for distributing data evenly across devices. Hash partitioning is a good and easy-to-use alterna tive to range partitioning when data is not historical and there is no obvious column or column list where logical range partition pr uning can be advantageous.
Oracle Database uses a linear hashing algorithm and to prevent data from clustering within specific partitions, you should define the number of partitions by a power of two (for example, 2, 4, 8).
The following statement crea
tes a table sales_hash, which is hash partitioned on the salesman_id field:
C REATE TABLE sales_hash (salesman_id NUMBER(5), salesman_name VARCHAR2(30), sales_amount NUMBER(10), week_no NUMBER(2)) PA RTITION BY HASH(salesman_id) PARTITIONS 4;
|
See Also: Oracle D atabase SQL Reference for partitioning syntax and the Oracle Database Administrator's Guide for more examples |
|
Note: You cannot define alternate hashing algorithms for partitions. |
Lis< /a>t partitioning enables you to explicitly control how rows map to partitions. You do this by specifying a list of discrete values for the partitioning column in the description for each partition. This is different from range partitioning, where a range of values is associated with a partition and with hash partitioning, where you have no control of the ro w-to-partition mapping. The advantage of list partitioning is that you can group and organize unordered and unrelated sets of data in a natural way. The following example creates a list partitioned table grouping states according to their sales regions:
CREATE TABLE sales_list
(salesman_id NUMBER(5),
salesman_name VARCHAR2(30),
sales_state VARCHAR2(20),
sales_amo
unt NUMBER(10),
sales_date DATE)
PARTITION BY LIST(sales_state)
(PARTITION sales_west VALUES('California', 'Hawaii') COMPRESS,
PARTITION sales_east VALUES('New York', 'Virginia', 'Florida'),
PARTITION sales_central VALUES('Texas', 'Illinois'));
Pa
rtition sales_west is furthermore created as a single compressed partition within sales_list. For details a
bout partitioning and compression, see "Partitioning and Table Compression".
An additional capability with list partitioning is that you can use a default partition, so that all rows that do not map to any other partition do not genera te an error. For example, modifying the previous exampl e, you can create a default partition as follows:
CREATE TABLE sales_list
(salesman_id NUMBER(5),
sal
esman_name VARCHAR2(30),
sales_state VARCHAR2(20),
sales_amount NUMBER(10),
sales_date DATE)
PARTITION BY LIST(sales_state)
(P
ARTITION sales_west VALUES('California', 'Hawaii'),
PARTITION sales_east VALUES ('New York', 'Virginia', 'Florida'),
PARTITION sale
s_central VALUES('Texas', 'Illinois'),
PARTITION sales_other VALUES(DEFAULT));
|
|
Composite partitioning combines range and hash or list partitioning. Oracle Database firs t distributes data into partitions according to boundaries established by the partition ranges. Then, for range-hash partitioning, Or acle uses a hashing algorithm to further divide the data into subpartitions within each range partition. For range-list partitioning, Oracle divides the data into subpartitions within each range partition based on the explicit list you chose.
You can choose whether or not t o inherit the partitioning strategy of the underlying tables. You can create both local and global indexes on a table partitioned by range, hash, or composite methods. Local indexes inheri t the partitioning attributes of their related tables. For example, if you create a local index on a composite table, Oracle automati cally partitions the local index using the composite method. See Chapter 6, " Indexes" for more in formation.
This section des cribes performance issues for:
Range partitioning is a convenient method for partitioning historical data. The boundaries of range partitions define the ordering of the partitions in the tables or indexes.
Range part
itioning organizes data by time intervals on a column of type DATE. Thus, most SQL statements accessing range partitions
focus on timeframes. An example of this is a SQL statement similar to "select data from a particular period in time." In such a scen
ario, if each partition represents data for one month, the query "find data of month 98-DEC" needs to access only the December partit
ion of year 98. This reduces the amount of data scanned to a fraction of the total data available, an optimization method called part
ition pruning.
Range partitioning is also ideal when you periodically load new data and purge old data. It is easy to add or d rop partitions.
It is common to keep a rolling window of data, for example keeping the past 36 months' worth of data online. R
ange partitioning simplifies this process. To add data from a new month, you load it into a separate table, clean it, index it, and t
hen add it to the range-partitioned table using the EXCHANGE PARTITION statement, all while the original ta
ble remains online. Once you add the new partition, you can drop the trailing month with the DROP PARTITION
statement. The alternative to using the DROP PARTITION statement can be to archive the partition and make
it read only, but this works only when your partitions are in separate tablespaces.
In conclusion, consider using range partitioning when:
Very large ta
bles are frequently scanned by a range predicate on a good partitioning column, such as ORDER_DATE or PURCHASE_DAT
E. Partitioning the table on that column enables partition pruning.
You want to maintain a rollin g window of data.
You cannot complete administrative operations, such as backup and restore, on large ta bles in an allotted time frame, but you can divide them into smaller logical pieces based on the partition range column.
The following example creates the table salestable for a period of two years, 1999 and 2000, and partitions it by r
ange according to the column s_salesdate to separate the data into eight quarters, each corresponding to a partition.
CREATE TABLE salestable
(s_productid NUMBER,
s_saledate DATE,
s_custid NUMBER,
s_totalp
rice NUMBER)
PARTITION BY RANGE(s_saledate)
(PARTITION sal99q1 VALUES LESS THAN (TO_DATE('01-APR-1999', 'DD-MON-YYYY')),
PARTITION
sal99q2 VALUES LESS THAN (TO_DATE('01-JUL-1999', 'DD-MON-YYYY')),
PARTITION sal99q3 VALUES LESS THAN (TO_DATE('01-OCT-1999', 'DD-M
ON-YYYY')),
PARTITION sal99q4 VALUES LESS THAN (TO_DATE('01-JAN-2000', 'DD-MON-YYYY')),
PARTITION sal00q1 VALUES LESS THAN (TO_DA
TE('01-APR-2000', 'DD-MON-YYYY')),
PARTITION sal00q2 VALUES LESS THAN (TO_DATE('01-JUL-2000', 'DD-MON-YYYY')),
PARTITION sal00q3
VALUES LESS THAN (TO_DATE('01-OCT-2000', 'DD-MON-YYYY')),
PARTITION sal00q4 VALUES LESS THAN (TO_DATE('01-JAN-2001', 'DD-MON-YYYY')
));
The way Oracle Data base distributes data in hash partitions does not correspond to a business or a logical view of the data, as it does in range partiti oning. Consequently, hash partitioning is not an effective way to manage historical data. However, hash partitions share some perform ance characteristics with range partitions. For example, partition pruning is limited to equality predicates. You can also use partit ion-wise joins, parallel index access, and parallel DML. See "Partition-Wise Joins" for more information.
As a general rule, use hash partitioning for these purposes:
To improve the availability and managea bility of large tables or to enable parallel DML in tables that do not store historical data.
To avoid d ata skew among partitions. Hash partitioning is an effective means of distributing data because Oracle hashes the data into a number of partitions, each of which can reside on a separate device. Thus, data is evenly spread over a sufficient number of devices to maxi mize I/O throughput. Similarly, you can use hash partitioning to distribute evenly data among the nodes of an MPP platform that uses Oracle Real Application Clusters.
If it is important to use partition pruning and partition-wise joins a ccording to a partitioning key that is mostly constrained by a distinct value or value list.
Note:
In hash partitioning, partition pruning uses only equality orIN-li
st predicates.If you add or merge a hashed partition , Oracle automatically rearranges the rows to reflect the change in the number of partitions and subpartitions. The hash function tha t Oracle uses is especially designed to limit the cost of this reorganization. Instead of reshuffling all the rows in the table, Orac les uses an "add partition" logic that splits one and only one of the existing hashed partitions. Conversely, Oracle coalesces a part ition by merging two existing hashed partitions.
Although the hash function's use of "add partition" logic dramatically improv es the manageability of hash partitioned tables, it means that the hash function can cause a skew if the number of partitions of a ha sh partitioned table, or the number of subpartitions in each partition of a composite table, is not a power of two. In the worst case , the largest partition can be twice the size of the smallest. So for optimal performance, create a number of partitions and subparti tions for each partition that is a power of two. For example, 2, 4, 8, 16, 32, 64, 128, and so on.
The following example creat
es four hashed partitions for the table sales_hash using the column s_productid as the partition key:
Specify partition names if you want to choose the names of the par
titions. Otherwise, Oracle automatically generates internal names for the partitions. Also, you can use the STORE
IN clause to assign hash partitions to tablespaces in a round-robin manner.
|
See Also: Oracle Database SQL Reference for partitioning syntax and the Oracle Database Administrator's Guide for more examples |
| Subpartition | Tablespace |
|---|---|
sal99q1_sp1 |
tbs1 |
sal99q1_s
p2 |
tbs2 |
sal99q1_sp3 |
tbs3 |
sal99q1_sp4 |
tbs4 |
sal99q1_sp5 |
tbs5 |
sal99q1_sp6 |
|
sal99q1_sp7
td>
| tbs7 |
sal99q1_sp8 |
tbs8 |
Composite range-list partitioning offers the benefits of both range and list partitioning. With composite range-list partitioning, Oracle first partitions by range. Then, within each range, Oracle creates subpartitions and distributes data within them to organize sets of data in a natural way as assigned by the list.
Data placed in composite partitions is logically ordered only by the boundaries that define the range level partitions.
Use the composite ra nge-list partitioning method for tables and local indexes if:
Subpartitions have a logical grouping defi ned by the user.
The contents of a partition can be spread across multiple tablespaces, devices, or node s (of an MPP system).
You require both partition pruning and partition-wise joins even when the pruning and join predicates use different columns of the partitioned table.
You require a degree of parallelism that is greater than the number of partitions for backup, recovery, and parallel operations.
Most large tables in a data warehouse should use range partitioning. Composite partitioning should be used for very large tables or for data warehouses with a well-defined need for these conditions. When using the composite method, Oracle stores each subpartition on a different segment. T hus, the subpartitions may have properties that differ from the properties of the table or from the partition to which the subpartiti ons belong.
This statement creates a table quarterly_regional_sales that is range partitioned on the txn_da
te field and list subpartitioned on state.
CREATE TABLE quarterly_regional_sales
(d
eptno NUMBER, item_no VARCHAR2(20),
txn_date DATE, txn_amount NUMBER, state VARCHAR2(2))
PARTITION BY RANGE (txn_date)
SUBPARTITION
BY LIST (state)
(
PARTITION q1_1999 VALUES LESS THAN(TO_DATE('1-APR-1999','DD-MON-YYYY'))
(SUBPARTI
TION q1_1999_northwest VALUES ('OR', 'WA'),
SUBPARTITION q1_1999_southwest VALUES ('AZ', 'UT', 'NM'),
SUBPARTITION q1_1999_northeast
VALUES ('NY', 'VM', 'NJ'),
SUBPARTITION q1_1999_southeast VALUES ('FL', 'GA'),
SUBPARTITION q1_1999_northcentral VALUES ('SD', 'WI'
),
SUBPARTITION q1_1999_southcentral VALUES ('NM', 'TX')),
PARTITION q2_1999 VALUES LESS THAN(TO_DA
TE('1-JUL-1999','DD-MON-YYYY'))
(SUBPARTITION q2_1999_northwest VALUES ('OR', 'WA'),
SUBPARTITION q
2_1999_southwest VALUES ('AZ', 'UT', 'NM'),
SUBPARTITION q2_1999_northeast VALUES ('NY', 'VM', 'NJ'),
SUBPARTITION q2_1999_southeast
VALUES ('FL', 'GA'),
SUBPARTITION q2_1999_northcentral VALUES ('SD', 'WI'),
SUBPARTITION q2_1999_southcentral VALUES ('NM', 'TX')),
<
/pre>
PARTITION q3_1999 VALUES LESS THAN (TO_DATE('1-OCT-1999','DD-MON-YYYY'))
(SUBPARTITION q3_1999_northwest VALUES ('OR', 'WA'),
SUBPARTITION q3_1999_southwest VALUES ('AZ', 'UT', 'NM'),
SUBPARTITION q
3_1999_northeast VALUES ('NY', 'VM', 'NJ'),
SUBPARTITION q3_1999_southeast VALUES ('FL', 'GA'),
SUBPARTITION q3_1999_northcentral VAL
UES ('SD', 'WI'),
SUBPARTITION q3_1999_southcentral VALUES ('NM', 'TX')),
PARTITION q4_1999 VALUES
LESS THAN (TO_DATE('1-JAN-2000','DD-MON-YYYY'))
(SUBPARTITION q4_1999_northwest VALUES('OR', 'WA'),
SUBPARTITION q4_1999_southwest VALUES('AZ', 'UT', 'NM'),
SUBPARTITION q4_1999_northeast VALUES('NY', 'VM', 'NJ'),
SUBPARTITION q4_19
99_southeast VALUES('FL', 'GA'),
SUBPARTITION q4_1999_northcentral VALUES ('SD', 'WI'),
SUBPARTITION q4_1999_southcentral VALUES ('NM
', 'TX')));
You can create subpartitions in a composite partitioned table using a subpartition template. A subpartition te
mplate simplifies the specification of subpartitions by not requiring that a subpartition descriptor be specified for every partition
in the table. Instead, you describe subpartitions only once in a template, then apply that subpartition template to every partition
in the table. The following statement illustrates an example where you can choose the subpartition name and tablespace locations:
CREATE TABLE quarterly_regional_sales
(deptno NUMBER, item_no VARCHAR2(20),
txn_date DATE, txn_amount N
UMBER, state VARCHAR2(2))
PARTITION BY RANGE (txn_date)
SUBPARTITION BY LIST (state)
SUBPARTITION TEMPLATE(
SUBPARTITION northwest VALUES ('OR', 'WA') TABLESPACE ts1,
SUBPARTITION southwest VALUES ('AZ', 'UT', 'NM') TABLESPACE ts2,
SUBPARTITION northeast VALUES ('NY', 'VM', 'NJ') TABLESPACE ts3,
SUBPARTITION southeast VALUES ('FL', 'GA') TABLESPACE ts4,
SUBPARTI
TION northcentral VALUES ('SD', 'WI') TABLESPACE ts5,
SUBPARTITION southcentral VALUES ('NM', 'TX') TABLESPACE ts6)
(
PARTITION q1_1999 VALUES LESS THAN(TO_DATE('1-APR-1999','DD-MON-YYYY')),
PARTITION q2_1999 VALUES LESS THAN(TO_DAT
E('1-JUL-1999','DD-MON-YYYY')),
PARTITION q3_1999 VALUES LESS THAN(TO_DATE('1-OCT-1999','DD-MON-YYYY')),
PARTITION q4_1999 VALUES LES
S THAN(TO_DATE('1-JAN-2000','DD-MON-YYYY')));
You can compress several partitions or a complete partitioned heap-organize d table. You do this by either defining a complete partitioned table as being compressed, or by defining it on a per-partition level. Partitions without a specific declaration inherit the attribute from the table definition or, if nothing is specified on table level, from the tablespace definition.
To decide whet her or not a partition should be compressed or stay uncompressed adheres to the same rules as a nonpartitioned table. However, due to the capability of range and composite partitioning to separate data logically into distinct partitions, such a partitioned table is an ideal candidate for compressing parts of the data (partitions) that are mainly read-only. It is, for example, beneficial in all ro lling window operations as a kind of intermediate stage before aging out old data. With data segment compression, you can keep more o ld data online, minimizing the burden of additional storage consumption.
You can also change any existing uncompressed table p
artition later on, add new compressed and uncompressed partitions, or change the compression attribute as part of any partition maint
enance operation that requires data movement, such as MERGE PARTITION, SPLIT PARTITION
code>, or MOVE PARTITION. The partitions can contain data or can be empty.
The access and maintenanc e of a partially or fully compressed partitioned table are the same as for a fully uncompressed partitioned table. Everything that ap plies to fully uncompressed partitioned tables is also valid for partially or fully compressed partitioned tables.
|
See Also: Chapter 3, " Physical Design in Data Warehouses" for a generic discussion of table compression, Chapter 15, " Maintaining the Data Warehouse" for a sample rolling window operation with a range-partitioned table, and Oracle Database Performance Tuning G uide for an example of calculating the compression ratio |
Th
e following statement moves and compresses an already existing partition sales_q1_1998 of table sales:
ALTER TABLE sales MOVE PARTITION sales_q1_1998 TABLESPACE ts_arch_q1_1998 COMPRESS;
If you use
the MOVE statement, the local indexes for partition sales_q1_1998 become unusable. You have to rebuild them
afterward, as follows:
ALTER TABLE sales MODIFY PARTITION sales_q1_1998 REBUILD UNUSABLE LOCAL INDEXES ;
The following statement merges two existing partitions into a new, compressed partition, residing in a separate tablesp ace. The local bitmap indexes have to be rebuilt afterward, as follows:
ALTER TABLE sales MERGE PARTITI ONS sales_q1_1998, sales_q2_1998 INTO PARTITION sales_1_1998 TABLESPACE ts_arch_1_1998 COMPRESS UPDATE INDEXES;
|
See Also: Oracle Database Performance Tuning Guide for details regardin g how to estimate the compression ratio when using table compression |
Partition pruning is an essential performance feature for data warehouses. In partition pruning, the optimizer analyzes FROM and WHERE clauses in SQL statements t
o eliminate unneeded partitions when building the partition access list. This enables Oracle Database to perform operations only on t
hose partitions that are relevant to the SQL statement. Oracle prunes partitions when you use range, LIKE, equality, and
IN-list predicates on the range or list partitioning columns, and when you use equality and IN-list predic
ates on the hash partitioning columns.
Partition pruning dramatically reduces the amount of data retrieved from disk and short ens the use of processing time, improving query performance and resource utilization. If you partition the index and table on differe nt columns (with a global, partitioned index), partition pruning also eliminates index partitions even when the partitions of the und erlying table cannot be eliminated.
On composite partitioned objects, Oracle can prune at both the range partition level and a
t the hash or list subpartition level using the relevant predicates. Refer to the table sales_range_hash earlier, partit
ioned by range on the column s_salesdate and subpartitioned by hash on column s_productid, and consider the
following example:
SELECT * FROM sales_range_hash
WHERE s_saledate BETWEEN (TO_DATE('01-JUL-1999', 'DD
-MON-YYYY')) AND
(TO_DATE('01-OCT-1999', 'DD-MON-YYYY')) AND s_productid = 1200;
Oracle uses the predicate on the partit ioning columns to perform partition pruning as follows:
When using range partitioning, Oracle accesses o
nly partitions sal99q2 and sal99q3.
When using hash subpartitioning, Oracle ac
cesses only the one subpartition in each partition that stores the rows with s_productid=1200. The mapping between the s
ubpartition and the predicate is calculated based on Oracle's internal hash distribution function.
In the earlier partitioning pruning example, the date value was fully specified as
four digits for the year using the TO_DATE function, just as it was in the underlying table's range partitioning descrip
tion. While this is the recommended format for specifying date values, the optimizer can prune partitions using the predicates on
SELECT * FROM sales_rang
e_hash
WHERE s_saledate BETWEEN TO_DATE('01-JUL-99', 'DD-MON-RR') AND
TO_DATE('01-OCT-99', 'DD-MON-RR') AND s_productid = 1200;
Although this uses the DD-MON-RR format, which is not the same as the base partition, the optimizer can still
prune properly.
If you execute an EXPLAIN PLAN statement on the query, the PARTITION_START and PARTITION_STOP columns of the output table do not specify which partitions Oracle is accessing. Instead, you se
e the keyword KEY for both columns. The keyword KEY for both columns means that partition pruning occurs at
run-time. It can also affect the execution plan because the information about the pruned partitions is missing compared to the same
statement using the same TO_DATE function than the partition table definition.
Partition-wise joins reduce query response time by minimizing the amount of data exchange d among parallel execution servers when joins execute in parallel. This significantly reduces response time and improves the use of b oth CPU and memory resources. In Oracle Real Application Clusters environments, partition-wise joins also avoid or at least limit the data traffic over the interconnect, which is the key to achieving good scalability for massive join operations.
Partition-wis e joins can be full or partial. Oracle decides which type of join to use.
A full partition-wise join divides a larg e join into smaller joins between a pair of partitions from the two joined tables. To use this feature, you must equipartition both t ables on their join keys. For example, consider a large join between a sales table and a customer table on the column customerid. The query "find the records of all customers who bought more than 100 articles in Quarter 3 of 1999" is a typical example of a SQL state ment performing such a join. The following is an example of this:
SELECT c.cust_last_name, COUNT(*)
FRO
M sales s, customers c
WHERE s.cust_id = c.cust_id AND
s.time_id BETWEEN TO_DATE('01-JUL-1999', 'DD-MON-YYYY') AND
(TO_DATE('0
1-OCT-1999', 'DD-MON-YYYY'))
GROUP BY c.cust_last_name HAVING COUNT(*) > 100;
This large join is typical in data wareh
ousing environments. The entire customer table is joined with one quarter of the sales data. In large data warehouse applications, th
is might mean joining millions of rows. The join method to use in that case is obviously a hash join. You can reduce the processing t
ime for this hash join even more if both tables are equipartitioned on the customerid column. This enables a full partit
ion-wise join.
When you execute a full partition-wise join in parallel, the granule of parallelism, as described under "Granules of Parallelism", is a partition. As a result, the degree of parallelism is limited to the number of partit ions. For example, you require at least 16 partitions to set the degree of parallelism of the query to 16.
You can use various
partitioning methods to equipartition both tables on the column customerid with 16 partitions. These methods are descri
bed in these subsections.
This is the simplest method: the customers and sales tables are
both partitioned by hash into 16 partitions, on the s_customerid and c_customerid columns. This partitioni
ng method enables full partition-wise join when the tables are joined on c_customerid and s_customerid, bot
h representing the same customer identification number. Because you are using the same hash function to distribute the same informati
on (customer ID) into the same number of hash partitions, you can join the equivalent partitions. They are storing the same values.
p>
In serial, this join is performed between pairs of matching hash partitions, one at a time. When one partition pair has been jo ined, the join of another partition pair begins. The join completes when the 16 partition pairs have been processed.
|
Note: A pair of matching hash partitions is defined as one partition with the same partition number from each table. For example, with full partition-wise jo ins we join partition 0 ofsales with partition 0 of customers, partition 1 of sales with part
ition 1 of customers, and so on. |
Parallel ex ecution of a full partition-wise join is a straightforward parallelization of the serial execution. Instead of joining one partition pair at a time, 16 partition pairs are joined in parallel by the 16 query servers. Figure 5-1 illustrates the parallel execution of a full partition-wise join.
Figure 5-1 Parallel Execution of a Full Partition-wise J oin

In Figure 5-1, assume that the degree of parallelism and the number of partitions are the same, in other wor ds, 16 for both. Defining more partitions than the degree of parallelism may improve load balancing and limit possible skew in the ex ecution. If you have more partitions than query servers, when one query server completes the join of one pair of partitions, it reque sts that the query coordinator give it another pair to join. This process repeats until all pairs have been processed. This method en ables the load to be balanced dynamically when the number of partition pairs is greater than the degree of parallelism, for example, 64 partitions with a degree of parallelism of 16.
|
|
In Oracle Real App lication Clusters environments running on shared-nothing or MPP platforms, placing partitions on nodes is critical to achieving good scalability. To avoid remote I/O, both matching partitions should have affinity to the same node. Partition pairs should be spread ov er all nodes to avoid bottlenecks and to use all CPU resources available on the system.
Nodes can host multiple pairs when the re are more pairs than nodes. For example, with an 8-node system and 16 partition pairs, each node receives two pairs.
|
See Also: Oracle Real Application Clusters Deployment and Performance Guide for more information on data affinity |
This method is a variation of the hash-hash method. The sales table is a typical
example of a table storing historical data. For all the reasons mentioned under the heading "When to Use Range Pa
rtitioning ", range is the logical initial partitioning method.
For example, assume you want to partition the sales<
/code> table into eight partitions by range on the column s_salesdate. Also assume you have two years and that each part
ition represents a quarter. Instead of using range partitioning, you can use composite partitioning to enable a full partition-wise j
oin while preserving the partitioning on s_salesdate. Partition the sales table by range on s_salesda
te and then subpartition each partition by hash on s_customerid using 16 subpartitions for each partition, for a
total of 128 subpartitions. The customers table can still use hash partitioning with 16 partitions.
When you use
the method just described, a full partition-wise join works similarly to the one created by the hash-hash method. The join is still d
ivided into 16 smaller joins between hash partition pairs from both tables. The difference is that now each hash partition in the
Hash partitions are implicit in a compo site table. However, Oracle does not record them in the data dictionary, and you cannot manipulate them with DDL commands as you can range partitions.
(Composite-Hash)-Hash partitioning
is effective because it lets you combine pruning (on s_salesdate) with a full partition-wise join (on customerid<
/code>). In the previous example query, pruning is achieved by scanning only the subpartitions corresponding to Q3 of 1999, in other
words, row number 3 in Figure 5-2. Oracle then joins these subpartitions with the customer table, using a ful
l partition-wise join.
All characteristics of the hash-hash partition-wise join apply to the composite-hash partition-wise joi n. In particular, for this example, these two points are common to both methods:
The degree of paralleli
sm for this full partition-wise join cannot exceed 16. Even though the sales table has 128 subpartitions, it has only 16
hash partitions.
The rules for data placement on MPP systems apply here. The only difference is that a
hash partition is now a collection of subpartitions. You must ensure that all these subpartitions are placed on the same node as the
matching hash partition from the other table. For example, in Figure 5-2, store hash partition 9 of the sales table shown by the eight circled subpartitions, on the same node as hash partition 9 of the customers tabl
e.
The (Composite-List)-List method resembles that for (Composit e-Hash)-Hash partition-wise joins.
If needed, you can
also partition the customer table by the composite method. For example, you partition it by range on a postal code colum
n to enable pruning based on postal code. You then subpartition it by hash on customerid using the same number of partit
ions (16) to enable a partition-wise join on the hash dimension.
You c an also join range partitioned tables with range partitioned tables and list partitioned tables with list partitioned tables in a par tition-wise manner, but this is relatively uncommon. This is more complex to implement because you must know the distribution of the data before performing the join. Furthermore, if you do not correctly identify the partition bounds so that you have partitions of eq ual size, data skew during the execution may result.
The basic principle for using range-range and list-list is the same as fo
r using hash-hash: you must equipartition both tables. This means that the number of partitions must be the same and the partition bo
unds must be identical. For example, assume that you know in advance that you have 10 million customers, and that the values for sales on s_customerid and customers on c_customer
id. You should define partition bounds for both tables in order to generate partitions of the same size. In this example, part
ition bounds should be defined as 625001, 1250001, 1875001, ... 10000001, so that each partition contains 625000 rows.
Oracle can perform partial partition-wise joins only in parallel. Unlike full partition-wise joins, partial partition-wise joins require you to partition only one table on the join key, not both tables. The partitioned table is referred to as the reference table. The other table may or may not be partitione d. Partial partition-wise joins are more common than full partition-wise joins.
To execute a partial partition-wise join, Orac le dynamically repartitions the other table based on the partitioning of the reference table. Once the other table is repartitioned, the execution is similar to a full partition-wise join.
The performance advantage that partial partition-wise joins have over joins in non-partitioned tables is that the reference table is not moved during the join operation. Parallel joins between non-partit ioned tables require both input tables to be redistributed on the join key. This redistribution operation involves exchanging rows be tween parallel execution servers. This is a CPU-intensive operation that can lead to excessive interconnect traffic in Oracle Real Ap plication Clusters environments. Partitioning large tables on a join key, either a foreign or primary key, prevents this redistributi on every time the table is joined on that key. Of course, if you choose a foreign key to partition the table, which is the most commo n scenario, select a foreign key that is involved in many queries.
To illustrate partial partition-wise joins, consider the pr
evious sales/customer example. Assume that sales is not partitioned or is partitioned on a column other than s_cus
tomerid . Because sales is often joined with customers on customerid, and because this
join dominates our application workload, partition sales on s_customerid to enable partial partition-wise j
oin every time customers and sales are joined. As in full partition-wise join, you have several alternative
s:
The simplest method to enable a partial partition-wise join is to partition sales by hash on
s_customerid. The number of partitions determines the maximum degree of parallelism, because the partition is the smallest gra
nule of parallelism for partial partition-wise join operations.
The parallel execution of a partial partition-wise join is ill
ustrated in Figure 5-3, which assumes that both the degree of parallelism and the number of partitions of customers table in parallel. The granule of parallelism for the scan operation is a range of block
s.
Rows from customers that are selected by the first set, in this case all rows, are redistributed to the second
set of query servers by hashing customerid. For example, all rows in customers that could have matching ro
ws in partition P1 of sales are sent to query server 1 in the second set. Rows received by the second set o
f query servers are joined with the rows from the corresponding partitions in sales. Query server number 1 in the second
set joins all customers rows that it receives with partition P1 of sales.
|
N ote: This section is based on range-hash, but it also applies for range-list partial partition-wise join s. |
Considerations for full partition-wise joins also appl y to partial partition-wise joins:
The degree of parallelism does not need to equal the number of partit ions. In Figure 5-3, the query executes with two sets of 16 query servers. In this case, Oracle assigns 1 par tition to each query server of the second set. Again, the number of partitions should always be a multiple of the degree of paralleli sm.
In Oracle Real Application Clusters environments on shared-nothing platforms (MPPs), each hash parti
tion of sales should preferably have affinity to only one node in order to avoid remote I/Os. Also, spread partitions ov
er all nodes to avoid bottlenecks and use all CPU resources available on the system. A node can host multiple partitions when there a
re more partitions than nodes.
|
See Also: Oracle Real Application Clusters Deployment and P erformance Guide for more information on data affinity |
As with full partition-wise joins, the prime partitioning method for the sales table is to use the range method on column s_salesdate. This is because sales is a typical ex
ample of a table that stores historical data. To enable a partial partition-wise join while preserving this range partitioning, subpa
rtition sales by hash on column s_customerid using 16 subpartitions for each partition. Pruning and partial
partition-wise joins can be used together if a query joins customers and sales and if the query has a sele
ction predicate on s_salesdate.
When sales is composite, the granule of parallelism for a partial pa
rtition-wise join is a hash partition and not a subpartition. Refer to Figure 5-2 for an illustration of a ha
sh partition in a composite table. Again, the number of hash partitions should be a multiple of the degree of parallelism. Also, on a
n MPP system, ensure that each hash partition has affinity to a single node. In the previous example, the eight subpartitions composi
ng a hash partition should have affinity to the same node.
|
Note: This section is based on range-hash, but it also applies for range-list part ial partition-wise joins. |
Finally, you can use range partitioning on s_customerid to enable a partial partition-wise join. This works similar
ly to the hash method, but a side effect of range partitioning is that the resulting data distribution could be skewed if the size of
the partitions differs. Moreover, this method is more complex to implement because it requires prior knowledge of the values of the
partitioning column that is also a join key.
Partition-wise joins offer benefits described in this section:
Partition-wis e joins require less memory than the equivalent join operation of the complete data set of the tables being joined.
In the cas e of serial joins, the join is performed at the same time on a pair of matching partitions. If data is evenly distributed across part itions, the memory requirement is divided by the number of partitions. There is no skew.
In the parallel case, memory requirem ents depend on the number of partition pairs that are joined in parallel. For example, if the degree of parallelism is 20 and the num ber of partitions is 100, 5 times less memory is required because only 20 joins of two partitions are performed at the same time. The fact that partition-wise joins require less memory has a direct effect on performance. For example, the join probably does not need to write blocks to disk during the build phase of a hash join.
The optimizer weighs the advantages and disadvantages when deciding whether or not to use partition-wise joins.
In range partitioning where partition sizes differ, data skew increases response time; some parallel execution servers take longer than others to finish their jo ins. Oracle recommends the use of hash (sub)partitioning to enable partition-wise joins because hash partitioning, if the number of p artitions is a power of two, limits the risk of skew.
The number of partitions used for partition-wise j oins should, if possible, be a multiple of the number of query servers. With a degree of parallelism of 16, for example, you can have 16, 32, or even 64 partitions. If there is an even number of partitions, some parallel execution servers are used less than others. For example, if there are 17 evenly distributed partition pairs, only one pair will work on the last join, while the other pairs will have to wait. This is because, in the beginning of the execution, each parallel execution server works on a different partition pair . At the end of this first phase, only one pair is left. Thus, a single parallel execution server joins this remaining pair while all other parallel execution servers are idle.
Sometimes, parallel joins can cause remote I/Os. For example , on Oracle Real Application Clusters environments running on MPP configurations, if a pair of matching partitions is not collocated on the same node, a partition-wise join requires extra internode communication due to remote I/O. This is because Oracle must transfe r at least one partition to the node where the join is performed. In this case, it is better to explicitly redistribute the data than to use a partition-wise join.
The partitioning columns (or subpartitioning columns) of a table or index consist of an ordered list of columns whose values determine how the data is partitioned or subpartitioned. This list can include up to 16 columns, and cannot include any of the follo wing types of columns:
A LEVEL or ROWID pseudocolumn
A column of the ROWID datatype
A nested table, VARRAY, object type, or
A LOB column (BLOB, CLOB, NCLOB, or BFILE datatype)
A row's partit ioning key is an ordered list of its values for the partitioning columns. Similarly, in composite partitioning a row's subpartitionin g key is an ordered list of its values for the subpartitioning columns. Oracle applies either the range, list, or hash method to each row's partitioning key or subpartitioning key to determine which partition or subpartition the row belongs in.
In a range-partitioned table or index, the partitioning key of each row is compared with a set of upper and lower bounds to determin e which partition the row belongs in:
Every partition of a range-partitioned table or index has a noninclusive upper bound, which is specified by the VAL
UES LESS THAN clause.
Every partition except the first partition also h
as an inclusive lower bound, which is specified by the VALUES LESS THAN on the next-lower part
ition.
The partition bounds collectively define an ordering of the partitions in a table or index. The first partiti
on is the partition with the lowest VALUES LESS THAN clause, and the last or highest partition
is the partition with the highest VALUES LESS THAN clause.
If you attempt to insert a row into a table and the row's partitioning key is greater than or equal to the partition bound for the highest partition in the table, the insert will fail.
When compar
ing character values in partitioning keys and partition bounds, characters are compared according to their binary values. However, if
a character consists of more than one byte, Oracle compares the binary value of each byte, not of the character. The comparison also
uses the comparison rules associated with the column data type. For example, blank-padded comparison is done for the ANSI CHAR
data type. <
/a>The NLS parameters, specifically the initialization
parameters NLS_SORT and NLS_LANGUAGE and the environment variable NLS_LANG, have no effect on
the comparison.
The binary value of character data varies depending on which character set is being used (for example, ASCII o r EBCDIC). For example, ASCII defines the characters A through Z as less than the characters a through z, whereas EBCDIC defines A th rough Z as being greater than a through z. Thus, partitions designed for one sequence will not work with the other sequence. You must repartition the table after importing from a table using a different character set.
You can specify the keyword MAXVALUE for any value in the partition bound value
_list. This keyword represents a virtual infinite value that sorts higher than any other value for the data type, including the
NULL value.
For example, you might partition the OFFICE table on STATE (a CHAR(10
) column) into three partitions with the following partition bounds:
NULL cannot be specified as a value in a partition bound value_list. A
n empty string also cannot be specified as a value in a partition bound value_list, because it is treated as NULL within
the database server.
For the purpose of assigning rows to partitions, Oracle Database sorts nulls greater than all other valu
es except MAXVALUE. Nulls sort less than MAXVALUE.
This means that if a table is partitioned on a nu
llable column, and the column is to contain nulls, then the highest partition should have a partition bound of MAXVALUE
for that column. Otherwise the rows that contain nulls will map above the highest partition in the table and the insert will fail.
If the partition key includes a column that has the D
ATE datatype and the NLS date format does not specify the century with the year, you must specify partition bounds using the <
code>TO_DATE function with a 4-character format mask for the year. Otherwise, you will not be able to create the table or inde
x. For example, with the sales_range table using a DATE column:
CREATE TABLE
sales_range
(salesman_id NUMBER(5),
salesman_name VARCHAR2(30),
sales_amount NUMBER(10),
sales_date DATE)
COMPRESS
PARTITION
BY RANGE(sales_date)
(PARTITION sales_jan2000 VALUES LESS THAN(TO_DATE('02/01/2000','DD/MM/YYYY')),
PARTITION sales_feb2000 VALUES L
ESS THAN(TO_DATE('03/01/2000','DD/MM/YYYY')),
PARTITION sales_mar2000 VALUES LESS THAN(TO_DATE('04/01/2000','DD/MM/YYYY')),
PARTITI
ON sales_apr2000 VALUES LESS THAN(TO_DATE('05/01/2000','DD/MM/YYYY')));
When you query or modify data, it is recommended that you use the
TO_DATE function in the WHERE clause so that the value of the date information can be determined at compil
e time. However, the optimizer can prune partitions using a selection criterion on partitioning columns of type DATE whe
n you use another format, as in the following examples:
SELECT * FROM sales_range
WHERE sales_date BE
TWEEN TO_DATE('01-JUL-00', 'DD-MON-YY')
AND TO_DATE('01-OCT-00', 'DD-MON-YY');
SELECT * FROM sales_range
WHERE sales_date BE
TWEEN '01-JUL-2000' AND '01-OCT-2000';
In t
his case, the date value will be complete only at runtime. Therefore you will not be able to see which partitions Oracle is accessing
as is usually shown on the partition_start and partition_stop columns of the EXPLAIN PL
AN statement output on the SQL statement. Instead, you will see the keyword KEY for both columns.
When a table or index is partitioned by range on multiple columns, each partition bound and parti tioning key is a list (or vector) of values. The partition bounds and keys are ordered according to ANSI SQL2 vector comparison rules . This is also the way Oracle orders multicolumn index keys.
To compare a partitioning key with a partition bound, you compare the values of their corresponding columns until you find an unequal pair and then that pair determines which vector is greater. The values of any remaining columns have no effect on the comparison.
|
See Also: Oracle Database Administrator's Guide for more information regarding multicolumn partitioning keys |
The rules for partitionin g indexes are similar to those for tables:
An index can be partitioned unless:
The index is a cluster index
The index is defined on a clustered table.
You can mix partitioned and nonpartitioned indexes with partitioned and nonpartitioned tables:
A partitioned table can have partitioned or nonpartitioned indexes.
A nonpartitioned table can have partitioned or nonpartitioned B-tree indexes.
Bitmap indexes on nonpartitioned tables ca nnot be partitioned.
A bitmap index on a partitioned table must be a local index.
Howev er, partitioned indexes are more complicated than partitioned tables because there are three types of partitioned indexes:
Local prefixed
Local nonprefixed
Global prefixed
These types are described in the following section. Oracle supports all three types.
In a local index, al
l keys in a particular index partition refer only to rows stored in a single underlying table partition. A local index is created by
specifying the LOCAL attribute.
Oracle constructs the local index so that it is equipartitioned with the underlying table. Oracle partitions the index on the same columns as the underlying table, creates the same number of partitions or subpartitions, and gives them the same partition bounds as corresponding partitions of the underlying table.
Oracle also maintains the index partitioni ng automatically when partitions in the underlying table are added, dropped, merged, or split, or when hash partitions or subpartitio ns are added or coalesced. This ensures that the index remains equipartitioned with the table.
A local index can be created
Local indexes have the following advantages:
Only one index partition needs to be rebuilt when a maintenance operation other than
SPLIT PARTITION or ADD PARTITION is performed on an underlying table partition.
The duration of a partition maintenance operation remains proportional to partition size if the partitioned tab le has only local indexes.
Local indexes support partition independence.
Lo cal indexes support smooth roll-out of old data and roll-in of new data in historical tables.
Oracle can take advantage of the fact that a local index is equipartitioned with the underlying table to generate better query access plans.
Local indexes simplify the task of tablespace incomplete recovery. In order to recover a partition or subp artition of a table to a point in time, you must also recover the corresponding index entries to the same point in time. The only way to accomplish this is with a local index. Then you can recover the corresponding table and index partitions or subpartitions togethe r.
|
See Also: < a class="olinkSRC ARPLS" href="../../appdev$101/b10802/toc.htm">PL/SQL Packages and Types Reference for a description of theDBMS_PCLXUTIL package |
A local index is prefixed if it is partition
ed on a left prefix of the index columns. For example,
if the sales table and its local index sales_ix are partitioned on the week_num column, then i
ndex sales_ix is local prefixed if it is defined on the columns (week_num, xaction_num). On th
e other hand, if index sales_ix is defined on column product_num then it is not prefixed.
Local pref ixed indexes can be unique or nonunique.
Figure 5-4 illustrates another example of a local prefixed in dex.
A local index is nonprefixed if it is not partitioned on a left prefix of the index columns.
You cannot have a unique local nonprefixed index unless the partitioning key is a subset of the index ke y.
Figure 5-5 illustrates an example of a local nonprefixed index.
In a global partitioned index, the keys in a particular index partition may refer to rows stored in more than one underlying table p artition or subpartition. A global index can be range or hash partitioned, though it can be defined on any type of partitioned table.
A global index is created by specifying the GLOBAL attribute. The database administrator is responsible for defi
ning the initial partitioning of a global index at creation and for maintaining the partitioning over time. Index partitions can be m
erged or split as necessary.
Normally, a global index is not equipartitioned with the underlying table. There is nothing to pr
event an index from being equipartitioned with the underlying table, but Oracle does not take advantage of the equipartitioning when
generating query plans or executing partition maintenance operations. So an index that is equipartitioned with the underlying table s
hould be created as LOCAL.
A global partitioned index contains a single B-tree with entries for all rows in all p artitions. Each index partition may contain keys that refer to many different partitions or subpartitions in the table.
The hi
ghest partition of a global index must have a partition bound all of whose values are MAXVALUE. This insures that all ro
ws in the underlying table can be represented in the index.
A global partitioned index is prefixed if it is partitioned on a left prefix of the index columns. See Figure 5-6 for an example. A global partitioned index is nonprefixed if it is not partitioned on a left prefix of the index columns. Oracle does not support global nonprefixed partitioned indexes.
Global prefixed partitioned indexes can be unique or nonunique.
Nonpartitioned indexes are treated as global prefixed nonpartitioned indexes.
Global partitioned indexes are harder to manage than local indexes:
When the data in an underlying table partition is moved or removed (SPLIT, MOVE, DROP, or
When an underlying table partition or subpartition is recovered to a point in time, all corresponding e ntries in a global index must be recovered to the same point in time. Because these entries may be scattered across all partitions or subpartitions of the index, mixed in with entries for other partitions or subpartitions that are not being recovered, there is no wa y to accomplish this except by re-creating the entire global index.
< a href="#g1010065">Table 5-2 summarizes the types of partitioned indexes that Oracle supports.
If an index is local, it is equipartitioned with the underlying table. Otherwise, it is global.
A prefixed in dex is partitioned on a left prefix of the index columns. Otherwise, it is nonprefixed.
Ta ble 5-2 Types of Partitioned Indexes
| Ty pe of Index | Ind ex Equipartitioned with Table | Index Partitioned on Left Prefix of Index Columns | UNIQUE Attribute Allowed | Example: Table Partitioning Key |
Example: Index Columns |
Example: Index Partitioning Ke y |
|---|---|---|---|---|---|---|
| Local Pr efixed (any partitioning method) | Yes | Yes | Yes | A | A, B | A |
| Local Nonprefixed (any partitioning method) | Yes | No | Yes (Note1) | B, A | A | |
| Global Prefixed (range partitioning only) | No (Note2) | Yes | Yes | A | B | B |
Note 1: For a unique local nonprefixed index, the partitioning key must be a subset of the index key.
Note 2
: Although a global partitioned index may be equipartitioned with the underlying table, Oracle does not take advantage of the partiti
oning or maintain equipartitioning after partition maintenance operations such as DROP or SPLIT PARTI
TION.
Nonprefixed indexes are particularly useful in historical databases. In a table containing historica l data, it is common for an index to be defined on one column to support the requirements of fast access by that column, but partitio ned on another column (the same column as the underlying table) to support the time interval for rolling out old data and rolling in new data.
Consider a sales table partitioned by week. It contains a year's worth of data, divided into 13 partiti
ons. It is range partitioned on week_no, four weeks to a partition. You might create a nonprefixed local index sal
es_ix on sales. The sales_ix index is defined on acct_no because there are queries that
need fast access to the data by account number. However, it is partitioned on week_no to match the sales t
able. Every four weeks, the oldest partitions of sales and sales_ix are dropped and new ones are added.
It is more expensive to probe into a nonprefixed index than to probe into a prefixed index. p>
If an index is prefixed (either local or global) and Oracle is presented with a predicate involving the index columns, then partition pruning can restrict application of the predicate to a subset of the index partitions.
For examp
le, in Figure 5-4, if the predicate is deptno=15, the optimizer knows to apply the predicate onl
y to the second partition of the index. (If the predicate involves a bind variable, the optimizer will not know exactly which partiti
on but it may still know there is only one partition involved, in which case at run time, only one index partition will be accessed.)
When an index is nonprefixed, Oracle often has to apply a predicate involving the index columns to all N index p
artitions. This is required to look up a single key, or to do an index range scan. For a range scan, Oracle must also combine informa
tion from N index partitions. For example, in Figure 5-5, a local index is partitioned on
chkdate with an index key on acctno. If the predicate is acctno=31, Oracle probes all 12 index parti
tions.
Of course, if there is also a predicate on the partitioning columns, then multiple index probes might not be necessary.
Oracle takes advantage of the fact that a local index is equipartitioned with the underlying table to prune partitions based on the
partition key. For example, if the predicate in Figure 5-4 is chkdate<3/97, Oracle only has t
o probe two partitions.
So for a nonprefixed index, if the partition key is a part of the WHERE clause but not of
the index key, then the optimizer determines which index partitions to probe based on the underlying table partition.
When ma ny queries and DML statements using keys of local, nonprefixed, indexes have to probe all index partitions, this effectively reduces the degree of partition independence provided by such indexes.
Table 5-3 Comparing Prefixed Local, Non prefixed Local, and Global Indexes
| Index Characteristics | Prefixed Local | Nonprefixed Local | Global |
|---|---|---|---|
| Unique possible? | Yes | Yes | Yes. Must be global if using indexes on columns other than the partitioning columns |
| Manageability | Easy to manage | Easy to manage | Harder to manage |
| OLTP | Good | Bad | Good |
| Long Running (DSS) | Good | Good | Not Good |
When deciding how to partition indexes on a table, consider the mix of applicati ons that need to access the table. There is a trade-off between performance on the one hand and availability and manageability on the other. Here are some of the guidelines you should consider:
Global indexes and local prefixed indexes provide better performance than local non prefixed indexes because they minimize the number of index partition probes.
Local indexes support mor e availability when there are partition or subpartition maintenance operations on the table. Local nonprefixed indexes are very usefu l for historical databases.
For DSS applications, local nonprefixed indexes can improve performance because many index partitions can be scanned in parallel by range queries on the index k ey.
For example, a query using the predicate "acctno between 40 and 45" on the table checks of Figure 5-4 causes parallel scans of all the partitions of the nonprefixed index ix3. On the other han
d, a query using the predicate "deptno BETWEEN 40 AND 45" on the table deptno of Figure
5-5 cannot be parallelized because it accesses a single partition of the prefixed index ix1.
For historical tables, indexes should be local if possible. This limits the impact of regularly scheduled drop partition opera tions.
Unique indexes on columns other than the partitioning columns must be global because unique local nonprefixed indexes whose key does not contain the partitioning key are not supported.
Default physical attributes are initially specified when a CRE
ATE INDEX statement creates a partitioned index. Because there is no segment corresponding to the partitioned ind
ex itself, these attributes are only used in derivation of physical attributes of member partitions. Default physical attributes can
later be modified using ALTER INDEX MODIFY DEFAULT ATTRIBUTES.
CREATE INDEX are determined as follows:
Values of physical attributes specified (explicitly or by default) for the index are used whenever the value of a correspond
ing partition attribute is not specified. Handling of the TABLESPACE attribute of partitions of a LOCAL ind
ex constitutes an important exception to this rule in that in the absence of a user-specified TABLESPACE value (at both
partition and index levels), that of the corresponding partition of the underlying table is used.
Physic
al attributes (other than TABLESPACE, as explained in the preceding) of partitions of local indexes created in the cours
e of processing ALTER TABLE ADD PARTITION are set to the default physical attribu
tes of each index.
Physical attributes (other than TABLESPACE) of index partitions created by ALT
ER TABLE SPLIT PARTITION are determined as follows:
Values of physical attributes of the index partition being split are used.
Physical attributes of an existing index partit
ion can be modified by ALTER INDEX MODIFY PARTITION and ALTER INDEX
REBUILD PARTITION. Resulting attributes are determined as follows:
Values of p
hysical attributes of the partition before the statement was issued are used whenever a new value is not specified. Note that A
LTER INDEX REBUILD PARTITION can be used to change the tablespace in which a partition resides.
Physical attributes of global index partitions created by ALTER INDEX SPLIT PARTITION are determined as follows:
Values of physical attributes of the partition being split are used when ever a new value is not specified.
Physical attributes of all partitions of an index (along with default
values) may be modified by ALTER INDEX, for example, ALTER INDEX indexname
NOLOGGING changes the logging mode of all partitions of indexname to NOLOGGING.
|
See Also: < a class="xlinkSRC ADMIN017" href="../../server$101/b10739/partiti.htm#ADMIN017">Oracle Database Administrator's Guide fo r more detailed examples of adding partitions and examples of rebuilding indexes |