A COMPARATIVE STUDY OF PARTITIONING TECHNIQUES: INTRODUCTION

INTRODUCTION

Modern enterprises frequently run mission-critical databases containing upwards of several hundred gigabytes and, in many cases, several terabytes of data. These enterprises are challenged by the support and maintenance requirements of very large databases (VLDB), and must devise methods to meet those challenges. It is an environment or storage space managed by a relational database management system (RDBMS) consisting of vast quantities of information. It is sometimes used to describe databases occupying magnetic storage in the terabyte range and containing billions of table rows. Typically, these are decision support systems or transaction processing applications serving large numbers of users.

ISSUES WITH VLDB

Backing up the database. With a VLDB, a daily backup of everything via RMAN or Hot Backup is simply not possible, as it can’t run the backup in 24 hours. It is required to use hardware such as mirror splitting or deltas.

Performance. It is important to consider radical changes such as removing RI or designing around full table scans and ignoring the block buffer cache for the largest tables.

The number or size of objects starts causing bits of Oracle to break or work less efficiently (so many tables it takes 2 minutes to select them all or it will hit an unexpected limit like the 2TB disk size in ASM, because it is need to use bigger disc sizes).

Maintenance tasks become a challenge in their own right. This could be stats gathering, it could be adding columns to a table, it could be recreating global indexes, all of which now take more time than it can schedule the maintenance windows {so part of the definition of a VLDB could be down to how active a database is and how small the windows are – 1TB could be a VLDB).

PARTITIONING

Partitioning addresses key issues in supporting very large tables and indexes by letting you decompose them into smaller and more manageable pieces called partitions, which are entirely transparent to an application. SQL queries and DML statements do not need to be modified in order to access partitioned tables. However, after partitions are defined, DDL statements can access and manipulate individual partitions rather than entire tables or indexes. This is how partitioning can simplify the manageability of large database objects.

Each partition of a table or index must have the same logical attributes, such as column names, data types, and constraints, but each partition can have separate physical attributes such as compression enabled or disabled, physical storage settings, and tablespaces.

Partitioning is useful for many different types of applications, particularly applications that manage large volumes of data. OLTP systems often benefit from improvements in manageability and availability, while data warehousing systems benefit from performance and manageability.

Partitioning offers these advantages

It enables data management operations such as data loads, index creation and rebuilding, and backup/recovery at the partition level, rather than on the entire table. This results in significantly reduced times for these operations.

It improves query performance. In many cases, the results of a query can be achieved by accessing a subset of partitions, rather than the entire table. For some queries, this technique (called partition pruning) can provide order-of-magnitude gains in performance.

It significantly reduces the impact of scheduled downtime for maintenance operations. Partition independence for partition maintenance operations lets you perform concurrent maintenance operations on different partitions of the same table or index. You can also run concurrent SELECT and DML operations against partitions that are unaffected by maintenance operations.

It increases the availability of mission-critical databases if critical tables and indexes are divided into partitions to reduce the maintenance windows, recovery times, and impact of failures.

Parallel execution provides specific advantages to optimize resource utilization, and minimize execution time. Parallel execution against partitioned objects is key for scalability in a clustered environment. Parallel execution is supported for queries as well as for DML and DDL.

Partitioning enables faster data access within an Oracle database. Whether a database has 10 GB or 10 TB of data, partitioning can improve data access by orders of magnitude. Partitioning can be implemented without requiring any modifications to your applications. For example, for converting a nonpartitioned table to a partitioned table without needing to modify any of the SELECT statements or DML statements which access that table. It is not necessary to rewrite the application code to take advantage of partitioning.