A new technical white paper on Exadata Hybrid Columnar Compression (EHCC) explains what Exadata compression is and how it works. Considered an important feature for data warehouse customers, Exadata compression increases performance while reducing the overall cost of storage.
Warehouse compression and archive compression are two new Oracle Exadata Storage Server features highlighted in this December 2009 eight-page Oracle white paper. Oracle claims average storage savings with EHCC ranges from 10x to 15x, depending on which feature is implemented; customer benchmarks have resulted in storage savings of up to 204x, the company reports.
Oracle’s EHCC technology is a new method for organizing data within a database block. As the name implies, this technology utilizes a combination of both row and columnar methods for storing data. This hybrid approach achieves the compression benefits of columnar storage, while avoiding the performance shortfalls of a pure columnar format, the authors write. A logical construct called the compression unit is used to store a set of Exadata Hybrid Columnar-compressed rows. When data is loaded, column values are detached from the set of rows, ordered and grouped together and then compressed. After the column data for a set of rows has been compressed, it is fit into the compression unit.
The paper asserts a key benefit of the EHCC approach is that it provides both the compression and performance benefits of columnar storage without sacrificing the feature set of the Oracle Database. An example cited explains while optimized for scan-level access, because row data is self-contained within compression units, Oracle is still able to provide efficient row-level access, with entire rows typically being retrieved with a single I/O. In contrast, pure columnar formats require at least one I/O per column for row-level access.
Warehouse Compression
Warehouse compression provides two levels of compression: low and high. According to the paper, warehouse compression high typically provides a 10x reduction in storage, while warehouse compression low typically provides a 6x reduction. Both levels have been optimized to increase scan query performance by taking advantage of the fewer number of blocks on disk.
Archive Compression
Archive compression typically achieves a compression ratio of 15:1 (15x), the paper states. That is, an uncompressed table or partition would require 15x more storage than a table or partition using archive compression.
In contrast to warehouse compression, archive compression is a pure storage saving technology. Tables or partitions utilizing archive compression will typically experience a decrease in performance - a factor of the compression algorithm being optimized for maximum storage savings. Therefore, archive compression is intended for tables or partitions that store data that is rarely accessed. Databases supporting any application workload, including OLTP and data warehouses, can use archive compression to reduce the storage requirements of historical data.
More Information
Exadata Hybrid Columnar Compression - Oracle white paper
Sun Oracle Database Machine and Exadata Storage Server
What's New in Oracle Exadata V2?
Exadata Partner Program
Sun Oracle Database Machine with Sun FlashFire Technology
The Value of Sun Oracle Database Machine
Technical Overview of the Exadata Product Family
[...read more...]