System News
Testing File System Compression on the Sun Unified Storage System 7000
It's Free and Part of ZFS So Why Not Use It?
March 26, 2009,
Volume 133, Issue 4

ZFS and the Sun Unified Storage System 7000: Compression, free and easy to use
 

While built-in filesystem compression might not be the greatest thing since sliced bread in blogger Dave Pacheco's view, he does assert that it is far superior to using an external appliance between storage and clients to do the job. In his blog "Compression on the Sun Storage 7000" he presents the results of an experiment that demonstrates the utility of the built-in compression feature of ZFS.

Enabling compression on the Sun Storage 7000 series is simple, Pacheco explains, requiring only the creating of a share and modifying its properties. The result is that all new data written to the share will be compressed, no matter its source, with the specified algorithm. Turning off the compression feature is just as simple: select "Off" from the drop-down menu.

Pacheco concedes that compression does involve a trade-off with CPU utilization for disk space but, by reducing the space used, less disk I/O is called for anyway, so the result may just be a performance improvement. In any case, with the 4 quad-core 2GHz Opteron processors in a server like the 7410, there is plenty of room for both compression and performance, he writes.

The blog presents the results of tests Pacheco performed using the 7410 and a basic workload consisting of 10 clients, each of them writing 3GB to a share and then reading it back for a total of 30GB in each direction. "This fits entirely in the client's DRAM, but it's about twice the size of the server's total memory. While each client has its own share, they all use the same compression level for each run, so only one level is tested at a time," he points out.

The test included each of the compression levels supported on the 7000 series: lzjb, gzip-2, gzip (which is gzip-6), gzip-9, and none. Pacheco also used two data sets: 'text' (copies of /usr/dict/words, which is fairly compressible) and 'media' (copies of the Fishworks code swarm video, which is not very compressible).

The tests showed Pacheco similar results with between 3 and 30 clients (with the same total write/read throughput, so they were each handling more data). The results were similar regardless of whether each client had its own share or not.

The "unsurprising" results showed that, at the NFS and network levels, the experiments basically appear the same, except that the writes are spread out over a longer period for higher compression levels, while read times are pretty much unchanged across all compression levels. The total NFS and network traffic should be the same for all runs. Pacheco also found that CPU usage increases with higher compression levels, but caps out at about 50%, which he is at a loss to explain.

The author asks his readers to keep in mind that the disk throughput rate is twice that of the data we're actually reading and writing because the storage is mirrored. His expectation was that there would be an actual decrease in disk bytes written and read as the compression level increases because we're writing and reading more compressed data. He found that, having collected similar data for the media (uncompressible) data set, the three important differences were that with higher compression levels, each workload took less time than the corresponding text one.

He summarizes his results as follows:

  • read performance is generally unaffected by compression
  • lzjb can afford decent space savings, but performs well whether or not it's able to generate much savings.
  • Even modest gzip imposes a noticeable performance hit, whether or not it reduces I/O load.
  • gzip-9 in particular can spend a lot of extra time for marginal gain.

He concludes that "compression is free, built-in, and very easy to enable on the 7000 series. The performance effects vary based on the workload and compression algorithm, but powerful CPUs allow compression to be used even on top of serious loads. Moreover, the appliance provides great visibility into overall system performance and effectiveness of compression, allowing administrators to see whether compression is helping or hurting their workload."

More Information

OpenSolaris Community: ZFS

Sun Storage 7000 Unified Storage Systems [...read more...]

Keywords:

fullsource
 

Other articles in the OpenStorage section of Volume 133, Issue 4:

See all archived articles in the OpenStorage section.



News and Solutions for Users of Solaris, Java and Oracle's Sun hardware products
Just the news you need, none of what you don't – 42,000+ Members – 24,000+ Articles Published since 1998