Curious about how MySQL would perform on the Sun Storage 7110 and 7210, David Lutz found that MySQL performance on the Sun Storage 7000 line is excellent, even on the entry level Sun Storage 7110, adding that, for MySQL on Linux over NFS, performance is great, right out of the box. Lutz also has praise for the capacious addressable memory, most of which is used as cache, available on the Sun Storage 7000 product line.
Lutz found that with a database working set (the subset of data that clients are actively using) that fits in the cache of the Unified Storage appliance, MySQL read-only performance on Linux and Solaris NFS was primarily bound by network bandwidth. In his testing, the sustained data throughput was running very near network line rate for the 1Gb NIC.
He writes that this resulted in 2x to 3x the Sysbench throughput for the MySQL/InnoDB/sysbench server using the Sun Storage 7110, compared to the same server when it was attached via fiber channel to a traditional HW RAID array with 2GB memory (providing enough cache for less than 1/3 of our working set). The read-only results were strong across the board for Linux and Solaris over both NFS and iSCSI, he notes.
Even more impressive were the results on the 7210, where Lutz observed ~90% linear scaling for 1 to 6 MySQL/sysbench servers over NFS, when configured with a 10Gb NIC on the Unified Storage appliance and 1Gb nics in each DB server. Analytics showed that CPU and network in the 7210 were both at approximately 50% utilization during this test, and the working set used 36GB out of the available ~60GB of cache, so it is likely that there could could have been similar scaling to 8 or 10 DB servers.
The author points out that the only tuning applied was at the Unified Storage appliance end, which was set to the record size of the NFS share to 16KB for the InnoDB tablespace storage. The record size was left at its default of 128KB for the NFS share that contained the InnoDB logs, he points out, adding that there was no special network or NFS tuning applied to either the DB server or the appliance.
Lutz infers from his results that, if users have an active working set that exceeds the cache capacity in the Unified Storage appliance, their random read performance will eventually be bound by the IOPS rate of the underlying storage. For the Sun Storage 7110 and 7210, which do not have Read Flash Accelerator options like the 7310 and 7410, that means the IOPS rate of the disks. In a MySQL/InnoDB/sysbench test on the 7110, with an aggregate working set of 36GB (180GB aggregate table space), resulting in a 40% cache hit rate in the appliance, Lutz writes that he observed roughly 4000 NFS reads per second and roughly 2700 resulting disk IOPS. That translates to 230 IOPS for each of the 12 data drives. This test used only about 15% of the drive capacity, so the disks were "short stroked" to deliver better IOPS rates than would have been the case with longer average seeks. For example, at 80% capacity results might have been on the order of 150 to 180 IOPS per drive.
Lutz writes of his pleasant surprise in discovering that, despite the lack
of Write Flash Accelerators in the 7110, he observed over 2300 reads plus 400 writes per second on the 7110, with a single SQL/InnoDB/sysbench server accessing a 6GB working set (20GB total table space) via NFS over a 1Gb NIC. This resulted in 1.5x the Sysbench throughput for the MySQL/InnoDB/sysbench server using the Sun Storage 7110, he writes, compared to the same server when it was attached via fiber channel to a traditional HW RAID array with 2GB memory (providing enough cache for less than 1/3 of the working set).
Lutz speculates that, as with the read-only test, the large read cache in the 7110 compared to the HW RAID array probably played a big role here. The 85% cache hit rate in the appliance provided an advantage on read performance, which offset the potential advantage of the battery backed write cache in the HW RAID array. In addition, the multi-threaded MySQL/InnoDB/sysbench workload appears to have benefited from group commit on writes, since the 7110 started with lower throughput with 1 or 2 active threads, but began to outperform the HW RAID array at 4 threads and higher.
The author notes that he was sufficiently surprised by the MySQL read-write performance over Linux NFS that he felt compelled to confirm that fsync() calls in Linux were actually resulting in correct NFS data commits. Based on a combination of Dtrace Analytics and Wireshark analysis of NFS traffic, he confirms that the data were correctly going to disk.
Lutz observes that, while MySQL read-only performance on Solaris NFS is currently excellent, read-write performance is impacted by the lack of VMODSORT support in the Solaris NFS client (CR 6213799). This affects fsync() performance for large files using a default, buffered NFS mount. The normal workaround for this is to eliminate client side buffering of file data by mounting the file system with the forcedirectio option, or enabling directio on a per-file basis. For example, the MySQL/InnoDB option "innodb_flush_method = O_DIRECT" enables dirctio on the InnoDB tablespace files. That is likely to work fine on an appliance that includes write flash accelerators like the Sun Storage 7210 and higher, but write flash accelerators are not currently available on the Sun Storage 7110. For the 7110 without write flash accelerators, there was not a MySQL read-write performance gain by using directio instead of default, buffered NFS file access.
For Solaris, he adds, MySQL read-write performance on ZFS over iSCSI currently exceeds its performance over buffered or directio enabled NFS on the Sun Storage 7110, provided that fast-write-ack is enabled on the iSCSI luns.
Lutz points out that a Solaris system running ZFS over iSCSI can realize performance gains by enabling fast-write-ack on the iSCSI luns in the storage appliance, because ZFS is known to correctly issue SCSI cache flush commands when fsync() is called (that is not currently known to be true for any Linux file system).
The fast-write-ack option can be activated by enabling the "write cache enabled" option in the appliance for a given iSCSI lun. However, due to CR 6843533 the write cache enabled setting on an iSCSI lun will be silently disabled any time an iscsi login occurs for a target, although the Unified Storage BUI and CLI will still indicate that it is enabled. Examples of iscsi target login triggers include a reboot of the Unified Storage appliance, a reboot of the client, a client "iscsiadm modify target-param -p ..." command to modify negotiated parameters, or an "iscsiadm remove ..." followed by an "iscsiadm add ..." for the affected target. The workaround for CR 6843533 is to manually disable and then reenable write cache following an iscsi target login, he writes.
Lutz recommends that, for an NFS share that will be used to store MySQL tablespace files, users should match the record size of the share to the block size of the storage engine. For example, this should be 16k for InnoDB tablespace files. This can be configured on a per-share basis by setting the "Database record size" (in the BUI) or "recordsize" (in the CLI) for the share. This must be done before creating the tablespace files in the share. For an NFS share that will store transaction logs or other non-tablespace files, on the other hand, the record size of the share can be left at its default of 128k.
More Information
Introducing the Sun Storage 7000 Series
Sun Storage 7000 Performance invariants
Analyzing the Sun Storage 7000
Compared Performance of Sun 7000 Unified Storage Array Line
Sun Storage 7310 perf limits
The Fishworks Team
[...read more...]
Other articles in the MySQL section of Volume 137, Issue 4:
Evaluating MySQL Performance on Sun Storage 7110 and 7210
(this article)
See all archived articles in the MySQL section.
|
|
Top 10 Most Popular Articles in Current Issue (Vol 168, Issue 1)
|
|
|
|
|