Dave blogs, "The other day we noticed that a number of guests running on our Solaris OpenStack cloud were very lethargic, with even simple operations like listing boot environments with beadm taking a long time, let alone more complex things like installing packages. It was easy to rule out CPU load on the compute nodes, as prstat quickly showed me that load averages were not at all high on the compute nodes hosting my guests.
The next suspect was disk I/O. When you're running on bare metal with local disks, this is pretty easy to check; things like zpool status, zpool iostat, and iostat provide a pretty good high-level view of what's going on, and tools like prstat or truss might help identify the culprit. In an OpenStack cloud it's not nearly that simple. It's all virtual machines, the "disks" are ZFS volumes served up as LUNs by an iSCSI target, and with a couple hundred VM's running on a bunch of compute nodes, you're now searching for the needle(s) in a haystack. Where to start?..."
(Get More Information . .)