Dr. Marshall Kirk McKusicK takes a critical look at disk-based storage in terms of the shortcuts it takes and the remedies file systems must employ to overcome them.
First among the culprits that McKusick cites is the track cache controller, whose large buffer houses an accumulation of data written only to the cache iself rather than the disk. Unless a write-request is sent sufficiently quickly to cause writing of the next sequential block, the file system can fail to deliver the integrity promised to user applications using the fsync system call.
Tag queueing offered a partial solution to this situation, allowing applications to be accurately notified when their data has reached stable store without incurring the penalty of lost disk revolutions when writing contiguous blocks. Fine for SCSI disks but not for the ATA disk, which lacks tag queueing capability. SATA disks overcame this shortcoming by featuring a new definition called NCQ (Native Command Queueing) that has a bit in the write command that tells the drive if it should report completion when media has been written or when cache has been hit.
Increasing disk sector size has been a mixed blessing, writes McKusick, since error rate per bit has also increased necessitating an error code with enough redundancy for each sector to handle a high correction rate even though most sectors will not require it. This has also led to performance degradation since, to provide compatibility with old applications, the disk controllers on the new disks with 4,096-byte sectors must emulate the old 512-byte sector disks.
The upshot of this, the author concludes, is that file systems need to be aware of the disk technology on which they are running to ensure that they can reliably deliver the semantics they have promised. Users need to be aware of the constraints that different disk technology places on file systems and select a technology that will not result in poor performance for the type of file-system workload they will be using.
Open Storage SSD Array Delivers Record Setting Price/Performance
Read More ...