Know your data and its value to the business of the day, Gaetano Bisaz would seem to be saying in his series of blogs on backup practices. If it doesn't affect current operations, maybe it needs to be stored differently from data that does. In Part 3 of the article series entitled "Classic backup is a dead-end", he considers where to put data that no longer can claim space on primary storage and how best to back it up.
Users who have come to appreciate the inutility of NDMP backup or filesystem oriented backup will find some useful recommendations in Part 3 of Bisaz's blog series, where he points out the pointlessness of making multiple copies of the same stale files and offers some more sensible solutions.
Using optical media is the choice some make but that has limitations both in the life of the media itself and in the drives. This is an outmoded technology, Bisaz argues.
Writing data to two storage boxes is another alternative but not a good one in the author's view. One quits, then what does a user do. Take the time to deploy a replacement box and then the further time to copy, copy, copy until the two boxes are synchronized. This approach puts unaccustomed stress on SATA disks, and there is always the risk that the healthy box can fail before the operation is complete.
What about deduplication in storage? Well, Bisaz reasons, unless you are using ZFS the odds are very good that some bytes might go missing in the course of making the necessary second copy on another medium.
A better option is the SAM-FS solution, which accepts data written over network into its cache while a copy is written to tape in another room, delivering both archive and data protection.
Bisaz describes SAM-FS as a filesystem that offers a kind of hierarchical storage management (HSM). Using daemons, SAM-FS watches closely what happens in the filesystem. If a new file is written to it, and it is closed, the SAM-FS daemons write a copy of the file to another media, which is often tape, resulting in two copies of the file and completed backup.
He notes that the archive step writes a copy (or up to four) to another media, and that copy can be on any supported media, like disk, tape, optical, honeycomb, or a combination of these. The copies can be distributed over the planet through WAN connections.
The result is a near continuous data protection arrangement that produces a huge saving in hardware. In addition, space on the filesystem can be released, further saving money.
And, once the data is released, Bisaz continues, only the representation is left on the cache. The user still sees all attributes as if the data is still residing in the cache in full.
If primary storage fails, one need only rebuild a view to all the files using the METADATA with the SAM-FS, which can be done anytime using the samfsdump command. The data, already copied to other media, can be recalled on request. If a released file is recalled, a daemon writes the data back into cache or, as an option, directly to the requestor, bypassing the cache.
Bisaz is sold on SAM-FS.
More Information
Classic backup is a dead-end, Part 1
Classic backup is a deadend, Part 2
[...read more...]