System News
Treating Storage as If It Were a Utility
The Celeste Project: Failed Backups? Not Anymore.
December 10, 2007,
Volume 118, Issue 2

We treat the storage as a utility just like power.

-- Glenn Scott
 

What good is a backup system that doesn't produce a particular file on-demand? Or, more to the point, what can developers do to eliminate this possibility? That's the question Glenn Scott, Sun Labs senior researcher, and his team asked as they launched the Celeste Project. Even more to the point, Scott told Al Riske, was the question, What would it take to make a storage system that didn't need to be backed up? And the answer, in a mere 61,000 lines of Java code, was the Celeste Project, software that can create a secure, distributed storage system from millions of disparate, untrusted computers.

Unlike such object stores as Honeycomb, which can store data across many machines, the Celeste Project can not only perform that action but enable its users to alter the data should they choose to do so. So, the Celeste Project works around failures and around malicious behavior that stems from viruses and bad actors, as well.

But, wait, there's more: "We made a storage system out of it, but really what we have underneath the storage system is a potentially large distributed system that has some interesting properties," Scott says. "For one, no node is trusted. That's very different than most distributed systems. The state of the art is you have a master, and if the master fails there's an algorithm that's used by the remaining systems to elect a new master. But in the underpinnings of Celeste there is no master. Well, this is interesting. So part of what's come out of this project is some new ideas about how to do distributed systems."

In the Celeste Project, users have a masterless system that cannot elect a malicious node as master. In fact, malicious nodes are ostracized and then allowed to wither away from the inattention of healthy nodes that expect their fellows to respond according to protocol.

The Celeste Project also performs automatic load balancing, relieving overburdened nodes and then apportioning additional work to those nodes when they exhibit further capacity.

There's a further interesting angle in the Celeste Project, which involves public utility computing. As Scott describes it, again in a question: "What if all the PCs out there contributed a little storage to a collective? So your data is encrypted and hidden but its spread around the country, so to speak, on everybody else's PC (and vice versa). You've effectively made a storage system out of nothing. Out of borrowed pieces. Or a service provider could own all the storage equipment and we'd just store stuff in there. We could be competitors, you and I. I can't see your data; you can't see mine. But we treat the storage as a utility just like power."

Riske writes that, a big part of what makes Celeste work is lots of redundancy, which doesn't come cheap.

"But it's cheaper in other respects," Scott retorts. "Humans are probably the most expensive IT component. So the goal of Celeste is to never require human intervention. That means a bunch of things. There are no backups. There's no restore. Adding to the system of course has to be done by somebody. They have to tote it in on a dolly or whatever. So there is some human intervention, but a child should be able to augment the system with no futzing around." [...read more...]

Keywords:

fullsource
 

Other articles in the Features section of Volume 118, Issue 2:

See all archived articles in the Features section.

Jobs powered by Personforce



Customized news reports about Oracle's Sun hardware products.
Just the news you need, none of what you don't.
45,000+ Members. 20,000+ Articles Published since 1998.