Ultimately, do object storage plays displace file systems or are they absorbed?

NAB kept me totally away from all the interesting online discussions last week. It’s too late to respond to @JoinToigo’s tweet (we’d call this Figs after Easter in Dutch), but I thought I’d share my thoughts in a bit more than 140 characters.

The short answer is no … but a better answer is very much *yes*.

The first file systems were not designed with the thought of petabytes of data. I don’t know what the exact projections were back then, but gigabytes must have sounded pretty sci-fi. Bytes and kilobytes were a lot more common. We didn’t think that we’d soon all be creating tens if not hundreds of multi-megabyte files per day.

File systems have of course evolved a lot and some have become so popular you could actually say they have a fan base (I’d need to do research on ZFS fan clubs). It is clear that the file system has played a very important role in the evolution of the computer industry. In my list of features that helped to make computers a commodity, the file system would probably be in the top three (with the windows-style GUI and the mouse). The file system enables the use of directories, which have been the most important tool to keep our data organized.

But like Robin Harris says “entropy refers to the inherent tendency for any organized system to disorder”. Especially with the amounts of data we are dealing with today, we have to put a lot of energy into keeping our data organized. We have come to a point where our directories are not that organized anymore because we have too much data. But that doesn’t matter all that much since there are so many applications out there (and a lot more coming) that can do this for us.

Take Google docs for example. Docs lets you star and share your documents, and organize them in collections. And no matter how you organize your stuff, Docs will find it back for you. Docs has a great search function (it’s Google after all) that is lot more powerful than the search in windows explorer or OSX’ Finder (although spotlight is actually pretty good). Picasa and Itunes are just two more examples of applications that help us keep our data organized with hardly any role for the file system. Eventually the applications will make the file system obsolete. Many of the applications we are using today are cloud based and run on object storage, with no file system involved, the application just masks the lack of a file system.

For businesses the situation is the same but different. Applications in the cloud are increasingly popular, so a lot of business data is already stored in a public or private object store. But a lot of business applications simply need a file system interface. For now, that is. If the current data growth continues, a lot of file systems will hit their scalability limits. And here object storage will play a very important role as object storage platforms have been (at least the good ones) designed to scale out big.

One interesting example is the media and entertainment industry. If there is one industry where data is big, it’s there: think of the 4k and soon 8k movies. Movies have become multi-petabyte projects (tens of petabytes). Companies in this industry understand they need more efficient storage and tape is no longer an option. All major studios are running object storage projects right now (mostly with file systems on top). This frees them from worrying about “how many files fit into a directory before it slows down”, and “how many directories can I have” and “how deep can my file system tree become” – especially as it relates to access performance.

So, expect object storage systems to become more and more popular. As long as needed, object storage will be implemented with some file system gateway on top but eventually, when the applications are ready, we will see less and less file systems. It just makes more sense to have the application talk directly to the storage. REST makes it all very simple. And fast. And economically feasible.

And now, anticipating the next question: Shouldn’t there be some standard REST API? I used to strongly believe so. But while doing research for this piece, I stumbled across Wikipedia’s list of file systems. With hardly a dozen object storage REST API’s on the market, it’s not all that bad in my opinion. Still, I believe object storage vendors all agree standardization is good. It’s just a matter of waiting to see which API will eventually become the most popular with the applications that use them.

~ by tomleyden on April 25, 2012.

One Response to “Ultimately, do object storage plays displace file systems or are they absorbed?”

  1. Hey Tom,

    [I know you know me but for everyone else, as a Disclaimer, I’m part of Scality :)]

    I think that we can state that the most popular REST API nowadays is S3. That’s not far-fetched. Doesn’t make it a standard obviously, but let’s not be naive, others have tried to push their own API, and most of them in the end are rolling out an S3-compatible interface (Atmos, OpenStack…). Even though their API may have been superior, there’s just no traction for it.

    But i agree with you too that democratization of object-based storage is what counts for now, and a standard (a real one, not just the most popular protocol) will have to be worked on (CDMI maybe?) sometime soon. Object-based storage is a real disruption in the world of storage, but a real need for Service Providers, Telcos and Enterprises and their unstructured data (emails, documents, media, archive, backups).

    To respond to the blog’s post question, I strongly believe in Object Storage to be the next-gen storage. Of course, right now, it’s in its infancy and it has limitations and the optimal use cases are very dissimilar with file/block-based storage systems(structured vs unstructured), but it will evolve and offer new ways to access it. Some of us storage vendors have already been working on offering true posix compatible, file access to our object storage. And i’m not talking about simply putting translating gateways. That’s just not scalable enough and all the advantages of object storage are lost. True Object storage scalability with the legacy access of NFS,CIFS, block. That’s the future of object storage, at least the way i see it.

    So Object stores displacing filesystems? Probably, yes. But, the more interesting thing that I think will happen in the next decade is that Object Storage will not achieve it by offering legacy protocol access, but by motivating applications developers/providers to embrace the object model and get rid of legacy access protocols (NFS, CIFS, iSCSI). Those will still be available for end-user processes and access, but applications should be able and hopefully will move away from those limited standards and move to the next-generation.

    Darwinian evolution in the computing world. Survival of the fittest as he said. I feel like next-generation storage with all its features (Commodity-based, feature-rich, adaptivity, cost-efficiency) is just more fitted to last in the long run).

    Oh, and good quote about entropy. Good Analogy to file systems becoming disorderly over time. Being a physics geek, reminded me of the exceptional lectures on computing by Dr Richard Feynman, I would recommend anyone to read it: http://www.amazon.com/Feynman-Lectures-On-Computation-Richard/dp/0738202967


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: