April 13, 2012

Is the cloud good value for money?

There are three flavours of companies that provide cloud storage for sequencing data: Generalist providers such as Amazon, Cycle Computing, or Microsoft,  sequencing technology providers such Illumina (their BaseSpace uses Amazon's cloud infrastructure) or Life Technologies, and cloud genomics-only providers, of which DNAnexus is the only one I have come across.

Insert your cloud pun here

How much value for money is the cloud? Consider this simple calculation: DNAnexus charges academic users $10 to store one gigabase of raw sequence data for two years. A 3-gigabase human genome at decent quality (25-fold coverage) comes to 75 gigabases, or $750 at DNAnexus prices. For comparison: An external hard drive which would easily hold the same data costs $89.99.

Whether that is good value for money depends on what you want to do with your sequencing data. If all you want is a place to store it, DNAnexus may be a hard sell. It takes a long time to upload your files, and the example above shows that it is more expensive than to store it locally. 

This looks differently if instead of whole genome data as in the example above, you are dealing with other types of sequencing data. For example, DNAnexus provides ways to store exome or ChIP-Seq data, which take less space and are therefore less expensive.

You probably also want to analyse your data. DNAnexus offers software tools for genomic analysis that are designed to be intuitive to use for standard tasks like variant calling. For whoever is looking for an integrated solution, cloud services like DNAnexus may therefore be worth a try.

Please remember that as always, the views in this blog are my own and are not endorsed by the Wellcome Trust or the Sanger Institute.

1 comment:

  1. I've heard this argument a lot: 'It's cheaper just to buy a terabyte of external storage'.

    If we leave aside the issue of analysis when you pay for storage, whether that is the IT department in your local university institution or Amazon or whoever it always looks expensive until you start to factor in the costs of the RAID/Isilon or whatever storage systems are making sure you don't lose your data on the disk, the cost of duplicating the data between two different sites, or the costs of the fire safes and the tapes that need to be shuttled around sites with your backups on, the costs of having a competent systems administrator looking after all this for you.

    The sole reason for this is that when your cheap 1TB drive grinds to a complete and utter halt after a year, and you haven't got backup procedures in place, and all of a sudden you need to pull the data off that drive, which could easily be storing £50k worth of exome data for instance - your losses start to outweigh the cost of the storage.

    To be honest I'm with you on the argument, cloud storage is slow to use via the available bandwidth (we're not all on JANET with super-fast connectivity) but $89.99 is not the bottom line if you care about the data you've generated.

    ReplyDelete