January 6, 2012

What does the history of computing teach us about the future of sequencing?

"I think there is a world market for maybe five computers", goes an infamous quote from the early days of computing, often attributed to IBM CEO Thomas J. Watson. That turned out to be wrong, and today pretty much everyone carries a programmable computer in the format of a smartphone in their pocket and has at least one more at home.
Genome sequencing is an emerging technology that has a number of similarities to computing. For example, exponential growth in performance. In computing, the computing power that can be bought for a fixed amount of money doubles every 18 months or so (Moore's law). In DNA sequencing, the number of bases that can be sequenced for a fixed amount of money doubles at an even faster pace.

Another similarity is the occurrence of disruptive innovation in both fields. Initially, computers called mainframes filled whole rooms. Later years saw them being replaced by minicomputers and even later by PCs, which would be smaller, cheaper, and easier to use. A similar trend can be observed for DNA sequencing. A possible comparison is:

Mainframe computer - e.g. Illumina HiSeq, PacBio
Minicomputers - e.g. IonTorrent, Illumina MiSeq
PCs - Possibly third generation sequencers

It's possible to continue this game, predicting that eventually there'll be sequencers that are equivalent in size to laptops or smartphones, and that everyone will carry one in their pocket.

Of course this is possible, but just because it happened with computers does not mean that the same thing will happen with sequencers. The reason is that there are important differences between the two technologies. One is that whilst even in the early days of computing IBM sold computers to a diverse bunch of companies, DNA sequencers go to a more restricted set of customers in the life sciences. That may change, but currently it's not obvious how an accountancy firm or a apparel retailer would benefit from sequencing technology. I'm not claiming that this couldn't happen, but I don't see why it should be more likely just because it also happened in the computer industry.

After having thought about the issue, I'm sceptical that the computing market can teach us anything concrete about the sequencing market. Except maybe not to indulge in bold predictions.


  1. If you try to predict when people will carry sequencers in their briefcases, or when will biometric identification at boarder crossing be based on whole genome sequencing, the computer industry's history does not help. I fully agree with your conclusion.

    On the other hand we can learn a lot if we look at the disruptive dynamics of the industry. (read Christensen: Innovators' Dilemma) What are the circumstances disruptive companies win and when continuous innovation of the incumbent prevails? The computer industry teaches a lot in this regard.

    Ion torrent is a disruptive new entrant, while MiSeq is a continuous innovation of the incumbent. But both companies enter into a new segment, the desktop market. Which one will win this race? Based on the history of disruptive industrial changes, if Ion Torrent tries to compete on the same attributes as MiSeq is different from HiSeq (size, runtime, upfront cost), it will be unlikely to win. In order to successfully introduce a disruptive change, Ion Torrent must have a unique selling point that is outside the value network of the current NGS users. In my opinion Ion's chance is that it can offer the scale of economy for three orders of magnitude of throughput using different chips. This attribute will be important for those who sequence small targets of varying number of samples and highly varying target sizes. Many diagnostic lab falls into this category.

    Another useful analogy from computers helps to settle the whole genome debate. Will all human sequencing be replaced by cheap whole genome sequencing? It is a similar question as asking whether all application specific integrated circuits (ASICs) will be replaced by general purpose CPUs. While we see CPUs entering (smart)phones, a typical territory for ASICs for decades, the number of devices using ASICs is also increasing. On the basis of this analogy, no matter how inexpensive whole genome sequencing will get, there will be opportunities for targeted sequencing (even SNP genotyping) for many years to come.

  2. Hi Attila,

    I really like your analogy between general purpose CPUs and sequencing.

    As for the lessons on distruptive innovation that we can learn from computing history, you may be interested in this post.

    Best wishes,

  3. I would need you opinion ! Thanks ?

    Using CFAR-m in biomarkers genes ranking ?

    I am not professional at all about Bio anf genetics and I woulkd like to know if there could be some uses of our CFAR-m algorithm in this field ? Thanks for helping.

    We have resaercher able to do analytics in these fields if any idea about how to use this algo and implement it in industry please contact me remi@cfar-m.com

    CFAR-m features
    For example, large scale environmentally based alternate energy projects and similar infrastructure projects are extremely complex challenges involving several inter-active phenomena emanating from different fields, this requires skilled aggregation techniques.
    Aggregation is a way to combine several single indicators representing different components (dimensions) of the same concept to form a single aggregate. The result leads to a single score, called a composite indicator, which has the ability to summarize a large amount of information in a comprehensible form. Aggregation requires the determination of a weighting scheme of the different components. This task is extremely difficult and is one of the central problems in the construction of composite indicators. Weights must take into account all existing forms of interaction between the components aggregated and have a significant effect on the result. However, there is no universally agreed methodology and the arbitrary nature of the weighting process by which components are combined constitutes the main weakness of composite indicators which CFAR-m overcomes.
    ? CFAR-m is an original method of aggregation based on neural networks which can summarize with great objectivity the information contained in a large number of variables emanating from many different fields.
    ? Its contribution lies in determining, from the database itself, a weighting scheme of variables specific to each individual. CFAR-m solves the major problem of fixing the subjective importance of each variable in the aggregation.
    ? It avoids the adoption of an equal weighting or a weighting based on exogenous criteria. The weightings for CFAR-m emanate only from the information content of variables themselves and their own internal dynamics.
    ? Objectivity: No handling of weightings - the weighting is resolutely objective and it emanates from the informational content of the variables themselves of their research and internal dynamics.
    ? Specificity: a specific equation for each individual piece of data to is used calculate the indicator
    ? Decision support: ability to run simulations and propose to the decision makers plans of action and optimal sequences of reforms.
    In addition:
    ? It can provides the contribution of the variables to the ranking
    ? It keeps all the variables during the calculus and so it is helpful for extracting what is happening within the noise. This is very interesting for predicitve models


  4. ASICs are now at the heart of the new sequencers from Oxford Nanopore.