January 5, 2020

Favorite books 2019

Of the non-fiction books I read in 2019, here are the ones I recommend:
  • Nick Chater: The mind is flat. Our subconscious minds analyze vast amounts of information, and once they reach a conclusion, the conscious part of our mind is notified. At least that's what I used to think before reading this book, which proposes a radical and well-argued departure from the way most people assume our minds work. It argues that there is no subconscious of which the conscious is the surface. Instead, the conscious may be all there is.
  • Paul Schneider: Brutal Journey. This is an account of the Narváez expedition, which departed Spain in 1527 to colonize Florida. Things went wrong and only four out of 600 men made it back to Europe after trekking through what today is Texas, New Mexico, Arizona and Mexico for eight years. When it comes to physical hardship, nothing today approaches what they went through.
  • Temple Grandin: Animals in Translation, reviewed here.
  • Jonathan B. Losos: Improbable Destinies, reviewed here.
  • Ben Blatt: Nabokov's Favorite Word Is Mauve, reviewed here.
The best non-fiction book I read this year is Jonathan Franzen's 2010 novel Freedom. Unlike other books, it has stayed with me in the months since I've read it.

Happy New Year!

December 17, 2019

Ben Blatt's Nabokov's Favorite Word Is Mauve

This is an unusual book. I can't think about any other attempt to analyze literature using big data and statistics.

For example: Ernest Hemingway, together with the world's English teachers, warns against the use of adverbs. He argues that adverbs, especially those that end in -ly, are a sign of lazy writing. The reader should be able to tell whether a character is sleepy without having to specify that they are doing something sleepily.

Did Hemingway follow his own advice? In Nabokov's Favorite Word is Mauve, data journalist Ben Blatt analyzes Hemingway's novels and found that he indeed used -ly adverbs less than other authors. In his novels, they appear at a rate of 80 per 10,000 words. Other writers vary widely, with e.g. 140 per 10,000 in J. K. Rowling's Harry Potter novels.

Blatt also analyzes other stylistic choices such as the use of thought verbs (thinks, knows, understands, realizes, believes, wants, remembers, imagines, desires, loves, hates), exclamation points and the use of suddenly and qualifiers like rather, very or little in hundreds of novels. Hemingway, Mark Twain, Toni Morrison and Chuck Palahniuk appear towards the top of multiple of the resulting style rankings, confirming what everyone already knew: They are good at what they do.

Unlike other popular science books, Blatt's consists of original research. The bibliography lists 1,500 literary fiction or bestselling popular fiction novels, but they're only used as the raw material for Blatt's data analysis. One side effect of presenting original research is that there are more figures than in other books. Most pages contain at least one graph or table. While they're clear and informative, the text that links them together sometimes appears redundant.

Nabokov's Favorite Word is Mauve left me with enough questions to hope for a sequel. For example, Blatt describes an approach that can determine a book's author by comparing the frequency of common words like and, the, then or these. It works well on novels, but what about shorter texts? How much better would a machine learning approach perform that not only uses word frequencies but also word co-occurrence, sentence structure and other metrics? Could it tell this blog post was written by me when given my previous posts? If so, I need to remember to never pen an anonymous op-ed if I ever enter politics.

But what I really want - data journalists take note - is a website where I can explore different metrics like adverb abuse, Flesch-Kincaid readability, gender-specific pronoun use and so on in interactive graphs.

November 12, 2019

Jonathan B. Losos' Improbable Destinies

Is there such a thing as destiny? How resilient are outcomes to changed starting conditions?

This was the question that Stephen Jay Gould asked in his 1989 book Wonderful Life. He thought that if we went back in time to the Precambrian period before animals were a thing and restarted the tape of life, the kind of creatures that would evolve would be very different to what we have now. Maybe while you're back there you step on a worm that turns out to be your ancestor, and voila, no humans. Think of it as an evolutionary Butterfly Effect.

In 2003, Simon Conway Morris published Life's Solution, in which he made the opposite argument. He thinks that because of the pervasiveness of convergent evolution, the outcome is preordained. Flight has evolved multiple times in vertebrates: Birds fly, bats fly, pterodactyls flew. No matter how many times we go back to the Precambrian and trample the local primitive wildlife, flying will evolve.

Because I'm interested in evolutionary theory and because at the time I was at the same Cambridge college than Conway Morris, I read his book, but I don't remember much. That's because apart from its central thesis, it consists of a long and thorough list of examples for convergent evolution.

Jonathon B. Losos' new book Improbable Destinies is different. It asks the same question than Gould and Conway Morris did and their views get a fair hearing. What has changed since their books appeared is that we now have more data.

We still can't go back in time, but we can and have run evolution experiments where bacterial populations are subjected to selection pressures over thousands of generations. The results suggest that starting conditions make a big difference to the way the bacteria evolve.

An orthogonal approach is to study how geographically isolated animal populations evolve. Again, while convergent evolution is pervasive, there are examples of unique adaptations. In Australia, the wombat is basically a convergently evolved groundhog but the kangaroo is unique.

Overall, Losos' book is a satisfying summary of developments in one of the central questions in evolution over the last 40 years. It doesn't provide definite answers but offers a roadmap how we may be able to reach them. The writing is good and I like the illustrations too.

I still wonder: How transferable are the insights gained from studying the role of destiny from evolution to other areas? For example, Philip Tetlock (his book Superforecasters is excellent) argues that it is valuable to think about historical counterfactuals. Would Hitler not existing have prevented Nazi Germany? Maybe or maybe not, but data is lacking either way. Could it be that the methods biologists have used will also prove useful to the historians?

March 14, 2019

Rutger Bregman's Utopia for Realists

The idea of an a universal basic income (UBI) has been around for some time. It's a bold idea that has the potential to fundamentally change the way our society works, and there aren't enough of those. Whether it'd be a positive change is a different question.

The UBI is one of three big ideas historian Rutger Bregman discusses in Utopia for Realists. The other two are open borders and radically shorter work weeks. The essence of the book is interesting and it inspired me to read more about president Nixon, who had a few things going on that were at least as interesting as the Watergate scandal. One of them was he came close to introducing the UBI in the United States.

Unfortunately, it's hard to appreciate the book's tone, which at times feels like it was written with an audience of 12 year old school children in mind. Neither can I recommend it if you're interested in a balanced discussion of the disadvantages as well as the advantages associated with the UBI and the other topics it covers. Bregman is a polemicist more than he is a scholar, as will be obvious if you look for some of his recent media appearances.

Finally, for your enjoyment here is the photo of Rupert Murdoch that first made me aware of the book's existence when I came across it on Twitter:

Her husband chose Utopia for Realists: And How We Can Get There by Rutger Bregman

February 9, 2019

Temple Grandin's Animals in Translation

Why are we able to consciously perceive only so little? Every moment, our senses collect a large amount of information including sounds, smells, visual input and pressure, temperature and proprioception readings from all over our bodies. However, we're only consciously aware of a tiny and heavily filtered part of that information. Why is that? Why can't our consciousness deal with a larger fraction of the input data? Why can't we pay attention to many things at the same time? Why is our consciousness pointy instead of wide?

There are plenty of books trying to define consciousness, but there is much less literature on why human consciousness is the way it is. Temple Grandin's Animals in Translation addresses this question indirectly by considering different kinds of minds, like those of people with autism and those of animals. As it turns out, people with autism and animals both have access to more unfiltered sensory input, which in both cases can be overwhelming under circumstances that are fine for non-autistic people.

Unlike much of popular science writing, Grandin's books are firmly based on her own experience of working with animals and being autistic. As a result, they avoid being overly theoretical. Her writing style reflects this: She cares about her subjects without being sentimental. I haven't come across any science writer who combines extensive personal experience with published research as well as she does.

Animals in Translation came out in 2006 and some of the research may benefit from an update, but the vast majority of insights this book delivers are timeless.

April 9, 2018

How to get from variants to genes?

Genome-wide association studies are great at identifying genetic variants associated with diseases and other traits of interest. However, for most of these variants there neither is a clear candidate gene nor an alternative mechanistic explanation for how they exert their effect.

Some thoughts on this:
  1. The vast majority of significant haplotypes (blocks of variants) reported by a given GWAS are causative for the trait of interest, assuming that the GWAS followed best practices
  2. Only a small proportion of those variants are coding or otherwise likely to affect protein function
  3. The proportion of coding variants decreases even further after finemapping to exclude variants that are not likely to be causative
  4. Some of the remaining noncoding variants act by changing gene expression, i.e. they're eQTLs
  5. Since many eQTLs are cell type and condition specific, and since data is not available for all cell types and conditions, it's unclear for what proportion of GWAS variants this applies
  6. There is a lack of understanding of how noncoding non-eQTL GWAS variants act mechanistically
  7. Some variants may not directly act through protein coding genes at all. Instead, they may act through noncoding RNAs (e.g. lncRNAs) or some other unknown mechanism
  8. Software tools for identifying causative genes from noncoding non-eQTL GWAS hits have been proposed. Here's one
  9. Experimental follow-up for those variants is hardly ever (never?) done, making it uncertain how well these tools work
  10. A lot of people are putting a lot of thought into how to approach this problem, and I expect some best practices to crystallize in the next few years

April 6, 2018

What have I been reading?

One author I keep returning to is Richard Russo: Nobody’s Fool, Everybody’s Fool, Empire Falls, The Risk Pool and The Bridge of Sighs are among my favorite novels. Nobody’s Fool and Empire Falls have been turned into first-rate movies. Yet there is also something about Russo's books that makes me uneasy. It's not that Russo could be called the Norman Rockwell of fiction, because that'd be unfair. If my livelihood depended on it – for example, if it were my job to review books – I’d have to come up with a post-hoc rationalization for that feeling. Fortunately, it doesn’t and therefore I won’t. 

Ready Player One by Ernest Cline is an entertaining science fiction novel, but not a particularly good one. The whole concept feels like the author looked for a way to cash in on his extensive knowledge of 1980s pop culture. The only reason why I'm still going to watch the movie is because it's directed by Stephen Spielberg.

Finally, here's a list of my favorite science fiction novels (inspired by this): 

 1. Vernor Vinge: Rainbows End
 2. Greg Egan: Permutation City
 3. Jack McDevitt: Engines of God
 4. Andy Weir: The Martian
 5. Ken MacLeod: Intrusion
 6-8. Kim Stanley Robinson: Red Mars, Blue Mars, Green Mars
 9. Mikael Niemi: Astrotruckers