Torturing The Data Til It Screams

connect1

We all have at least one in our office.  The guy who is quick to let you know that selling that call against Whole Foods Markets (WFMI) may be a mistake because supermarket stocks go higher on Thursdays after it rains in Akron, Ohio 77% of the time.  The Data Miners.

I’m cool with statistics junkies.  I love to hear stats of all kinds and certainly, the market produces so much data, there is no end to the amount of regressions and backtests that can be run.  On a purely intellectual level, these stats can be enjoyable and intriguing to hear, no matter how far out their connective conclusions may be.

That said, it is a major mistake to be confusing causation with correlation in many of these cases, which some quantitative analysts tend to do.  In fact, some quants have built whole funds or careers off of doing exactly this, finding relationships between data sets that just don’t really mean anything.

Jason Zweig, the Personal Finance columnist for the Wall Street Journal sat down with the author of a new book that, among other topics, touches on the subject of “stupid” data mining.  One example given is the belief that the Bangladesh butter industry’s output may have some sort of predictive powers over the S&P 500.  Yeah, I told you this stuff could get ridiculous.

Meanwhile, dozens — probably hundreds — of Web sites hawk “proprietary trading tools” and analytical “models” based on factors with cryptic names like McMillan oscillators or floors and ceilings.

There is no end to such rules. But there isn’t much sense to most of them either. An entertaining new book, “Nerds on Wall Street,” by the veteran quantitative money manager David Leinweber, dissects the shoddy thinking that underlies most of these techniques.

I’ll be interested to read this book, but regardless, I will still be a sucker for interesting correlatives…but that doesn’t mean I’ll feel compelled to invest based on them!

Sources:

Nerds on Wall Street Author (WSJ)