One of the biggest challenges I had when my sportswriting students were doing their analytics assignment last week was keeping them from going too far down a sports-reference.com rabbit hole (as many of them were discovering that site for the first time.
About midway through the class, I heard one of my students, who is a Tampa Bay Buccaneers fan, say “that stat is bullsh*t.” He was looking at Jameis Winston’s stats.
“How can a stat be bullsh*t?” I asked him. After all, it’s just a collection of numbers that reflect what he did on the field last year. Numbers are what they are.
My student wouldn’t bite. He said that the stats didn’t reflect the kind of player he is.
My radar went up, as it always does when that sentence is uttered. Oh?
“It doesn’t take into account the pressure he faced, how many times he was knocked down, all that,” my student said.
To be honest, it’s not a bad point. One of the problem with football analytics, as I understand them, is that it is harder to quantify an individual’s performance. Football is such an intricate game, and there are so many dependencies on a given play, that it’s hard to isolate an individual’s play. Did the receiver run the wrong route, or did the QB throw the ball too early? Did a running back miss a hole, or did the left guard miss his blog?
But it points to a larger issue with sports journalism and analytics and the hesitation of so many to accept statistical based thinking. There’s the I was told there’d be no math crowd, and that’s a perfectly understandable sentiment (even if they can compile a batting average and shooting percentage which is, you know, many). There’s the You’re sucking the fun out of sports crowd, which is also understandable but still assumes there there is only one correct way to enjoy sports. There’s the I don’t understand what all these fancy stats mean crowd, and they have a point. One of our jobs as sports journalists is to explain what these stats mean, to provide the benchmarks so that fans can understand a 9.0 WAR as easily as a .300 batting average.
But there’s a larger issue here too, one my student was getting it. It’s the This stat is bullshit crowd. It’s the idea that the statistics we’re looking at are flawed in some way, because they don’t “tell the whole story.” This is a very real, very big thing. There’s this notion that advanced analytics always contradicts what we see with our eyes. That they either lift a player who isn’t very good by traditional measures and makes him into a story, or (more typically), that they take a player we all assume is very good and shows how he’s not. This is how you get the great Derek Jeter arguments of the early 2000s, or Tim Tebow a few years ago. This is how you get praise for a player’s “heart” and their “guts” and their “intangibles and how “they do things that don’t show up in a stat sheet.” The stats suggest one thing. Our eyes suggest another.
Which one do we trust?
The problem, of course, is that it’s a false dichotomy. We don’t have to pick one. They’re really not mutually exclusive. Joe Posnanski and Keith Law — both of whom are very analytics minded — discussed this on a recent podcast , and it’s striking how much even stat heads will look at a player’s attitude, his personality, the things you can’t measure. In fact, in an era when data is everywhere, that can be even more important.
Also, stats are neutral. They’re just numbers. Stats are reflective of performance. One of the things my students learned is that when you compare traditional measures of success to advanced ones (think points allowed vs. DVOA), often times they reflect each other perfectly. When you look at the Player Efficiency Rankings for the NBA, you see the best player this season is Russell Westbrook which, yep.
The issue is, when the stats don’t reflect what we expect, how do we react? It’s funny to think of this in a sports context, in terms of whether Jameis Winston is better than his numbers or whether Derek Jeter is clutch. It’s less funny in the real world, where people see data about Climate Change and don’t believe it, or when they hear that President Obama deported more people than any other president and simply reject that.
The proper thing to do is not to blame the stat. It’s to rethink our assumptions. It’s to ask ourselves “Is there something this stat isn’t measuring? but also “Am I looking for confirmation of what I want to believe rather than what’s really true?” We interview the data and ourselves.
That’s how we can start figuring out what’s truly bullsh*t.