“The numbers don’t lie?: The problem of emergence in baseball and basketball statistics”

The role of statistics in sports can be generally stated as providing more objective and sophisticated evaluations of an athlete’s performance. At its heart, statistics are tools that can be used to increase a team’s chance of winning a game. In this sense, much like counting cards can help win at blackjack, keeping track of a variety of individual statistics can in theory help a team win. The logic is to first objectively determine which aspects are important to winning and then to build a team around athletes with those skills. The allure of objective and measurable data is that it levels the evaluative playing field, by providing tools to measure aspects of the game that were previously thought to be subjective and only noticeable to those individuals such as scouts with special visual access. Indeed, the appeal of statistics has grown to the extent that many casual fans have brushed off high school math textbooks to both understand and contribute to statistical evaluations.

For example, in his recent ESPN the magazine article the sometimes sports journalist Bill Simmons documents a recent conference at MIT on the future of statistics in the NBA. Simmons’s article is notable because it reflects a relatively new idea about the use of statistics, specifically that they can be used to measure a player’s impact on the team as a whole. Unlike in baseball, where statistics measure individual players and their talents, basketball is thought to be more of a team sport. Accordingly, analyzing a player’s individual performance does not necessarily help teams win. Instead, it is necessary to develop statistics that capture how individual players contribute to the team’s emergent identity, which is what leads to victories. Given the complexity behind building a team, Simmons implores basketball teams to release the data they have so statisticians can develop new statistical measurements of a diverse number of basketball plays. He writes:

“I want “mega-assists” (passes that create a layup or a dunk) and “half-assists” (for each made foul shot). I want “unforced turnovers,” like in tennis (Tony Allen would be Wilt Chamberlain in this category), and “nitty-gritties” (some combination of charges taken, deflections, balls saved from going out of bounds and rebounds tipped to teammates). I want “Unselds” (a long outlet pass that leads to an assist for a layup or a dunk) and “Russells” (a blocked shot directed to a teammate).”

This quote reflects a general idea underlying the use of statistics. Namely, that it is possible to completely catalogue the different interactive or team related talents of players. By doing so, one could then presumably draw upon an index of players (mega-assist guys, nitty-gritties, scorers, defenders etc) in such a fashion to construct teams with the best chance at winning. Obviously, putting together a team where different player’s talents contribute to the team’s chance of winning is the goal of every General Manager as well as every want to be General Manager (of which we all secretly are).

The use of statistics to measure performance and build teams in sports is nothing new. In his popular book Moneyball, Michael Lewis details the use of statistics by the Oakland A’s General Manager Billy Beane in drafting and trading for baseball players. Beane, who was greatly constrained by the limited payroll of the A’s, used statistics to help find players who had underappreciated talents. The most famous of these is On Base Percentage (OBP), a simple measure of how many times per plate appearance a player gets on base. His argument was that measurements such as OBP (as well as Slugging Percentage) were better indices of successful players (in specific those who score and drive in runs) than traditional descriptive statistics such as Batting Average (BA), Home Runs (HR) and Runs Batted In (RBI). Beane, utilizing primarily OBP successfully built teams around players who were undervalued by other teams. His model of using statistics to help identify undervalued talents has served as a model for other teams and as a result baseball statisticians have become increasingly en vogue. Currently there is a plethora of statistics used to evaluate players, ranging from the mildly simple, i.e. OPS (on base plus slugging) and WHIP (Walks and Hits per inning pitched) to the fabulously complicated, i.e. VORP (value over replacement player) and WARP (wins over replacement player).

The use of statistics in basketball is much more recent. In a recent NY Times article, Michael Lewis describes the unique case of Shane Battier, the Houston Rockets starting forward. According to the Rocket’s General Manager Daryl Morey, (who was also at the statistical conference attended by Simmons not to mention highlighted in his article), despite a lack of visible statistical evidence, Shane Battier, makes his team significantly better when he plays than when he does not. Moreover, he has made every team he has played on significantly better. According to Morey, this is because Battier plays such effective defense against the opposing teams best player and is himself an unselfish offensive player. As a result, Battier limits the other teams scoring while not detracting from his own teams scoring.

Much like Beane, who used statistics to find underappreciated players, Morey has worked to develop statistics to help him find undervalued players. While the secrecy about the nature of these statistics (as well as the data behind them) is what infuriates Simmons, what is of note in the case of Battier is that according to Morey, he recognized Battiers effectiveness but not why he was effective. The recognition according to Lewis, involved a variation of a relative simple measurement known as “plus-minus”. Simply put it “measures what happens to the score when any given player is on the court.” Though not perfect, it is in many ways a descriptive statistic similar to that of OBP, as it expands the analysis to include a wider range of behaviors.

The fascinating part about of the Battier story is the eventual recognition of what he was doing that was making him effective. Morey describes it as an accumulation of different abilities Battier had developed that minimized the abilities of the other teams best scorer. One of these was Battier’s intelligence and capacity to assimilate a large amount of information given prior to the game about the tendencies of the other player and then to use them to play the odds. Lewis traces how Battier developed his abilities as a result of growing up between two cultures, a privileged white private school culture and a more street orientated black culture. As a result, he was forced to develop his game as a flexible hybrid of the two. It was his ability to stick (in some senses literally, he is referred to as a “lego” in the article and a “glue guy” by others) between these two games that made him successful.

Battier’s unique “sticking” abilities add intangible elements to the Rockets, such that when he plays they become greater than the sum of their parts. In order to discover this Morey used a statistic that looked at more than an individual’s box score performance (i.e. Field Goals, Shots blocked). While understanding what he is doing to be successful necessitated close observation, allowing for the development of new statistics. It is exactly this process that Simmons hopes to use in order to quantify the talents and impact of other players to construct a team that wins.

The question to me is whether statistics can ever really be effectively used to create a team where the whole is greater than the sum of its parts. The phenomenon here is one of emergence. From watching sports we know that over the course of a season a team can almost develop a personality that they did not necessarily have at the beginning of the season. This personality becomes greater than the individual players themselves and is often credited with helping them win. While each teams emergent traits are unique there have certainly been similarities in the narratives of championship teams. For instance, there are “underdog teams”, “goofy teams”, “professional teams” etc. The goals outlined by Simmons and Morey is how to use statistics to construct a team with one of these emergent personalities.

The paradox here is that statistics themselves are necessarily reductive – by definition they measure the probability of an event happening given a set of circumstances. Their use restricts events to what is normative not capturing the events that lie outside of the norm. This is why in his article Simmons lists a variety of statistical categories, because any one category only looks at some aspect of behavior. In order to get all of behavior one needs increasingly diverse range of statistics. Indeed, almost all behavior can be subjected to statistical analysis, especially if there is a trend that the researcher is looking for. What is less clear is that it can be successfully applied to the construction of the team, such that the whole becomes greater than the sum of its parts.

Interestingly, this problem mirrors the historical tension surrounding the use of statistics in studying social phenomenon. In my opinion this tension has been described most clearly by the social psychologist Kurt Lewin (1932) as the difficulty of developing general laws in science. Lewin recognized, that any attempt at establishing general laws (or abstracting out from individual cases) made it difficult if not impossible to go back to the individual cases. He relates this problem to the traditional method of classification in science, where phenomena were grouped according to similarity. According to Lewin, this only ended when genetic/constructive accounts became viable, as they allowed for the grouping of phenomena according to the way they can be produced or derived from each other. The focus moved from forming classificatory categories (making it difficult to look at specifics) to genetic accounts that provided laws for understanding idiographic cases.

In his article, Simmons discussion of the individual nature of baseball statistics (i.e. the ability for any baseball trait to be measured) maps out nicely to the classificatory use of statistics in social science. Essentially, the view is that if there is a baseball related activity, there can be a statistic for it. The event driven nature of a baseball game (a pitch, an at-bat, etc.) lends itself nicely to this sort of approach. As a result, statistics often highlight the individualized nature of baseball.

The brilliance of Billy Beane was not so much his use of statistics as a whole, but rather his use of a particular statistic, OBP, which allowed for a more genetic view of the player. Specifically, players with high OBP generally did many things better than players with just a high batting average. Some of these things, such as taking more pitches even help the team as a whole, by letting other players see more pitches or tiring out the pitcher quickly. Beane’s use of OBP in effect switched lenses, from one that only looked at hits to one that examined at-bats as a whole. It is crucial to also keep in mind the reasons why Beane was looking at OBP, namely his relative need versus what he could afford. In other words, OBP was great not only because it was more holistic but also because it was undervalued. Relatively, it provided greater bang for the buck.

Looking at Beane’s effectiveness in constructing a winning team using OBP, the early “Moneyball” Oakland Athletic teams on the one hand consistently made the playoffs, while, on the other never advanced to a World Series. Indeed, mixed results are the norm for other teams that have focused on building their teams around statistics such as the Dodgers and Blue Jays.  The explanation for this at least in part, is that statistics due to being inherently reductive only capture some aspects of the game. What is not captured often is minimized. The classic example of this is how Beane’s teams stole relatively few bases because stolen bases are thought to be statistically inadvisable. Yet, it was a stolen base, by the 2004 Boston Red Sox that is credited with propelling them to winning the World Series. While statisticians point out all the other things that happened during the year and the playoffs without that single event the Red Sox would have lost to the Yankees.

This matters because while Beane’s use of OBP while more genetic, still reduced the whole of a player into certain aspects of worth. This reduction, was far from an objective measurement, but instead was defined in relationship to the more traditional and highly valued statistics. Indeed, since OBP has become more popular it has become less cost effective for Beane to build a team around OBP players. Instead he has switched to looking at more holistic defensive statistics as well as base running statistics. These skills, which were undervalued to him in the 1990’s, have become more valuable as the market has changed.

To Simmons and Morey, the goal for basketball statisticians is balance the tension between measuring a player’s individual effectiveness and their impact on the team. Morey described how in baseball individual statistics almost always benefit the team, whereas in basketball good individual statistics do not always equate to team success. It is this line of reasoning that has them looking measures to capture the emergent properties that make the whole greater than the sum. To Morey, basketball needs “to measure the right things … meaningful statistics” about how certain players help their teams. It is this idea that led Morey to identify Battier as a valuable player as he had a high plus/minus and a relatively low salary. Accordingly, this theoretically represents a shift from thinking about the player’s statistics as directly helping the team to how their overall play fits in with and helps the team.

At first glance, this appears to be a move towards exactly the genetic type of thinking endorsed by Lewin. However, the question remains to the extent statistics are able to assist with this matter. Identifying Shane Battier as a significantly better player than reflected by popular statistics involved using a different lens (specifically, plus/minus) than commonly used by General Managers. Understanding why Battier was so effective required intense observation and knowledge of his developmental history. To Lewin, such idiographic study is at the heart of science. He sees two possible paths. One is to attempt and genetically understand how Battier does what he does in an order to develop certain that can be applied to other players. Much like understanding why he is effective, this approach emphasis subjective observation and comparison. By observing how he interacts and influences the play on the court it is possible to qualitatively describe his play in such a way that rules can be generated and applied to others. Indeed, this is very much like Morey does when talking about Battier with Michael Lewis. Alternatively, we could abstract outwards the specific traits (i.e. getting his body in front of players, or using his hands in a certain manner) that make him effective, give them names and then statistically measuring them in others. This process will inevitably reduce Battier to certain traits and more importantly limit the extent of observation in other players. Exactly like OBP excluded stolen bases, the defining act of categorizing in basketball would do the same.

While it would certainly be interesting and likely elucidating to attempt and develop statistics that are more sensitive to how players interact with each other and thus their impact on the team the nature of sports is such that this can only capture parts of the whole. This is not reflective of one sport being more team orientated than another sport but instead a question of the locus of analysis. Any attempt at statistically analyzing an action, interactive or otherwise (and I would argue that even the more individual aspects of baseball are still interactive and team orientated) limits the field of study. Though this analysis can generate useful and insightful data it is at best only complimentary to other subjective and qualitative methods of inquiry.

Lewin, K (1932). The Conflict Between Aristotelian and Galileian Modes of Thought in Contemporary Psychology. Contemporary Psychoanlysis; 23, 517-554

The Art of the Commercial Break

Now that I’ve switched from Comcast to AT&T, and seem to only watch pre-recorded programs off the DVR, I’ve become a little more attuned to the importance of the commercial break. We may say we hate it, but when it’s gone, or artificially eliminated rather, it’s hard to shake the feeling that something important is missing. And sure enough, in  two new studies reviewed by The New York Times, researchers who study consumer behavior argue that, in fact, “interrupting an experience, whether dreary or pleasant, can make it significantly more intense”.

“The punch line is that commercials make TV programs more enjoyable to watch. Even bad commercials,” said Leif Nelson, an assistant professor of marketing at the University of California, San Diego, and a co-author of the new research. “When I tell people this, they just kind of stare at me, in disbelief. The findings are simultaneously implausible and empirically coherent.”

Whether or not Nelson’s findings are intuitive on some level — our anticipation for the commercial break to end does, after all, reinforce its existence — it’s the interpretation that really matters. Now, one could of course read these results as evidence of some ‘deep’ psychological function that thrives on “interruption”, which seems to be Nelson’s perspective, but even if this is the case, generally speaking, it shouldn’t be forgotten that the TV drama itself is written for commercial breaks — which is to say, the commercial’s interruption is less random and disruptive than strategic and productive.

As a matter of fact, every guide to writing for TV spends considerable time discussing how to account for the commercial break in the structure of the plot and timing of events. (Which perhaps explains why the researchers found little variation in results across all kinds of content. In this regard, the interruption is hardly an interruption.) So, technically speaking, the commercial break itself is built into, on a deeper level than otherwise suggested, the story it’s supposed to ‘interrupt’.

In this guide, for instance, Evan Smith devotes a chapter to negotiating the distinction between ‘dramatic structure’ and ‘broadcast format’. The tension between them — irreconcilable, to be sure — in many ways reproduces the familiar struggle between commercial conditions and artistic endeavors (–which despite their presumed animosity must ultimately come together). As Smith puts it, rather succinctly:

“Here is where things get confusing. The timing of commercial breaks does not necessarily coincide with the transitions between threedramatic acts in a typical episode.” (99)

As TV watchers, we know this intuitively. How many times have we seen an old movie written in anticipation of its eventual consignment to TV, only to find that the commercial break structure they expected has since changed? The awkward flash-fades to black peppered throughout (almost unnoticeably), followed by cuts to a resumed action that the writers clearly wanted us to have to anticipate, are in this regard strange, archaeological signs of a former age, not evidence of a permanent, evolutionary form of ‘interruption’.

It’s this kind of subtle, almost meaningless, dissonance that reveals just how conventional, and non-interruptive, the commercial break really is. There’s an art to it, after all. So much so that it’s practically a credit and a compliment to the writers that their work is now attributed to our brains instead of theirs.