Occam's Kangaroo: Let's Make Broad Generalizations!

This is only going to cause me trouble.

I'm about to make what is probably a huge mistake, by diving into a subject that I find interesting, but which I suspect is only going to cause me a fair bit of annoyance. Fortunately, I can be a bit of an idiot, and am not very good at avoiding foolish behavior. So, here's the question that's currently rattling around in my head.

What should we realistically expect out of a draft pick? Or, to what degree does being picked highly influence the path of a player's career, for better or for worse?

I realize that a number of people have tinkered around with this sort of question. A lot of the attention nowadays seems to be directed towards running regressions that compare CarAV(or Approximate Value) to where a player was selected in the draft. The results of some of these examinations can be interesting....but I just can't fully jump aboard the CarAV train. I'm not saying that CarAV is a bad statistic, it's just that I have some qualms about committing to it at this point in time. As these sorts of geeky stats go, I just think it might be trying to do a bit too much.

I think my main problem with CarAV is that many people seem to be using it as a measure of a player's performance or contribution to a team. Some people seem to be using CarAV as a qualitative measure, when I think it is probably more of a quantitative statistic. It does a fine job of saying whether a player is getting on the field, but I suspect it doesn't say much about what someone does once they are allowed to play. I'm just not sure we are at the point where we have a "one ring to rule them all" statistic, to easily sum up such a potentially complex question as to whether a player is "good" or not. Instead, I think CarAV might be more of a 'perceived performance' or 'perceived contribution' estimate, which is a very different sort of idea. It might tell us who the coaches have some confidence in, but not whether the coaches are right to feel this way.

A player who is on a particularly effective unit (offensive line, defensive line or whatever) probably has a clear advantage in building up their CarAV results. It tends to be a very end result oriented statistic. If your team/unit does well, you get graded well, even if you might have been the weakest link in the chain. Getting selected for Pro Bowls and All Pro teams also has a significant effect on CarAV, and those selections can often be popularity contests that might favor high draft picks over a comparable player who was drafted later. The single biggest factor in CarAV is probably a player's ability to simply get on the field in the first place, where high draft picks will always have the edge, almost regardless of their actual performance. I just feel that CarAV might be weighted with some confirmation bias regarding high draft picks.

Over the first 6 years of his career, Evan Mathis' average annual Approximate Value result was a 2. That's not a very good result, in the CarAV world. Opportunity clearly plays a significant role in all of this, as he (a 3rd round pick) was only asked to start in 22 out of a potential 96 (22.91%) games during this time period. In the last 3 years, when he was finally made a regular starter, his average annual Approximate Value was 8.66, a much more impressive result. On the other hand, the generally mediocre Michael Oher, who was selected in the 1st round, had an annual average Approximate Value result of 8 in his first five seasons, and his results never dipped below 7 in any year. Of course, Oher started every single one of the 80 games from the day he was drafted, despite wavering between an average to below average level of play.

Now, some people might argue that perhaps Evan Mathis improved over time, and I'm sure that his coaches would love to take credit for this improvement, but I just can't buy into that idea. I think there were some fairly solid reasons to suspect that Mathis was always the more gifted player, as you can see in the post on the Lobotomy Line. Does anybody believe that Michael Oher is a comparable/superior talent to Evan Mathis? Or, is it just more likely that the challenge of exceeding/altering people's expectations of a player is a more difficult proposition when a team has invested less in a particular player in the first place? Hope is a flame that constantly burns bright when it comes to high draft picks. Hell, there are people still waiting for a Tim Tebow resurgence.

Now, I realize that this is just one peculiar example of a possible shortcoming in CarAV, and criticizing CarAV really isn't my goal. CarAV has its uses. All I'm trying to say is that there are some issues which cause me to have concerns about whether the best players are consistently being put on the field. I have little doubt that the majority of high draft picks are reasonably talented, but assuming that the talent is there simply because they are high draft picks is a very different question.

Instead of looking at 'perceived performance' or hype, relative to where a player is selected in the draft, Reilly proposed that we ignore the talent/quality of performance issue altogether. He was just curious about the degree to which a player's draft position relates to a team's willingness to give someone a starting role on the team. More specifically, he suggested that we chart the percentage of potential games started during a player's first four years in the league. A four year period was chosen since, with the exception of some 1st round draft picks, that is the typical length of a rookie contract, which is part of the real issue we're eventually going to try to figure out. So, without further jibber-jabber, this is basically what the results look like.

We're going logarithmic, like an ALPS Blue Velvet potentiometer!

This chart is based on the Ravens' draft picks from 1996-2010. Beyond being my home team, and the organization I am most familiar with, they really are a great team to run these experiments on. Having one GM, Ozzie Newsome, who has overseen such an extensive period of time running the team, eliminates some of the fickle fluctuations that might come with a more volatile organization. For better or worse, their behavior should be relatively consistent, at least compared to teams with more turnover.

During this period of time 118 players were selected by the Ravens, though we did remove five of them from the discussion. Since we are only doing this to look at the team's tendencies, behavior and biases, Reilly removed Sergio Kindle and Dan Cody from the list, because they were relatively high draft picks who never played due to injuries that were factors before a single game had even been played. Remember, we're not trying to judge the team's performance or luck in the draft (not yet, but we will later), just their tendency to give "starts" to higher draft picks. If a player was never healthy enough to play, the team never gets to make much of a decision in those cases. We also removed any kickers, punters (Dave Zastudil) or fullbacks (Le'Ron McClain and Ovie Mughelli) who were selected before the 5th round. These positions generally don't get credited with "starts" in the first place, so they just create a bit of chaos if they are included. In reality, only Le'Ron McClain would have had a significant effect if he had been allowed to remain in this list, at least compared to Zastudil and Mughelli. We also obviously can't include results from more recent drafts, because the four year time period for those players hasn't run out yet, though they will eventually be included.

The data could be tightened up and manipulated a little to produce a better R^2 value, but really, 0.5697 is rather adequate for our humble purposes. I prefer simplicity over tidiness and perfection, since a better data fit could spark criticisms of nudging the results too much. We're not trying to predict what percentage of potential games started that a player will have in their first 4 seasons. Instead, we just want to figure out what an average result might be, relative to where a player was selected. Why we're curious about this, is something we'll get into at a later point in time. For now, we're just going to refer to the data from this chart as PETARD (because it makes me giggle like a little girl). Eventually, we'll have to turn this into an amusing acronym to justify naming it this. It should also be noted that once we get above the 5th pick in the first round, it becomes impossible to meet or exceed expectations, so we actually capped everybody at an average expectation of 100% of games started in their first four years, regardless of how insane that may be.

Like I said though, this isn't really about judging whether a player is good or not. There are clearly numerous data points, particularly for mid-round players, that wildly diverge from the trendline, so individual outcomes can be fairly unpredictable. Despite that, it seems reasonable to suspect that players who exceeded these average expectation for where they were selected were possibly doing something right, and vice versa, for players who failed to meet expectations. You could say that we are simply looking to identify who the outliers are. In the future we will explore the stories behind these players who produced surprising and unexpected results, in an attempt to better understand them. I fully realize that many people aren't going to like the idea of using "percentage of potential games started" as a metric for making judgments (we're still working on the Moxie-Meter), but...you can't please everyone. At the end of the day, I do think being a "starter" is more valuable than not being a starter, so we're not going to over-think this. This is just meant to be a simple and reasonably objective way of gauging things, and some shortcomings are naturally going to exist in this process.

Really, there's probably not much in this post that should matter to anybody, but some of the issues I've been considering will refer back to this. It was just easier to throw this out there in advance, rather than trying to fit it in later. I think the more interesting discussion will come somewhere further down the road.

Occam's Kangaroo

Tuesday, July 1, 2014

Let's Make Broad Generalizations!

1 comment: