RSOB Chapter 1: How the Stats are Out to Get You
From Ron Shandler’s Other Book 2016: The Manual of Baseball Roster Construction.
“This is a very simple game. You throw the ball, you catch the ball, you hit the ball. Sometimes you win, sometimes you lose, sometimes it rains.” — Nuke Laloosh, Bull Durham
The structure of the game of baseball lends itself to analysis. The result of each at-bat is an individual event that can be measured. But this measurement is always after the fact. We can count how many home runs a player hits, but that is only after he’s hit them. The problem comes when we try to take the next apparently logical step. If a specific event chronicles a real, measurable skill and we can count it and track it over time, then can’t we also predict it?
No, not really, at least not with the level of precision necessary to have meaningful control over building a fantasy baseball team. But every year, the quest continues to create, enhance and fine-tune predictive models.
Again, are you dissing all the work we’ve put into advanced baseball analysis over the years?
No, there is nothing wrong with more and better data. The metrics in my other other book, at BaseballHQ.com, now-mainstream sabermetric gauges like WAR and wOBA, advanced granular data from PitchF/X, Statcast and heat maps – are all very, very important. The better that we can describe the elements of performance, the better we can assess skill.
Then we often take the next step and try to use those methods to validate statistical output. That’s a reasonable exercise too. Yes, Bryce Harper hit 42 home runs, but when we deconstruct events into granular components such as contact rate, exit velocity and batted ball distance, we can get a sense of how “real” those 42 HRs were. We can determine whether Harper’s skill set supported that home run output in general terms.
But then we take it a step too far; we try to attach a number to it. He should have hit 3 more HRs, or 2 fewer HRs, all things being equal. Here’s the problem: all things are never equal. You can never replicate one season’s performance in another season. So while this is an interesting exercise, it provides little actionable information when it comes to subsequent years.
Tell me that the indicators point to an increase or decrease in power skills, show me the areas of growth or erosion, even go out on a limb and tell me that a player is going to fall off a cliff – but don’t tell me that Albert Pujols is going to hit 31 HRs. Don’t tell me that Dee Gordon is going to steal 55 bases. Don’t even tell me that Jake Arrieta is going to have an ERA somewhere between 2.29 and 2.54.
For more than 30 years, we’ve been told that we need these numbers to play the game. We need a set of projections, and we need to convert them into dollar values or ranking positions. We need to build budgets and roster plans, and set statistical targets based on all this data.
But no matter how exhaustive a job we do in assembling our draft prep materials, the numbers we use to plan out our rosters are always wrong. Pujols never hits exactly 31 HRs and his eventual output might not be anywhere close to that number. Gordon will not steal exactly 55 bases. And Arrieta’s ERA – even with a range to work with – is almost as likely to end up somewhere outside that range as inside it.
Yes, no projection is going to be exact. But can’t we expect that the over-projections and under-projections are going to even out across an entire roster?
No, not at all. In fact, your league’s winners and losers will most likely be determined by a basic report card of overs and unders. The team with the most or biggest over-performers will always have the best odds of winning, regardless of how close your projections were overall. To wit:
Last year in the FSTA experts league, my overall draft report card was pretty damning. I had five on-par picks, nine profitable picks and 15 outright losers, including six in the first eight rounds. By all rights, this team should have been a disaster. But my nine winners were big winners, including Jake Arrieta (9th round), J.D. Martinez (14), Manny Machado (15), Xander Bogaerts (16) and Dallas Keuchel (19). I finished one day short of a title, even though my overall prognosticating prowess was nothing to write home about.
So we really can’t rely on the projections getting us to where we need to go. Yet every spring we go back through the same process all over again.
Well, of course. What else can we do?
But isn’t that the definition of insanity? Doing the same thing over and over, and expecting a different result?
I don’t really see it that way. I see it as we’re using the best methodology that we have. Until someone finds a better way…
Challenge accepted.
You wouldn’t know it from all this extreme analysis going on, but baseball is a simple game. Even fantasy baseball tends to dig far deeper into the minutia than is necessary.
Here is a rundown of many of the lessons, truisms and proclamations we’ve been following over the years. The research findings are all valid; the cited authors are from the Baseball Forecaster and other sources (if no author is cited, it’s my own research). Our application of these findings is where things go off the rails. You can’t really assimilate hundreds of pieces of input and cull it all down to a single projected stat line that has any real value.
You’ve read all the following before, as individual facts, at different times. Now it’s time to read them again, together in one place, to reach one inescapable conclusion.
The Baseline
With the tools currently available to us, the maximum projective accuracy we can hope to achieve is 70 percent. This is a number that we’ve been throwing around for a long, long time.
But what that means is, the best we can hope to be is 30 percent wrong. Thirty percent is a lot! It means being off by nine HRs for a 30-HR hitter, 60 strikeouts for a 200-K pitcher or 12 saves for a 40-save closer. That’s the best level of wrongness we can reasonably expect to achieve. And few of us will ever achieve “best.”
Seriously? Is this true?
Eh, I don’t know. That’s the number we’ve been using, and frankly, I’m not sure how they arrived at 70. It’s possible there could be a better system out there – one that exceeds 70 percent – but I don’t know that you’d be able to prove it.
Why?
Because the proof comes after the baseball season is over but one season represents only a single data point for analysis. We’d need to see a system that produces forecast data over multiple seasons to provide a significant enough sample size. You can’t reasonably do that for future seasons, but I suppose you could go back and regenerate forecasts for past seasons. However, when you have actuals already available from past seasons, the temptation is to create a model that fits all of the historical data. I’m not sure you could really produce something actionable.
In the end, that 70 percent figure is too “macro” to have any real impact on a 23-man roster. This is a point I will be coming back to over and over again.
Maybe you can’t evaluate an entire season of projections on a macro basis, but what about individual players? That’s all that matters anyway.
Sure, we can try. There are overall skills metrics that are considered good evaluators of talent, like on base-plus slugging (OPS). But let’s say that I project a player to have an OPS of .838 and he ends up with an OPS of .838.
Um, that would be great!
Except, this:
HR SB BA OBP Slg OPS -- -- ---- ---- ---- ---- Lorenzo Cain 16 28 .307 .361 .477 .838 Lucas Duda 27 0 .244 .352 .486 .838
If I projected Cain numbers and he produced like Duda, I’d hardly call that a successful projection. But OPS thinks so.
Baseball analysts use statistical processes like “average mean squared error” to compare the accuracy of one set of metrics to another. You’ll see this method used for projections too. There are studies that involve a group of forecasters, often compared to a control group – like a simple age-adjusted, weighted three-year average (the Marcel Method) – and to each other.
Using these studies to determine the best system has little value. The test groups typically cover hundreds, or thousands, of players. The variance between any one system and another usually amounts to percentage points over the entire study group. It’s not something that’s going to provide much benefit for a tiny sample of a 23 players on a fantasy roster. There is no way that you can cover your risk of volatility over a roster size of just 23 players. (See? I told you we’d be coming back to this.) So you can almost pick any system and have just a good of a chance of winning as any other.
Statistical Volatility
According to the research of Patrick Davitt of BaseballHQ.com, normal production volatility varies widely over any particular 150-game span. A .300 career hitter can hit anywhere from .250 to .350, a 40-HR hitter from 30-50, and a 3.70/1.15 pitcher from 2.60/0.95 to 6.00/1.55. All of these represent normal ranges.
So if a batter hits 31-.250 one year, 36-.280 the next year and 40-.310 the third year, you don’t know whether that is growth or normal volatility. In fact, the low-end and/or high-end performances could be isolated outliers. But nearly all analysts will call it growth. Their projection for year #4 will either continue this perceived trend or show some regression. And any one of them could be right. Or wrong.
It actually would be a lot easier if every player performed like Alcides Escobar (pictured above):
Year SB BA OBP Slg ---- -- ---- ---- ---- 2010 10 .235 .288 .326 2011 26 .254 .290 .343 2012 35 .293 .331 .390 2013 22 .234 .259 .300 2014 31 .285 .317 .377 2015 17 .257 .293 .320
I love Alcides. He doesn’t hide his volatility. It’s all-clothes-off, out there in the Kansas City sun. He trumpets the fact that there’s no way to pin him down. But while this data set is impossible to project into 2016, it’s perfectly consistent within a normal range. You probably couldn’t convince many people, but this is the same player every year.
I’m starting to pull my hair out.
Completely understandable. But there’s more.
Research has shown that 150 games, or about the length of a single baseball season, is not enough of a sample size to be a reliable indicator of skill for some statistics. For instance, a stat like batting average doesn’t stabilize until about 910 AB, according to Russell Carleton. So we definitely cannot draw conclusions after one season. You cannot look at a batter who hits .230 one year and .270 the next and call that “growth.” What you’d more likely call that is a .250 hitter.
My friend Alcides? He’s your basic .260s hitter, even though he’s never actually had a batting average in the .260s.
But what does .260 mean anyway? Or .300? Or .250 or .200?
The line we draw in skills benchmarks is incredibly grey.
We’ll chase a .300 hitter as being significantly better than a .250 hitter, however, over 550 AB, the difference is fewer than 5 hits per month. The difference between a .272 average and a .249 average – still perceptively different – is two hits per month, or one hit every other week. We’ll opt for a pitcher with a 3.95 ERA, passing over one with a 4.05 ERA. But what’s the real difference? A pitcher who allows 5 runs in 2 1/3 innings will see a different ERA impact than one who allows 9 runs in 3 innings, even though, for all intents and purposes, both got rocked. That could be your 0.10 variance in ERA right there.
The line we draw between success and failure is also incredibly grey.
A batter whose HR output drops might have had a concurrent increase in doubles and triples (see: Xander Bogaerts). A pitcher whose ERA spikes may have seen no degradation in skills but was backed by a poor defense and a bullpen that allowed more inherited runners to score (see: Chris Sale). A speedster may have seen his SB total plummet only because he was traded to a team that didn’t run (see: Ben Revere). A closer may have been as effective as ever but lost the 9th inning role as a result of a trade or a manager with a quick hook (see: Drew Storen, Joakim Soria).
It’s like nothing is real anymore.
Oh, it’s real. The issue is how you interpret these realities. I’m trying to make a case that our trusted, comfortable statistics are not the place to find “real.” This becomes more problematic when we try to project the future. Garbage in, garbage out.
And honestly, beyond the volatility in the numbers, there is too much uncertainty for many players to pin down a stat line anyway. How do you handle Giancarlo Stanton? Will the wrist injury sap his power? Can you reasonably pro-rate Carlos Correa’s 2015 stat line to a full season? Is Jake Arrieta really now in the same class as Clayton Kershaw?
I don’t know. You don’t know. Nobody knows. But someone is going to have to slap a bunch of numbers on these guys in order for you to draft, right?
Um, right. Well, won’t they?
They will, but you don’t have to buy into any of it.
Rotisserie Earnings/Fantasy Rankings Volatility
Trying to find some stability within Rotisserie dollar earnings or Average Draft Position rankings (ADPs) is no less frustrating.
There is only a 65% chance that a player projected for a certain dollar value will finish the season within plus-or-minus $5 of that projection. That means, if you project a player will earn $25 and you agonize when bidding hits $27, there is really about a 2-in-3 shot of him finishing anywhere between $20 and $30.
So I shouldn’t worry about those extra few bucks?
In most cases, no. But auction pricing is going to be market-driven anyway. So, if you are convinced that Jason Kipnis is worth $25 and land him for $21, you will have overpaid if the rest of your league sees him as no more than a $19 player. Even if he is really worth $30.
Arrrgh! I give up. Are you saying I should just pay whatever for whoever and not worry about budgets or bargains or value or anything?!
You still need to follow the market, but in general, yes. Kipnis might go for $25 in your league. He might be a bit inflated in Cleveland. But you don’t know whether he’s going to bat .300 again, or .240. You don’t know whether he is going to steal 12 bases or 30. At age 29, it’s entirely possible that double-digit power is still in his skill set. In fact, any of those ranges are within his current skill set. So what will it be?
Prognosticators will give you a stat line that will likely split the difference all around; we have no choice but to hedge. And if all of baseball’s top analysts don’t know what the heck Kipnis is going to do, clearly the other owners in your league have no clue either. So you need to decide whether his assets offset the risk of owning him and then just follow the market. I’ll get more into that much later.
Nice guy. Tease me with all this stuff and then put me off until later.
You’re not ready. There’s more.
I’ve said this often: the two most powerful forces known to man are regression and gravity. If you’re ever faced with the question of whether to project a player to IMPROVE or DECLINE, the better percentage play will always be DECLINE.
But that runs counter to what we want to see in our players. That’s why so many of us are infatuated with upwardly mobile rookies and anything in a data set that even remotely looks like improvement. But, facts:
FACT: Players who earn $30 in a season are only a 34 percent bet to repeat or improve the following season. (Matt Cederholm)
FACT: Pitchers who earn less than $24 in a season retain only 52 percent of their value the following year. More expensive pitchers do retain 80 percent of their value. (Michael Weddell)
That 80 percent is nice but it still means your ace pitcher’s value is going to decline.
If you are looking for value retention or a reasonable return on your investment in this game, you’re playing the wrong game. This is no less evident in snake draft leagues when it comes to the very best players. One would think baseball’s elite stars are the most projectable commodities. One would be wrong.
FACT: The success rate of ADP rankings correctly identifying each season’s top 15 players (in any order) is only 34 percent. (Study period: 2004-2015)
In fact, this meager success rate has been trending downward over time. So here’s the takeaway:
When you sit down at the draft table (or your computer, whatever) and start agonizing over who is going to fall to you in the first round, there is a 66 percent chance that whoever you end up drafting will be wrong. Ten of the first 15 players taken in your draft will not earn back their owner’s investment.
That’s ridiculous. You’re lying.
Seems that way, right? But last March’s Top 15 included Andrew McCutchen, Giancarlo Stanton, Miguel Cabrera, Jose Abreu, Carlos Gomez… need I go on?
I guess not.
Felix Hernandez, Adam Jones, Troy Tulowitzki…
All right, I get it.
Over the last 12 years, it was all the same thing.
A great exercise to establish some perspective is to look at 2016’s ADPs and try to identify which five of the top 15 players will earn back their draft slot. Mike Trout’s a lock, right? But his earnings ranks over the past four years have been 1, 2, 4 and 10. Clayton Kershaw has finished 5, 6, 3, 2 and 3 but this record-breaking string has to end sometime, doesn’t it? Paul Goldschmidt and Bryce Harper are gimmes, right? Well, if you commit to those four, then you get to pick only one more player among the next 11. Who are you going to bet on? Miguel Cabrera or Carlos Correa? Giancarlo Stanton or Kris Bryant? Andrew McCutchen or Nolan Arenado?
It’s not easy, but I’m going to give it a shot in a few chapters.
Playing Time
You can do all the skills assessment you want, but the bane of our existence is getting a handle on playing time. Back when this game was invented, AB and IP projections were just another unknown element of the forecasting process. Here is a brief history of how far we’ve come:
Year Milestone event Playing time projections were… ---- --------------- -------------------------------- 1984 Rotisserie Baseball book published pretty easy, but we were clueless. 1989 Several analytical books published getting harder as we got smarter. 1996 Internet goes mainstream easier to crunch, but still a bitch. 2004 First high stakes leagues now ridiculously hard with $$ on the line. 2007 Disabled list days spike to 28,000 - Holy crap, this is getting nuts! 2012 DL days lost hits 30,000 marked by anarchy, chaos, ice cream binges. 2014 Daily games take over the planet done. Screw it, I'll just play for one day.
As I noted in the 2016 Baseball Forecaster, the number of players making an appearance on each Major League roster has increased significantly over the past 30 years. Each additional body stakes a claim to playing time, but the availability of plate appearances and innings hasn’t increased. There are still just 162 games in a season.
All of that boils down to more challenges projecting at-bats and innings:
In any given year, of the ADP’s top 300 players, between 45-50 percent will lose playing time due to the disabled list, demotion, suspension or release. Since playing time is a zero-sum proposition, those lost AB and IP have to go somewhere, and in fact, more than 70 percent of the most profitable players are typically driven by unexpected increases in playing time. The opportunity for those playing time increases is largely dependent on external events, virtually none of which are predictable on Draft Day. And so, more than 70 percent of each season’s most profitable players cannot be predicted on Draft Day.
As you would expect, these most profitable players have a disproportionately large impact on who is going to win your league. Research shows that 25 percent of the teams owning one or more of the most profitable players will win their league. More than 50 percent of those teams with the most profitable players will finish no lower than third place. The biggest driving force behind all that – changes in playing time – is unpredictable on Draft Day.
I think my head is going to explode.
I said you weren’t ready to hear the truth, and I meant it. But there’s one more variable.
Performance Enhancing Drugs
For more than a decade, I have written extensively about the impact of PEDs on the statistics that drive our game. I am not going to rehash the old arguments now. While there remains disagreement among analysts about how real or measurable the impact is, there are certain logical truths that are tough to deny.
– People are generally honest, except if it’s a choice between honesty and survival.
– For pro athletes, survival often equates to maintaining an edge to stay gainfully employed.
– If PEDs did not improve or sustain performance in order to give athletes an edge, why would they accept the risk of using them?
– You can’t dismiss the possibility that any radical swing in productivity could be caused by a player’s use or discontinuance of PEDs.
Ugh. I hate talk about PEDs. Are you trying to say that all players are motivated to cheat?
No, not all of them. But it’s yet one more variable that puts the “realness” of all statistics at risk. And unfortunately, it’s naïve to think that the lack of daily PED headlines means the problem has been contained. The above truths don’t change; only the effort to cover up PED use does.
But what about all those minor leaguers in the Mitchell Report? Aren’t they proof that PEDs don’t work?
For any alleged PED users who fell short of a real Major League career, it’s possible that they never would have made it out of rookie ball without that help. We don’t know. The impact of PEDs is relative to each player’s actual skill level. That means we need to question the legitimacy of performance stats throughout every level of pro ball. Probably college and high school too.
So, all in all, are you telling me that, despite all the massive effort we’ve been expending to construct elaborate systems to project player performance, none of the numbers can be trusted?
Well, we can a little, but not enough for it to matter. About five years ago, I asked 12 of the most prolific fantasy champions in high stakes leagues and national experts competitions to rank six variables based on how important they were to winning consistently. “More accurate player projections” came in dead last.
What did they say were the most important variables to winning consistently?
Here were the results:
1. Better in-draft strategy/tactics
2. Better sense of value
3. Better luck
4. Better grasp of contextual elements that affect players
5. Better in-season roster management
6. More accurate player projections
There was actually a seventh variable brought up by Larry Schechter – better use and access to TIME. He said that the more time invested in the entire process, the better the results. There is a good deal of truth to that.
But the questions is, can you build a successful team without statistical player projections at all? That is the question this book is going to try to answer. But first, we need to discuss some more obstacles to success.
NEXT: How psychology is out to get you
I love this stuff. Can’t wait for the next chapter.
I couldn’t agree more that playing time is the great variable than can make or break a draft/auction. How much do you stick your neck out for that minor leaguer that may be called up on May 1st or may not be called up until September? And what’s to say that the stud rookie will stick? One can bake in some health issues for Adam Wainwright, but who can foresee him getting hurt while batting? Luck has a lot to do with it, but so does shying away from the risk at certain prices. But on the other hand, no guts / no glory.
Feed me more, Ron!
This is going to be good! A nice new spin on some topics the Forecaster has touched on, but with an update, new spin and a “Ron feel” to it all.
Looking forward to the next chapter.
Ron, of course baseball is unpredictable. That is a feature of the game, not a flaw. Of course we love the stats and the projections to three decimal places. But we know going in that we will be stymied, thwarted, stumped and befuddled after we make our prognostications. That is the fun of it! What can go wrong will go wrong. I picked G Stanton w my first round pick last year.
Mark – When I go into a season, I want to think that I have a good chance to win. I don’t want to think, “what can go wrong will go wrong.” I want to be able to plan and manage around that. Otherwise, aren’t you just throwing your hands up and leaving your fates to random chance?
Loving it so far, your writing style for this reminds me of the old Alex Patton books. He wrote to himself a lot too!
That was a deliberate nod to Alex, a new member of the Fantasy Sports Writers Association Hall of Fame. That conversational writing style is what made a dry topic so incredibly engaging. I’ve adopted it in this book because, well, I like it too! I will be giving him a shout-out in my Acknowledgments once the book is finished.
One thing I have disagreed with and/or misunderstood ever since I first heard an expert say it, is “Once a player exhibits a skill, he owns it.” HUH?!?!?! So if a player hits a curve ball, he now owns the skill of hitting a curve ball, even if he might have just gotten a lucky swing? Or on a larger scale, if Brady Anderson hits 50 home runs in a season, but hadn’t hit more than 21 in any other season up until then, does that mean he is now a power threat? (The most he ever hit in a single season after that was 24, and other than the 21, 50, and 24 home run seasons, all of his other single season totals during his career were below 20). Or how about the first basemen stealing 20+ bases in a season that never repeat it? How does that phrase make sense, let alone help someone understand a player’s skills, all things being equal? (But as you stated, they never are). Just a thought I’ve been holding in for many years, and just now got around to posting to an expert.
Greg – The original intent of the adage is as an attempt to explain surprise performances. While we only hear “once a player displays a skill, he owns it,” the full description appears in the Baseball Forecaster as this:
“Once a player displays a skill, he owns it. That display could occur at any time—earlier in his career, back in the minors, or even in winter ball play. And while that skill may lie dormant after its initial display, the potential is always there for him to tap back into that skill at some point, barring injury or age. That dormant skill can reappear at any time given the right set of circumstances.
Caveats:
1. The initial display of skill must have occurred over an extended period of time. An isolated 1-hit shut-out in Single-A ball amidst a 5.00 ERA season is not enough. The shorter the display of skill in the past, the more likely it can be attributed to random chance. The longer the display, the more likely that any re-emergence is for real.
2. If a player has been suspected of using performance enhancing drugs at any time, all bets are off.
Corollaries:
1. Once a player displays a vulnerability or skills deficiency, he owns that as well. That vulnerability could be an old injury problem, an inability to hit breaking pitches, or just a tendency to go into prolonged slumps.
2. The probability of a player correcting a skills deficiency declines with each year that deficiency exists.”
And just like every other projective measure or explanation, it’s never 100%.
Projections and ADP are out to get you. I totally get that. You can spend days, weeks, and even months projecting a stat line of what a player can possibly do. At the end of the day, how accurate can you be? As you stated, we can only project 70 percent of what a player can do, and even that is questionable. What we can do, and you have stated this well, is determine a players skill. I will gladly go into drafts this season with the knowledge of what skills a player posses over some projection that likely will be wrong at the end.
[…] the only way you can overbid is if you know what a player is actually worth… and we don’t (Chapter 1). So I just focused on Assets and Liabilities (Chapter 4), and drafting the most balanced roster I […]