Thursday, February 21, 2019

CBS: Predicting Pitcher Performance -- A Fool's Errand?


This post was an initial pass at digging into predicting pitching performance. The full enhanced version is on CBS: https://www.cbssports.com/fantasy/baseball/news/fantasy-baseball-draft-prep-how-much-can-pitching-projections-tell-us/

__________________________________________

Recently, there's been a lot of chatter on Twitter about the value of projections. They're certainly far from perfect -- the creators would admit as much -- but they're among the best we've got.

Many anti-projection arguments tend to put more weight into past stats than we should. Go dig up Kyle Freeland argument threads on Twitter if you don't believe me.

So what should we be using? Just projections? A combination of both?

To find out, let's starting with appreciating how wildly stats can fluctuate year-to-year, particularly those that many fantasy gamers play for in traditional 5x5 leagues.

To quantify this, here are the r-squared figures between the metric in one season ("season Y") and itself in the following season ("season Y+1"). This is using 1,114 player seasons from 2007 to 2018 for the traditional four starting pitching categories (excluded saves) (min. 150 innings pitched in each season):

Metric R^2
K 0.526
WHIP 0.208
IP 0.155
ERA 0.108
W 0.045

Basically, outside of strikeouts and maybe WHIP, previous season stats are terrible at predicting future performance in those areas. I also included innings given it's a key driver of performance, for strikeouts and wins, in particular. On the whole, the relationship between these stats year-to-year is weak

Now, let's see if projections fared any better -- taking an average of Steamer, Depth Charts, THE BAT and ZiPS for 2018 starting pitchers only (those who were projected for at least 75 innings by all systems):

Metric R^2         Diff
K 0.282 -46%
WHIP 0.238 14%
ERA 0.223 107%
W 0.188 316%
IP 0.113 -27%

That's a surprising mixed bag, BUT projections did fare significantly better in measuring talent -- that is, ERA and WHIP. They were much worse at estimating projected innings, a big factor in their worse projection of strikeouts. The giant surprise was projections outperforming previous year's stats in wins by a multiple of 3x, despite being nearly 30% worse at projecting innings. That goes to show how volatile -- and, dare I say, useless -- wins can be at measuring a pitcher's talent. The damn things fluctuate wildly each season!

For those volume stats -- strikeouts and wins -- what if we stripped away the innings projection and assessed the differing abilities to project future performance on a rate basis instead? Let's take a look at how the previous season and projections fared at predicting future strikeout and win rates:

Metric Season Y         Proj.
K/9 0.615 0.470
W/IP 0.017 0.134
W/GS 0.042 0.169

Hey, not bad! Our ability to project strikeouts per inning (or K/9 in this case) is anywhere from roughly 17-66% better than our ability to project raw strikeouts. Wins didn't fare the same -- somehow wins per inning and wins per game started are less sticky than raw wins. Oddly, when excluding ZiPS, projections actually predict W/GS roughly 5% better than raw wins. I'm not sure what's going on with ZiPS, but I'd rather not waste too much time with wins.

Okay, so far we've seen a mixed bag, but I'd argue this is a big win for projections. They are much better at predicting future ERA and WHIP than their previous, often-cited stats. They're also significantly better at predicting wins. If you must use previous season's stats, it appears the best place to do so would be with strikeouts (K/9 or K%) and potentially innings.

Let's wrap this up by assessing what are the best metrics to use when predicting the four primary "roto" starting pitching categories. 

ERA

Metric R^2
Proj. 0.223
SIERA 0.193
xFIP 0.183
K-BB% 0.176
FIP 0.174
K% 0.167
ACES 0.164

WHIP

Metric R^2
K-BB% 0.294
SIERA 0.251
Proj. 0.238
K% 0.218
xFIP 0.213
WHIP 0.208
ACES 0.205
FIP 0.200

STRIKEOUTS (K%)

Metric R^2
K% 0.615
K-BB% 0.514
Proj. 0.470
SwStr% 0.464
Contact% 0.456
SIERA 0.360
Z-Contact% 0.343
ACES 0.337

WINS

Metric R^2
Proj. 0.188
FIP 0.104
SIERA 0.100
xFIP 0.094
K 0.093
K-BB% 0.088
K% 0.074

INNINGS

Metric R^2
IP 0.155
TBF 0.121
Proj. 0.113
Pitches 0.103

LEAGUE-WINNING TAKEAWAYS

This piece wouldn't be complete without mentioning this -- projection testing is based on one season. I'd love to test multiple years -- similar to the actual stats -- but that will be for a different day. Additionally, we didn't even mention the new Statcast data -- "expected" stats, exit velocities, etc. That will also be for a different day. We could find a near infinite amount of metrics to test, but this should get us most of the way there.

With those caveats out of the way, how should we apply this information? 
  • Projections absolutely need to be used versus the actual stat itself from previous seasons, particularly for ERA, WHIP and wins
  • When evaluating pitchers, bet on strikeouts -- among traditional 5x5 categories, that's the category that we're far-and-away best equipped to predict
  • Use an array of metrics when evaluating and projecting pitchers: projections, ERA estimators (SIERA, DRA - not tested here, xFIP, FIP), K-BB%, K% and ACES.
  • Here's what I'll be looking at to assess and predict a pitcher's performance across the various categories:
    • ERA: Nearly all of the above -- projected ERA, ERA estimators, K-BB%, K% and ACES 
    • WHIP: K-BB%, projected WHIP, SIERA
    • Strikeouts: K%, K-BB%, projected K% or K/9
    • Wins: Projected wins
    • Innings: Previous season's IP/TBF and projected IP
There you have it -- the best tools we have to project pitchers. When the hitting portion of this series is complete, we'll come to see that hitters are more projectable than pitchers. Hopefully this analysis gives you the edge so you know what to look at -- and what not to look at -- while your competitors are poking around in the wrong places.

A NOTE ON SECOND HALF SPLITS

Read enough fantasy analysis and you're sure to come across someone citing second half splits. Maybe there's good reason for it -- injury, change in talent, etc. But more often than not, it's a case of recency bias.

I tested this using data from FanGraphs for starting pitchers who threw at least 30 second-half innings and then 150 innings the next season. I looked at the r-squared between their second half numbers in season one to the same stats in the full season two.

In essentially every case, you're significantly better off using the full season numbers over the cherry-picked second half numbers when the pitcher "figured it out." Outside of relatively obvious cases like injuries, I'd rather bet on the averages (i.e., full season numbers) while others try and find the outliers.

R-Squared of Full Season vs. 2nd Half Numbers

Metric Full 2nd Half Diff
IP 0.155 0.129 20%
TBF 0.129 0.121 6%
HR/9 0.163 0.086 90%
K% 0.615 0.553 11%
BB% 0.455 0.346 31%
K-BB% 0.553 0.469 18%
WHIP 0.208 0.135 54%
BABIP 0.039 0.031 27%
LOB% 0.025 0.018 38%
FIP 0.312 0.213 46%
xFIP 0.444 0.389 14%
LD% 0.027 0.021 27%
GB% 0.627 0.585 7%
FB% 0.617 0.581 6%
Soft% 0.006 0.010 -42%
Med% 0.065 0.023 183%
Hard% 0.067 0.034 97%
ERA 0.108 0.063 70%

Year-to-Year Stickiness

For reference, I've also included the year-to-year stickiness of all metrics I tested. This is measuring the year-to-year relationship with the metric in one season with itself in the next season.

Metric R^2
ACES 0.764
GB% 0.627
SwStr% 0.622
FB% 0.617
K% 0.615
K/9 0.615
Contact% 0.608
O-Contact% 0.556
Z-Contact% 0.554
K-BB% 0.553
Ks 0.526
O-Swing% 0.501
BB% 0.455
SIERA 0.455
xFIP 0.444
Z-Swing% 0.433
FIP 0.312
WHIP 0.208
HR/9 0.163
IP 0.155
Pitches 0.148
TBF 0.129
HR/FB 0.111
ERA 0.108
Hard% 0.067
Med% 0.065
W 0.045
BABIP 0.039
LD% 0.027
LOB% 0.025
GS 0.018
Soft% 0.006
G 0.003

Aaron Sauceda Web Developer

Morbi aliquam fringilla nisl. Pellentesque eleifend condimentum tellus, vel vulputate tortor malesuada sit amet. Aliquam vel vestibulum metus. Aenean ut mi aucto.

No comments:

Post a Comment