SkyKing162's Baseblog

A fan of the Yankees, Red Sox, and large sample sizes.

Finding Actual Ability Level

Because of Win Shares (and loss shares, don't forget) and TangoTiger's Win Probability Added, I've been thinking a lot about measuring value (past contribution towards trying to win). The cousin of value is actual ability, of which we get a glance every time a player does something on the field. The general method for determining ability (from the sabrmetric angle) is to collect a player's stats, adjust them for environment (ballpark, mostly), and regress towards the mean since actual performance is only a sample of a player's ability.

Let's keep things simple, and say that for hitting, the important measures of ability are homerun rate, strikeout rate, walk rate, BABIP, XBIP. These ratios are pretty basic and taken together are fairly complete. You can calculate OBP and SLG from these, and there's not much else you want to know about a hitter than his OBP and SLG abilities.

The one fault I find in the raw-stats to ability-stats conversion is the lack of consideration of opponent. Analysts smartly remove ballpark effects, but I think the pitchers and fielders that hitters face, and the hitters that pitchers face should also be considered. If Raul Mondesi and Derek Jeter have the same walk rate, most people assume they have the same ability to take walks since both play in the same home ballpark. But Mondesi's missed time to injuries - what if all those games were at a park that decreases walks - Jeter's walk rate is artificially lower relative to Mondesi's. And what if during those games the Yankees had to hit against Pedro Martinez, Nate Cornejo and other pitchers who don't walk anyone - again, Jeter's walk rate would be lower than it should be relative to Mondesi's.

My point is that the translation from raw-stats to ability-stats should consider play-by-play data. For example (completely made up), Derek Jeter in 2003 has 8 PA against Pedro Martinez, 12 against Nate Cornejo, 9 against Bob Stanley, etc... Weighting these pitcher's walk rates by the number of times tey faced Jeter, we could come up with the average walk rate of pitchers facing Derek Jeter. Then, since we know Jeter's raw walk rate, we could calculate his theoretical actual walk rate (regressing appropriately and considering park factors, naturally). Currently, analysts just assume that opposition evens out over the long-run. It doesn't, especially when getting down to the smaller sample sizes of relief pitchers.

Of course, in order to compute Derek Jeter's actual ability walk rate, you need to know Pedro's and Nate's, which requires you to know the walk rates of everyone they faced. The whole things is a big dependant web of hitters v. pitcher v. hitters v. pitchers... However, I'm sure there's a way to calculate these things while minimizing error.

I'll try a sample calculation and report it. Let me know if anyone has any ideas.

Comments: Post a Comment