A fan of the Yankees, Red Sox, and large sample sizes.
Beating the Ratio Conversion Horse
Motivated by my recent interest in win shares, loss shares, and game shares, I had an idea concerning the conversion of ratio stats into counting stats when valuing fantasy baseball players. Traditionally, a baseline ratio is chosen, and the counting stat becomes how many hits/runs/whatever a player is better than that baseline would be, given the same playing time opportunity. For example, if Derek Jeter hits .280 in 600 ABs, he has 12 extra hits compared to a .260 hitter with 600 ABs: (.300-.280)*600 = 12.
There's nothing magical about the baseline - it could be "anything." Many people choose the anticipated AVG of the last place roto team. Thus, extra hits becomes a measure of how many hits each player is helping your team do better than last place. I think this is because players are often compared to replacement level during baseball analysis. But since the next step in the valuation process is to find the replacement level number of extra hits, using the replacement level AVG to find extra hits is not a requirement. It may turn out to be the best, but it's not necessarily the best just by definition.
Win shares compares a player to replacement level. Loss shares compares a player to ideal level. Both are needed to get the full picture of a player's performance. So why not do the same for roto values? Let's compare a batter's AVG to both a replacement level (anticipated last place AVG) and an ideal level (anticipated first place AVG). This would measure how much a player helps pull you up from the dredges of last place, but also how much he's preventing your team from finishing in first. Here's an example:
Derek Jeter AVG: .280 in 600 ABs
Erubiel Durazo AVG: .270 in 450 ABS
last place AVG: .260
first place AVG: .290
DJ netHits = (.280-.260)*600 - (.290-.280)*600 = 12-6 = 6
Ruby netHits = (.270-.260)*450 - (.290-.270)*450 = 4.5-9 = -4.5
What does this number really mean? Well, let's do some algebra (oooooh):
netHits = (plAVG-repAVG)*AB-(idealAVG-plAVG)*AB
netHits = [(plAVG-repAVG)-(idealAVG-plAVG)]*AB
netHits = [2*plAVG-(repAVG+idealAVG)]*AB
netHits = [2*plAVG-2*(meanAVG)]*AB where meanAVG is the mean of repAVG and idealAVG
netHits = 2*(plAVG-meanAVG)*AB
This is simply twice the number of extra hits when using meanAVG as the baseline. It would have to be tested, but I'm pretty sure the mean of the repAVG and idealAVG is pretty close the mean AVG of the draftable player pool. Since multiplying counting stats by constants doesn't affect value, using this method is the same as using extra hits compared to meanAVG.
This excercise is just one more thing that makes me think using meanAVG as the baseline has some merit over repAVG. I remember Todd Zola claiming repAVG is better because the empirical results turn out "better," but I wonder if that isn't more an issue of making up for faulty projections versus theoretically correct. I guess it should be tested using year-end stats and values.
Comments: Post a Comment