SkyKing162's Baseblog



A fan of the Yankees, Red Sox, and large sample sizes.


6.28.2003
 
NETruns

I've been thinking a lot lately about the correct theoretical concepts behind win shares. Here's the state of the union:

Claim points for hitting win shares are runs created above replacement. I call them extra runs created. ExtraRC = (RC/out - .5*lgRC/out) * outs.

Claim points for hitting loss shares are runs created below ideal. I call them lacking runs created. LackRC = (1.5*lgRC/out - RC/out) * outs.

Claim points for hitting game shares are expected runs produced. That is, the number of runs expected to be produced given a player's outs and a league-average RC/out rate. I call them total runs created. TotRC = lgRC/out * outs.

Note that extraRC + lackRC = totRC. This is good. This means that win shares + loss shares = game shares.

What's the best metric to use to value players? It should reward players for extraRC and punish them for lackRC. So what better calculation than simply win shares - loss shares, or extraRC - lackRC. I call these net runs created. NETruns = (RC/out - lgRC/out) * outs * 2. That 2 pops into the equation when you run through the algebra of subtracting lackRC from extraRC. It doesn't make a difference when comparing players, just changes the scale a little bit. The issue of leaving the 2 in the equation versus taking it out is comparable to a team being 10 games over .500, but only being 5 games ahead of a .500 team. Both are correct; they just present the information in slightly different ways.

So, we've got our metric of NETShares. It rewards players for quantity and quality compared to average. Let's see some 2003 numbers through June 26:

Top 10 Overall:


Player XR OUTS XR/OUT extraXR lackXR netXR totalXR
pujols,albert77181.4260-288833
delgado,carlos74201.3756-197536
bonds,barry64152.4250-237328
sheffield,gary66189.3549-156334
mora,melvin56142.3943-176026
helton,todd65211.3145-75238
edmonds,jim55179.3139-74632
giambi,jason59201.2941-44536
garciaparra,nom64230.2843-14442
boone,bret62223.2842-24440


Bottom 10 Overall:

Player XR OUTS XR/OUT extraXR lackXR netXR totalXR
tatis,fernando11149.07-230-3227
infante,omar14168.08-132-3330
dye,jermaine6123.05-628-3322
inge,brandon10150.07-331-3427
cora,alex17188.09034-3434
ausmus,brad17185.09034-3434
matthews,gary22216.10237-3539
izturis,cesar20208.10137-3638
phillips,brando16204.08-340-4237
konerko,paul11184.06-639-4533


Top 10 Firstbasemen

Player XR OUTS XR/OUT extraXR lackXR netXR totalXR
delgado,carlos74201.3756-197536
helton,todd65211.3145-75238
giambi,jason59201.2941-44536
sweeney,mike49157.3134-64028
palmeiro,rafael49190.263223034
thome,jim51212.243262638
durazo,erubiel47191.253052535
johnson,nick2790.3019-32216
sexson,richie53231.2332102142
lee,derrek50220.2330101940


Top 10 Secondbasemen

Player XR OUTS XR/OUT extraXR lackXR netXR totalXR
boone,bret62223.2842-24440
vidro,jose49186.273213134
soriano,alfonso60253.243792846
kent,jeff47193.243062435
giles,marcus46202.232891937
durham,ray38155.242351928
cintron,alex2699.261711618
hart,bo1017.609-6143
young,michael44213.2125141039
belliard,ronnie31148.211710827

(continued in next post... stupid blogger errors)


 
Top 10 Shortstops

Player XR OUTS XR/OUT extraXR lackXR netXR totalXR
garciaparra,nom64230.2843-14442
rodriguez,alex57219.263733440
furcal,rafael55226.243562841
gonzalez,alex48190.253132834
cabrera,orlando52219.243282540
renteria,edgar51215.243182339
martinez,ramon20103.19108219
berroa,angel37197.191917236
relaford,desi34183.191716133
mendez,donaldo839.204317


Top 10 Thirdbasemen

Player XR OUTS XR/OUT extraXR lackXR netXR totalXR
koskie,corey55195.2837-23835
lowell,mike59232.253753342
ensberg,morgan34105.3325-63119
rolen,scott51195.263323135
blalock,hank46184.252942533
glaus,troy43193.232691735
boone,aaron46221.2126141240
mueller,bill37172.222291231
stynes,chris35167.212011930
chavez,eric43216.202316739


Top 20 Outfielders and Designated Hitters

Player XR OUTS XR/OUT extraXR lackXR netXR totalXR
pujols,albert77181.4260-288833
bonds,barry64152.4250-237328
sheffield,gary66189.3549-156334
mora,melvin56142.3943-176026
edmonds,jim55179.3139-74632
thomas,frank55183.3038-54333
bradley,milton51161.3136-74329
ramirez,manny59215.2840-14139
walker,larry51170.3035-44031
byrnes,eric48167.2933-33630
martinez,edgar48164.2933-33630
suzuki,ichiro56213.273723639
wells,vernon62247.254053545
gonzalez,luis56212.263723538
jones,chipper50183.2834-13433
wilkerson,brad46162.2932-23429
anderson,garret56220.263643240
giles,brian41137.3028-43225
guillen,jose41140.2928-33125
berkman,lance50193.263323135


Top 10 Catchers

Player XR OUTS XR/OUT extraXR lackXR netXR totalXR
lopez,javy45149.3032-53627
myers,greg35116.3024-32721
posada,jorge45185.252952434
lieberthal,mike39157.252542128
wilson,tom2799.281801918
piazza,mike2480.2916-21815
varitek,jason37161.232271529
rodriguez,ivan39177.222391432
loduca,paul41197.2123121136
larue,jason32155.211810728


For the full list of players and their NETruns, click the link on the sidebar.


6.27.2003
 
Beating the Ratio Conversion Horse

Motivated by my recent interest in win shares, loss shares, and game shares, I had an idea concerning the conversion of ratio stats into counting stats when valuing fantasy baseball players. Traditionally, a baseline ratio is chosen, and the counting stat becomes how many hits/runs/whatever a player is better than that baseline would be, given the same playing time opportunity. For example, if Derek Jeter hits .280 in 600 ABs, he has 12 extra hits compared to a .260 hitter with 600 ABs: (.300-.280)*600 = 12.

There's nothing magical about the baseline - it could be "anything." Many people choose the anticipated AVG of the last place roto team. Thus, extra hits becomes a measure of how many hits each player is helping your team do better than last place. I think this is because players are often compared to replacement level during baseball analysis. But since the next step in the valuation process is to find the replacement level number of extra hits, using the replacement level AVG to find extra hits is not a requirement. It may turn out to be the best, but it's not necessarily the best just by definition.

Win shares compares a player to replacement level. Loss shares compares a player to ideal level. Both are needed to get the full picture of a player's performance. So why not do the same for roto values? Let's compare a batter's AVG to both a replacement level (anticipated last place AVG) and an ideal level (anticipated first place AVG). This would measure how much a player helps pull you up from the dredges of last place, but also how much he's preventing your team from finishing in first. Here's an example:

Derek Jeter AVG: .280 in 600 ABs
Erubiel Durazo AVG: .270 in 450 ABS
last place AVG: .260
first place AVG: .290

DJ netHits = (.280-.260)*600 - (.290-.280)*600 = 12-6 = 6
Ruby netHits = (.270-.260)*450 - (.290-.270)*450 = 4.5-9 = -4.5

What does this number really mean? Well, let's do some algebra (oooooh):

netHits = (plAVG-repAVG)*AB-(idealAVG-plAVG)*AB
netHits = [(plAVG-repAVG)-(idealAVG-plAVG)]*AB
netHits = [2*plAVG-(repAVG+idealAVG)]*AB
netHits = [2*plAVG-2*(meanAVG)]*AB where meanAVG is the mean of repAVG and idealAVG
netHits = 2*(plAVG-meanAVG)*AB

This is simply twice the number of extra hits when using meanAVG as the baseline. It would have to be tested, but I'm pretty sure the mean of the repAVG and idealAVG is pretty close the mean AVG of the draftable player pool. Since multiplying counting stats by constants doesn't affect value, using this method is the same as using extra hits compared to meanAVG.

This excercise is just one more thing that makes me think using meanAVG as the baseline has some merit over repAVG. I remember Todd Zola claiming repAVG is better because the empirical results turn out "better," but I wonder if that isn't more an issue of making up for faulty projections versus theoretically correct. I guess it should be tested using year-end stats and values.


6.25.2003
 
Win Shares + Loss Shares = Game Shares

This was originally a post at RotoJunkie. It's been modified to seem more like an article for my blog. If you want to comment, head on over to the Sabrmetrics forum. Enjoy...

As a burgeoning Sabrmetrics groupie a few years ago when Win Shares hit the market, I ate it up. I thought it was the coolest ranking method. Well, almost the coolest. You see, there were things that just bugged me about Bill James' Win Shares system, mostly the (many parts) where James made decisions more subjectively than objectively. For example, those weird 40/30/20/10 weighted scales for valuing individual defense, the 52/48 argument (I think pitchers are undervalued, so I'll give 'em more points), and the fact that a team's Win Shares are directly proportional to wins, when there is a lot of variability in wins given a certain ability level. But it wasn't until I really started doing lots of my own baseball analysis that something else started to bug me. I couldn't explain it until I read a pdf document put together by TangoTiger and Rob Wood from Baseball Primer. Plain and simple...

Win Shares just aren't a complete, useful metric without their counterpart, Loss Shares.

Take two pitchers, Bob and Nolan. They don't seem like pitchers of equal ability, except that Bill James (via Win Shares) says they are. So, are they? Oh, you want some stats...ok, here you go:

Bob: 200 IP, 3.00 ERA, 0 BB, 0 SO, 0 HR (yes, every batter puts the ball in play - I love extreme examples)
The common assumption in the post-DIPS world is to say that half the credit for the results of balls in play go to the fielders and half to the pitchers, so I'll stick with that. Thus, since all of the batters Bob faced put the ball in play, Bob gets half the credit (half of all is half) for runs saved during his time on the mound: (6.75 ERA - 3.00 ERA)/9*200*.5 = 41.7 runs prevented, where 6.75 is the replacement level ERA of 1.5 times league average ERA.

Nolan: 200 IP, 4.00 ERA, with peripheral stats such that Nolan garners 75% of the credit for runs prevented (this doesn't mean half the batters he faces put the ball in play - it only means that half the runs scored are a result of balls put in play and the other half are a result of balls not put in play)
Nolan's credit = (6.75 - 4.00)/9*200*.75 = 45ish runs prevented.

Let's assume that both Bob and Nolan pitch in front of fielding teams with the same ability. Win Shares says Nolan deserves more credit, because he prevented more runs, even though he pitched the same number of innings with an ERA a full run higher than Bob. Seriously, would you want Nolan as your pitcher, or Bob? Do you want 200 IP with a 4.00 ERA or a 3.00 ERA? Seems like there's something fishy going on...

The idea behind Win Shares is that it assigns credit (Win Shares) on the basis of responsibility. Nolan's responsible for preventing more runs than Bob, thus he receives more Win Shares. But he's also responsible for allowing more runs than Bob, which Win Shares ignores. What's need is Loss Shares. Bill James even alludes to the fact that he thought about Loss Shares, but left them out because he couldn't figure out how to calculate them. But they're critical if you want to look at the whole picture. In order to compare Bob and Nolan, we also need to know how many extra runs more than Bob Nolan was responsible for allowing, in addition to the number of extra runs Nolan was responsible for preventing.

I don't claim to have come up with a way to calculate Loss Shares (although I believe Tango and Rob did in their pdf file), but let's do a little calculation that's on the right track.

Let's define "runs allowed^" as runs allowed above the "ideal pitcher" (the positive version of the replacement pitcher). Where the replacement pitcher has an ERA of 1.5*lgERA, the ideal pitcher has an ERA of .5*lgERA = 2.25 in our example. (Yes, pitchers often have better ERAs than this ideal ERA, but the point will still get across - plus, pitchers can, and do, have ERAs worse than replacement level.)

Bob gets charged with allowing^ (3.00-2.25)*200/9*.5 = 8.3 runs worse than ideal.
Nolan gets charged with allowing^ (4.00-2.25)*200/9*.75 = 28.5 runs worse than ideal.

Hmmm, Bob's responsible for allowing way fewer runs below ideal than Nolan. Let's combine runs prevented and runs allowed^:

Bob NET = 41.7 - 8.3 = 33.4
Nolan NET = 45 - 28.5 = 17.5

Thus, while Win Shares gives equal credit to Bob and Nolan, "NET Shares" would say Bob's performace was about twice as valuable as Nolan's. Why? Because while Nolan prevents more runs, he's also given more responsibility (aka opportunity) to prevent runs. And with more opportunity, Nolan's also allowing more runs, so much more that his advantage in Win Shares gets negated. It's like saying the Braves are better than the Tigers because they won 94 games to the Tigers' 68, without knowing how many games both teams played. If both played 162 games, the Braves are better, but if the Braves played 200 games and the Tigers played 100, the Tigers aren't more impressive? The same idea holds for Win Shares. Nolan uses a bigger chunk of the defensive opportunities (Game Shares) than Bob and thus should be held accountable for it. So in addition to Win Shares, we need Loss Shares. Together the two imply an all-encompassing stat - Game Shares. Because Nolan's more responsible than Bob for runs while he pitches, Nolan has more Game Shares.


6.23.2003
 
DIPS Numbers Through June 15

I've finally figured out how to create and post html at an actual website, so now my "daily" DIPS numbers will be available for all to see. Here's a quick rundown of what each number is:

ERA: tried and true Earned Run Average
XERA: Extrapolated Earned Run Average - what you'd expect a pitcher's ERA to be based on his actual component stats (e.g. singles, walks, strikeouts) and league average ER/R rate
$ERA: expected ERA using a pitcher's unadjusted $BB, $SO, and $HR rates combined with his team's average $H, $2B, and $3B rates
dERA: expected ERA using a pitcher's adjusted (to a neutral park) $BB, $SO, and $HR rates combined with MLB-average $H, $2B, and $3B rates. This is the "DIPS ERA."
rdERA: same thing as dERA, but with a pitcher's $BB, $SO, and $HR rates regressed "appropriately" towards the mean. Currently, appropriately means about .3 for $BB, .2 for SO, and .5 for $HR. Only the current season is considered. In the future, I hope to work in a regression rate that's a function of BFP for the current season. And, in the grand scheme of things, the goal is also to incorporate past season performance with current performance to get a good predictor of future performance.
$H=(H-HR)/(AB-SO-HR)
$BB=BB/PA
$SO=SO/AB
$HR=HR/(AB-SO)


6.20.2003
 
Bret Boone Versus Alfonso Soriano 2003

G AB R H 2B3BHRRBIBBSOSBCSAVG OBP SLG OPSBB/PAHR/AB
Soriano71316569012318442264193.285.341.513.854.065.057
Boone7027755892201959274761.321.381.606.987.089.069


Ok, so make me an argument that says Soriano should start the All-Star team over Boone.

The only possible things I see in favor of Soriano are these:
- Soriano's had more plate appearances (about 35 it looks like)
- Soriano's stealing many more bases at the same success rate

My rebuttal (after laughing profusely) is thus:
- 35 PA isn't very many and compard to the quality difference, is insignificant
- The stolen base difference equals about 3 runs according to linear weights - not a big deal.

Points in favor of Boone, put simply:
- .040 OBP advantage
- .090 SLG advantage (for a .130 difference in OPS for those scoring at home)
- Half the time Boone hits at Safeco
- Boone plays kick-ass defense, whereas Soriano merely holds his own.

Let's do a quick runs created analysis:
Soriano: .341*.513/338 = 59
Boone: .381*.606/304 = 70

Definitely an advantage for Boone, but here's the kicker - consider outs (AB-H+CS) and RC/27 outs:
Soriano: 229 outs yields 6.7 runs/game
Boone: 189 outs yields 10 runs/game

Ok, it's an extremely rough analysis, but much more accurate than the general argument of "Soriano's such a great athlete and can hit any pitch." Sure, he can probably do athletic things than Boone (or most others) can't, but that's not the issue - the issue is which player is doing more to help his team win. And it should be obvious that it's Boone.