SkyKing162's Baseblog

SkyKing162's Baseblog

A fan of the Yankees, Red Sox, and large sample sizes.

- My Links -
2003 DIPS
Roto Values

- Useful Stats Links -
Mahnken's Daily DIPS
Doug's MLB Stats
Hardball Times Stats
BPro Stats

- Places I Visit Daily -
RotoJunkie Bullpen
Baseball Musings
USS Mariner
Will Carroll
Hardball Times
Score Bard
Baseball Think Factory

- Article Hall of Fame -
Zumsteg: Revenue Sharing

Visitors Since 6.25.03

Atom Feed

- Archives -

6.28.2003

NETruns

I've been thinking a lot lately about the correct theoretical concepts behind win shares. Here's the state of the union:

Claim points for hitting win shares are runs created above replacement. I call them extra runs created. ExtraRC = (RC/out - .5*lgRC/out) * outs.

Claim points for hitting loss shares are runs created below ideal. I call them lacking runs created. LackRC = (1.5*lgRC/out - RC/out) * outs.

Claim points for hitting game shares are expected runs produced. That is, the number of runs expected to be produced given a player's outs and a league-average RC/out rate. I call them total runs created. TotRC = lgRC/out * outs.

Note that extraRC + lackRC = totRC. This is good. This means that win shares + loss shares = game shares.

What's the best metric to use to value players? It should reward players for extraRC and punish them for lackRC. So what better calculation than simply win shares - loss shares, or extraRC - lackRC. I call these net runs created. NETruns = (RC/out - lgRC/out) * outs * 2. That 2 pops into the equation when you run through the algebra of subtracting lackRC from extraRC. It doesn't make a difference when comparing players, just changes the scale a little bit. The issue of leaving the 2 in the equation versus taking it out is comparable to a team being 10 games over .500, but only being 5 games ahead of a .500 team. Both are correct; they just present the information in slightly different ways.

So, we've got our metric of NETShares. It rewards players for quantity and quality compared to average. Let's see some 2003 numbers through June 26:

Top 10 Overall:

Player	XR	OUTS	XR/OUT	extraXR	lackXR	netXR	totalXR
pujols,albert	77	181	.42	60	-28	88	33
delgado,carlos	74	201	.37	56	-19	75	36
bonds,barry	64	152	.42	50	-23	73	28
sheffield,gary	66	189	.35	49	-15	63	34
mora,melvin	56	142	.39	43	-17	60	26
helton,todd	65	211	.31	45	-7	52	38
edmonds,jim	55	179	.31	39	-7	46	32
giambi,jason	59	201	.29	41	-4	45	36
garciaparra,nom	64	230	.28	43	-1	44	42
boone,bret	62	223	.28	42	-2	44	40

Bottom 10 Overall:

Player	XR	OUTS	XR/OUT	extraXR	lackXR	netXR	totalXR
tatis,fernando	11	149	.07	-2	30	-32	27
infante,omar	14	168	.08	-1	32	-33	30
dye,jermaine	6	123	.05	-6	28	-33	22
inge,brandon	10	150	.07	-3	31	-34	27
cora,alex	17	188	.09	0	34	-34	34
ausmus,brad	17	185	.09	0	34	-34	34
matthews,gary	22	216	.10	2	37	-35	39
izturis,cesar	20	208	.10	1	37	-36	38
phillips,brando	16	204	.08	-3	40	-42	37
konerko,paul	11	184	.06	-6	39	-45	33

Top 10 Firstbasemen

Player	XR	OUTS	XR/OUT	extraXR	lackXR	netXR	totalXR
delgado,carlos	74	201	.37	56	-19	75	36
helton,todd	65	211	.31	45	-7	52	38
giambi,jason	59	201	.29	41	-4	45	36
sweeney,mike	49	157	.31	34	-6	40	28
palmeiro,rafael	49	190	.26	32	2	30	34
thome,jim	51	212	.24	32	6	26	38
durazo,erubiel	47	191	.25	30	5	25	35
johnson,nick	27	90	.30	19	-3	22	16
sexson,richie	53	231	.23	32	10	21	42
lee,derrek	50	220	.23	30	10	19	40

Top 10 Secondbasemen

Player	XR	OUTS	XR/OUT	extraXR	lackXR	netXR	totalXR
boone,bret	62	223	.28	42	-2	44	40
vidro,jose	49	186	.27	32	1	31	34
soriano,alfonso	60	253	.24	37	9	28	46
kent,jeff	47	193	.24	30	6	24	35
giles,marcus	46	202	.23	28	9	19	37
durham,ray	38	155	.24	23	5	19	28
cintron,alex	26	99	.26	17	1	16	18
hart,bo	10	17	.60	9	-6	14	3
young,michael	44	213	.21	25	14	10	39
belliard,ronnie	31	148	.21	17	10	8	27

(continued in next post... stupid blogger errors)

Posted by Sky at 11:19 AM | 0 comments

Top 10 Shortstops

Player	XR	OUTS	XR/OUT	extraXR	lackXR	netXR	totalXR
garciaparra,nom	64	230	.28	43	-1	44	42
rodriguez,alex	57	219	.26	37	3	34	40
furcal,rafael	55	226	.24	35	6	28	41
gonzalez,alex	48	190	.25	31	3	28	34
cabrera,orlando	52	219	.24	32	8	25	40
renteria,edgar	51	215	.24	31	8	23	39
martinez,ramon	20	103	.19	10	8	2	19
berroa,angel	37	197	.19	19	17	2	36
relaford,desi	34	183	.19	17	16	1	33
mendez,donaldo	8	39	.20	4	3	1	7

Top 10 Thirdbasemen

Player	XR	OUTS	XR/OUT	extraXR	lackXR	netXR	totalXR
koskie,corey	55	195	.28	37	-2	38	35
lowell,mike	59	232	.25	37	5	33	42
ensberg,morgan	34	105	.33	25	-6	31	19
rolen,scott	51	195	.26	33	2	31	35
blalock,hank	46	184	.25	29	4	25	33
glaus,troy	43	193	.23	26	9	17	35
boone,aaron	46	221	.21	26	14	12	40
mueller,bill	37	172	.22	22	9	12	31
stynes,chris	35	167	.21	20	11	9	30
chavez,eric	43	216	.20	23	16	7	39

Top 20 Outfielders and Designated Hitters

Player	XR	OUTS	XR/OUT	extraXR	lackXR	netXR	totalXR
pujols,albert	77	181	.42	60	-28	88	33
bonds,barry	64	152	.42	50	-23	73	28
sheffield,gary	66	189	.35	49	-15	63	34
mora,melvin	56	142	.39	43	-17	60	26
edmonds,jim	55	179	.31	39	-7	46	32
thomas,frank	55	183	.30	38	-5	43	33
bradley,milton	51	161	.31	36	-7	43	29
ramirez,manny	59	215	.28	40	-1	41	39
walker,larry	51	170	.30	35	-4	40	31
byrnes,eric	48	167	.29	33	-3	36	30
martinez,edgar	48	164	.29	33	-3	36	30
suzuki,ichiro	56	213	.27	37	2	36	39
wells,vernon	62	247	.25	40	5	35	45
gonzalez,luis	56	212	.26	37	2	35	38
jones,chipper	50	183	.28	34	-1	34	33
wilkerson,brad	46	162	.29	32	-2	34	29
anderson,garret	56	220	.26	36	4	32	40
giles,brian	41	137	.30	28	-4	32	25
guillen,jose	41	140	.29	28	-3	31	25
berkman,lance	50	193	.26	33	2	31	35

Top 10 Catchers

Player	XR	OUTS	XR/OUT	extraXR	lackXR	netXR	totalXR
lopez,javy	45	149	.30	32	-5	36	27
myers,greg	35	116	.30	24	-3	27	21
posada,jorge	45	185	.25	29	5	24	34
lieberthal,mike	39	157	.25	25	4	21	28
wilson,tom	27	99	.28	18	0	19	18
piazza,mike	24	80	.29	16	-2	18	15
varitek,jason	37	161	.23	22	7	15	29
rodriguez,ivan	39	177	.22	23	9	14	32
loduca,paul	41	197	.21	23	12	11	36
larue,jason	32	155	.21	18	10	7	28

For the full list of players and their NETruns, click the link on the sidebar.

Posted by Sky at 11:18 AM | 0 comments

6.27.2003

Beating the Ratio Conversion Horse

Motivated by my recent interest in win shares, loss shares, and game shares, I had an idea concerning the conversion of ratio stats into counting stats when valuing fantasy baseball players. Traditionally, a baseline ratio is chosen, and the counting stat becomes how many hits/runs/whatever a player is better than that baseline would be, given the same playing time opportunity. For example, if Derek Jeter hits .280 in 600 ABs, he has 12 extra hits compared to a .260 hitter with 600 ABs: (.300-.280)*600 = 12.

There's nothing magical about the baseline - it could be "anything." Many people choose the anticipated AVG of the last place roto team. Thus, extra hits becomes a measure of how many hits each player is helping your team do better than last place. I think this is because players are often compared to replacement level during baseball analysis. But since the next step in the valuation process is to find the replacement level number of extra hits, using the replacement level AVG to find extra hits is not a requirement. It may turn out to be the best, but it's not necessarily the best just by definition.

Win shares compares a player to replacement level. Loss shares compares a player to ideal level. Both are needed to get the full picture of a player's performance. So why not do the same for roto values? Let's compare a batter's AVG to both a replacement level (anticipated last place AVG) and an ideal level (anticipated first place AVG). This would measure how much a player helps pull you up from the dredges of last place, but also how much he's preventing your team from finishing in first. Here's an example:

Derek Jeter AVG: .280 in 600 ABs
Erubiel Durazo AVG: .270 in 450 ABS
last place AVG: .260
first place AVG: .290

DJ netHits = (.280-.260)*600 - (.290-.280)*600 = 12-6 = 6
Ruby netHits = (.270-.260)*450 - (.290-.270)*450 = 4.5-9 = -4.5

What does this number really mean? Well, let's do some algebra (oooooh):

netHits = (plAVG-repAVG)*AB-(idealAVG-plAVG)*AB
netHits = [(plAVG-repAVG)-(idealAVG-plAVG)]*AB
netHits = [2*plAVG-(repAVG+idealAVG)]*AB
netHits = [2*plAVG-2*(meanAVG)]*AB where meanAVG is the mean of repAVG and idealAVG
netHits = 2*(plAVG-meanAVG)*AB

This is simply twice the number of extra hits when using meanAVG as the baseline. It would have to be tested, but I'm pretty sure the mean of the repAVG and idealAVG is pretty close the mean AVG of the draftable player pool. Since multiplying counting stats by constants doesn't affect value, using this method is the same as using extra hits compared to meanAVG.

This excercise is just one more thing that makes me think using meanAVG as the baseline has some merit over repAVG. I remember Todd Zola claiming repAVG is better because the empirical results turn out "better," but I wonder if that isn't more an issue of making up for faulty projections versus theoretically correct. I guess it should be tested using year-end stats and values.

Posted by Sky at 8:46 AM | 0 comments

6.25.2003

Win Shares + Loss Shares = Game Shares

This was originally a post at RotoJunkie. It's been modified to seem more like an article for my blog. If you want to comment, head on over to the Sabrmetrics forum. Enjoy...

As a burgeoning Sabrmetrics groupie a few years ago when Win Shares hit the market, I ate it up. I thought it was the coolest ranking method. Well, almost the coolest. You see, there were things that just bugged me about Bill James' Win Shares system, mostly the (many parts) where James made decisions more subjectively than objectively. For example, those weird 40/30/20/10 weighted scales for valuing individual defense, the 52/48 argument (I think pitchers are undervalued, so I'll give 'em more points), and the fact that a team's Win Shares are directly proportional to wins, when there is a lot of variability in wins given a certain ability level. But it wasn't until I really started doing lots of my own baseball analysis that something else started to bug me. I couldn't explain it until I read a pdf document put together by TangoTiger and Rob Wood from Baseball Primer. Plain and simple...

Win Shares just aren't a complete, useful metric without their counterpart, Loss Shares.

Take two pitchers, Bob and Nolan. They don't seem like pitchers of equal ability, except that Bill James (via Win Shares) says they are. So, are they? Oh, you want some stats...ok, here you go:

Bob: 200 IP, 3.00 ERA, 0 BB, 0 SO, 0 HR (yes, every batter puts the ball in play - I love extreme examples)
The common assumption in the post-DIPS world is to say that half the credit for the results of balls in play go to the fielders and half to the pitchers, so I'll stick with that. Thus, since all of the batters Bob faced put the ball in play, Bob gets half the credit (half of all is half) for runs saved during his time on the mound: (6.75 ERA - 3.00 ERA)/9*200*.5 = 41.7 runs prevented, where 6.75 is the replacement level ERA of 1.5 times league average ERA.

Nolan: 200 IP, 4.00 ERA, with peripheral stats such that Nolan garners 75% of the credit for runs prevented (this doesn't mean half the batters he faces put the ball in play - it only means that half the runs scored are a result of balls put in play and the other half are a result of balls not put in play)
Nolan's credit = (6.75 - 4.00)/9*200*.75 = 45ish runs prevented.

Let's assume that both Bob and Nolan pitch in front of fielding teams with the same ability. Win Shares says Nolan deserves more credit, because he prevented more runs, even though he pitched the same number of innings with an ERA a full run higher than Bob. Seriously, would you want Nolan as your pitcher, or Bob? Do you want 200 IP with a 4.00 ERA or a 3.00 ERA? Seems like there's something fishy going on...

The idea behind Win Shares is that it assigns credit (Win Shares) on the basis of responsibility. Nolan's responsible for preventing more runs than Bob, thus he receives more Win Shares. But he's also responsible for allowing more runs than Bob, which Win Shares ignores. What's need is Loss Shares. Bill James even alludes to the fact that he thought about Loss Shares, but left them out because he couldn't figure out how to calculate them. But they're critical if you want to look at the whole picture. In order to compare Bob and Nolan, we also need to know how many extra runs more than Bob Nolan was responsible for allowing, in addition to the number of extra runs Nolan was responsible for preventing.

I don't claim to have come up with a way to calculate Loss Shares (although I believe Tango and Rob did in their pdf file), but let's do a little calculation that's on the right track.

Let's define "runs allowed^" as runs allowed above the "ideal pitcher" (the positive version of the replacement pitcher). Where the replacement pitcher has an ERA of 1.5*lgERA, the ideal pitcher has an ERA of .5*lgERA = 2.25 in our example. (Yes, pitchers often have better ERAs than this ideal ERA, but the point will still get across - plus, pitchers can, and do, have ERAs worse than replacement level.)

Bob gets charged with allowing^ (3.00-2.25)*200/9*.5 = 8.3 runs worse than ideal.
Nolan gets charged with allowing^ (4.00-2.25)*200/9*.75 = 28.5 runs worse than ideal.

Hmmm, Bob's responsible for allowing way fewer runs below ideal than Nolan. Let's combine runs prevented and runs allowed^:

Bob NET = 41.7 - 8.3 = 33.4
Nolan NET = 45 - 28.5 = 17.5

Thus, while Win Shares gives equal credit to Bob and Nolan, "NET Shares" would say Bob's performace was about twice as valuable as Nolan's. Why? Because while Nolan prevents more runs, he's also given more responsibility (aka opportunity) to prevent runs. And with more opportunity, Nolan's also allowing more runs, so much more that his advantage in Win Shares gets negated. It's like saying the Braves are better than the Tigers because they won 94 games to the Tigers' 68, without knowing how many games both teams played. If both played 162 games, the Braves are better, but if the Braves played 200 games and the Tigers played 100, the Tigers aren't more impressive? The same idea holds for Win Shares. Nolan uses a bigger chunk of the defensive opportunities (Game Shares) than Bob and thus should be held accountable for it. So in addition to Win Shares, we need Loss Shares. Together the two imply an all-encompassing stat - Game Shares. Because Nolan's more responsible than Bob for runs while he pitches, Nolan has more Game Shares.

Posted by Sky at 8:44 AM | 0 comments

6.23.2003

DIPS Numbers Through June 15

I've finally figured out how to create and post html at an actual website, so now my "daily" DIPS numbers will be available for all to see. Here's a quick rundown of what each number is:

ERA: tried and true Earned Run Average
XERA: Extrapolated Earned Run Average - what you'd expect a pitcher's ERA to be based on his actual component stats (e.g. singles, walks, strikeouts) and league average ER/R rate
$ERA: expected ERA using a pitcher's unadjusted $BB, $SO, and $HR rates combined with his team's average $H, $2B, and $3B rates
dERA: expected ERA using a pitcher's adjusted (to a neutral park) $BB, $SO, and $HR rates combined with MLB-average $H, $2B, and $3B rates. This is the "DIPS ERA."
rdERA: same thing as dERA, but with a pitcher's $BB, $SO, and $HR rates regressed "appropriately" towards the mean. Currently, appropriately means about .3 for $BB, .2 for SO, and .5 for $HR. Only the current season is considered. In the future, I hope to work in a regression rate that's a function of BFP for the current season. And, in the grand scheme of things, the goal is also to incorporate past season performance with current performance to get a good predictor of future performance.
$H=(H-HR)/(AB-SO-HR)
$BB=BB/PA
$SO=SO/AB
$HR=HR/(AB-SO)

Posted by Sky at 8:30 AM | 0 comments

6.20.2003

Bret Boone Versus Alfonso Soriano 2003

	G	AB	R	H	2B	3B	HR	RBI	BB	SO	SB	CS	AVG	OBP	SLG	OPS	BB/PA	HR/AB
Soriano	71	316	56	90	12	3	18	44	22	64	19	3	.285	.341	.513	.854	.065	.057
Boone	70	277	55	89	22	0	19	59	27	47	6	1	.321	.381	.606	.987	.089	.069

Ok, so make me an argument that says Soriano should start the All-Star team over Boone.

The only possible things I see in favor of Soriano are these:
- Soriano's had more plate appearances (about 35 it looks like)
- Soriano's stealing many more bases at the same success rate

My rebuttal (after laughing profusely) is thus:
- 35 PA isn't very many and compard to the quality difference, is insignificant
- The stolen base difference equals about 3 runs according to linear weights - not a big deal.

Points in favor of Boone, put simply:
- .040 OBP advantage
- .090 SLG advantage (for a .130 difference in OPS for those scoring at home)
- Half the time Boone hits at Safeco
- Boone plays kick-ass defense, whereas Soriano merely holds his own.

Let's do a quick runs created analysis:
Soriano: .341*.513/338 = 59
Boone: .381*.606/304 = 70

Definitely an advantage for Boone, but here's the kicker - consider outs (AB-H+CS) and RC/27 outs:
Soriano: 229 outs yields 6.7 runs/game
Boone: 189 outs yields 10 runs/game

Ok, it's an extremely rough analysis, but much more accurate than the general argument of "Soriano's such a great athlete and can hit any pitch." Sure, he can probably do athletic things than Boone (or most others) can't, but that's not the issue - the issue is which player is doing more to help his team win. And it should be obvious that it's Boone.

Posted by Sky at 7:05 PM | 0 comments

HOME