SkyKing162's Baseblog |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() A fan of the Yankees, Red Sox, and large sample sizes. ![]()
- My Links - 2003 DIPS Roto Values
- Useful Stats Links -
- Places I Visit Daily -
- Article Hall of Fame - Atom Feed - Archives -
![]() |
12.25.2003
Merry Christmas I hope everyone out there enjoyed the 25th of December. For those of you that I know, get ready for some Sky-cooked food. My family loaded me up with kitchen equipment, including a food processor and ice cream maker. What kind of ice cream should I make first - strawberry? Mango-pineapple? Peppermint? Peach-raspberry? Chocolate-espresso? Mmmm. My family also pushed me back a little bit towards the Yankees side of things with a fitted cap and framed photo of Lou Gehrig and Babe Ruth. Although, if you check out this DM sim of the 2004 season, the good money's on Boston. Also note that this sim study (which uses the ZiPS projections) agrees with my point of view that the AL Central will be awful next year, and that the A's will run away from the overrated Mariners and Angels (and appropriately rated Rangers). 12.24.2003
Schuykill Kings v. Hell Raisers The Hell Raisers defeated the Silicon Valley Technophobes 4 games to 2, advancing to the Peoria Strat League World Series. The Hell Raisers finished second to the Kings in the Joe Jackson division during the regular season. In an ironic twist of fate (ok, it's not that big of a deal), the Kings GM (me) drafted for the Hell Raisers during the most recent player draft. It was fun getting to decide which players would go to Hell, but here's hoping Schuykill whollops the Wannabe's. World Series next week. 12.23.2003
World Series Bound The Schuykill Kings are headed to the Big Show after a 4-2 series victory in the semi's of the Peoria Strat League. I don't have any stats handy (should be getting them within a few days - and next year we'll have a website), but rumor has it Wiki Gonzalez and Junior Spivey deserve co-MVP honors for their hand in bashing Guam's lefties. The other piece of info that drifted my way was about a 17 strikeout performance by Guam's Johan Santana in his first start. He followed it up with a 2 IP, 6 ER outing. What would Baseball Prospectus' flakiest starters list say about that? More details as I get 'em. The World Series will be played next week. 12.15.2003
ESPN.com - MLB - Stark: Momentum building: And the trading of Garciaparra could expand into a three-way blockbuster in which the Red Sox would get one of the Dodgers' starting pitchers -- most likely Odalis Perez -- then spin him elsewhere for a left fielder to replace Ramirez. I'd love for this thing to go down. I don't really know how much it would help the Sox, but it'd still be pretty damn cool. I wonder for whom (yes, this blog attempts to be grammatically correct) the Sox would spin Perez... The Cards need pitching, but the only OF of note is Edmonds. Not likely. The Braves are actually in the market for pitching - but I doubt Chipper would leave. The ChiSox might want pitching - would they give up Magglio? Probably not, but maybe Carlos Lee. How about Houston for Hidalgo? How about Seattle for Ibanez or McCracken? Or... and forget about this 10 seconds after you read it... how about they spin Perez for prospects, and use the money saved to sign Vlad for 4 years/$60 mil? That's quite tasty and would definitely push the Sox past the Yankees. ARod, Vlad, Ortiz, Nixon, Varitek Millar, Mueller, Damon - ok, 10 seconds is up. Peoria Strat League Playoffs Well, my Schuylkill Kings finished the regular season with 100 wins and ran away with the division. Led all 12 teams in AVG and ERA, according to the league secretary. And due to a less than stellar 2002 (my first year in the league), I was awarded the league's Most Improved Team award. My reward? A showdown with the two-time defending champion Guam Islanders, led by Barry Bonds and Alex Rodriguez. My team's better, but due to some unique playoff rules, I figure I've got only about a 50/50 shot to win the best of 7 series. The league, instead of allowing a percentage (usually about 8%) of actual PAs and IPs to be used in the playoffs, allows ALL PAs and IPs to be reused. Now, I hate claiming ignorance as an excuse, and I'm not - it's more that I'm pissed at myself for not finding out all the rules and taking advantage of them. Guam's got full use of Keither Ginter, Johan Santana, and Mark Hendrickson for this series. His quality lefties will neutralize my right-bashing lineup. I need to get to his bullpen ASAP, and hope that my starters and fielding can hold his offense in check. Against his lefty rotation of Hendrickson, Santana, and Rueter, my lineup will look like this (remember, 2002 stats are used) Kearns CF Martinez DH Spivey 2B Hammonds LF Lee 1B Guerrero RF Williams 3B Gonzalez C Vizquel SS Not quite as nice as my lineup I'll be able to use once against Jeff Weaver: Guerrero RF Giles LF Durazo DH Kearns CF Ortiz 1B Lowell 3B Alomar 2B Larue C Vizquel SS My rotation stacks up as: Lowe Millwood Leiter relieved by Burnett If it goes 7, I'll be able to pitch Lowe 3 times. Here's how I see it - split games 1 and 2 (Lowe v. Hendrickson, Millwood v. Santana). Guam takes game 3 (Leiter/Burnett v. Rueter) and Schuylkill comes back in game 4 (Lowe v. Weaver). Guam takes 5 (Hendrickson v. Millwood) and 6 (Santana v. Leiter/Burnett). But if I can sneak out either 5 or 6, I'll have the game 7 advantage of Lowe v. Rueter. So, call it even. If I make it through, the World Series would present a much easier opponent. The ARod and Manny Show Manny's probably a slightly better hitter than ARod. Add in the fact that ARrod plays SS very well, and Manny plays LF average at best, ARod's the better player. Don't know if he's $5 million better, but I doubt that $5 million is the last $5 million available to the Red Sox. What will make this deal a success or failure for the Sox is what they do with Nomar Garciaparra. If they trade him for Jared Washburn, bad move. If they trade him for Bobby Abreu, good move. If they trade him for a league-average 2B, and spend the rest of his salary on an above-average LF, good move. If they trade him for Raul Ibanez (which they won't), bad move. If they convince him to play 2B next to ARod (which they won't), good move. Of all the possible deals, I'd guess more than half of them are potentially bad. That's why it's nice to know Theo Epstein's in charge. I trust him about 80% to do something productive with Nomar if they pull of the Manny for ARod switch. Because it's not just Manny for ARod that will affect the Sox; it's Manny and Nomar for ARod and ______. That blank better be damn good. 12.14.2003
CBS.SportsLine.com - Orioles land Tejada with six-year, $54M offer It had seemed as though the Mariners were going to land Tejada, but the Orioles pulled it out. Word on the street (aka blogging community/news sources) is that the O's will also sign Pudge today. The Tejada light-cone will influence: The Mariners - I'll refrain from mocking the new Quinton McCracken/Raul Ibanez outfield - or does that count as mocking? Regardless, the Mariners failed - again - to make a quality move, and it's looking more and more like they won't be division favorites next year. They could still make a surprise run at Vlad, but for now, I'm favoring the A's. The AL West could be rather disappointing in 2004. The Orioles - no, they won't compete with the Yankees and Red Sox, even if they sign Pudge. The O's have some quality young position players in Chris Richard, Jerry Hairston, Jack Cust, Brian Roberts, Jay Gibbons and Luis Matos, but I don't see any of them being stars, and the chances that all, let alone most, of them turn out to be league-average isn't so hot. Then there's the pitching staff to consider. Assuming the Birds re-sign Ponson, the starting five shapes up to be Ponson, Damien Moss, Kurt Ainsworth, Jason Johnson, and Rodrigo Lopez. All have shown flashes of brilliance, but I think it's much more likely that this staff has a 5.50 ERA than a 3.50 ERA. Ponson's slightly above average, Moss will be making money off his 2002 season for many years, Johnson and Lopez are league-average at best, and Ainsworth's still figuring things out. If I had to bets, I'd bet Ainsworth has the best chance of pulling out a 3.50 ERA in 2004. And the bullpen - eh. The Red Sox - no the Red Sox were not involved in signing any players for the Orioles. But the O's now have an extra middle infielder. I'd love it if the Sox made a trade for Jerry Hairston. He's got a good glove, and could definitely hit .280/.370/.420 for the Sox. The AL East will be the best division in 2004. ESPN.com - MLB - Roundup: Rays acquire Blum, Hendrickson Blum and Hendrickson are just the pieces of the puzzle the DRays were looking for. Sorry, nevermind. The interesting part about these little deals is that Geoff Blum is no longer a member of the Astros. Which means that Morgan Ensberg should play full time at third. Well, he should have played full time at third last year, but now he's extremely likely to rack up 600 PAs. I wonder if this was a case of the Astros GM thinking, "Gosh darn, Ensberg should have the starting job, but Jimy Williams keeps giving at-bats to Geoff Blum. If I could just get rid of Geoff Blum, Jimy can't make that mistake anymore." 2003 Final DIPS Numbers Update They're done. They're really interesting. They're on my other computer. My goal is to transfer all my baseball spreadsheets and website templates to my new computer sometime this week, and then post the 2003 DIPS numbers shortly afterwards. Thanks to all of you who sent me emails saying you actually use the numbers. Oakland Athletics News Out of the options that the article lists (Alfonseca, Benitez, DeJean, Jimenez, Mesa, Urbina), Benitez would definitely be my choice. Not only is he the best pitcher (Ugie's not far behind), but he'll come relatively cheaply because of his perceived psychological short-comings. If I were the A's, I'd sign Benitez to a two-year contract worth $3 million per year. Let him kick butt the first year, then trade him either during the off-season or before the trade-deadline next year. Most importantly, the A's shouldn't go out and spend a lot of money on a closer - not that I think they will. Keeping Foulke at the price he signed with the Sox would definitely have been a mistake. Any money they have to spend this year should go towards scoring runs. ESPN.com - MLB - Mets sign Cameron to three-year deal I like Mike Cameron. He'll make the Mets better. But the Mets are at the point where getting a little better shouldn't be their main goal. They should own up to the fact that they're bad, save some money for a couple years, reload with young guys, and then go out and sign a couple free agents to move them from slightly above average to playoff contender. All in all, Cameron's not that expensive, and he'll be fun for New York fans to watch for a few years. It could have been worse - the Mets could have blown $15 million a year on Sheffield or Vlad. That really would have put a cramp in rebuilding. I guess I'm a little disappointed because I'd hoped Cameron would be signed by a team ready to contend over the next few years, and he could push them over the edge. Cameron'd be a great fit (with a little creativity) for the Yankees, A's, Red Sox, Astros, yada yada. Here's hoping Shea doesn't hurt Cameron as much as Safeco, and Mike can show the world how good a player he really is. 12.13.2003
ESPN.com - MLB - Deal closed: Foulke chooses Red Sox over A's Actually, posting this is really just an excuse to test out the new Google Toolbar. I installed it to block all the annoying pop-up ads, but this benefit might turn out to be just as fun. I wouldn't say that the Red Sox got a deal, but considering money doesn't seem to be a big issue, there really isn't any reliever I'd want more over the next three years. The Sox bullpen v.2004 appears to be much much stronger than last year's. I'm curious as to whether they'll use the best pitcher (probably Foulke, now) as a fireman or closer. I'd guess closer, but fireman would be way cool, and wouldn't rock the boat as much as last year's fiasco since Williamson/Kim are bonafide closers. The rumor is that if the A's didn't sign Foulke, they'd put that money towards Cameron. In my view, Cameron's a much better investment, especially considering the suckiness of the A's outfield. The Mets might have something to say about Cameron, though. 11.08.2003
Gimme Gimme Mike Cameron Mr. Steinbrenner (or shall I talk to your puppet known as Mr. Cashman) - Just because you might have realized that Alfonso Soriano is overrated, does NOT mean you should go ahead and give him away for Carlos Beltran. I've got a better idea. Keep Soriano, and sign... Mike Cameron - that's right, the Yankees need to get the guy who sucked it up in the second half in 2003, hasn't hit at all at his home ballpark, and has been a general disappointment since being traded for Ken Griffey. And, these are the argumentfor signing Cameron. Let me explain. Well, I'll let Mr. Gleeman explain in extreme, drawn-out, mind-numbing yet convincing detail, and I'll simply summarize. Here are the stats of Mike Cameron and Carlos Beltran over the past three years away from their home ballparks. For those not aware, Beltran plays at the extremely hitter-friendly Kaufman stadium, while Cameron drags himself to work at the spacious Safeco Field. Beltran away from Kaufman: .271/.337/.498 Cameron away from Safeco: .278/.364/.510 Hmm, pretty similar. Now, I'm not guaranteeing Beltran and Cameron will have similar careers from here on out. Beltran's younger, and Cameron's complete lack of competence during his home games should be considered, but here, you pick between these two choices: CHOICE A Mike Cameron - CF Alfonso Soriano - 2B Extra $$$ CHOICE B Carlos Beltran - CF Sammy Scrub - 2B No extra $$$ George, here's a hint. Pick Choice B. 9.17.2003
Barry Bonds for Dummies Ok, I'm starting to come to grips with the fact that there are tons of people out there who just don't get the value of a walk. You want RBIs. And Runs. And HRs. How pushy. Let's do something just for the fun of it. It's not really a solid analysis, but maybe those folks who don't pay attention to solid analysis will take something away from this. Let's remove walks from the Barry-Pujols debate. For every walk, we'll let each of them have another at-bat. Barry: 364 AB + 143 BB + 9 HBP = 516 PAs Pujols: 551 AB + 70 BB + 10 HBP = 631 PAs So let's give Barry 516 ABs and Pujols 631 ABs and pro-rate the rest of their stats: Barry: .118 HR/AB * 516 ABs = 61 HRs Pujols: .076 HR/AB * 631 ABs = 48 HRs (For every 2 homeruns Pujols hits, Barry hits 3. Wow.) Barry: .288 R/AB * 516 ABs = 149 Runs Pujols: .234 R/AB * 631 ABs = 148 Runs Barry: .236 RBI/AB * 516 ABs = 121 RBIs Pujols: .223 RBI/AB * 631 ABs = 141 RBIs Ok, so Barry has 13 more HRs than Pujols, the same number of Runs, and 20 less RBIs... in 115 less at-bats!!! Yes, that's a point in Barry's FAVOR in this discussion. Because even if you let a scrub take those 115 at-bats, he'll still create some runs for your team. My Peoria League Strat Team So far I've resisted writing much about my Strat-o-Matic baseball teams. I'm not resisting any more. Through 74 games, the Schuykill Kings in the 12 team Peoria SOM League are currently sitting in second place in their division, 2 games behind first place. The good news is that the Kings also have the second best record in the entire league, so if winning the division doesn't quite work out, I should have a good shot at the Wild Card. Since the 2003 MLB season is starting to wind down, I've started to think about my keeper list for next year. We all get to keep 16 players, more if we want to sacrifice draft picks - which generally isn't worth it. Here are the definite keepers: Brian Giles Vlad Guerrero Hank Blalock Mike Lowell Edgar Martinez David Ortiz Kevin Millwood Josh Beckett Guillermo Mota Other options to round out the 16 include: Erubiel Durazo Derrek Lee Richard Hidalgo Austin Kearns Junior Spivey Jason Larue Eddie Guardado Joe Borowski Borowski and Guardado are great, but closers with 65 IP are overrated in my opinion. Ideally, I'd like to trade one or both for a starting pitcher of draft pick. Stud hitters are way more consistent than stud pitchers, which is why my strategy is to stock up on stud hitters, keep them year to year, and scramble for pitching in the draft - both studs in the early rounds, and load up on sleepers for the following year in the later rounds. Last year, my pitching prospects included a bunch of scrubs that didn't pan out, except for Beckett, and I guess Suppan until he crashed. I just missed out on Kevin Brown, though. Going into next year, I'll have my outfield set with Giles, Vlad, and a Kearns/Hidalgo combo in center. 3B will be absolutely stocked with Lowell and Blalock - I may try to move Lowell, however. I'd only play him against lefties since I have Blalock, and I'd much rather have a stud starter than 150 PAs from Lowell. I'd have nothing in the middle infield except Spivey to play second against lefties. 1B would be a platoon of Derrek Lee and David Ortiz. Edgar Martinez would play all games except when normal righties were on the mound - hopefully I can draft another all-hit, no-field type. Catcher's wide open - maybe Larue against righties, but that's a really weak keeper. Pitching-wise, I'll have 200 IP from Millwood, but he's having an off year. He's probably even below average in an all-star Strat league. 125 IP from Beckett's nice, but that leaves me needing another 675 IP - 3.5 starters minimum. Mota and his 100+ IP at a 1.75 ERA is awesome. The key to relief pitching is finding the guys with quality and quantity so you don't have to use up tons of roster spots for relief pitchers with 50 IP. I like to keep those spots for prospects. Borowski and Guardado are nice, but again, hopefully I'll find someone else who values them more than me. If I can get a 200 IP, 3.00 ERA starter for both of them, I'd jump at it. Assuming I can get that, here's what I'll be lacking going into the draft: - a SS platoon - a 2B against righties - a lefty DH/1B that kill righties - a catcher platoon - 2.5 starters - 400 relief IP, with one guy that has a 3 or 4 closer rating. this should be about 5-6 guys. Out of 9 offensive positions, I have two positions without platoons - LF and RF. That leaves 9 pitching spots on the 25 man roster. 5 for starters and 4 for relievers. That's plenty. I just have to make sure I rotate my 6/7 relievers each month so they all get used up. A coupld months I'll have to send down Vlad or Giles since they won't have enough PAs to play the whole season, and that'll leave me with another platoon in LF or RF, most likely cutting my starters down to 4 or my relievers down to 3. If I can fill SS, 2B, CF, or C with a full-timer, that'd be wonderful. I'll keep you posted on how things go this year, and I'm happy knowing I'll be in the running again next year. 9.15.2003
Fantasy Football Strategy I'm not a big fantasy football fan, for a number of reasons. Football just doesn't quite do it for me like baseball does, number one. But more than that, the fantasy football format is a little weak. Head to head total points makes the game take a huge amount of luck, and the skill part doesn't involve much game theory or math (which I think some people like, but not me). I have found one application of math, though. And much of the credit should go to Bob Lung over at RotoJunkie who came up with the idea. I just came up with the proof. His theory is basically this: if you're going to win a head to head, 16 game schedule, you need to score an above average number of points for the whole year. Given that fact, you're better off with a consistent-scoring team, than a team that scores a wildly fluctuating number of points. Bob calls it his "Consisten Games Theory." Go read his articles if you want applications of it. Or read on, if you want the proof... Take two teams, A and B. A's weekly average is Ma points and B's is Mb. A's standard deviation is Sa, while B's is Sb. We need to find the probability that a random point from distribution A is greater than a point in distribution B. One "easy" way to do this is to create a new distribution, A-B. Mean of A-B is the difference of the means = Ma-Mb SD of A-B is the square root of the sum of the variances = sqrt((Sa)^2+(Sb)^2) So, in the new distribution, A beats B if A-B is greater than zero. We need to find the z-score for a random point in A-B. The z-score is: (Ma-Mb)/sqrt((Sa)^2+(Sb)^2) What does this mean? Well, assuming team A averages more points than team B, the z-score will be positive, yielding a win probability greater than .500. The greater the differences, the greater the z-score, which makes sense. But also notice that as Sa decrease, the denominator decreases, making the z-score larger. Thus, a lower standard deviation for team A (a team that we're assuming will already be over .500) will raise its expected winning percentage. High scoring teams are good. Consistently high scoring teams are better. How much better? I've done some preliminary research, and so far the results aren't very significant. So, for now, don't worry about it. But, Bob, your theory is sound. 7.20.2003
Finding Actual Ability Level Because of Win Shares (and loss shares, don't forget) and TangoTiger's Win Probability Added, I've been thinking a lot about measuring value (past contribution towards trying to win). The cousin of value is actual ability, of which we get a glance every time a player does something on the field. The general method for determining ability (from the sabrmetric angle) is to collect a player's stats, adjust them for environment (ballpark, mostly), and regress towards the mean since actual performance is only a sample of a player's ability. Let's keep things simple, and say that for hitting, the important measures of ability are homerun rate, strikeout rate, walk rate, BABIP, XBIP. These ratios are pretty basic and taken together are fairly complete. You can calculate OBP and SLG from these, and there's not much else you want to know about a hitter than his OBP and SLG abilities. The one fault I find in the raw-stats to ability-stats conversion is the lack of consideration of opponent. Analysts smartly remove ballpark effects, but I think the pitchers and fielders that hitters face, and the hitters that pitchers face should also be considered. If Raul Mondesi and Derek Jeter have the same walk rate, most people assume they have the same ability to take walks since both play in the same home ballpark. But Mondesi's missed time to injuries - what if all those games were at a park that decreases walks - Jeter's walk rate is artificially lower relative to Mondesi's. And what if during those games the Yankees had to hit against Pedro Martinez, Nate Cornejo and other pitchers who don't walk anyone - again, Jeter's walk rate would be lower than it should be relative to Mondesi's. My point is that the translation from raw-stats to ability-stats should consider play-by-play data. For example (completely made up), Derek Jeter in 2003 has 8 PA against Pedro Martinez, 12 against Nate Cornejo, 9 against Bob Stanley, etc... Weighting these pitcher's walk rates by the number of times tey faced Jeter, we could come up with the average walk rate of pitchers facing Derek Jeter. Then, since we know Jeter's raw walk rate, we could calculate his theoretical actual walk rate (regressing appropriately and considering park factors, naturally). Currently, analysts just assume that opposition evens out over the long-run. It doesn't, especially when getting down to the smaller sample sizes of relief pitchers. Of course, in order to compute Derek Jeter's actual ability walk rate, you need to know Pedro's and Nate's, which requires you to know the walk rates of everyone they faced. The whole things is a big dependant web of hitters v. pitcher v. hitters v. pitchers... However, I'm sure there's a way to calculate these things while minimizing error. I'll try a sample calculation and report it. Let me know if anyone has any ideas. 6.28.2003
NETruns I've been thinking a lot lately about the correct theoretical concepts behind win shares. Here's the state of the union: Claim points for hitting win shares are runs created above replacement. I call them extra runs created. ExtraRC = (RC/out - .5*lgRC/out) * outs. Claim points for hitting loss shares are runs created below ideal. I call them lacking runs created. LackRC = (1.5*lgRC/out - RC/out) * outs. Claim points for hitting game shares are expected runs produced. That is, the number of runs expected to be produced given a player's outs and a league-average RC/out rate. I call them total runs created. TotRC = lgRC/out * outs. Note that extraRC + lackRC = totRC. This is good. This means that win shares + loss shares = game shares. What's the best metric to use to value players? It should reward players for extraRC and punish them for lackRC. So what better calculation than simply win shares - loss shares, or extraRC - lackRC. I call these net runs created. NETruns = (RC/out - lgRC/out) * outs * 2. That 2 pops into the equation when you run through the algebra of subtracting lackRC from extraRC. It doesn't make a difference when comparing players, just changes the scale a little bit. The issue of leaving the 2 in the equation versus taking it out is comparable to a team being 10 games over .500, but only being 5 games ahead of a .500 team. Both are correct; they just present the information in slightly different ways. So, we've got our metric of NETShares. It rewards players for quantity and quality compared to average. Let's see some 2003 numbers through June 26: Top 10 Overall:
Bottom 10 Overall:
Top 10 Firstbasemen
Top 10 Secondbasemen
(continued in next post... stupid blogger errors) Top 10 Shortstops
Top 10 Thirdbasemen
Top 20 Outfielders and Designated Hitters
Top 10 Catchers
For the full list of players and their NETruns, click the link on the sidebar. 6.27.2003
Beating the Ratio Conversion Horse Motivated by my recent interest in win shares, loss shares, and game shares, I had an idea concerning the conversion of ratio stats into counting stats when valuing fantasy baseball players. Traditionally, a baseline ratio is chosen, and the counting stat becomes how many hits/runs/whatever a player is better than that baseline would be, given the same playing time opportunity. For example, if Derek Jeter hits .280 in 600 ABs, he has 12 extra hits compared to a .260 hitter with 600 ABs: (.300-.280)*600 = 12. There's nothing magical about the baseline - it could be "anything." Many people choose the anticipated AVG of the last place roto team. Thus, extra hits becomes a measure of how many hits each player is helping your team do better than last place. I think this is because players are often compared to replacement level during baseball analysis. But since the next step in the valuation process is to find the replacement level number of extra hits, using the replacement level AVG to find extra hits is not a requirement. It may turn out to be the best, but it's not necessarily the best just by definition. Win shares compares a player to replacement level. Loss shares compares a player to ideal level. Both are needed to get the full picture of a player's performance. So why not do the same for roto values? Let's compare a batter's AVG to both a replacement level (anticipated last place AVG) and an ideal level (anticipated first place AVG). This would measure how much a player helps pull you up from the dredges of last place, but also how much he's preventing your team from finishing in first. Here's an example: Derek Jeter AVG: .280 in 600 ABs Erubiel Durazo AVG: .270 in 450 ABS last place AVG: .260 first place AVG: .290 DJ netHits = (.280-.260)*600 - (.290-.280)*600 = 12-6 = 6 Ruby netHits = (.270-.260)*450 - (.290-.270)*450 = 4.5-9 = -4.5 What does this number really mean? Well, let's do some algebra (oooooh): netHits = (plAVG-repAVG)*AB-(idealAVG-plAVG)*AB netHits = [(plAVG-repAVG)-(idealAVG-plAVG)]*AB netHits = [2*plAVG-(repAVG+idealAVG)]*AB netHits = [2*plAVG-2*(meanAVG)]*AB where meanAVG is the mean of repAVG and idealAVG netHits = 2*(plAVG-meanAVG)*AB This is simply twice the number of extra hits when using meanAVG as the baseline. It would have to be tested, but I'm pretty sure the mean of the repAVG and idealAVG is pretty close the mean AVG of the draftable player pool. Since multiplying counting stats by constants doesn't affect value, using this method is the same as using extra hits compared to meanAVG. This excercise is just one more thing that makes me think using meanAVG as the baseline has some merit over repAVG. I remember Todd Zola claiming repAVG is better because the empirical results turn out "better," but I wonder if that isn't more an issue of making up for faulty projections versus theoretically correct. I guess it should be tested using year-end stats and values. 6.25.2003
Win Shares + Loss Shares = Game Shares This was originally a post at RotoJunkie. It's been modified to seem more like an article for my blog. If you want to comment, head on over to the Sabrmetrics forum. Enjoy... As a burgeoning Sabrmetrics groupie a few years ago when Win Shares hit the market, I ate it up. I thought it was the coolest ranking method. Well, almost the coolest. You see, there were things that just bugged me about Bill James' Win Shares system, mostly the (many parts) where James made decisions more subjectively than objectively. For example, those weird 40/30/20/10 weighted scales for valuing individual defense, the 52/48 argument (I think pitchers are undervalued, so I'll give 'em more points), and the fact that a team's Win Shares are directly proportional to wins, when there is a lot of variability in wins given a certain ability level. But it wasn't until I really started doing lots of my own baseball analysis that something else started to bug me. I couldn't explain it until I read a pdf document put together by TangoTiger and Rob Wood from Baseball Primer. Plain and simple... Win Shares just aren't a complete, useful metric without their counterpart, Loss Shares. Take two pitchers, Bob and Nolan. They don't seem like pitchers of equal ability, except that Bill James (via Win Shares) says they are. So, are they? Oh, you want some stats...ok, here you go: Bob: 200 IP, 3.00 ERA, 0 BB, 0 SO, 0 HR (yes, every batter puts the ball in play - I love extreme examples) The common assumption in the post-DIPS world is to say that half the credit for the results of balls in play go to the fielders and half to the pitchers, so I'll stick with that. Thus, since all of the batters Bob faced put the ball in play, Bob gets half the credit (half of all is half) for runs saved during his time on the mound: (6.75 ERA - 3.00 ERA)/9*200*.5 = 41.7 runs prevented, where 6.75 is the replacement level ERA of 1.5 times league average ERA. Nolan: 200 IP, 4.00 ERA, with peripheral stats such that Nolan garners 75% of the credit for runs prevented (this doesn't mean half the batters he faces put the ball in play - it only means that half the runs scored are a result of balls put in play and the other half are a result of balls not put in play) Nolan's credit = (6.75 - 4.00)/9*200*.75 = 45ish runs prevented. Let's assume that both Bob and Nolan pitch in front of fielding teams with the same ability. Win Shares says Nolan deserves more credit, because he prevented more runs, even though he pitched the same number of innings with an ERA a full run higher than Bob. Seriously, would you want Nolan as your pitcher, or Bob? Do you want 200 IP with a 4.00 ERA or a 3.00 ERA? Seems like there's something fishy going on... The idea behind Win Shares is that it assigns credit (Win Shares) on the basis of responsibility. Nolan's responsible for preventing more runs than Bob, thus he receives more Win Shares. But he's also responsible for allowing more runs than Bob, which Win Shares ignores. What's need is Loss Shares. Bill James even alludes to the fact that he thought about Loss Shares, but left them out because he couldn't figure out how to calculate them. But they're critical if you want to look at the whole picture. In order to compare Bob and Nolan, we also need to know how many extra runs more than Bob Nolan was responsible for allowing, in addition to the number of extra runs Nolan was responsible for preventing. I don't claim to have come up with a way to calculate Loss Shares (although I believe Tango and Rob did in their pdf file), but let's do a little calculation that's on the right track. Let's define "runs allowed^" as runs allowed above the "ideal pitcher" (the positive version of the replacement pitcher). Where the replacement pitcher has an ERA of 1.5*lgERA, the ideal pitcher has an ERA of .5*lgERA = 2.25 in our example. (Yes, pitchers often have better ERAs than this ideal ERA, but the point will still get across - plus, pitchers can, and do, have ERAs worse than replacement level.) Bob gets charged with allowing^ (3.00-2.25)*200/9*.5 = 8.3 runs worse than ideal. Nolan gets charged with allowing^ (4.00-2.25)*200/9*.75 = 28.5 runs worse than ideal. Hmmm, Bob's responsible for allowing way fewer runs below ideal than Nolan. Let's combine runs prevented and runs allowed^: Bob NET = 41.7 - 8.3 = 33.4 Nolan NET = 45 - 28.5 = 17.5 Thus, while Win Shares gives equal credit to Bob and Nolan, "NET Shares" would say Bob's performace was about twice as valuable as Nolan's. Why? Because while Nolan prevents more runs, he's also given more responsibility (aka opportunity) to prevent runs. And with more opportunity, Nolan's also allowing more runs, so much more that his advantage in Win Shares gets negated. It's like saying the Braves are better than the Tigers because they won 94 games to the Tigers' 68, without knowing how many games both teams played. If both played 162 games, the Braves are better, but if the Braves played 200 games and the Tigers played 100, the Tigers aren't more impressive? The same idea holds for Win Shares. Nolan uses a bigger chunk of the defensive opportunities (Game Shares) than Bob and thus should be held accountable for it. So in addition to Win Shares, we need Loss Shares. Together the two imply an all-encompassing stat - Game Shares. Because Nolan's more responsible than Bob for runs while he pitches, Nolan has more Game Shares. 6.23.2003
DIPS Numbers Through June 15 I've finally figured out how to create and post html at an actual website, so now my "daily" DIPS numbers will be available for all to see. Here's a quick rundown of what each number is: ERA: tried and true Earned Run Average XERA: Extrapolated Earned Run Average - what you'd expect a pitcher's ERA to be based on his actual component stats (e.g. singles, walks, strikeouts) and league average ER/R rate $ERA: expected ERA using a pitcher's unadjusted $BB, $SO, and $HR rates combined with his team's average $H, $2B, and $3B rates dERA: expected ERA using a pitcher's adjusted (to a neutral park) $BB, $SO, and $HR rates combined with MLB-average $H, $2B, and $3B rates. This is the "DIPS ERA." rdERA: same thing as dERA, but with a pitcher's $BB, $SO, and $HR rates regressed "appropriately" towards the mean. Currently, appropriately means about .3 for $BB, .2 for SO, and .5 for $HR. Only the current season is considered. In the future, I hope to work in a regression rate that's a function of BFP for the current season. And, in the grand scheme of things, the goal is also to incorporate past season performance with current performance to get a good predictor of future performance. $H=(H-HR)/(AB-SO-HR) $BB=BB/PA $SO=SO/AB $HR=HR/(AB-SO) 6.20.2003
Bret Boone Versus Alfonso Soriano 2003
Ok, so make me an argument that says Soriano should start the All-Star team over Boone. The only possible things I see in favor of Soriano are these: - Soriano's had more plate appearances (about 35 it looks like) - Soriano's stealing many more bases at the same success rate My rebuttal (after laughing profusely) is thus: - 35 PA isn't very many and compard to the quality difference, is insignificant - The stolen base difference equals about 3 runs according to linear weights - not a big deal. Points in favor of Boone, put simply: - .040 OBP advantage - .090 SLG advantage (for a .130 difference in OPS for those scoring at home) - Half the time Boone hits at Safeco - Boone plays kick-ass defense, whereas Soriano merely holds his own. Let's do a quick runs created analysis: Soriano: .341*.513/338 = 59 Boone: .381*.606/304 = 70 Definitely an advantage for Boone, but here's the kicker - consider outs (AB-H+CS) and RC/27 outs: Soriano: 229 outs yields 6.7 runs/game Boone: 189 outs yields 10 runs/game Ok, it's an extremely rough analysis, but much more accurate than the general argument of "Soriano's such a great athlete and can hit any pitch." Sure, he can probably do athletic things than Boone (or most others) can't, but that's not the issue - the issue is which player is doing more to help his team win. And it should be obvious that it's Boone. 5.23.2003
My 2003 All-Playing Better than Expected and Deserve to be Mentioned Team: C Jason Larue 1B Travis Lee 2B Jerry Hairston Jr. SS Rey Ordonez 3B Bill Mueller OF Rocco Baldelli (better than any Sabrmetrician would have predicted) OF Geoff Jenkins OF Jose Guillen P Esteban Loiza (ERA is for real - let's see if his skills stay this good all year, though) My 2003 All-Fantasy Value Much Higher than Actual Value Team C Ivan Rodriguez 1B Todd Helton (b/c Coors inflation) or Derrek Lee (lots of SB) 2B Alfonso Soriano (Luis Castillo honorable mention) SS Desi Relaford 3B Mike Lowell OF Vernon Wells OF David Roberts OF Juan Pierre P Barry Zito My 2003 All-Finally Fulfilling Potential Team: C Jason Kendall 1B Nick Johnson 2B Marcus Giles SS Edgar Renteria 3B Hank Blalock OF Raul Mondesi OF Milton Bradley OF Jose Cruz Jr. P Javier Vazquez Just picked up my official 2003 All-Star ballot (ok, ballots) from Subway this evening. Went through and discussed my traditional All-Star selections with a friend, but then also had some fun thinking up other "All-____" teams, such as the All-Surprise Team and the All-Overrated Team. Here are a few (with comments) for your viewing pleasure: All-Star Team (which I define as the overall best player so far during the 2003 season) AL: C Jorge Posada 1B Carlos Delgado (no one else close) 2B Bret Boone (better defense and SLG% than Soriano, and hits in a more pitcher-friendly home park) SS Alex Rodriguez 3B Hank Blalock (only a couple others are having better than decent years) OF Raul Mondesi (something clicked - knowing why would earn you millions) OF Carl Everett (listed as DH, but, well, he plays OF) OF Aubrey Huff (listed as 3B, but he plays OF) DH Edgar Martinez (yup, age 42 is a rebound year) P Mike Mussina NL C Mike Piazza (best OPS by a large margin - he's lucky I'm not voting a month from now, though) 1B Richie Sexson (Helton's better numbers aren't better when considering Coors - what's up with Thome?) 2B Marcus Giles (it's not just hype: AVG: .345, OBP: .400, SLG: .580, and good defense - and the 17:6 2B:HR ratio says more homers are coming...) SS Alex Gonzalez (slugging over .600 - saw that one coming...) 3B Scott Rolen (but if Vinny Castilla keeps up last week's pace, looks out...) OF Gary Sheffield OF Barry Bonds OF Austin Kearns P Mark Prior Overall observations: 3B is extremely weak in both leagues. The NL goes deeper in quality SS. The AL outfield has caught some sort of suck disease. Who would have picked Mondesi, Everett, and Huff as their top three OF so far? Honorable mention pitchers: Curt Schilling (look for a crazy-good rest of year), Pedro Martinez, Roger Clemens, Kevin Millwood, Jason Schmidt, Kevin Brown, Odalis Perez, Javier Vazquez. First of all, I'd like to correct myself. I made a big error in my last post, at least theoretically, and I'd like to correct it. After determining the difference between "typical" league-best and league-worst pitching and fielding and making the warning that these number didn't necessarily correlate to how credit should be divided, I went ahead and assumed that defense and offense deserve to split credit 50/50 and thus concluded how much difference all three phases of the game there were between league-worst and league-best. Now, that 50/50 split is probably accurate for dividing up credit, but since that's not what I was doing, it's theoretically wrong. What I should have done (and will do now) and find the difference between "typical" league-worst and league-best hitting, as measured by runs scored. Again, I'll take the range from the third worst to third best runs scored totals. And yes, this is chocked full of park affects, unbalanced schedules, and you-don't-have-to-hit-against-your-own-pitchers issues. I'm fine with a rough estimate, though. Last year's range: 856 (CHW) - 641 (PIT) = 215 runs. So here's our chart: Offense: 1.3 runs/game Pitching: 1.1 runs/game Fielding: .4 runs/game That's interesting - defense combined gives a range basically equal to the offense range - which show a pretty even split between offense and defense. But similar to the whole gravitational mass doesn't HAVE to equal generic kinematic mass, even if offense is 50% of the game and defense is 50% of the game, these numbers didn't have to come out like they did. And because they came out this way, it doesn't mean hitting is 50% of the game. (And heck, the error bars on these calculations are huge.) 5.16.2003
Ok, the quick answer to the question posed in the last entry is .55 runs. The difference between the third best defense-neutral pitching staff (Florida) and the third worst (San Diego) is 1.1 runs. Assuming the average team in in the middle of the two (it's not), that gives a .55 run difference between an average pitching staff and the best/worst. Of course, this doesn't deal with park effects, either, but it shouldn't be TOO far off. Thus, if you had a choice between having average fielders and a kick-ass pitching staff or kick-ass fielders and an average pitching staff, you'd choose the pitchers, but not by as much as I had thought. About 58% percent of the difference between teams' defenses should be attributed to pitching, and 42% to fielding. Assuming offense is 50% and defense is 50% of baseball, that means positions players as a group show more variability than pitching staffs as a group (71% of the difference is attributable to hitters). Of course, it's not often the best hitters are also the best fielders. And, of course credit distribution can change when you take into account the fact that Curt Schilling does a lot more on his own than Kirk Rueter does. A crappy defensive team with Schilling on the mound has a good chance of winning (with the credit mostly going to Curt Schilling), while the same team with Rueter on the mound has a good chance of losing (with the blame mostly going to the fielders). And there's definitely a distinction between distribution of credit and differences between the best and worst fielding and pitching teams. I've been playing around with a spreadsheet to compute expected ERAs based on DIPS, and wondered exactly how much effect fielding actually has. Taking a league average pitching staff ($BB=.90, $SO=.182, $HR=.37), I computed Xtrapolated ERA for a range of $H values around the MLB mean of .287. Here are the results: $H XERASo it looks like the difference between league average fielding and typical best/worst fielding is .4 runs per game. Over an entire season, that's a difference of 65 runs. Definitely not chump change. The difference between the best and worst fielding teams is about 130 runs. Maybe there is something to defense winning championships. The next step is see how much of effect a good pitching staff makes over a league average pitching staff. My guess is .9 runs per game, thus making defense, on average, 70% pitching and 30% fielding. 5.15.2003
Some quick DIPS trivia: Whos' played the best defense so far in baseball, in terms of turning balls in play into outs? Oakland, by a landslide. Their $H is .239. The number two team, Minnesota, is at .261. I guess Chris Singleton actually is helping the team in some way. Which league has the lower $H? Neither, both the AL and NL are turning 71.27% of balls in play into outs. Considering the AL has to deal with the DH, I find that interesting. I'll have to take a look at the past few years' values. Who are the bottom five teams at turning balls in play into outs? NY Yankees - but Jeter's back to save the day... Florida - fast doesn't equal range, evidently... Colorado - i'll forgive them, though... Milwaukee - at least they can pitch and hit; oh wait... Texas - and for Exhibit A in favor of park effects we have Chan Ho Park... 5.14.2003
Back to baseball, like I promised... People are slowly catching on to the whole DIPS thing. Even mainstream columns are using BB rates, SO rates, and HR rates to analyze pitchers. Two bad things, though - they're also using H rates, which is often meaningless, and they're using 9 IP as the denominator in the rates (BB/9, SO/9, HR/9). I want to rant about the second problem right now. Innings Pitched is not the best measure of a pitcher's playing time - batters facing pitcher (what we call plate appearances for hitters) is the best. IP is dependant on how many batters a pitcher faces get out, which is dependant on the pitcher AND the defense AND the ballpark. If a pitcher has a lousy defense behind him, or an extraordinary number of hits drop in just by chance, the IP total for that outing will be lower than it "should" be. If Bob the Pitcher faces 24 batters, strikes out 8, walks 0, gives up 0 homers, induces 0 GIDP and has 8 of those other 16 batters get out, he'll have pitched 5 2/3 innings and struck out 8 for a ratio of 13.5 K/9. If however, he's a little luckier and retires 12 of the 16 other batters, he'll pitch 6 2/3 innings and have a ratio of 10.8 K/9. If you buy into DIPS, in which game did the pitcher pitch better? Neither, the differences are a result of defense and luck. So why would we want to represent his performance with the stat K/9 when it can fluctuate for no good reason? I sure wouldn't. The worse the defense playing behind a pitcher, and the unluckier a pitcher is with hits falling in, the higher his BB/9, K/9, and HR/9 stats will appear, artificially improving K/9, and artificially worsening both BB/9 and HR/9. Instead, everyone should be using the raw DIPS ratios: $BB, $SO, and $HR, which are NOT dependant upon defense and luck. $BB = BB/BFP $SO = SO/(BFP-BB-HBP) $HR = HR/(BFP-BB-HBP-SO) Sure, they take a little getting used to, but well worth it. And you should be glad I didn't get started on H/9... 5.13.2003
In addition to an amateur baseball analyst, I'm also a mathematician and extremely amateur philosopher. So here's a taste of something random: How many holes does a pair of underwear (ignoring tears and the fly) have? Don't worry I'll skip the part of the story that explains how the question came up. My friends mostly claim three - at least they did until they started refusing to answer - but I claim two. I'm a math guy, I've taken topology, and I'm darn sure undies are a two-holed torus. So I was feeling all smart until my friend Amanda, who's more of a writer than a math person, chimed in with, "Underwear out of the package has no holes. It needs the leg and waist openings in order to be called underwear. Would you buy a package of boxers without the 'holes'?" Ok, good point. Moral of the story - underwear has zero holes or two holes, but definitely not three holes. Back to baseball next time... 5.08.2003
Hey, a spot on the web for my baseball ideas. I think that's a good thing. I'll start off with a small pat on my own back, and then try really hard not to do it again soon. I TOLD you the Atlanta Braves weren't just going to give way to the Phillies. Sure, they're pitching is merely average, but they're offense is going to carry them for the first time in a while. Chipper, Andruw, Sheffield, Fick, Giles, and Furcal are a force to be reckoned with. Add in an always solid bullpen and you've got yourself a contender. |