Back in January, I did a two-part series looking at the Bill James and PECOTA projections for likely Sox and Yank starters, plugged those lineups into a run calculator and determined via the Pythagorean theorum how James and BP thought the Sox and Yanks would do.
Time for a check-up.
Boston Red Sox
We'll start with the nuts and bolts and work our way outward. Here's a rundown of the Sox' lineup with the James, PECOTA and actual 2007 numbers:
Kevin Youkilis
BJ: .283/.395/.433 14 HR, 165 H, 101 R
BP: .271/.376/.456 18 HR, 142 H, 90 R
Act: .288/.390/.453 16 HR, 152 H, 85 R
Dustin Pedroia
BJ: .284/.355/.418 10 HR, 72 RBI, 47 2B, 67 BB, 43 K
BP: .294/.360/.431 9 HR, 60 RBI, 36 2B, 47 BB, 39 K
Act: .317/.380/.442 8 HR, 50 RBI, 39 2B, 47 BB, 42 K
David Ortiz
BJ: .285/.391/.592 47 HR, 138 RBI, 42 2B, 103 BB
BP: .289/.406/.577 41 HR, 130 RBI, 37 2B, 109 BB
Act: .332/.445/.621 35 HR, 117 RBI, 52 2B, 111 BB
Manny Ramirez
BJ: .305/.414/.590 37 HR, 118 RBI, 33 2B, 90 BB
BP: .297/.400/.567 33 HR, 105 RBI, 32 2B, 84 BB
Act: .296/.388/.493 20 HR, 88 RBI, 33 2B, 71 BB
Mike Lowell
BJ: .273/.341/.452 18 HR, 77 RBI, 36 2B
BP: .273/.333/.441 15 HR, 74 RBI, 33 2B
Act: .324/.378/.501 21 HR, 120 RBI, 37 2B
J.D. Drew
BJ: .283/.398/.493 24 HR, 82 RBI, 27 2B
BP: .285/.392/.476 15 HR, 61 RBI, 27 2B
Act: .270/.373/.423 11 HR, 64 RBI, 30 2B
Jason Varitek
BJ: .259/.343/.434 17 HR, 69 RBI
BP: .274/.357/.453 14 HR, 55 RBI
Act: .255/.367/.421 17 HR, 68 RBI
Coco Crisp
BJ: .284/.337/.419 11 HR, 54 RBI, 30 2B, 23 SB
BP: .310/.361/.452 13 HR, 63 RBI, 27 2B, 21 SB
Act: .268/.330/.382 6 HR, 60 RBI, 28 2B, 28 SB
Julio Lugo
BJ: .277/.343/.399 11 HR, 156 H, 85 R, 26 SB
BP: .284/.347/.406 8 HR, 134 H, 74 R, 19 SB
Act: .237/.294/.349 8 HR, 135 H, 71 R, 33 SB
On the one hand, give the projectors credit. They did a heck of a job, but I think we can all agree the individual components of the Red Sox lineup did about the exact opposite of what we thought they would. The players we thought might struggle -- Dustin Pedroia, Mike Lowell -- excelled. The ones we thought would provide some offensive spark -- Julio Lugo, J.D. Drew, Manny Ramirez -- struggled mightily. The projections for Kevin Youkilis and Jason Varitek were very close to their year-end stats, and David Ortiz just had an odd season, underperforming his usual power numbers while still having a career year in many categories.
The Bill James lineup projections equaled about 5.93 runs per game, while PECOTA's equaled 5.95. Based on their 2007 stats alone, the Sox should have scored 5.81 runs per game (note that all the lowest-scoring permutations of that lineup have Lugo at the top, where he batted for 82 games). The Sox actually scored 5.35 runs a game. At 5.93-5.95 runs per game, the Sox would have been basically tied with the Yankees for best offense in baseball. Instead, they sat a distant third behind New York and Detroit. The Sox scored 100 runs fewer than either James or BP expected. That's 10 wins off. Oops.
Now, the pitchers:
Josh Beckett
BJ: 13-10, 3.68 ERA, 2.55 K/BB
BP: 11-10, 4.47 ERA, 2.27 K/BB
Act: 20-7, 3.27 ERA, 4.85 K/BB
Curt Schilling
BJ: 12-8, 3.50 ERA, 5.9 K/BB
BP: 13-9, 4.02 ERA, 4.24 K/BB
Act: 9-8, 3.87 ERA, 4.39 K/BB
Daisuke Matsuzaka
Projection compiled from various systems:
15-10, 3.55 ERA
BP: 12-9, 4.01 ERA, 3.18 K/BB
Act: 15-12, 4.40 ERA, 2.51 K/BB
Tim Wakefield
BJ: 8-8, 4.14 ERA, 1.96 K/BB
BP: 9-9, 4.77 ERA, 1.69 K/BB
Act: 17-12, 4.76 ERA, 1.72 K/BB
Jon Lester
BJ: 4-5, 4.38 ERA, 1.80 K/BB
BP: 6-7, 5.20 ERA, 1.64 K/BB
Act: 4-0, 4.57 ERA, 1.61 K/BB
Julian Tavarez
BJ: 62 G, 4.56 ERA, 1.39 K/BB
BP: 43 G, 5.05 ERA, 1.22 K/BB
Act: 34 G (23 GS), 5.15 ERA, 1.51 K/BB
Mike Timlin
BJ: 72 G, 3.86 ERA, 2.56 K/BB
BP: 36 G, 4.24 ERA, 2.30 K/BB
Act: 50 G, 3.42 ERA, 2.21 K/BB
Manny Delcarmen
BJ: 53 G, 3.88 ERA, 2.19 K/BB
BP: 47 G, 4.29 ERA, 2.23 K/BB
Act: 44 G, 2.05 ERA, 2.93 K/BB
Jonathan Papelbon
BJ: 14-6, 2.98 ERA, 3.77 K/BB
BP: 9-7, 4.28 ERA, 2.82 K/BB
Act: 1-3 (59 G), 1.85 ERA, 5.6 K/BB
Of course when the Bill James and Baseball Prospectus handbooks were published, Papelbon was still telling everyone he really, really wanted to start. We all know how that ended up. Likewise, no one really expected Julian Tavarez to start as much as he did. Curt Schilling and Tim Wakefield wound up right in line with their projections. Daisuke Matsuzaka struggled in his transition to American baseball more than was expected, while Josh Beckett blew everyone away. In the bullpen, Mike Timlin and Manny Delcarmen both were tremendous assets, and neither book even thought to project Hideki Okajima's stats for '07, so little thought of was he.
As I noted in January, the BIS projections for pitchers in the James handbook were ridiculously optimistic, to the point of unusability. They called for the Sox pitchers to give up 587 runs, a tremendously low number. PECOTA projected 710 runs. The Sox staff actually gave up 657 -- 53 runs (about five wins) better than the BP projection.
When I first did this, I made the following calculation, which got a fair bit of coverage on the interwebs:
Using James' Pythagorean theorum and PECOTA's seemingly more realistic projections, 964 runs scored and 710 runs allowed would project roughly (I used the power of two instead of the power of 1.83) to a 105-57 record. Um, wow.
Seeing that the Sox actually were 10 wins worse on offense and five better on "defense" the team should have finished about 100-62. Still not shabby. Not coincidentally, that happens to be almost exactly what the Sox' final Pythag record was this year (101-61). A poor 21-28 record in one-run games explains the difference between that and their real record of 96-66.
So, in all, the projections were off by four or five wins. That's not bad on the surface. Of course, they got there by drastically overprojecting the offense and underprojecting the pitching, and even more specifically by missing the mark widely in either direction on six or seven members of the starting lineup and two or three of the most heavily used pitchers. It all evens out, I guess. After all, would we do any better?
New York Yankees
As I don't know the ins and outs of the Yankee lineup and rotation as well, please bear with me.
I only used Baseball Prospectus' PECOTA projections for the Yankees study because in the Red Sox' case, they proved to be almost identical on offense and so terribly off on pitching.
Johnny Damon
BP: .289/.362/.458 18 HR, 158 H, 94 R, 14 SB
Act: .270/.351/.396 12 HR, 144 H, 93 R, 27 SB
Derek Jeter
BP: .322/.390/.452 12 HR, 71 RBI, 189 H, 23 SB
Act: .322/.388/.452 12 HR, 73 RBI, 206 H, 15 SB
Bobby Abreu
BP: .277/.389/.447 16 HR, 65 RBI, 28 2B
Act: .283/.369/.445 16 HR, 101 RBI, 40 2B
Alex Rodriguez
BP: .288/.385/.531 34 HR, 104 RBI, 30 2B, 14 SB
Act: .314/.422/.645 54 HR, 156 RBI, 31 2B, 24 SB
Jason Giambi
BP: .252/.413/.518 29 HR, 81 RBI, 18 2B
Act: .236/.356/.433 14 HR, 39 RBI, 8 2B
Hideki Matsui
BP: .288/.376/.474 15 HR, 60 RBI, 21 2B
Act: .285/.367/.488 25 HR, 103 RBI, 4 SB
Jorge Posada
BP: .259/.365/.433 17 HR, 65 RBI
Act: .338/.426/.543 20 HR, 90 RBI
Robinson Cano
BP: .308/.345/.472 16 HR, 80 RBI, 36 2B
Act: .306/.353/.488 19 HR, 97 RBI, 41 2B
Doug Mientkiewicz
BP: .251/.328/.382 5 HR, 30 RBI
Act: .277/.349/.440 5 HR, 24 RBI
No, those Jeter and Abreu comparisons are not misprints. PECOTA really was that accurate, particularly on Jeter's line, off by just two on-base points and two RBI among the main stats. Impressive. BP actually did a much better here overall than with Boston's lineup. No one projects the years A-Rod and Posada had, while only Damon and Giambi really struggled. As we all know, the Yankee lineup needed about three months to get going, but once it did, it didn't look back. As you can tell, guys who looked like they couldn't hit a whiffle ball in June (Matsui, Cano, Abreu) wound up with great-looking numbers.
The thing is, these PECOTA projections represented fairly significant drops for many of these players -- particularly Matsui and Abreu. So BP clearly was on to something. The projections said the Yankees should have scored 957 runs, 5.91 runs per game, in 2007. Based on the stats they produced, New York should have scored 6.1 runs per game, or 988 runs. Instead, the team actually split the difference -- 5.98 RPG, the only team with 900 runs (968). Overall, good work from PECOTA on the Yankee projections, off by just one win, and good work from the Yankee offense itself, obviously, scoring the most runs in a season since the 2000 White Sox.
Now the pitching.
Chien-Ming Wang
BP: 11-9, 4.35 ERA, 1.42 WHIP, 48 BB, 74 Ks
Act: 19-7, 3.70 ERA, 1.29 WHIP, 59 BB, 104 Ks
Andy Pettitte
BP: 12-9, 4.21 ERA, 1.34 WHIP, 51 BB, 131 Ks
Act: 15-9, 4.05 ERA, 1.43 WHIP, 69 BB, 141 Ks
Mike Mussina
BP: 12-9, 4.27 ERA, 1.31 WHIP, 47 BB, 146 Ks
Act: 11-10, 5.15 ERA, 1.47 WHIP, 35 BB, 91 Ks
Roger Clemens
BP*: 11-6, 3.02 ERA, 1.15 WHIP, 50 BB, 140 Ks
Act: 6-6, 4.18 ERA, 1.31 WHIP, 31 BB, 68 Ks
Philip Hughes
BP: 10-7, 3.70 ERA, 1.25 WHIP, 32 BB, 138 Ks
Act: 5-3, 4.46 ERA, 1.28 WHIP, 29 BB, 58 Ks
Kei Igawa
BP: 11-9, 4.46 ERA, 1.40 WHIP, 59 BB, 128 Ks
Act: 2-3, 6.25 ERA, 1.67 WHIP, 15 BB, 37 Ks
Luis Vizcaino
BP: 3-3, 3.89 ERA, 1.40 WHIP, 23 BB, 49 Ks
Act: 8-2, 4.30 ERA, 1.45 WHIP, 43 BB, 62 Ks
Brian Bruney
BP: 2-2, 5.01 ERA, 1.62 WHIP, 15 BB, 25 Ks
Act: 3-2, 4.68 ERA, 1.62 WHIP, 37 BB, 39 Ks
Ron Villone
BP: 3-2, 4.90 ERA, 1.53 WHIP, 27 BB, 45 Ks
Act: 0-0, 4.25 ERA, 1.28 WHIP, 18 BB, 25 Ks
Kyle Farnsworth
BP: 3-3, 3.85 ERA, 1.34 WHIP, 5 Sv, 27 BB, 44 Ks
Act: 2-1, 4.80 ERA, 1.45 WHIP, 0 Sv, 27 BB, 48 Ks
Mariano Rivera
BP: 4-4, 2.78 ERA, 1.18 WHIP, 27 Sv, 13 BB, 47 Ks
Act: 3-4, 3.15 ERA, 1.12 WHIP, 30 Sv, 12 BB, 74 Ks
*Clemens' projection with Houston Astros
Needless to say, the Yankees' pitching staff, aside from Wang and Pettitte, was disappointing in 2007. Mussina tanked, Clemens was average, Hughes was injured, Igawa was a flop, and never mind that in January, Carl Pavano was a part of the rotation. Projected to give up 738 runs in '07, the Yanks instead gave up 777 (although just 724 were earned). That's a four-loss difference.
Using their projected run differential as a basis, I again used the Pythag theorum to figure the Yankee projected record as 102-60 -- two games behind Boston. With their stats being three games worse than the projections (one win better with the bats, four losses worse with the arms), the Yankees should have finished 99-63, which is again very close to their Pythag record of 97-65. New York actually finished 94-68, also a victim of poor play in one-run games (18-21).
To recap, Bill James and PECOTA missed on the Sox' projections by four or five games, a fairly significant margin. PECOTA was perhaps a game better in projecting the Yankees. Now, at least, we have a margin of error to consider when we project the 2008 teams -- something that I'm sure we're all looking forward to.
Great post Paul thanks for that! Ill need to keep this all in mind for my fantasy team next year. When does that start by the way?
Posted by: sam-YF | Thursday, November 08, 2007 at 10:59 PM
I'm too distracted by the collusion accusation to focus on PECOTA right now. This Arod thing gets weirder and weirder by the day.
Posted by: no sleep til brooklyn SF | Friday, November 09, 2007 at 01:16 AM
Great post - fascinating stuff, makes me want to look a bit more into the inner workings of how they do these projections.
I'm a little curious as to why they bother projecting W-L records for the pitchers. I suppose you can make some guesses for relief pitchers based on the situations you think they'll be used in and the total runs you think they'll give up, but for starters W-L is so dependent on the opposing pitcher, and therefore a complete crapshoot.
And where are the projections of IP?!
But enough whining from me - great post.
Posted by: Jackie (SF) | Friday, November 09, 2007 at 01:22 AM
NSTB - collusion? what? I haven't heard anything about this, and it doesn't seem to be in the headlines...?
Posted by: Jackie (SF) | Friday, November 09, 2007 at 01:29 AM
The players' union released a two paragraph statement warning the owners against colluding because they got nervous about Theo's idea to have all the GMs in one room talk about what players they were interested in at one time (as opposed to the inefficient method of making 29 different phone calls).
In it, the union mentions it believes the commissioner is trying to hold down the salary of at least one player. Obviously, that's referring to A-Rod, but it all sounds like rumor, unfounded speculation and silliness to me.
Posted by: Paul SF | Friday, November 09, 2007 at 01:33 AM
My favorite comment from both threads...
Igawa will be in the conversation for rookie of the year. i know he throws in the high 80s and he has weak offspead pitches but i think with some major league coaching he will fair better then matsuzaka because matsuzaka has immense pressure to succede
hahahah. Sometimes it's funny to read what we write in hindsight.
Posted by: Brad | Friday, November 09, 2007 at 07:55 AM
brad, you're saying DiceK had a better year than Igawa? Well i guess we are each entitled to our own opinions...
Posted by: sam-YF | Friday, November 09, 2007 at 08:06 AM
Incredible work.
Posted by: YF | Friday, November 09, 2007 at 08:59 AM
Amazing post. It's always fun to look back on predictions and expectations, and it's a pleasure to take an in-depth look at the numbers you've gathered.
Posted by: Atheose | Friday, November 09, 2007 at 09:03 AM
thanks Paul, awesome stuff.
Posted by: SF | Friday, November 09, 2007 at 09:39 AM
Wow man, wow. While we all appreciate the work, you need to get out of the house man.
Posted by: LocklandSF | Friday, November 09, 2007 at 09:49 AM
Again, great stuff, Paul!
Posted by: Nick-YF | Friday, November 09, 2007 at 10:06 AM
Nice. Love that number-crunchin'.
Heh, I say the Sox fell short on win projections almost entirely because of...Eric Gagne.
Posted by: Devine | Friday, November 09, 2007 at 10:31 AM