A MajorWager exclusive analysis by poster Spraguer
Here, I will walk you through one method of baseball handicapping. While there are still plenty of things left unsaid, there are a few nuggets of information here both on methodology and the current playoff series. While this is not precisely how I handicap baseball, it is very close in many ways and the underlying principles are the same.
First, we will go through a handicapping of the Red Sox vs. Angels series. I will explain each "move", or number manipulation, in the context of this series, yielding win probabilities for each game and the series as a whole. For the other three series, I will simply list the relevant numbers and conclusions and you, the reader, can refer back to the Red Sox vs. Angels analysis for explanation/clarification of what the numbers mean and why they are there.
In addition to series picks, you will find win probabilities for each game of each series. These presume that no major injuries occur between now and then and that the posted pitching matchups remain the same.
If you just want the picks... Cleveland and Philly offer some value in their respective first round series (keep in mind of course, that this is baseball and anything can happen).
Red Sox vs. Angels
The Red Sox and Angels are a study in contrasts on offense, with the former displaying the tried and true combination of power and patience and the latter deriving much of its prowess from batting average and speed.
The Red Sox scored 867 runs this season, which adjusts to 842 after accounting for the effects of Fenway Park. The Angels scored 822 runs, or 853 in a neutral run environment. The temptation, therefore, is to presume that we are dealing with two equally potent offenses, but that statement can be misleading for a couple of reasons.
First, runs scored are not the best indicator of the ability to score runs - they are merely evidence that runs were scored in the past. Linear weights style formulae (which break runs down into their elements and assign values to those elements), like Runs Created or Base Runs are better than actual past runs scored at predicting future runs scored. This is somewhat counter-intuitive and best appreciated by considering the difference between process and results - if certain processes (x number of doubles, for example) have historically correlated strongly enough to certain results (y number of runs, for example), then we can best predict future results by researching current processes. At any rate... I'm going to use Runs Created per 27 outs rather than simple runs to measure the offenses.
Second, depth is much less of an asset in the playoffs than in the regular season, as days off for the recently fragile likes of Ramirez, Ortiz, Drew, Anderson and Guerrero will no longer be an issue. In order to correct for this - that is, to correct regular season offensive output to the playoff context - we have to discount the contributions of players like Hinske and Willits. Pinch-hitting will still occur, but for a few reasons (the negligible impact of AL pinch-hitting being chief among them) we will focus on the starting lineups.
For the Red Sox (Runs Created per 27 outs in parentheses): Varitek (5.55), Youkilis (6.67), Pedroia (6.47), Lowell (6.68), Lugo (3.69), Ramirez (6.53), Crisp (4.52), Drew (5.71), and Ortiz (9.97), for a team average of 6.20 runs per 27 outs, or 5.85 per 25.5 outs (as there are in an average game because the bottom of the ninth is only played half the time). After correcting for park effects, we get 5.69, which is a reasonable runs per game expectation for these Red Sox. And now the Angels: Mathis (3.05), Kotchman (5.81), Kendrick (4.99), Izturis (5.42), Cabrera (5.15), Anderson (5.92), Matthews (4.75), Figgins (6.68), Guerrero (7.73), for a team average of 5.50 runs per 27 outs, or 5.19 per 25.5 outs. After correcting for park effects, we get 5.39.
That was a dense paragraph, but the point is that the Red Sox should be about a third of a run better than the Angels on offense.
To measure starting pitcher performance, we are going to focus on those statistics which fall within the control of the pitcher, specifically homeruns, strikeouts, walks and groundball rates. The other events - non-homerun balls in play - we will ignore. There is pretty extensive research on the idea that batting average on balls in play for any given pitcher is a function of defense and randomness - googling Voros McCracken will get you started if you are so inclined - and I don't wish to rehash that here, except to say that I subscribe to the research and its more contemporary iterations.
There is also research that indicates that the percentage of a pitcher's flyballs allowed that go for homeruns will regress toward the league average (usually around ten percent) over time.
With these two thoughts in mind, we are going to use a statistic called xFIP to form our expectations for the pitchers. FIP stands for Fielding Independent Pitching, an ERA-style number derived solely from homeruns, walks, hit batsmen, strikeouts and innings pitched. xFIP is the same as FIP, except that it corrects the pitcher's homeruns per flyball to 0.1, helping to correct for both short-term fluke and home park influence. The numbers are put on an ERA scale, so they are familiar to look at as well as translatable to actual runs. xFIP numbers are available at hardballtimes.com.
We are also going to need to know how deep into the game each starting pitcher is likely to go, so here are the xFIP numbers for the Red Sox starting pitchers with average innings per start in parentheses: Beckett 3.56 (6.7), Matsuzaka 4.42 (6.4), Schilling 4.48 (6.3). And for the Angels: Lackey 4.09 (6.8), Escobar 4.31 (6.5), Weaver 4.83 (5.8).
The biggest difference between bullpen usage in the regular season and bullpen usage in the playoffs is that the better pitchers are used more often and in higher leverage situations than are the lesser pitchers.
In the regular season, the import of each pitcher's contribution to the overall success of the bullpen goes something like this, in descending order from the closer to the sixth man: 31%, 23%, 16%, 13%, 10%, 7%. These numbers are derived from my own research which combines leverage (a stat available at places like fangraphs.com and Baseball Prospectus) and innings distribution. Without adequate playoff leverage data, we have to guess a little at how these numbers would look for the playoffs. I am going to make an educated guess as a scale closely resembling 37%, 28%, 20%, 10%, 5%.
Remember, we don't want cumulative bullpen numbers - those tend to deceive us by giving equal weight to innings by mop-up men as they do to innings by closers and other high-leverage relief aces.
So, applying our playoff-leverage bullpen weights to the Red Sox, we get a bullpen the success of which we predict according to the following weights, and using the same xFIP numbers that we did for the starting pitchers: Papelbon 37% (3.03), Gagne 28% (4.21), Okajima 20% (3.72) , Delcarmen 10% (4.07), Timlin 5% (4.88). So, our weighted Red Sox bullpen xFIP is 3.69. After re-adjusting this number upward to get it back onto the proper scale for runs allowed per game, necessary because the number is artificially deflated by the weighting process, we get 3.94.
For the Angels, we will use the same weights and the following xFIP numbers: Rodriguez (3.53), Shields (4.12), Speier (4.19), Oliver (4.46), Bootcheck (4.52). So, our weighted Angels bullpen xFIP is 3.97. Re-adjusted as mentioned in the last paragraph, we get 4.14.
Defense is an important consideration, especially when we are using defense independent pitching measures. I use Davenport Translations (Baseball Prospectus) for the defense of each player in each lineup, and I regress them toward the mean somewhat. Without getting into specifics, I have the Red Sox defense as being worth 0.23 runs per game (very good) and the Angels at 0.13 runs per game (also good).
Bringing it all Together
Red Sox with Beckett starting: 6.7 innings of 3.56 xFIP, or 2.65 earned runs, leaving 1.8 innings for the bullpen, at a 3.94 xFIP, or 0.79 earned runs, for a total of 3.44. Next, we have to multiply the number by 1.08 to put us on a runs scale, rather than an earned runs scale (1.08 is the league ratio of runs to earned runs). So, now we are at 3.72. Finally, we subtract 0.23 runs to account for the Sox strong defense, and we end up at 3.49.
And we know from our analysis of the hitting above that the Sox are good for 5.69 runs per game on offense. When we plug these numbers into Bill James' expected wins formula (also known as "Pythagorean" record), we see that the Red Sox, in the playoffs, with Josh Beckett pitching, are a .710 team against league average competition (interesting to note: the Sox went 21-9 in Beckett's 30 starts this year, which is a .700 winning percentage).
Angels with Lackey starting: 6.8 innings of 4.09 is 3.09 earned runs. 1.7 innings of the bullpen at 4.14 is 0.78 earned runs. That is 3.87 total earned runs multiplied by 1.08 for a total of 4.18. After adjusting for the Angels defense, we are left with 4.05.
Our offensive expectation for the Angels is 5.39 runs per game, so again by using James' formula we can determine that, in the playoffs, with John Lackey pitching, the Angels are a .628 team.
In game one, then, wherein Lackey will be facing Beckett, we have a .628 team playing a .710 team. Home advantage in baseball is historically around four percent, so to cap game one at Fenway Park, we have to adjust the Red Sox upward to .750.
Another Bill James formula, Log5, is used here to determine the probability of a .628 team beating a .750 team. Feel free to google Log5 if you like, but the answer is that the .628 team will beat the .750 team 36% of the time.
Now, we can repeat the methodology for the other four starting pitchers in the series. When we do, we get: Matsuzaka 4.16, Schilling 4.20, Escobar 4.23, Weaver 4.57. With those numbers, we can now measure the quality of each team with each starting pitcher, like we did with Beckett and Lackey above.
Red Sox with Beckett: .710
Red Sox with Matsuzaka: .639
Red Sox with Schilling: .635
Angels with Lackey: .628
Angels with Escobar: .609
Angels with Weaver: .575
Now, we can use Log5, our offensive projections, pitching matchups and home advantage to come up with winning percentages for each team in each game.
10/3 LAA (Lackey) @ Bos (Beckett): Red Sox have a .640 probability of winning.
10/5 LAA (Escobar) @ Bos (Matsuzaka): Red Sox have a .576 probability of winning.
10/7 Bos (Schilling) @ LAA (Weaver): Red Sox have a .522 probability of winning.
10/8 Bos (Beckett) @ LAA (Lackey): Red Sox have a .549 probability of winning.
10/10 LAA (Escobar) @ Bos (Matsuzaka): Red Sox have a .576 probability of winning.
We can use those numbers to come up with series prices. For example, the probability of a Red Sox sweep is .640*.576*.522, or 19.2%. The probability of the Red Sox winning games 1 and 2, losing game 3 and winning game 4 is 9.7%. In total, there are ten different possible permutations or combinations of winning/losing games to win a best three-out-of-five series. After adding up all the probability of each of the ten that result in a Red Sox series win, we get a .635 probability of the Red Sox beating the Angels in the series.
The (juice-free) line for a .635 probability would be -174. The Red Sox are currently favored -175 at The Greek.
Yankees vs. Cleveland
I am using the exact same methods here as in the Red Sox vs. Angels preview. Please refer to the above to learn how these numbers were arrived at and what they mean.
Yankees Runs Created, Park and Lineup Adjusted: 6.26 expected runs per game. Cleveland Runs Created, Park and Lineup Adjusted : 5.50 expected runs per game.
Yankees Weighted Bullpen: 4.22
Cleveland Weighted Bullpen: 4.14
Yankees Defense: 0.12
Cleveland Defense: 0.16
Yankees expected runs allowed per game (based on xFIP, methodology described above) with each starting pitcher, after adjusting for bullpen and defense:
And for Cleveland:
Now we can bring this all back together and handicap the games.
10/4 NYY (Wang) @ Cle (Sabathia): Cleveland has a .565 probability of winning.
10/5 NYY (Pettitte) @ Cle (Carmona): Cleveland has a .533 probability of winning.
10/7 Cle (Westbrook) @ NYY (Clemens): Yankees have a .589 probability of winning.
10/8 Cle (Byrd) @ NYY (Mussina): Yankees have a .627 probability of winning.
10/10 NYY (Wang) @ Cle (Sabathia): Cleveland has a .565 probability of winning.
The Yankees have a .520 probability of winning the series, which would make for a line of -108.
Cleveland is currently +170 at The Greek.
Cubs vs. Diamondbacks
Cubs Runs Created, Park and Lineup Adjusted: 4.91 expected runs per game. Diamondbacks Runs Created, Park and Lineup Adjusted: 4.43 expected runs per game.
Cubs Weighted Bullpen: 4.24
Diamondbacks Weighted Bullpen: 4.37
Cubs Defense: 0.15
Diamondbacks Defense: 0.22
Cubs expected runs allowed per game (based on xFIP, methodology described above) with each starting pitcher, after adjusting for bullpen and defense:
And for the Diamondbacks:
Now we can bring this all back together and handicap the games.
10/3 ChC (Zambrano) @ Arz (Webb): Diamondbacks have a .624 probability of winning.
10/4 ChC (Lilly) @ Arz (Davis): Cubs have a .528 probability of winning.
10/6 Arz (Hernandez) @ ChC (Hill): Cubs have a .684 probability of winning.
10/7 Arz (Owings) @ ChC (Zambrano): Cubs have a .599 probability of winning.
10/9 ChC (Lilly) @ Arz (Webb): Diamondbacks have a .603 probability of winning.
The Cubs have a .532 probability of winning the series, which would make for a (juice-free) line of -114.
The line is currently Chicago -125/Arizona +110 at The Greek, so no value to be had.
Rockies vs. Phillies
Rockies Runs Created, Park and Lineup Adjusted: 4.90 expected runs per game. Phillies Runs Created, Park and Lineup Adjusted: 5.63 expected runs per game.
Rockies Weighted Bullpen: 4.17
Phillies Weighted Bullpen: 4.34
Rockies Defense: 0.30
Phillies Defense: 0.10
Rockies expected runs allowed per game (based on xFIP, methodology described above) with each starting pitcher, after adjusting for bullpen and defense:
And for the Phillies:
Now we can bring this all back together and handicap the games.
10/3 Col (Francis) @ Phi (Hamels): Phillies have a .644 probability of winning.
10/4 Col (Morales) @ Phi (Kendrick): Phillies have a .537 probability of winning.
10/6 Phi (Moyer) @ Col (Jimenez): Rockies have a .522 probability of winning.
10/7 Phi (Lohse) @ Col (Fogg): Phillies have a .519 probability of winning.
10/9 Col (Francis) @ Phi (Hamels): Phillies have a .644 probability of winning.
The Phillies have a .620 probability of winning the series, which would make for a (juice-free) line of -163.
The Phillies are currently -125 at The Greek.
So, to Recap:
That's Cleveland +170 and Philadelphia -125, with a distinct possibility that Arizona and Boston drift into the "value zone" with even a small movement because both are close.
Remember, of course, that this is baseball and anything can happen. These are not guarantees or locks... these are pronouncements of value. In fact, Cleveland, which I have picked, is more likely to lose its series than win it (52%/48%), but +170 is ludicrous. One sobering thought to keep things in perspective: despite the value, there is nearly a one in five chance that both Cleveland and Philadelphia lose and I'll go 0-2 here (of course there is a 30% chance both will win...).
MajorWager.com poster at-large
Please direct any private comments on this analysis to firstname.lastname@example.org.
If you would like to make or read comments about this article, you may do so by visiting the Mess Hall forum at MajorWager.com where a thread has been started. Please click HERE