Thursday, May 31, 2007

Odds of Making the Playoffs

In this space, I pretend to use and understand sabermetrics.

In actuality, while I would love to know what BABIP means in essence and how it effects what we can further expect from a batter or a pitcher, I truly have no idea.

I often want to apply mathematics to my analysis, but I am never quite sure how to begin. I sometimes have ideas about how to proceed, but they are usually half baked. For example, when I wanted to figure out if Joe Torre has too quick of a hook with his starters, I had an idea of how to start, but once I had my pretty little table, I didn't quite know how to analyze it.

I try and look up the stats (such VORP), and understand the stats, and pass on the information in a meaningful way, but ultimately, it never happens to the extent that it could have been.

So when I want to figure out the odds of the Yankees and their 22-29 record of making the playoffs, I can go to Coolstandings.com and see they have a 10% chance, or I can go to Baseball Prospectus and see they have a 19.5% chance.

The question is, is looking at how the Yankees have done to this point in 2007, the best way of looking at how they will continue to do in the remaining 111 games this season?

To better answer that question, I turn to the email I received from sometime commenter and Knower-Of-All-Things, Anthony. He, unlike me, understands stats, and math, and can come up with some sort of formula to discover all things.

His interest regarding the Yankees' playoff hopes piqued from this conversation on Tango Tiger's website, and specifically a few comments made by one, MGL:
No matter how much we (analysts) show how current performance affects performance going forward (not much as compared to past performance, unless we have little past performance of course), EVERYONE, including people like Neyer and Law, and of course, every mainstream fan and commentator on the planet, attaches WAY too much importance to a team’s performance thus far in the season. People cannot imagine that the Yankees may be a true .575 team...That just boggles their mind.
It is MGL's belief that what seems like a large enough sample size (51 games) to determine some sort of projection moving forward, it is still too small to completely ignore preseason projections for individual players. Ultimately, MGL thinks that actual 2007 numbers (at this point in the season) should only account for 10-15% for a player's expected value going forward.

With that thought process, according to Anthony:
The correct way of predicting how the Yankees will do in the final 111 games is to look at the players, not the team. That is, figure out each player's true talent level, figure out how much playing time each will get, then add them up for the team totals.
At this point, I have to completely turn to the Knower-Of-All-Things, as the math leaves me slightly confused. That, and there is no real way to summarize this information any better than Anthony's original statement.
(Slightly-involved math - Should we use MGL's suggested 85/15 split? Tango uses a 5/4/3/2 split for the Marcels. That means he weights 2006 by 5, 2005 by 4, 2004 by 3, and league average by 2. Someone in the above discussion suggested a 6.5 weight for 2007. Let's say we bump it up to 7. A 7/5/4/3/2 split weights this year as 33% of the projection. If we assume 600 PA in all previous years, and 200 PA this year, that makes this year's weight just over 14%. Looks like MGL's guess was pretty darn close.

Considering most players don't yet have 200 PA, and we'll be using a "smarter" projection system than Marcel, that 14% should really be the absolute max. Instead, what I'm doing is adding 1000 PA of the preseason PECOTA projections to every batter's actual 2007 stats. For pitchers, I'm making it 1500 BFP worth of PECOTA. Derek Jeter leads the Yankees' in playing time thus far; he gets an 81/19 regression. Andy Pettitte tops out the pitchers: he gets an 84/16 regression. If anything, I'm actually overweighting 2007.)

Some interesting things come out when we do this. Check out what happens to Bobby Abreu:

2007: .228/.313/.289
PECOTA: .277/.389/.447
Updated: .267/.374/.417

Now here's Posada:

2007: .357/.414/.560
PECOTA: .259/.365/.443
Updated: .275/.372/.462

Repeat for every player, add together, wipe hands on pants. For offense, we'll use a simple Runs Created formula. So far the Yankees have scored 268 runs versus 268 runs created. I'd say we can live with that accuracy. For pitchers we'll use Component ERA. Regular ERA won't take into account luck and FIP won't take into account defense, so Component ERA is the best available option.

So that's the method. What's the result, then? What is the Yankees' true talent level?

.614.

Seriously. .614. Even with the awful start, this is a true 99-win team.
So how do the Yankees go from being a .431 team to a 99 win team? Well, Anthony answered that as well looking at three specific areas:
Luck - 13 wins. Believe it or not, the team has scored more runs than they've allowed, 268-244. They should be 28-23. A lot of this is due to a dreadful 2-10 record in one-run games. Even if you hate "luck" as an explanation, merely positing that Mariano Rivera will even vaguely resemble his old self from now on will dramatically improve the Yankees' fortunes in close game. Banning Luis Vizcaino from the park will help, too. That their performance thus far more closely reflects a .545 team than a .431 team is worth 13 wins the rest of the way ([.545-.431]*111).

Pitching - 3 wins. Two words: Roger Clemens. Replacing DeSalvo/Wright/Igawa with arguably the best pitcher in baseball is easily worth three wins by itself. Two more words: Phil Hughes. Even if likely improvements by Rivera and Mussina are canceled out by Pettitte and Bruney coming down to earth, the additions of Hughes and Clemens easily make the pitching far, far better. I was surprised this analysis only saw the team's runs allowed per game falling from 4.78 to 4.49, to be honest.

Hitting - 5 wins. The offense has scored 5.25 runs per game this year. This analysis says they'll score 5.76 the rest of the way. Essentially, while Rodriguez, Jeter and Posada are playing over their heads, everyone else is likely to improve. We know Cano and Abreu and Mientkiewicz and Damon are better than they've shown, to various extents. Giambi is old and chubby and may be on his way out; but there's still a very good shot his power returns. And Wil Nieves couldn't hit any worse if he used a whiffle ball bat.
Essentially, Anthony took reasonable offense and pitching improvements and the all important luck improvement ("the luck improvement, while huge, makes sense when you consider all the in-game pitching injuries and Rivera's late-game failures") and came to the conclusion that this Yankee team is not only capable of winning 90 games, but quite likely to do so.

How likely? Well, using a binomial distribution, and the aforementioned true talent level evaluator, here are the odds of various win totals:
  • 81 games: 96.9%
  • 90 games: 55.4%
  • 95 games: 19.9%
  • 100 games: 3.3%
So while it sucks watching the Yankees fumble games like this, it is still far too early to give up hope on playoff chances.

The question is now, does Giambi going on the DL for at least 3 weeks in favor of Kevin Thompson have a significant effect on anything from above?

10 Comments:

Anonymous Anthony said...

Worst-case scenario: Giambi is out for the year, his playing time is split among Phelps, Minky and Cairo. Works out to about 15 runs, or one-to-two wins. That's at worst, mind you.

BABIP is easy. Perfectly logical when it's explained properly. Most people get it wrong though.

10:37 PM  
Blogger Tangotiger said...

Create post!

BABIP is batting average on balls in play.

The numerator is H-HR, those I, among the minority, also include reaching base on errors.

The denominator is PA minus "unfieldable" balls (BB+HBP+SO+HR+interference+SacBunts).

Ideally, you would remove all sac attempts, if the data was available.

1:01 PM  
Blogger Tangotiger said...

those I = though I

The equivalent to BABIP is 1 minus DER, where DER is defensive efficiency ratio, and is outs made per fieldable ball.

It's a fairly good, though not great, measure of team fielding.

1:03 PM  
Anonymous Sam said...

Awesome post, ^^ and Anf.

I would suggest distinguishing "luck" from "unlikely event". Luck is Willie Bloomquist getting a horseshit call. Rivera pitching poorly is really not luck, it was something under his control. I agree that it is unlikely to continue, but that does not make it "luck", per se. Ditto Vizcaino, and he is unlikely to improve, so that is managerial incompetence, which, again, is unlikely to improve unless the GM becomes more competent, making it more of a likely event.

Secondly, I am skeptical about Roger Clemens. If he gives the Yankees what Schilling is currently giving Red Sox, I would be ecstatic. Phil Hughes should help, though.

Thirdly, I am not sure Bobby Abreu will project to a similar decline than the typical player. His power decline has been precipitous, and his patience seems to have abandoned him in recent days. I am less optimistic of his resurgence. Mientkiewicz should be better, but his upside is low as well, so that does not project to much of an improvement. Damon in his current state of health is also not the usual PECOTA projected Damon.

All these would reduce the likelihoods projected, but are hard to quantify. Thus, I would probably qualify these as upper bounds.

9:28 AM  
Anonymous Anthony said...

The thing with Damon is that he's always banged up. His injury woes now are no worse than last year's broken bone in his foot.

PECOTA already predicted a huge dropoff for Abreu. I remember a big debate about that projection in the spring. His Marcel was .290/.406/.480, so using the PECOTA projection already takes into account his huge decline in power the last couple of years.

Noww no projection system can account for his bunting with runners on. Repeatedly. That's a terrible sign, and maybe that means his 2007 is more telling than most. Or maybe he'll turn it around like Jeter in 2004. We don't know, but I think this method finds the happy medium, even if it is more an estimate than a rigorous statistical analysis.

If I had more time and mathematical chops, I'd love to do a better version for all 30 teams.

11:31 AM  
Anonymous Sam said...

You do a great job regardless. Keep it up!

Does any participant in this thread know if "random effects" or "fixed effects" panel data (essentially, time series on a large number of cross-sectional units, which are the players) regressions are employed by PECOTA or others? I think this will be particularly meaningful, and could actually do a good job of quantifying how variables unique to players (or "intangibles") contribute to the overall conditional mean.

6:51 PM  
Anonymous Anthony said...

Weeeelll...that's a bit over my head. I don't believe PECOTA regresses at all in creating the baseline, though I guess technically the method is regressing towards similar players. Not sure of the specifics there; you might want to drop Nate a line at nsilver@baseballprospectus.com.

Marcel and other projection systems regress a lot, though I'm not sure if it's the specific type you mention. Visit Tango's site (http://insidethebook.com/ee/) and look around through the very cool archives. This thread might be a decent place to start: http://www.insidethebook.com/ee/index.php/site/comments/groups_of_players_and_regression_toward_the_mean/

8:17 PM  
Anonymous Anthony said...

Link cut off. Second attempt:

http://www.insidethebook.com/ee/index.php/site/
comments/groups_of_players_and_regression_toward_the_mean/

8:19 PM  
Anonymous Sam said...

Ok, I checked that one. MGL seems to use regression in a way that I am not entirely familiar with. I was using it more in terms of a multivariate regression context, where you have a dependent or "explained" variable (BA, for instance) on the left hand side, and the explanatory variables (age, weight, some measure of athletic ability, contact rate against fastball, curveball etc., GB-FB data) on the right hand side. The R-squared is a measure of sum-squared of explained variance over the sum squared of total variance. R-squared in bound by 1, in which case the fit between the model and data is perfect.

So, regression there is a methodology, and not a number, as MGL uses it.

11:09 PM  
Anonymous Anthony said...

Updated odds, reflecting the 33-32 record as well as lots less Giambi and Malphabet, and lots more Cairo and Melky:

81: 99.2%
90: 71.4%
95: 31.7%
100: 6.2%

10:38 AM  

Post a Comment

Links to this post:

Create a Link

<< Home