Wednesday, May 02, 2007

the five back rule revisited

this page has taken a fairly firm stance that, at least as far as the playoffs go, this year is already over. falling back so far of the division leader so early in the season, as has been documented here and here, is highly indicative of clubs that are going to fail not only to make the playoffs but to breach .500 at year end. as to the disparity of runs scored to runs allowed, one need only observe the weakness of the correlation between april run differential and final fates to understand that was has thusfar transpired in these terms means little.

the cubs pythagorean projection at this time is .593 -- implying a 96-win pace. that is the third-highest such figure in the lot, and the peer group is a somewhat happier one. of the top 15, fully 8 broke .500 even though only two continued to post run differential at such a high pace as they did in april. six of this sample even managed to break a .525 winning percentage (corresponding to 85 wins). though the mean winning percentage of these fifteen is .491, the standard deviation is .064 -- implying a probable range of .427 to .555.


indeed, as much can be confirmed by a presentation of previous data regarding the five back rule, updated, expanded and newly tabular.

year5 back game10 back gameat year endthrough 4/30
playofffinal gbwlwin%wlrsrapythag
1984nny096650.59612898700.648
1989nny093690.574121185950.450
2001139nn588740.543159107910.573
1977115132n2081810.5007956770.359
1973111nn577840.47812885750.557
2003100ny088740.54315121501090.641
198580106n23.577840.47812660460.619
199876131wc12.590730.55214131241120.546
20047593n1689730.549139115960.581
197875156n1179830.48811970840.418
19876086n18.576850.472101090810.548
19755977n17.575870.46312593730.608
199652157n1276860.46913141361280.528
19954273n1273710.5074129240.585
197433101n2266960.407711721090.320
19913267n2077830.481101187820.527
19993189n3067950.414101088980.451
19802980n2764980.3959677770.500
20062742n17.566960.40713101091120.488
20052580n2179830.48812111191070.548
19762538n2675870.463910991120.444
19932050n1384780.519111188910.485
200719      1014112910.593
19901852n1877850.47581162820.375
20001744n3065970.40110171451750.415
19881639n2477850.4751012971040.468
20021639n2467950.414816861100.390
19821349n1973890.45171485920.464
199212114n1878840.48171361890.335
19941224n16.549640.434615951280.368
1986926n3770900.43871275960.390
19798135n1880820.4948957690.414
1981814n21.538650.36921336730.216
1997618n1668940.420619951330.352
19835116n1971910.43861465940.338


from this presentation, it doesn't take a mathematician to see that the 2007 cubs are in very deep trouble. if they go on to win their 82nd game later this season, they will be only the second cub club in the last 25 to do so in spite of falling five games back of the division leader before their 70th game was played. (a third, the 1995 club, may have done so had not that season been shortened by a labor dispute.)

looking at the plight of early laggards from the other side, espn's jayson stark offered this analysis.

We looked at every full season since 1982. Here's what we found:

• Of the 144 teams that made it to the postseason in that span, only eight (or 5.6 percent) came out of April more than three games under .500. Clubs that need to worry most about that history lesson: the Yankees (9-14), Astros (10-14), Cardinals (10-14), Cubs (10-14) and Rangers (10-15).

• Just six of those 144 playoff teams (or 4.2 percent) found themselves more than 4½ games out of a playoff spot after April. Clubs that ought to get nervous about that trend: the Cubs, Cardinals and Astros (all five games out).

• And you wouldn't think the standings would mean much this time of year. But more than half of the 120 teams that found themselves in first place after April (66 of 120) wound up finishing first. And 98 of the 120 (81.7 percent) of the teams that finished the season in first place either led their division or were within 2½ games of the lead at the end of April.


going into today's games, the cubs had further fallen six afoul of the milwaukee brewers. again, it must be said -- the greatest of a handful of barriers now is the inertia of the record itself. with the brewers at 17-9, should they play just .500 ball henceforth -- something they're probable to do or better -- they'll finish at 85-77. this cubs team simply aren't going to be that good barring a miracle. matching that record means 75-63 -- 12 over .500, or .543 baseball.

this page would not dispute that there are worse ways to be 10-14 and six back. the cubs are chasing only one good club -- the brewers -- and not three or four, which would all but nail the coffin shut. and having scored more than they've allowed is certainly a more promising indicator than its opposite -- after all, the 1984 and 2003 playoff sides both posted positive run differentials through april that only modestly exceed this one's.

however, so did the 1985 and 1975 clubs -- teams which finished with 77 and 75 wins, respectively. furthermore -- indicative of the tremedous variance of just a month's worth of play -- the 1989 club that finished with 93 wins had allowed ten more (95) than they had scored (85) through the same date. as can be seen in the following chart, the correlation between pythagorean figures and actual ones over large sets of data are good -- the ratio of theoretical to actual records should approach one, as they do here -- but the quality of the correlation in individual data points is poor.

not as tightly scattered as one would like

No comments: