Tuesday, May 01, 2007

slow starts and run differential, part two

this writer has often found it instructive to revisit fruitful lines of thought with the blessing of greater data to see if new conclusions can be wrung from the rag. much continues to be made, as the cubs continue to struggle, of the positive run differential that the team is losing in spite of. the club has scored 112 and allowed just 91, but offers a 10-14 record.

how strange is this? what does it say about the club? does it say anything about matters going forward? and, if so, what?

updating this study of some two weeks ago, what follows is a listing of every major league team since 1973 to have posted, as of april 30, both a losing record and a positive run differential. there have been 76 such teams, not including the six newly minted this season. they are sorted by disparity -- maximal to minimal -- between their actual record and the record implied by runs scored and allowed, commonly called the pythagorean record.


  as of 4/30after 4/30
yearteamwlwin%rsrardpythaga-tw afterl afterwin% afterwin% final
2007chc10140.4171129110.3%0.5930.177    
1998min11160.4071371168.3%0.5750.16859760.4370.432
1975nyy9100.474997215.8%0.6410.16774670.5250.519
1986pit7100.41287748.1%0.5730.16157880.3930.395
1990min7120.36888832.9%0.5270.15867760.4690.457
1980phi690.40075675.6%0.5510.15185620.5780.562
1984mil9110.450917311.0%0.5990.14958830.4110.416
1974kcr8110.42198866.5%0.5590.13869740.4830.475
1991atl8100.44475638.7%0.5790.13486580.5970.580
1977laa9130.4091171074.5%0.5410.13165750.4640.457
2007nyy9140.3911311252.3%0.5210.130    
1997sdp9150.375100990.5%0.5050.13067710.4860.469
1994phi9140.3911171122.2%0.5200.12945470.4890.470
1985phi8110.42180734.6%0.5420.12167760.4690.463
1997nym12140.462116978.9%0.5810.11976600.5590.543
1982lad10110.47694779.9%0.5900.11478630.5530.543
1980mil780.46780678.8%0.5800.11379680.5370.531
2005hou9130.40984802.4%0.5220.11380600.5710.549
1985stl8110.42184783.7%0.5340.11393500.6500.623
1978cle8110.42185793.7%0.5330.11261790.4360.434
1977cin9100.47499829.4%0.5850.11179640.5520.543
1976bos670.46263547.7%0.5700.10877720.5170.512
1994lad11120.4781291079.3%0.5840.10647440.5160.509
1985mil8110.42176722.7%0.5250.10463790.4440.441
2001fla10140.4171191142.1%0.5200.10366720.4780.469
1987det9120.42991853.4%0.5310.10289520.6310.605
1973nyy9100.47485728.3%0.5750.10171720.4970.494
1983det890.47189767.9%0.5710.10184610.5790.568
1977nym890.47169597.8%0.5710.10056890.3860.395
2001sdp10150.4001281280.0%0.5000.10069680.5040.488
2002tex10150.4001201200.0%0.5000.10062750.4530.444
1982cle8100.44497884.9%0.5440.10070740.4860.481
1978mil9110.4501171055.4%0.5490.09984580.5920.574
1990sdp9100.47490777.8%0.5710.09766770.4620.463
1999laa11120.4781321128.2%0.5740.09659800.4240.432
1976sdp9100.47485737.6%0.5690.09564790.4480.451
1999kcr9110.4501091004.3%0.5390.08955860.3900.398
1988mil9110.45082763.8%0.5350.08578640.5490.537
1989lad11130.45879724.6%0.5420.08466700.4850.481
2006atl10140.4171151150.0%0.5000.08369690.5000.488
1979stl9100.47486766.2%0.5560.08277660.5380.531
1983chw8100.44491862.8%0.5260.08191530.6320.611
1989min10120.4551081003.8%0.5350.08070700.5000.494
2005mil10130.435102991.5%0.5140.07971680.5110.500
1978stl9120.42990890.6%0.5050.07760810.4260.426
1979chw9110.4501061002.9%0.5260.07664760.4570.456
1999cin9120.42997960.5%0.5050.07687550.6130.589
2006lad12130.4801221086.1%0.5550.07576610.5550.543
1989phi11120.4781131015.6%0.5510.07356830.4030.414
2000oak12130.4801351215.5%0.5500.07079570.5810.565
1975bal790.43866650.8%0.5070.06983600.5800.566
1979det790.43880790.6%0.5060.06878670.5380.528
2007phi11140.4401231210.8%0.5070.067    
2004phi10110.47688804.8%0.5430.06776650.5390.531
2006ari12130.4801241125.1%0.5460.06664730.4670.469
2002bal12140.4621211143.0%0.5270.06655810.4040.414
1994sea10130.4351151150.0%0.5000.06539500.4380.438
2000sdp11140.4401371360.4%0.5030.06365720.4740.469
2003laa13140.4811401274.9%0.5440.06364710.4740.475
1998sea12150.4441611590.6%0.5060.06164700.4780.472
1981atl9100.47468633.8%0.5350.06141460.4710.472
2002hou11140.4401211210.0%0.5000.06073640.5330.519
1998oak12140.4621361302.3%0.5210.05962740.4560.457
1983sdp10120.4551071041.4%0.5130.05871690.5070.500
1989bos10120.4551141120.9%0.5080.05473670.5210.512
1991chc10110.47687823.0%0.5270.05167720.4820.481
1985atl9100.47477732.7%0.5240.05157860.3990.407
1994kcr9110.4501171170.0%0.5000.05055400.5790.557
2007oak12130.48095893.3%0.5300.050    
1996chc13140.4811361283.0%0.5280.04663720.4670.469
2005nym11130.4581101090.5%0.5040.04672660.5220.512
1974cle10110.476100962.0%0.5190.04267740.4750.475
2005sdp11130.4581051050.0%0.5000.04271670.5140.506
1996pit12140.4621311310.0%0.5000.03861750.4490.451
1987kcr9100.47484821.2%0.5110.03774690.5170.512
1986tex9100.4741031011.0%0.5090.03578650.5450.537
1997kcr11120.4781121091.4%0.5120.03456820.4060.416
1986kcr9100.47473720.7%0.5060.03367760.4690.469
1974phi10110.47689880.6%0.5050.02970710.4960.494
2007fla12130.4801431410.7%0.5060.026    
1993atl12130.48083820.6%0.5060.02692450.6720.642
2007cin12130.4801091090.0%0.5000.020    


from this, what can we learn?

the first and most obvious thing is the rarity of the 2007 cubs' current situation. in 35 seasons, this team shows the greatest early season disparity between actual and theoretical record. this must be seen as a consequence of one of the stranger splits you'll likely see, dear reader -- the cubs are both 0-6 in one-run games, and 6-1 in blowout games (that is, those decided by five or more runs). both of these trends figure to revert to the mean over time.

the second and less enchanting bit of information that can be gleaned is that, despite that rare and massive difference, the outlook for the cubs going forward is perhaps not nearly so positive as one might hope based on run differential to date. as one can see, of the next 15 clubs that so underperformed their pythagorean projection by the largest margin, only six have been able to reach .500 by year-end -- and further, just 3 of the worst 10. the average year-end winning percentage of these 15 is .487 with a deviation of .056. it would seem that a brisk pace of positive run differential through april 30 is hardly a guarantor of future success.

the data can be otherwise sorted to yield further unwelcome conclusions. this writer has noted that the largest barrier to success is now the simple fact of being 10-14. it is hard to make up four games against .500 for most clubs, particularly for one that is thought to be relatively mediocre.

indeed, when this sample is sorted by actual winning percentage through april 30, the cubs place 12th-worst of the 82 teams at .417 -- and of the worst 15 by that measure, just three have finished the year over .500. the average winning percentage of this lot is .483 with a deviation of .056. as it happens, the large majority of clubs in the entire sample who did finish at or over break-even in the end were no more than two games under .500 at this point.

there is also, however, some reason for hope.

the cubs pythagorean projection at this time is .593 -- implying a 96-win pace. that is the third-highest such figure in the lot, and the peer group is a somewhat happier one. of the top 15, fully 8 broke .500 even though only two continued to post run differential at such a high pace as they did in april. six of this sample even managed to break a .525 winning percentage (corresponding to 85 wins). though the mean winning percentage of these fifteen is .491, the standard deviation is .064 -- implying a probable range of .427 to .555.

how then to view the data? the hopeful may take away elements of the last point to the ignorance of the rest, but the balance must seem to indicate that the cubs have made in april a rather negative statement in regards to where they will be in september in spite of posting a losing record in one of the more encouraging ways imaginable. that further comports with other estimations of what has transpired in april -- with the team due to lose in mean reversion in pitching more severely than it stands to gain in offense, with improvement in one-run games likely to be offset by a higher loss rate in blowouts, it is hard to see how the promise of this early-season vacillation of run differential will be likely to materialize into a dramatically better ballclub.

given further that the club is now chasing a decent milwaukee club at a five-and-a-half game disadvantage, it seems here that what was indicated a week ago was probably no accident.

No comments: