Posts Tagged “Runs & RBI”

Building on my last post about which statistics correlate well with Runs and RBI, I decided to test out runs created. Using eXtrapolated Runs, I calculated the correlation between runs created and actual Runs to be .978, while the correlation between runs created and RBI is .956 (using 2005 data). Evidently Runs is a slightly better measure of actual performance than RBI — more proof of the media’s foolish love affair with the Run Batted In. The best correlation, however, belonged to XR and Runs/RBI combined (.987). More information yields more accuracy, yet again.

I then computed the difference between each player’s runs created and the average of his Runs and RBI. Let’s call this the Fortunate Number, the idea being that players with more XR than Runs/RBI were unfortunate because of their spot in the lineup, the quality of the lineup they hit in, or other uncontrollable factors. A positive Fortunate Number means that a player deserved better, while a negative number implies fortuitous luck.

The Top 10 Fortunate Players:
-14 cantu,jorge
-13 ramirez,manny
-12 anderson,garret
-12 rivera,juan
-11 rios,alex
-11 hernandez,jose
-11 lamb,mike
-11 ginter,keith
-11 feliz,pedro
-10 teahen,mark

The Top 10 Unfortunate Players:
32 lee,derrek
27 winn,randy
25 helton,todd
23 roberts,brian
23 giles,brian
23 bay,jason
19 tracy,chad
19 lugo,julio
18 giambi,jason
18 counsell,craig

Next, I found the correlation between runs created and the Fortunate Number. It turned out to be .466, meaning the better the player, the less fortunate they tended to be. In other words, in a batting lineup the worse players owe credit for some of their Runs and RBI to the better players. The better players are givers and the worse players are receivers.

Crude Models For Predicting Runs and RBI

Let me say up front that I think the following models are severely lacking in quality. I’ve mixed parts of a theoretical model with an empirical regression analysis and made more than a couple subjective decisions. However, the investigation produces some intriguing observations.

I wanted to find a way to predict a player’s Runs and RBI totals as a function of his individual statistics. For Runs, I decided to use 1B, 2B, 3B, HR, BB, HBP, SB, and CS as the independent variables. I first tried using Outs as well, but then decided that the contribution of a batting out to scoring a run should be zero by definition (not reaching base doesn’t do anything towards scoring). Also, since a HR accounts for exactly one run every time, I took it out of the regression calculations. Here are the coefficients for “modeling” Runs:

1b 0.325
2b 0.289
3b 1.010
bb 0.208
hbp 0.194
sb 0.448
cs -0.14
hr 1.000

Why is a double worth less than a single and why is a triple worth more than a homerun? Because of the whole correlation-is-not-causation phenomenon. Leadoff hitters tend to be singles hitters with a lot of speed, whereas an increase is doubles (and power in general) will get player moved to the heart of the order. Because players with a lot of triples tend to leadoff and score runs because they leadoff, the value of a triple is exaggereated here. Same thing for SB.

As far as RBI go, I removed 1 RBI from the value of a HR, ran the regression, then added it back in. I also included Outs, as there are some types of batting outs that will produce RBI, and removed SB and CS, which don’t influence RBI. Here are the coefficients:

out 0.020
1b 0.175
2b 0.417
3b -0.517
bb 0.047
hbp -0.144
hr 1.832

As expected, Outs have a small influence on RBI. Triples are experiencing the reverse phenomenon from the Runs regression analysis — they should have an RBI value exactly 1 less than HR (all baserunners score in both cases, but not the batter for a triple). The value of a double relative to a single is likely exaggerated because power hitters tend to hit in the middle of the lineup and get a boost in RBI because of it.

Using these “models”, I calculated each player’s expected number of Runs and RBI and compared them to their actual totals. Here are some Top 10 lists:

Top 10 Fortunate Run-Scorers:
-24 damon,johnny
-22 renteria,edgar
-21 crosby,bobby
-21 dellucci,david
-17 giles,marcus
-17 jeter,derek
-15 matsui,hideki
-15 walker,larry
-14 green,nick
-14 hinske,eric

Four out of the top 7 are Yankees and Red Sox.

Top 10 Unfortunate Run-Scorers
19 hall,toby
16 clark,tony
16 burrell,pat
13 lecroy,matthew
13 burroughs,sean
13 piazza,mike
12 encarnacion,jua
12 ausmus,brad
12 glaus,troy
12 larue,jason

Top 10 Fortunate RBI Hackers
-27 atkins,garrett
-27 ramirez,manny
-26 matsui,hideki
-25 cantu,jorge
-24 anderson,garret
-23 holliday,matt
-23 teixeira,mark
-22 johnson,reed
-22 sheffield,gary
-21 burrell,pat

Wow, Reed Johnson only had 55 RBI in the first place. Guess he was clutch!

Top 10 Unfortunate RBI Sufferers
24 lee,derrek
19 winn,randy
15 biggio,craig
15 podsednik,scott
14 blake,casey
14 clark,brady
14 taveras,willy
12 counsell,craig
11 tracy,chad
11 walker,todd

Surprisingly few leadoff types on that list. Hmm, how many more MVP votes would Derrek Lee have received with 24 more RBI (131 total)?

The MFP and MUP

How about a few awards to wrap up. The first I’ll call the MFP (Most Fortunate Player). The 2005 MFP is Hideki Matsui for accumulating 41 more Runs and RBIs than his individual stats deserved. The second is the MUP (Most Unfortunate Player), won in 2005 by Randy Winn with 29 fewer Runs and RBIs than he deserved. Randy Winn did have a surprisingly good year last year.

Popularity: 2% [?]

Comments 1 Comment »

Dave Pinto points out that Jim Thome failed to score in a game for the first time today. That’s right, he had scored in every one of the ChiSox first 17 games this season. Mister Pinto explains an often overlooked baseball truth:

We like to think of run scorers as the fast, leadoff men, but six of the top ten in runs scored were known more for their power than for their speed. In fact, most had the deadly combination of a high OBA and power. The OBA means they’re on base when their teammates hit. The power means they score on their own (home runs) or get themselves on base in scoring position.

Obviously, hitting at the top of the lineup leads to more runs that batting at the bottom of the order, but a study that controlled for lineup position would without a doubt show that the Albert Pujols and Derrek Lees of the world do the most towards getting themselves home. Like Dave points out, getting on base and hitting for power to advance yourself around the bases are the two best ways to make sure you score a bunch of runs. Here are some correlations from 2005 between player runs and some other individual stats:

OB .980
TB+ .979
TB .976
HIT .973
PA .969
AB .964
XBH .941
RBI .920
BB .885
SO .865
HR .821
SB .556
CS .552

TB+ is total bases plus walks and hbp’s. Obviously, any counting stat will correlate decently with runs scored because more playing time leads to more of everything. I need to figure out a way to control for lineup position. If you assume that players with better on-base and stolen base skills tend to bat higher in the order and batters with more power get dropped a little, then these numbers overinflate the importance of OBP and SB (which is damn low anyways) and underestimate XBH and HR.

Here are the correlations for RBI:
TB .964
TB+ .964
OB .933
HR .927
PA .924
RUN .922
HIT .919
AB .920
XBH .898
SO .873
BB .860
CS .353
SB .326

The big changes in this list are that HR shoots up, XBH falls, and both SB/CS fall. The fact that doubles and triples correlate better with RUNs than RBI is interesting — sure, they advance other players towards home, but people tend to ignore that they also do a lot to advance yourself towards home. And boy do managers sure like to use stolen bases as a determinant for batting order position, huh?

Here are the top 20 finishers in runs scored last year:
129 pujols,albert
124 rodriguez,alex
122 jeter,derek
120 lee,derrek
119 ortiz,david
117 damon,johnny
115 rollins,jimmy
114 young,michael
113 figgins,chone
112 ramirez,manny
112 teixeira,mark
111 sizemore,grady
111 suzuki,ichiro
110 bay,jason
108 matsui,hideki
107 dunn,adam
106 cabrera,miguel
104 abreu,bobby
104 giles,marcus
104 sheffield,gary

I count only about six guys that are traditional leadoff type hitters. The rest are bashers. Imagine if Albert Pujols hit first or second for the Cardinals with Rolen and Edmonds healthy. Is it unrealistic to think he could score 150 runs? 160? 170? The all-time record is 192 runs, accomplished in 1894. The post-1900 record is 177 by Babe Ruth in 1921. The post- 1950 record is 152 by Jeff Bagwell in 2000. Bagwell batted inf front of Moises Alou and Richard Hidalgo, who both slugged over .620.

Hat Tips…

… to Dave Pinto for the Jim Thome observation.
… to Baseball-Reference.com for the historical information.
… to Doug’s Stats for last year’s stats in easily downloadable format.

Popularity: 2% [?]

Comments 1 Comment »