It’s been a while since I’ve thrown a sop to my baseball-oriented readers and the season is under way, so I’m gonna make it up to you with a new statistic, because the one thing baseball suffers from is not enough statistics.

I was trying to explain the game to an Icelandic friend of mine the other day. What’s with guys charging the mound? he wanted to know. (This from a hockey fan.) Well, they get upset when pitchers throw at them, I said. So why do the pitchers throw at them? he asked. To instill fear, I said. It’s a lot harder to hit when you’re worrying that the next pitch might come at your head. Don’t pitchers get thrown out for doing that? he asked. Yes and no, I explained. It’s complicated. He asks, can’t they at least keep track of the pitchers who do it all the time and punish them later? Why yes, I mused. Yes they can. And then and there I conceived the VI, or Viciousness Index.

VI relies on the premise that a pitcher’s true wildness can be roughly judged by the number of walks he allows. The fewer he allows, the better idea he has of where the ball is going most of the time. So if he allows very few walks and still hits a lot of batters, the way Pedro Martinez does, one can assume that it’s not entirely or even mostly by accident. Therefore VI = HBP/BB. I submit this will prove an excellent index to pitcher viciousness.

I’d like to oblige you with some actual numbers, but HBP pitcher data turns out to be scarce. It’s not in the Lahman database, Baseball Reference doesn’t have it, and that means I don’t have it either. In lieu of numbers, I offer two hypotheses. First, pitchers with headhunting reputations, like Bob Gibson and Don Drysdale, will have high VIs. Second, the VI leaders, seasonally and career, will be a better set of pitchers than the VI trailers. (This is of course largely because the trailers walk more hitters. A stronger version is that if you match pitchers with similar walk/inning ratios, the ones with the higher VIs will tend to be better.) If somebody out there has HBP data for pitchers and wants to share it with me so I can confirm or deny, I pledge that I will not only publish the lifetime and 2002 leaders for the Viciousness Index, but I will add the data to my pitching search engine. Now is that a deal or what?

(**Update:** I’ve mentioned before how impressed I am with my commenters — it’s a regular little salon around here — but Greg Padgett has outdone himself. He actually grabbed the HB numbers from the Yahoo MLB database and posted the VI leaders and trailers, both raw and adjusted, for 2002 in the comments. He also extracted a few VI comparisons for pitchers with similar walk rates. I’ll have more to say about this later, but on casual inspection the results are inconclusive. There are some excellent pitchers at the top, like Pedro Martinez, Brad Radke, Derek Lowe, and Mark Mulder, but there are some pretty good pitchers at the bottom too, like Bartolo Colon and Jason Schmidt. Each end of the list has its share of washouts too. I suspect career results will be more conclusive, since we’re dealing with a relatively rare occurrence. Adjusted VI doesn’t range much beyond plus or minus 5 in a single season. But go read Greg’s comment.)

For every Pedro, Drysdale and Gibson there are Koufax, Walter Johnson and Christy Mathewson (sp?). These guys were as good or better than the ones you name and were known for NOT being headhunters.

In general I think the idea has merit, but the evidence must be statistical, not anecdotal.

Now here is another angle. I heard Joe Morgan say the other day that Willie Mays had more balls thrown at his head than any other player of his era. If it is true that the better a player you are the more you are thrown at, then can that be quantified? Unlikely since better players have better reflexes so might not be hit as much. Then again, they may take one for the team sometimes. If it could be quantified, there should be a VR Index– Viciousness Received. The higher the VR, the better the player.

I doubt that being thrown at can be quantified. The single-season HBP record of 50 is held by Ron Hunt, who was no kind of hitter and known for diving into pitches to get on base.

When I mentioned Drysdale and Gibson I did not mean to imply that anecdotal evidence would be decisive. But if guys with nasty reputations turned out to have normal VIs while supposed pussycats had high ones, that would argue against its efficacy in measuring what it’s supposed to.

I think that I see a problem with the stat. I would assume two things. First, if two pitchers are equally vicious and pitch an equal number of innings, you would expect them to hit an equal number on purpose. Second, I would assume that unintentional HBPs varies directly with BBs. HBP/BB = (Intentional HBP + Unintentional HBP)/BB, so you have a numerator where part of it varies with the denominator and one that does not. I think that will lead to some odd figures.

Let’s assume for example pitchers A & B who are equally vicious and pitch the same amount. I would guess that they would intentionally hit the same number, so let’s say each hits 15 batters intentionally. Let’s also assume that for every 20 walks, a pitcher hits a batter unintentionally. If A walks 100 in a season, while B walks 40, their viciousness ratings will be (15+5)/100 = .2 and (15+2)/40 = .425. Thus we get quite different figures for pitchers of equal viciousness.

Does this look right?

Just for clarity, a pitcher hitting a batter is statistically known as HB. I, too, am quite surprised that you can’t find this stat on baseball-reference. You can find it on ESPN’s player stats page (extended pitching stats) but that is going to limit you to current players.

Here are some ones of note:

Clemens 14/63

Martinez 15/40

R. Johnson 13/71

Maddux 4/45

The first "pussycat" that came to my mind would be Shawn Estes, for obvious reasons:

S. Estes 9/83

Eddie: Good point. I could rejigger the statistic to compensate for this effect. Calculate the league average HBP/BB ratio. Subtract that number, the expected HBs, from the actual HBs of a particular pitcher, given his BBs. If the remaining number is positive, he’s more vicious than average; if negative, less. Adjust for number of innings pitched and you’re done.

It’s an interesting theory. Using Yahoo’s MLB database, I was able to compute VI for 2002. (I modified it to include only non-intentional walks: HBP/(BB-IBB).) There are 17 pitchers topping 20% (min. 100 IP):

VI……….Pitcher0.385……Martinez, Pedro

0.350……Radke, Brad

0.313……Padilla, Vicente

0.291……Kennedy, Joe

0.276……Astacio, Pedro

0.270……Pavano, Carl

0.250……Lowe, Derek

0.250……Weaver, Jeff

0.239……Lawrence, Brian

0.238……Franklin, Ryan

0.235……Hernandez, Orlando

0.231……Reed, Rick

0.224……Park, Chan Ho

0.214……Lilly, Ted

0.212……Mulder, Mark

0.200……Wilson, Paul

0.200……Bernero, Adam

There’s no clear cutoff at the other end, so here’s the bottom 15:

VI……….Pitcher0.039……Haynes, Jimmy

0.039……Ishii, Kazuhisa

0.036……Halama, John

0.034……Anderson, Brian

0.032……Ponson, Sidney

0.031……Colon, Bartolo

0.028……Schmidt, Jason

0.024……Beckett, Josh

0.023……Penny, Brad

0.021……Rueter, Kirk

0.021……May, Darrell

0.021……Nomo, Hideo

0.020……Santana, Johan

0.012……Sabathia, C.C

0.000……Trachsel, Steve

It doesn’t appear particularly skewed by walk rates, since Chan Ho Park (4.82 BB/9) shows up at the top, while Brian Anderson (1.85 BB/9) helps bring up the rear.

To address the second part of your theory (that, in comparing pitchers with similar walk rates, a higher VI correlates with better performance), here are some matches from up and down the BB/9 spectrum:

BB/9….Pitcher………………VI2.16……Valdes, Ismael……..0.196

2.16……Weaver, Jeff………..0.250

2.18……Williams, Woody……0.174

2.18……Thomson, John……..0.057

2.32……Padilla, Vincente……0.313

2.33……Halladay, Roy……….0.125

2.34……Hudson, Tim…………0.151

2.39……Rueter, Kirk…………0.021

2.39……Mulder, Mark………..0.212

2.39……Oswalt, Roy………….0.086

2.81……Johnson, Jason……..0.154

2.81……Wakefield, Tim……..0.184

3.06……Appier, Kevin……….0.113

3.06……Zito, Barry…………..0.118

3.58……Trachsel, Steve…….0.000

3.58……Sturtze, Tanyon…….0.103

It seems to work at least for some, particularly with the lower walk rates. Like I said before, it’s definitely interesting, and something to think about. Maybe with compensation for park/league effects the index might match up even better, but that would be difficult without something like the PECOTA database at one’s fingertips (Hello, Nate Silver?).

Now, using the modifications that you suggested above, I think this is what you were talking about: I computed the league average HBP/(BB-IBB) rate (0.116), and, using each pitcher’s rate of (BB-IBB)/IP, figured what his expected number of HBP should be. I then found the amount each pitcher was over/under this estimate and the % difference of same. (I’m using the same crop of 100+ IP)

The top group (sorted by %diff):

Pitcher…………..Diff…..%DiffPark, Chan Ho…..12…….323.5%

Miller, Justin………7……..298.0%

Astacio, Pedro……9……..231.4%

Kennedy, Joe…….9……..225.5%

Martinez, Pedro….8……..208.6%

Wood, Kerry……..8……..207.6%

Pavano, Carl……..5……..203.8%

Padilla, Vicente…..8……..201.9%

Wilson, Paul………6……..186.1%

Sparks, Steve……5……..176.0%

Neagle, Denny…..4……..168.7%

Driskill, Travis……3……..167.2%

Lilly, Ted…………..2……..166.3%

Prior, Mark………..3……..166.3%

Radke, Brad………3……..164.0%

Bernero, Adam…..2……..163.6%

Chacon, Shawn….3……..162.6%

Estes, Shawn…….3……..155.3%

Wakefield, Tim…..3……..152.8%

Weaver, Jeff……..4……..152.7%

And the bottom:

Pitcher…………..Diff…..%DiffSchmidt, Jason….-5……..29.9%

Halama, John……-3……..27.4%

Tomko, Brett…….-5……..27.1%

Beckett, Josh…….-3……..25.7%

Santana, Johan….-3……..25.6%

Nomo, Hideo……..-6……..25.2%

Colon, Bartolo…..-6……..23.8%

Penny, Brad……..-4……..21.4%

May, Darrell……..-4……..21.1%

Lieber, Jon……….-4……..19.7%

Anderson, Brian…-5……..17.8%

Rueter, Kirk………-6……..13.6%

Sabathia, C.C……-7……..13.2%

Trachsel, Steve….-6……..0.0%

I’ll leave the interpreting up to someone else.

I nlight of recent events, it would be interesting to calculate how Pedro’s VI on Montreal compares with Boston.