Arm Stress from Consecutive Long Innings
Arm Stress from Consecutive Long Innings
By: Brendan Burke, James Burke Jr & Ernie Smith
An understudied aspect of the stress placed on a pitcher’s arm over the course of a game is the effect of pitching consecutive long innings in which the pitcher has limited time to recover from the additional stress of the long innings. This paper will explore some of the initial research involved in attempting to quantify stress and discuss future research opportunities.
Creating a Discrete Model of Stress
Although there are innumerable complicating factors that can stress a pitcher’s arm over the course of the game—pitch mix, pitch velocity, recovery time between innings, to name a few—this first study will look solely at pitch count per inning to develop an initial metric. Future models will take many of these factors into account.
Using a large dataset of all pitches thrown by a starting pitcher (here defined as the pitcher who pitched the first inning) during the regular season between the start of the 2000 and end of the 2022 seasons, a collection of more than 650,000 innings, it was found that the average number of pitches per inning over this time period was approximately 15.525, with a standard deviation of approximately 6.171 pitches. Using these values, the pitch data can be normalized. That is, a pitcher that pitches an average number of pitches in an inning would earn a score of zero. A below-average number of pitches would receive a negative value and an above-average number of pitches would receive a positive value. The further from the norm a pitching performance is, the larger the (positive or negative) value. We will use this value as an analog for stress.
Min Pitches | 0 | 3 | 6 | 9 | 12 | 15 | 18 | 21 | 24 | 27 | 30 | 33 | 36 | 39 | 42 | 45 | 48 | 51 | 54 |
Max Pitches | 2 | 5 | 8 | 11 | 14 | 17 | 20 | 23 | 26 | 29 | 32 | 35 | 38 | 41 | 44 | 47 | 50 | 53 | 56 |
Value | -4 | -3 | -2 | -1 | 0 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 |
Stress values can be assigned by mapping them from discrete ranges of pitches per inning. The size of the ranges can affect the stress values; a range that is too wide risks assigning the same stress value to innings that saw significantly different pitch counts. A continuous model of stress will be explored later in this paper.
A sample mapping with a bucket size of 3 is shown above. Values within 3 of the approximate average number of pitches (15) are assigned a stress value of zero, with below-average values receiving a negative score and above-average values receiving a positive score. The sample chart tops out where it does given 54 pitches was the maximum number of pitches thrown in a single inning by a single pitcher during the study period (achieved by Daniel Norris, Paul Maholm, and Victor Zambrano), although there is no reason that the chart cannot be extended.
This works well for the first inning, but does not model the cumulative effect of consecutive long innings. For innings after the first, a pitcher’s stress value for that inning will be the base stress value, calculated as it was for the first inning, plus half of the stress value of the previous inning. There is no special significance to the value of one half.
Here are the 25 starting pitchers with the lowest arm stress for their career in our study. Some had injury issues in their career, but none had Tommy John surgery:
Meanwhile, according to our formula, below are the pitchers with the most arm stress:
Not a lot of Tommy John Surgeries in this list, but most of them had long histories of injuries.
As an illustrative example, we’ll look at Paul Maholm’s start from May 9th, 2010. On this day, Paul was making a routine start for the Pittsburgh Pirates, facing off against the division rival Saint Louis Cardinals. Maholm pitched four innings, including a historically long third inning.
Inning | Pitches | Stress | Cumulative Stress |
1 | 18 | 1 | 1 |
2 | 11 | -0.5 | 0.5 |
3 | 54 | 12.5 | 13 |
4 | 12 | 6.5 | 19.5 |
In the first inning, Maholm pitched a slightly above-average 18 pitches. Looking up this value on the table, it earns a stress value of 1, as 18 is between 18 and 21. For the second inning, Maholm pitched a below-average 11 pitches, which earns a base stress value of -1. Reflecting the cumulative stress of pitching, half of the stress value from the first inning is then added to the stress value for the second inning. Maholm earned a stress value of 1, half of which is 0.5, yielding a total stress value for the second inning of -0.5. This below-average inning decreases Maholm’s cumulative stress across the course of the game from 1 to 0.5.
In the third inning, it took Maholm 54 pitches to get all three outs, earning him a base stress value of 13. However, his much shorter previous inning diminishes this stress somewhat. The second inning’s base stress value of 1, divided by 2 to 0.5, drops the 13 to 12.5. Cumulative stress increases by 12.5 from 0.5 to 13, and the process repeats for Maholm’s fourth and final inning of the day. He tosses 12 pitches, earning a base stress value of 0, but his historically long third inning weighs heavily, adding 12.5*½ = 6.5 stress, for a total value of 6.5. This brings Maholm’s total for the day to 19.5.
A Continuous Model of Stress
A discrete model of stress is useful for introducing the general concepts & mechanics, but lacks some precision. Throwing 16 pitches is generally less stressful than throwing 17 pitches, if only marginally, but a marginal difference is still a difference. Creating a continuous model of stress can reflect this difference, but the value of such precision in a game as chaotic as baseball is perhaps debatable.
The formula for the stress value S(n) during inning n is
where P(n) is the number of pitches in inning n, μ is the sample average, and σ is the number of pitches per unit of stress.
Paul Maholm can once again be used as an example. The same process is undertaken, but now more precise stress values can be calculated. For purposes of this example, μ = 15.525, which is the aforementioned average number of pitches per inning for the entire sample. σ = 3.085, which is half of the standard deviation of the number of pitches per inning of the entire sample. σ is chosen without loss of generality; in this linear model it is a scalar.
Inning | Pitches | Stress | Cumulative Stress |
1 | 18 | 0.802 | 0.802 |
2 | 11 | -1.065 | -0.263 |
3 | 54 | 11.737 | 11.474 |
4 | 12 | 5.093 | 16.566 |
For the first inning, the number of pitches is normalized and then scaled to arrive at the stress value:
For the following innings, half of the previous inning’s stress is added to the normalized & scaled stress base value:
Future Research
This purpose of developing this algorithm for quantifying pitching stress is to attempt to examine a potential correlation between repeated long innings and arm injuries. Future research could refine what is considered an average inning and examine the extent to which stress compounds. This could mean changing the arbitrarily selected value of 2 by which the previous inning’s stress value is divided, or adjusting the amount of stress a pitcher is quantified as having experienced to increase exponentially or logarithmically, rather than linearly.
More refined models could also introduce additional confounding factors, such as pitch mix, pitch velocity, or recovery time between innings. Throwing more curveballs, or throwing harder, or having less time to recover between innings because the opposing pitcher had a quick inning, could very well affect the level of stress experienced by a pitcher.
References
Pitch data was collected from Retrosheet.org’s records, specifically their Events file (https://www.retrosheet.org/eventfile.htm#8). The 6th field of a “play” record, called the “pitches” field, records all of the throws made by the pitcher. The codes for pickoff attempts and a few extra markers have to be removed to get an actual count of game-level, stressful pitches.
In the larger study, we are looking at relationships between stressful innings and future need for Tommy John surgery (TJS), so, for now, we filter out starters who had TJS before the date of the start, since we want to study the relationship between stressful innings and the length of time until TJS. So, we are taking out those negative durations. This only removes approximately 7,000 innings (1.1%) out of a total of 672,000 innings thrown by starting pitchers, leaving us with the 665,644 innings in this study. This may impact the maximums and minimums mentioned in this article, but shouldn’t change the conclusions since it is so few innings. Ideally, we might want to study only starting pitchers who have had two TJS, because then we would know when the pitcher’s arm is at a cumulative stress of 0, but that might not yield a large enough sample of innings.
We needed a list of players who have had TJS. We’d like to thank Jon Roegele, who goes by “@MLBPlayerAnalys” on X (Twitter) for his meticulous curation of the list of players who have had TJS. He keeps a current list on Google Drive (https://docs.google.com/spreadsheets/d/1gQujXQQGOVNaiuwSN680Hq-FDVsCwvN-3AazykOBON0/edit#gid=0). This list includes players’ official MLB ID numbers and FanGraphs ID which makes merging data from multiple sources so much easier. Merging data between different sources by name is a slow, painful, manual process since different websites typically have different spellings, full names / nicknames, use of hyphens for foreign players, use of “Jr.”, etc.
Unfortunately, Retrosheet uses a different player ID than Jon Roegele, so a third database needed to be added to bridge the gap. Sean Lahman’s baseball database (https://sabr.org/lahman-database/) includes both the Retrosheet ID and the Baseball Reference ID for each pitcher. We used the version kept inside the R Programming Language (https://cran.r-project.org/web/packages/Lahman/Lahman.pdf).