A lot of my job involves spinning a story I don’t completely believe in. I know, I know, you’re shocked! You mean I don’t actually think that the four to five players I highlight every week are each breaking out by doing something they’ve never done before? And I don’t think that each of them is doing it sustainably? What are the odds?
Some of that comes with the territory. If you’re looking across the universe of major league players for something interesting, some portion of what you find interesting will have happened by random chance. That pitcher who’s striking everyone and their mother out? He might just be on a hot streak. The hitter who’s currently smashing high fastballs? There’s some chance he just felt really good for a week and then will stub his toe when walking out of the clubhouse tomorrow.
I know all that. One thing I wasn’t sure about, though, was how often false signals pop up. Even without searching them out, you might end up seeing a breakout around every corner. There’s a famous quote from Nobel Prize winning economist Paul Samuelson: “The stock market has predicted nine out of the last five recessions.” Is the same general idea true of batted ball data? I came up with a simple experiment to investigate. What follows is a breakdown of the exact method I used, but if you’re just interested in the conclusion, it won’t surprise you: When hitters put up hot streaks of a reasonable length, it’s a good but not infallible sign that they will finish the year as above-average hitters.
I took every batted ball from the 2022 season and broke it out by player. From there, I put them all in chronological order and calculated each player’s best stretch of 50 batted balls. I calculated it for a variety of “advanced” metrics: average exit velocity, xwOBA, and barrels per batted ball. Those are some of the most commonly used underlying statistics – if I’m citing someone who’s really hitting, I’d likely use batted ball outputs like this to assess the validity of their performance, so I excluded things like batting average on contact or wOBA on contact, which might be quite noisy in 50-ball samples.
Next, I looked at every hitter’s overall batting line in 2022. I grabbed every batter who ended the season with a below-average wOBA, then removed hitters from Colorado as a sort of crude park adjustment. I chose to use wOBA rather than wRC+ because the batted ball data I collected doesn’t do any park adjusting, and I didn’t want to mix two unlike things.
With that accomplished, I asked a simple question: How often did these bad hitters have good underlying data at some point during the year? Let’s take Chad Pinder as an example. On the season as a whole, he wasn’t very good; he hit .235/.263/.385, good for a .281 wOBA and an 86 wRC+. But that doesn’t mean he never looked good. His best batted ball streaks made him look like a dangerous hitter. At one point or another, he racked up a 90.7 mph average exit velocity, .469 xwOBA on contact, and 16% barrel rate across a stretch of 50 batted balls.
Those are all elite numbers. For comparison’s sake, Austin Riley posted a 92.5 mph average exit velocity, .468 xwOBA on contact, and 15.7% barrel rate in 2022; he finished sixth in MVP voting. Those batted ball numbers, the ones Pinder put up at his best, are legitimately excellent. Pinder’s season as a whole certainly wasn’t, but if you’d searched his numbers for a breakout throughout the year, you might have found one anyway. “Chad Pinder is Store Brand Austin Riley” basically writes itself when you look at the batted ball data.
That’s one specific example, but here are the broad takeaways: 57% of hitters who finished the season with a below-average batting line had streaks of 50 batted balls where they produced an xwOBA on contact of at least .450; 71% had a 50 batted-ball stretch with a barrel rate of 12% or higher. These are very solid numbers, produced by players who were by definition not very solid.
On one hand, that’s a pretty good argument in favor of reading batted ball statistics with a grain of salt. A full 57% of hitters who were objectively bad had a stretch where they looked objectively good, even at the granular level. On the other hand, the endpoints I used are arbitrary. Fifty batted balls? A .450 xwOBA cutoff? What exactly does a 12% barrel rate mean?
Let’s cut it up differently. What about a .500 xwOBA on contact, a number that would be a top-10 mark if sustained for a full season? 23.7% of below-average hitters managed at least one streak of 50 batted balls with a .500 xwOBA on contact. Twenty percent of below-average hitters had at least one stretch of a 20% barrel rate, roughly Aaron Judge’s career mark. We can seemingly make these numbers say anything they want. We need some rigor.
To handle that, you could do a split-half reliability test. But I’ll be honest with you, that’s not really the question I’m interested in answering. That’s an overly abstract question, and the answers to it don’t always click intuitively. It’s also a lot of math. I’m interested in answering one specific question: When I look at solid batted ball data, what are the chances that I’m seeing a player who is good rather than the hot streak of someone subpar?
To look at that and that specifically, I asked a slightly different question: What percentage of hot hitting streaks were produced by good hitters? That’s still not the right question, but it gets closer to what I’m looking for, and it’s easier. So let’s answer that!
I went back to the same data and added a filter. I threw out every batter who didn’t reach 400 plate appearances. That let me tag every single remaining streak of 50 batted balls with either “good hitter streak” (accomplished by a hitter who posted an above-average wOBA in 2022) or “bad hitter streak” (the opposite).
With that accomplished, I looked for every streak of 50 batted balls that produced an xwOBA of .450 or higher. I then counted how many of those were accomplished by good hitters. Great news – 81.3% of those high-quality hitting streaks were accomplished by hitters who ended the season with above-average batting lines, while 79.3% of hot barrel streaks (12% or higher barrel rate) were accomplished by that group. Maybe seeing really should be believing.
Only no, this is still the wrong way of looking at things. Let me give you an example. In the stretch of 50 batted balls that ended on May 17 last year, Paul Goldschmidt produced a .490 xwOBA on contact and a barrel rate of 16%. No one wrote an article about him or wondered whether he was now a good hitter. That’s because the question was never in doubt. Obviously, most of the hot batting stretches are produced by good hitters, and we don’t think twice about it. They’re good hitters! Of course they hit well.
The question I’m really trying to answer – or at least, a close variant of it – is this: When a hitter who I don’t perceive as being particularly good turns in a stretch of robust batted ball data, how likely are they to end the year as a good hitter? To proxy this, I took an even smaller group of hitters, those who accumulated at least 400 plate appearances in both 2021 and 2022. Then I looked at the subset of hot streaks produced by hitters who were below average in 2021.
In other words, the hot streaks we’re left with are basically what I’m looking for. They’re good stretches of hitting, and the hitters producing them played frequently but not particularly well in 2021. That’s quite close to the data you might look at to say that someone is breaking out. So how many of them actually did?
If you define breaking out as posting an above-average wOBA for the year, the numbers look good. Exactly two thirds of the time — 66.7% — a hitter who a) was below average in 2021 and b) produced a streak of 50 batted balls with an xwOBA of .450 or higher ended the 2022 season with an above-average batting line. That number is slightly higher – 69.7% – if you focus on barrel rate instead.
Increase the sample size to 75 batted balls, and the odds don’t improve as much as you might think. That works out to 74% using xwOBA and 76% using barrel rate. At 100 batted balls, both criteria produce a roughly 80% hit rate. I tried higher cutoffs for xwOBA and barrel rates as well, but they ran into sample size issues; there simply weren’t enough hitters who were bad in 2021 posting xwOBA stretches above .500 in 2022 to say much about that data.
That was all a big jumble of numbers, but let’s draw some conclusions and maybe make a pretty table to wrap things up. If you see a hitter who used to be bad doing good things, it’s quite reasonable to ask yourself whether they are now good. Are they? Maybe! Baseball is a probabilistic sport, which means that some bad hitters who look good are still bad, while some have actually become good. There’s a good shot that you’re seeing something at least somewhat real, though. Here’s that wishy-washy sentence in table form. I didn’t have space to title it “Odds That a Bad Hitter in Year One Will End Year Two as a Good Hitter, Based on Batted Ball Streaks of Various Lengths,” but you get the idea:
Bad to Good Hitters by Streak Length
Let me leave you with a few caveats – after all, this is a Ben Clemens article, so I don’t want to suggest any overly strong conclusions. There are overlapping data issues: a hitter with 50 batted balls worth of excellence is more likely to be an above-average hitter, because those batted balls count towards being an above-average hitter. Focusing only on batted balls ignores other things hitters could be doing to get better or worse. I’m also ignoring opposition; some of these hot streaks might be less about talent and more about facing the bullpen shuttle for a week straight. And heck, there’s a small selection bias to contend with. If you batted 400 times in 2021 and put up a subpar batting line, but your team still gave you another 400 plate appearances in 2022, there’s a strong chance that they expected you to improve in the first place, maybe because you underperformed in 2021.
That’s all true, but I think the central point stands: You can believe batted ball data up to a point. It’s true that hitters, even bad ones, who sustain loud contact for a stretch tend to be good hitters in that year. It’s also not a given. Feel free to dream on your team’s fourth outfielder turning into a bona fide starter, because it might happen. But finding why things have changed is still important – otherwise you’ll be getting false signals a third of the time.