MLB

2014 MLB Regression Candidates

Was Yasiel Puig's rookie season lucky? The numbers say yes.

Yesterday, I wrote about the players that were likely to rebound in 2014 based on their BABIP and line-drive percentages in 2013. It was all sunshine and daisies, and everybody was happy! You get a breakout, and you get a breakout!

Today is a bit different. Today, it’s more the spiders and dead puppies of regression in the negative sense than the positive. These are the players that outperformed what they should have done last year and are going to come crashing violently back to the troposphere.

In case you forgot the method to our madness, here’s a refresher. The equation used for this was batting average on balls-in-play (BABIP) divided by line-drive percentage. Usually, line-drive percentage is expressed as a percentage (21.2 percent, for example), but for scaling sake, I used it in the same format as BABIP (as in .212).

The reasoning behind this was pretty simple. You would think that the more line drives you hit, the higher your BABIP would be. After compiling the BABIP’s and line-drive percentages for each of the players in the league with 300-plus plate appearances, the correlation between the two sets of data was .455. This certainly isn't a perfect correlation, but it is significant enough to show that there is a relationship between the two.

If you want to see where various players ended up, you can click here to view the Google doc I used to compile the data. Sheet 1 is a list of each player by team. Sheet 2 (you can change between sheets by clicking which sheet you want in the lower left-hand corner of the screen) shows each of the players sorted by their luck with the "unlucky" players being at the top and the "lucky" players at the bottom. Sheet 3 is the same data except by team, again with the "unlucky" teams at the top and the "lucky" teams at the bottom.

The average BABIP/LD percentage throughout the league was 1.423. If a player’s ratio was lower than that, they could be looking for an uptick in success this year. If it were higher, the opposite is true.

The equation for the “line of best fit” for the data is BABIP = (LD%)*.5534 + .1837. I’ll be using this below to show where a player’s BABIP would have been if they had an average ratio as opposed to what they ended up with.

Now, let’s crush some dreams and lay down the law (of averages).

Regression Candidates

1. Yasiel Puig, Dodgers

Troll troll troll troll. But really. At times last year, it seemed as if Puig was livin’ on a prayer and feasting on magic. Well, that’s because he was. Puig had the highest BABIP/line-drive percentage ratio of anybody with at least 300 plate appearances in the majors.

The runner-up for NL Rookie of the Year finished with an ungodly .383 BABIP. This would have been tied with Joe Mauer for the second highest in the majors if Puig had recorded enough plate appearances. The only problem is that Mauer finished with a 27.7 line-drive percentage and Puig was at 19.1.

Based on our less-than-scientific-line-of-best-fit, Puig should have finished the year with a .251 average and a .331 OBP. Instead, those totals were at .319 and .391 respectively. Part of this is due to Puig’s speed, so he may not regress all the way back to .251. That said, it would be foolish to believe that he would continue to produce at the rate he did last year. His output will likely end up somewhere between his bi-polar numbers in August (.320/.405/.515) and September (.214/.333/.452).

2. Michael Cuddyer, Rockies

This one is a bit more painful because, as a life-long Twins fan, I still love Cuddy. He’s a good man, but the 2013 NL Batting Champ had a bit of help from Ol’ Lady Luck in getting there.

In his second year with Colorado, Cuddyer finished with a .383 BABIP, more than 16 percent higher than his previous career high of .328 all the way back in 2006. This was despite having only his fourth highest line-drive percentage in a season where he had at least 250 plate appearances.

When he was not dealing with an abdominal injury in 2012, Cuddyer had a comparable line-drive percentage at 20.4. However, his BABIP that year was a full 95 points lower than it was this year.

Based on his 20.2 line-drive percentage, Cuddyer’s BABIP should have been at .295, making his average .264 and his OBP .328 (as opposed to .331 and .389 respectively). These are both a stitch below his career averages, which makes more sense for a guy in his age 34 season. Expect Cuddyer to play more like a 35 year old next year than a 28 year old when doing your fantasy drafts this spring.

3. Kelly Johnson, Yankees

The Yankees don’t care about your gosh durn Sabermebullcrap numbers, nerd!

Well, they probably should. The man that may be tabbed to replace Robinson Cano had the second lowest line-drive percentage of players with at least 400 PA’s last year (keep on doing you, Dan Uggla).

Johnson, despite a poor 15.2 line-drive percentage, finished with a .276 BABIP. According to our line-of-best-fit, that number should have been .268. This would have lowered his average to .227 from .235 and his OBP to .297 from .305. The difference may not seem like a lot, but when your rates are already that low, every tenth counts. Something tells me that increasing Johnson’s plate appearances from 407 isn’t a good idea if his OBP is going to be sub-.300.

There’s a reason Johnson is no longer with the Rays. Do you think Andrew Friedman and Joe Maddon, two of the best in the biz, want to pay $3 million for a guy that can’t get on base? No siree, Bob. The Yankees infield still has a lot of work left to do if they want to return to relevance this year.

4. Oswaldo Arcia, Twins

No! Why? Why? Let me live in a delusional world where I believe Arcia will lead the Twins to only 90 losses this year! Let the ignorance live!

Arcia’s rookie season in Twinkie Town had its highs and lows. He finished with a gross 31.0 strikeout percentage that warranted a demotion back to AAA Rochester in July. Sure, the 14 home runs and 17 doubles in just 351 at-bats were nice, but Arcia could see more struggles this year.

The 22 year old posted a .336 BABIP while being limited to a 17.1 line-drive percentage. His final rates were a .251 average and a .304 OBP. Using our equation, those numbers should have been .214 and .270 respectively. That doesn’t exactly spell resurgence, I guess. Arcia needs to cut down on the strikeouts and find some patience quickly if he wants to lock down a corner outfield spot. Until then, I’ll be sobbing in the bathtub clutching my Shannon Stewart bobblehead.

Other candidates: Wil Myers, Jean Segura, Juan Lagares, Yan Gomes