Choose Your Own Adventure: 2025 MLB GM Edition

Dive into how you could use standard analytics (t-tests and ANOVAs) to compare groups & tackle real-world MLB GM challenges going into the 2025 season.

Mar 24, 2025

READ TIME: ∼6 MINUTES | WORDS: 1,286

New York Yankees: After a whirlwind offseason, the Yankees insist they're in a better position to win compared to last year following their bevy of signings and trades that project a more complete and all-around better roster than 2024.
1. However, a significant question remains about who will play third base. So, let's explore an analysis a Yankees GM might conduct when considering players to bring in to fill their needs at third base.
Boston Red Sox: Speaking of third base, the Red Sox made a surprising move this offseason when they signed star third baseman Alex Bregman to allegedly play second base.
1. This quickly caused a stir in Red Sox Spring Training, as team officials and media members hinted that the plan was not 100% set on Bregman playing second base.
2. Meanwhile, current star third baseman Rafael Devers insisted he wouldn’t be switching positions. Thus, we will explore an analysis a Red Sox GM might conduct when considering which is the better option at third base.
Miami Marlins: Starting with the fish, we'll then shift our focus from roster construction and winning to roster construction and revenue.
1. A playoff participant just two years ago, the Marlins have elected to go with an entire team of replacement-level players (0 WAR - Wins Above Replacement) — at least once they trade away their final valuable pitchers.
2. Why don't we explore an analysis a Marlins GM might conduct when considering this approach to building a team?
Los Angeles Dodgers: In a seemingly alternate universe lives the 2025 Dodgers — putting the entire league to shame as they spare no expense on players.
1. Going all-in on a roster stacked with stars sure seems fun (MLB The Show, anyone?), but we can make a down-to-earth analysis a Dodgers GM might conduct on whether there's actual viability in spending literally billions of dollars for on-field talent.

WHY IT MATTERS

Neither winning nor making money in the MLB is guesswork anymore.

Proper research questions and statistical tests (like t-tests and ANOVAs) give GMs the evidence to determine whether roster moves genuinely move the needle on performance and profits.

➡︎ Yankees

Given Yankee Stadium's lefty-friendly dimensions, a responsible Yankees GM should always be considering left-handed hitters as our offensive preference. Especially when our current power production predominantly comes from our monster righties, Judge and Stanton.

Research Question: Does signing a left-handed third baseman yield a significantly different impact on the Yankees' winning percentage compared to a right-handed third baseman, given Yankee Stadium's park effects?
T-Test: With two distinct groups to compare, left-handed and right-handed third basemen, a t-test works perfectly. It's explicitly designed to compare the difference in mean between the two groups. By running a t-test on the two groups (lefty vs. righty 3B), we can see if a difference in projected production adjusted for Yankee Stadium indeed translates to more wins.
- Data Setup: Gather MLB third-base candidates' annual offensive metrics (e.g., wRC+, OPS) over multiple seasons, adjusting to Yankee Stadium. Label each candidate "Lefty" or "Righty." Then, estimate how each candidate's production translates into additional runs and wins created for the Yankees.
- Model: Compare winPct ~ 3BHandedness (winning percentage for two groups: lefty vs. righty).

➡︎ Red Sox

As the Red Sox GM, we need to at least consider the possibility that our team could be better off from a winning percentage perspective with a defense-first third baseman like Bregman or an offense-first third baseman like Devers.

Research Question: Does a defense-heavy third baseman produce a significantly different win rate than a bat-first third baseman for the Red Sox?
T-Test: Once again, we compare two distinct groups (defense-first and offense-first third basemen), so a t-test is the best way to analyze run prevention (defensive metrics) vs. run creation (offensive metrics) makes a meaningful difference in projected runs scored and runs allowed.
- This would allow you to calculate and compare the Red Sox's projected winning percentage for the different scenarios using Bill James' Pythagorean Winning Percentage — read more about it here.
  - Data Setup: Compile defensive metrics (DRS, OAA) and offensive metrics (wOBA, HR, wRC+) for available 3B. Split them into "Defense-First" or "Offense-First" categories based on thresholds.
  - Model: Compare winPct ~ 3BType (winning percentage for two groups: defense-first vs. offense-first).

Roster Construction & Revenue

➡︎ Marlins

It's no secret that our Marlins leadership group has decided that yet another rebuild is the best option for the organization's future. As the GM leading the rebuild, it is up to us to determine the best approach that balances reconstructing the roster while not intentionally throwing away entire seasons with zero intention of being competitive and no regard for our fans (or, you would think).

Research Question: Does a roster of replacement-level players, instead of a standard mix or even league-average players, get penalized for lack of competitiveness by way of generating significantly less revenue than other teams?
ANOVA: Since we need to compare the differences in averages for one factor (revenue) at multiple levels (roster types), an ANOVA is a better fit for our testing. Allowing us to compare the effects of our three roster types (replacement level, standard mix, and league average) on revenue generation.
- Data Setup: Look at team-level revenue (ticket sales, local TV ratings, merchandise) from past seasons while categorizing each team's roster into "Replacement," "League Average," or "Standard Mix" based on team makeup of player all-star appearances, WAR > 3, and salary.
- Model: Compare Revenue ~ RosterType (revenue for three roster types: Replacement, Standard, Average).
  ballplayeruniverse
  A post shared by @ballplayeruniverse

➡︎ Dodgers

An even worst-kept secret across the MLB landscape is that our Dodger leadership group has no ceiling on our budget for player acquisition. However, that doesn't inherently mean we should look to load up on as many stars as possible. In our proper due diligence as the GM designing the organization's path forward, we should determine if investing heavily in talent directly boosts revenue as well.

Research Question: "Does going 'all-in' on a superstar roster produce significantly higher revenue compared to more roster approaches of a standard mix, league-average players, or replacement-level players?
ANOVA: Now comparing four different groups around one factor, an ANOVA is still our best-fitting test. Assessing the average revenue generated by each of our roster types can inform us whether there is any downside to our all-in strategy that we've been overlooking.
- Data Setup: Collect annual revenue data for teams with different roster compositions (star-laden, league-average, replacement-level, standard mix). Classify them by team makeup of player, all-star appearances, WAR > 3, salary, and team total WAR.
- Model: Revenue ~ RosterType (revenue for four roster types: Stars, Replacement, Standard, Average).

THE BOTTOM LINE

These four GM scenarios highlight how analytics—through t-tests and ANOVAs—can guide real decisions on the field and in the books.

But I hope to have also highlighted a broader competitive backdrop: are some teams realizing they can forget about winning and just cashing in on revenue sharing?

MLB super-agent Scott Boras says it's time to leverage the data-driven methods used across the game to develop a new team competitiveness metric that holds owners accountable for using revenues to build competitive rosters.
Allowing the fans and the entire baseball world to see who's truly playing to win and who's gaming the system.

GO DEEPER…

Watch Boras discuss this issue in this recent clip from Foul Territory:

BAR | brilliance above replacement

Discussion about this post

Ready for more?