The Mathematical Improbability of Parity

July 28, 2020

Here’s something you hear a lot that also has the benefit of being true: baseball is a sport of haves and have nots. There are super teams scattered around the league, the Dodgers and Yankees and Astros of the world. There are plenty of teams that aren’t trying to compete this year; the Tigers and Royals spring to mind, but it’s not like there aren’t others.

In a technical sense, however, the league achieved a rare degree of parity over the weekend — and as we all know, being technically correct is the best kind of correct. After each team played three games, the entire league stood at either 1-2 or 2-1, with fifteen teams apiece in each camp. In that odd, specific sense, this is one of the best years ever for parity in baseball.

Come again? Per no less an authority than MLB.com, this is the first time in the last 66 years that no team started 3-0 in their first three games. In that contrived sense, then, this is the most parity since 1954. Given that it was far easier to have no team start 3-0 then (there were only 16 teams), you would even be justified in saying that this was the most balanced start of all time.

That sounds, without putting too much thought into it, very impressive. 1954! Man hadn’t landed on the moon. The LOOGY hadn’t been invented, or the personal computer. It was a very different time.

As fun as it would be to leave it at “Wow, that was crazy,” I thought I’d spoil the fun with a little math. First things first — what if we think every team is evenly matched? Let’s leave home field advantage out for now — we’re just approximating anyway, and that makes the math cleaner. The math for a single series is easy; if each game is a coin flip, all we need to do is find the odds of getting either three heads or three tails in a row.

That works out to 0.5*0.5*0.5, or 0.125, for each outcome. In other words, any given three-game series between evenly matched teams ends in a sweep 25% of the time; with each team sweeping the other 12.5% of the time. From there it’s a snap to work out the odds of it happening to the whole league.

The odds of neither team sweeping are 75%. We need that to happen in 15 straight series. Easy, peasy — 0.75 raised to the 15th power will do it. That’s 1.3%, which is indeed unlikely. To get to a 50% chance of that happening at least once, you’d need to play 54 seasons. To get to a 90% chance, you’d need to play 176 seasons. It’s unlikely, is my point.

“It’s unlikely” is always going to be our answer. But let’s get a little more precise in our estimation. Every team isn’t a coin flip against every other team. The Dodgers, for example — the juggernaut Dodgers, who added no less a presence than Mookie Betts to an already loaded squad — faced off against the rebuilding Giants. Even ignoring home field advantage, that’s hardly an even matchup.

To estimate each team’s chances in each series, we can use a neat little mathematical trick called the odds ratio. Odds ratio is a mathematical way of answering, essentially, this question: what happens when a .600 team plays a .400 team? The answer clearly isn’t a 60% chance at winning; a .600 team is, in theory, 60% to beat an average team, and their opponent is worse than average. The formula looks like this:

Team A Win Percentage = (A – A * B) / (A + B – 2 * A * B)

A is the first team’s winning percentage, B is the second team’s winning percentage, and the output will give you Team A’s chances of beating Team B in a single game. We’re ignoring starting pitching, naturally, which is just the cost of doing business at such a high level; if you want to zoom in and take each matchup into account, you no doubt can, but that’s more detail than we’re getting into today.

Okay — let’s use the real Dodgers and Giants win percentages to work out the odds. Not their actual records this year, of course — that’s wrong on, well, tons of levels. First of all, they’re both .500. Second of all, those games were against each other. Third of all, that doesn’t reflect their true talent. I don’t have time to explain why I don’t have time to explain the rest of the reasons it won’t work, but just trust me on this one; we’re going to need a projection.

Luckily, our very own website has the goods. On our projected standings page, you can see a “rest of season win percentage” column driven by our Depth Charts playing time and skill projections. Handily enough for us, those win percentages are against neutral opposition, which means we can plug them right into our formula.

The Dodgers are a .606 true talent team. The Giants are a .447 true talent team. Plug them into the formula, and the Dodgers are 65.5% likely to win a random game against the Giants. Armed with that, we can work out the probabilities of a sweep. The Dodgers have a 28.1% chance of sweeping, while the Giants check in at a mere 4.1%. Add them up, and that’s a 32.1% chance of a sweep occurring.

You know what’s coming next. We’re going to do the same thing for each of the 15 matchups that took place over the weekend. Away winning percentage, home winning percentage, and Shazam! We have our odds of a sweep:

Sweep Odds By Series

Away	Home	Home WP	Away Sweep %	Home Sweep %	Sweep %
Giants	Dodgers	65.6%	4.1%	28.2%	32.3%
Yankees	Nationals	49.0%	13.3%	11.8%	25.0%
Pirates	Cardinals	56.4%	8.3%	17.9%	26.2%
Braves	Mets	48.3%	13.8%	11.3%	25.1%
Tigers	Reds	61.3%	5.8%	23.0%	28.8%
Blue Jays	Rays	58.4%	7.2%	20.0%	27.1%
Marlins	Phillies	56.7%	8.1%	18.2%	26.3%
Brewers	Cubs	51.7%	11.3%	13.8%	25.1%
Royals	Indians	58.9%	6.9%	20.5%	27.4%
Orioles	Red Sox	62.9%	5.1%	24.8%	30.0%
Rockies	Rangers	49.1%	13.2%	11.8%	25.0%
Twins	White Sox	46.3%	15.5%	9.9%	25.4%
Diamondbacks	Padres	52.3%	10.9%	14.3%	25.2%
Mariners	Astros	65.9%	4.0%	28.6%	32.6%
Angels	Athletics	52.2%	10.9%	14.2%	25.1%

From there, it’s easy. The odds of not having a sweep in a given series are just one minus the odds of a sweep. We multiply the odds of each series ending without a sweep together to figure out the chances of no one being 3-0 after three games. It works out to a 0.863% chance. To put it in the same terms as earlier, you’d need to play 80 seasons to have a 50% chance of seeing this at least once. To get to a 90% chance of seeing it at least once, you’d need to play 266 seasons. It’s not very likely, is the point.

You might notice something interesting about the odds of each series ending in a sweep. The lowest odds anywhere were 25%. That also happens to be the chances of an evenly matched series ending in a sweep, and that’s no coincidence. Take a look at the range of sweep probabilities for a given home team winning percentage:

The graph is symmetrical, of course, because we don’t care which team is doing the sweeping. The key is that closer matchups lead to lower chances of a sweep. That’s intuitive, but it’s still neat to me. It might seem like the added prospects of a sweep by a better underdog would offset the lost sweep percentage from a less dominant favorite, but it simply doesn’t work that way.

Think of it like this: imagine a 10-by-10 grid, with each square representing a possible outcome of a two-game series. The first game will run along the x axis at the bottom, and the second along the y axis along the left side. The left and lower sides represent home team wins, the upper and right sides away team wins. It looks like so:

Odds of a Two-Game Sweep

Odds	10%	20%	30%	40%	50%	60%	70%	80%	90%	100%
100%	1-1	1-1	1-1	1-1	1-1	0-2	0-2	0-2	0-2	0-2
90%	1-1	1-1	1-1	1-1	1-1	0-2	0-2	0-2	0-2	0-2
80%	1-1	1-1	1-1	1-1	1-1	0-2	0-2	0-2	0-2	0-2
70%	1-1	1-1	1-1	1-1	1-1	0-2	0-2	0-2	0-2	0-2
60%	1-1	1-1	1-1	1-1	1-1	0-2	0-2	0-2	0-2	0-2
50%	2-0	2-0	2-0	2-0	2-0	1-1	1-1	1-1	1-1	1-1
40%	2-0	2-0	2-0	2-0	2-0	1-1	1-1	1-1	1-1	1-1
30%	2-0	2-0	2-0	2-0	2-0	1-1	1-1	1-1	1-1	1-1
20%	2-0	2-0	2-0	2-0	2-0	1-1	1-1	1-1	1-1	1-1
10%	2-0	2-0	2-0	2-0	2-0	1-1	1-1	1-1	1-1	1-1

I’ve shaded in the squares that represent sweeps by either team. Next, let’s make the home team 60% to win the second game:

Odds of a Two-Game Sweep

Odds	10%	20%	30%	40%	50%	60%	70%	80%	90%	100%
100%	1-1	1-1	1-1	1-1	1-1	0-2	0-2	0-2	0-2	0-2
90%	1-1	1-1	1-1	1-1	1-1	0-2	0-2	0-2	0-2	0-2
80%	1-1	1-1	1-1	1-1	1-1	0-2	0-2	0-2	0-2	0-2
70%	1-1	1-1	1-1	1-1	1-1	0-2	0-2	0-2	0-2	0-2
60%	2-0	2-0	2-0	2-0	2-0	1-1	1-1	1-1	1-1	1-1
50%	2-0	2-0	2-0	2-0	2-0	1-1	1-1	1-1	1-1	1-1
40%	2-0	2-0	2-0	2-0	2-0	1-1	1-1	1-1	1-1	1-1
30%	2-0	2-0	2-0	2-0	2-0	1-1	1-1	1-1	1-1	1-1
20%	2-0	2-0	2-0	2-0	2-0	1-1	1-1	1-1	1-1	1-1
10%	2-0	2-0	2-0	2-0	2-0	1-1	1-1	1-1	1-1	1-1

As you can see, the same number of cells remain shaded. You could move this to 80%, or 40%, and get the same result. Here, we added five cells to the home team’s sweeps while subtracting five from the away team’s. Perfectly balanced, you might say.

But watch what happens when we match the first game’s winning percentage with the second game:

Odds of a Two-Game Sweep

Odds	10%	20%	30%	40%	50%	60%	70%	80%	90%	100%
100%	1-1	1-1	1-1	1-1	1-1	1-1	0-2	0-2	0-2	0-2
90%	1-1	1-1	1-1	1-1	1-1	1-1	0-2	0-2	0-2	0-2
80%	1-1	1-1	1-1	1-1	1-1	1-1	0-2	0-2	0-2	0-2
70%	1-1	1-1	1-1	1-1	1-1	1-1	0-2	0-2	0-2	0-2
60%	2-0	2-0	2-0	2-0	2-0	2-0	1-1	1-1	1-1	1-1
50%	2-0	2-0	2-0	2-0	2-0	2-0	1-1	1-1	1-1	1-1
40%	2-0	2-0	2-0	2-0	2-0	2-0	1-1	1-1	1-1	1-1
30%	2-0	2-0	2-0	2-0	2-0	2-0	1-1	1-1	1-1	1-1
20%	2-0	2-0	2-0	2-0	2-0	2-0	1-1	1-1	1-1	1-1
10%	2-0	2-0	2-0	2-0	2-0	2-0	1-1	1-1	1-1	1-1

We just got rid of four sweep cells for the visiting team while adding six for the home team. That’s more total sweeps, and since the move is symmetrical, any move away from a 50% winning percentage does some version of this, even if it’s to a lower number. Move from a 50% to 51% winner percentage? It’s this same thing, only with a 100-by-100 grid. A three-game series? That’s a three-dimensional grid, but it’s a grid all the same, and a move off of 50% will always increase the proportion of shaded parts.

What does any of this matter in real life? Oh, it doesn’t. It doesn’t at all. You don’t get to know teams’ winning percentages before each series, you don’t get to draw these perfect boxes that show win probabilities. Different pitchers muddle with the math. There’s nothing really actionable about this discovery. But it’s neat, and I spent a bit thinking about it, and I like neat math tricks. So here we are! With a neat math trick about the shocking parity in this weird season.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG