The Erdos number measures how closely associated you are with the late number theorist Paul Erdos, who collaborated with hundreds or thousands of other people and thereby sort of sits at the center of the mathematical universe. If you wrote a paper with Erdos, your Erdos number is 1; if you wrote a paper with someone with an Erdos number of 1, your number is 2, and so on.
The show business equivalent of Erdos is Kevin Bacon. Supposedly every actor can connect to Kevin Bacon in a small number of steps. For example, Paul Newman has a Bacon number of 2 because he was in Fort Apache, The Bronx with Clifford David, who was in Pyrates with Kevin Bacon.
A person's Erdos-Bacon number is the sum of his Erdos number and Bacon number. Not many people have an Erdos-Bacon number. You have to have done something in both math and show business.
I have a shaky claim to an Erdos-Bacon number of just 7. That is not bad; even the legendary Carl Sagan could only manage a 6. How did I get a 7?
My Erdos number is an indisputable 4, because I wrote a paper with Greg Forest of UNC who has the link Forest>Richard Montgomery>Persi Diaconis>Erdos.
But I can conceivably claim a Bacon number of 3. I have to stretch it a little here. When I was in the 7th grade, I was in a school production of Macbeth with Tammy Pescatelli. (She was Lady Macbeth; I was Banquo.) It does too count as a movie, because the AV club taped it. Tammy has a Bacon number of 2 (Pescatelli>Dan Cortese>Bacon.) So that gives me 3, and my EB number is 3 + 4 = 7. That ties me with radical motormouth Noam Chomsky!
Now, if by chance I were to appear in a real production with Tammy and not just a school play, I could really cement my claim to an epically low E-B number. I just need a little more luck. God knows I was lucky to get an Erdos number at all, let alone a low one. I don't want to game it just to get the number. As my son said, "Anybody who games their Erdos-Bacon number, there's something wrong with."
I can't remember if I ever collaborated on a math or science project with Tammy, but I probably did at some point, because we were in school together for 12 years. Tammy, if you can dig one up, I will back your claim to an EB number of 8. [Correction: 7]
Tuesday, October 10, 2017
Thursday, September 14, 2017
Letdowns After Streaks
I've been telling people for a few days now that the Indians need to lose a game and break their streak, so that they have time to get through the letdown and get back to normal before the playoffs.
But is there really a letdown after a streak? It feels like there is, but maybe that's just an effect of elevated expectations.
I checked the won-lost records of teams in the 11 games after the 10 longest streaks after 1900. Ignoring the game that ended the streak, which by definition has to be a loss, what was the overall record? (Two of the streaks ended very late in the season and there weren't 11 games left.)
It turns out to be .471 (41-46), which is not great baseball, especially for a team strong enough to pull off a long streak. But it's not quite as dire as I imagined.
For the record, the streaks I looked at were
1916 Giants (both the famous 26-game record streak and an earlier 17-game streak that same season)
1935 Cubs
2002 A's
1906 White Sox
1947 Yankees
1904 Giants
1953 Yankees
1907 Giants
1912 Senators
This is a small sample, so take it for what it is.
As I write this the Indians are down 2-1 to KC. We can only hope...
But is there really a letdown after a streak? It feels like there is, but maybe that's just an effect of elevated expectations.
I checked the won-lost records of teams in the 11 games after the 10 longest streaks after 1900. Ignoring the game that ended the streak, which by definition has to be a loss, what was the overall record? (Two of the streaks ended very late in the season and there weren't 11 games left.)
It turns out to be .471 (41-46), which is not great baseball, especially for a team strong enough to pull off a long streak. But it's not quite as dire as I imagined.
For the record, the streaks I looked at were
1916 Giants (both the famous 26-game record streak and an earlier 17-game streak that same season)
1935 Cubs
2002 A's
1906 White Sox
1947 Yankees
1904 Giants
1953 Yankees
1907 Giants
1912 Senators
This is a small sample, so take it for what it is.
As I write this the Indians are down 2-1 to KC. We can only hope...
Streaks Part 2
In my last post I estimated the odds of winning streaks of various lengths by simulating a large number of seasons. I came up with a 0.75% chance per season of a win streak of 19 games or longer, but the actual history is 8 such streaks in 137 seasons (6%). That is significantly more than my estimate.
One obvious correction would be to tweak my uniformly distributed team strengths to fatten up the tails. An "outlier" good team would be more likely to have a long win streak. But my distribution was already uniform. A .450 team is as likely as a .500 team, which is to say that my distribution has very fat tails. (I verified that this reasoning was true with a simulation, because I never trust my statistical intuition.) If I fattened up the tails any more, you'd have teams winning 120 and 130 games a season, which never happens.
So I did two things, both based on the fact that a season is not made up of 162 random matchups as my model originally assumed, but of about 50 3- and 4-game series with each series being either all home games or all away games against the same team. That seems like it would increase the likelihood of a streak, because you could line up a bunch of home series against weak teams.
First, I changed the season from 162 individual matchups to 54 3-game series between the same teams. That had basically no effect on the likelihood of a 19-game win streak. Then, I gave the home team a slight edge by increasing its strength 5% and decreasing the visiting team's strength by 5%. This is based on the average home record being about .550 compared to .450 on the road, which I got here. That barely moved the needle. The likelihood of a 19-game win streak was still a little less than 1%, compared to historical experience of 6%.
I didn't include the effects of home stands or road trips; that is, the fact that teams usually play three or four series in a row at home or away instead of cycling between the two. But I don't see that being a big player.
A couple of other possible explanations are that streaks either psychologically build on themselves (probably impossible to verify with any rigor, due to the luck factor) or that team strength waxes and wanes during the season instead of being a fixed value throughout. This second effect seems promising because many injuries take a few weeks to heal. The worst teams at any given time probably include good teams that have a lot of injured players. When those players get better, all of a sudden the team is good again. Then there's the streakiness of individual players. I speculate that many slumps are due to players being injured but functional and not telling anyone.
At some point I'll build up my team strengths from player stats, instead of assigning them randomly. Then we'll be cooking' with gas, as they say.
One correction: I said that the 1916 26-game winning streak of the Giants was interrupted by a tie. That is not exactly correct. The "tie" was actually a suspended game that, by the rules of the day, had to be replayed from the start instead of picked up from where it left off as it would be today. They did replay the game (I didn't know this when I made my original remarks) and the Giants won. That's not a tie in my book. So the record really is 26 wins in a row and there should be no asterisk by it.
One obvious correction would be to tweak my uniformly distributed team strengths to fatten up the tails. An "outlier" good team would be more likely to have a long win streak. But my distribution was already uniform. A .450 team is as likely as a .500 team, which is to say that my distribution has very fat tails. (I verified that this reasoning was true with a simulation, because I never trust my statistical intuition.) If I fattened up the tails any more, you'd have teams winning 120 and 130 games a season, which never happens.
So I did two things, both based on the fact that a season is not made up of 162 random matchups as my model originally assumed, but of about 50 3- and 4-game series with each series being either all home games or all away games against the same team. That seems like it would increase the likelihood of a streak, because you could line up a bunch of home series against weak teams.
First, I changed the season from 162 individual matchups to 54 3-game series between the same teams. That had basically no effect on the likelihood of a 19-game win streak. Then, I gave the home team a slight edge by increasing its strength 5% and decreasing the visiting team's strength by 5%. This is based on the average home record being about .550 compared to .450 on the road, which I got here. That barely moved the needle. The likelihood of a 19-game win streak was still a little less than 1%, compared to historical experience of 6%.
I didn't include the effects of home stands or road trips; that is, the fact that teams usually play three or four series in a row at home or away instead of cycling between the two. But I don't see that being a big player.
A couple of other possible explanations are that streaks either psychologically build on themselves (probably impossible to verify with any rigor, due to the luck factor) or that team strength waxes and wanes during the season instead of being a fixed value throughout. This second effect seems promising because many injuries take a few weeks to heal. The worst teams at any given time probably include good teams that have a lot of injured players. When those players get better, all of a sudden the team is good again. Then there's the streakiness of individual players. I speculate that many slumps are due to players being injured but functional and not telling anyone.
At some point I'll build up my team strengths from player stats, instead of assigning them randomly. Then we'll be cooking' with gas, as they say.
One correction: I said that the 1916 26-game winning streak of the Giants was interrupted by a tie. That is not exactly correct. The "tie" was actually a suspended game that, by the rules of the day, had to be replayed from the start instead of picked up from where it left off as it would be today. They did replay the game (I didn't know this when I made my original remarks) and the Giants won. That's not a tie in my book. So the record really is 26 wins in a row and there should be no asterisk by it.
Tuesday, September 12, 2017
Streak Odds
The simulation I developed to find the effect of luck in baseball can be used to estimate the odds of various streaks. As it happens, the Cleveland Indians are currently sitting on a 19-game winning streak, which is the sixth-longest winning streak since 1880.
In an earlier post I said the all-time longest winning streak was 26 by the Giants in 1916, but it turns out that streak was 27 wins interrupted between wins 15 and 16 by a tie with Pittsburgh. (A tie? According to Retrosheet they finished the top of the 9th tied at 1-1, but the Giants didn't bat in the bottom of the 9th, for unrecorded reasons. I'm guessing it started raining, and then they never completed the game because neither team contended that year.)
The 1916 Giants also had a 17-game winning streak earlier in the season, but they only came in fourth!
What are the odds of any team getting a 19-game win streak or better in a given season? I set up my team strengths as shown in the scatter plot on the left, and then ran 1000 simulated 162-game seasons. The histogram of longest streaks is shown on the right. There were 15 win or loss streaks of 19 or more, so that would be 15/2/1000 = 0.75% chance per season.
The actual number of streaks of 19 or more since 1880 (137 seasons but most were fewer than 162 games) is 8 (6%). So there's a fat tail effect, or something, going on that I'm not accounting for.
A window company in Cleveland offered free window jobs to anyone who bought windows in July, if the Indians had a 15-game winning streak. You got the deal if you bought by July 31, at which time the Tribe had 58 games left to play. What are the odds of a 15-or-better winning streak by one of the 30 teams in 58 games?
I calculated it by looking at the longest streak for "Team 1" of my ensemble over 10,000 58-game "seasons". (It took 10,000 simulated seasons to get a stable value.) That streak was 15 or greater just 14 times, so the chance of a 15-game winning streak was 14/2/10,000 = 0.07%. The figure below shows on the left the strength and actual wins over the 58 games for the 30 teams, and on the right the histogram of longest streaks by Team 1 in each season. The caption should read 10,000 seasons, not 1000 seasons.
I didn't use all the information available. I could have only looked at teams that happened to have 57 wins in the first 104 games (as the Indians did), which would have taken a lot more simulations but probably wouldn't have changed the results much because 57 wins out of 104 is not much better than average.
As is typical of these kinds of promotions, the window company itself didn't take on the risk of having to pay out. They paid a promotion company, which took the risk. What would have been a fair price to pay the promotion company? They sold about $2 million worth of windows, so the expected payout would be 0.07% x $2 million or $1400. Even if they paid $10,000, that promotion company had to eat a very spicy meatball when the Tribe won their 15th game.
Now, a philosophical excursion. It only makes sense to talk about probability and odds when there is some degree of ignorance. On July 31, everyone was ignorant of how the Indians would actually play, but there were varying degrees of knowledge about their record so far, their injuries, which teams they were scheduled to play, how many home versus away games and other information that a sophisticated model could use to estimate the odds. Given what we know today, what are the odds the Indians would have won their 19th straight last night?
100%.
In an earlier post I said the all-time longest winning streak was 26 by the Giants in 1916, but it turns out that streak was 27 wins interrupted between wins 15 and 16 by a tie with Pittsburgh. (A tie? According to Retrosheet they finished the top of the 9th tied at 1-1, but the Giants didn't bat in the bottom of the 9th, for unrecorded reasons. I'm guessing it started raining, and then they never completed the game because neither team contended that year.)
The 1916 Giants also had a 17-game winning streak earlier in the season, but they only came in fourth!
What are the odds of any team getting a 19-game win streak or better in a given season? I set up my team strengths as shown in the scatter plot on the left, and then ran 1000 simulated 162-game seasons. The histogram of longest streaks is shown on the right. There were 15 win or loss streaks of 19 or more, so that would be 15/2/1000 = 0.75% chance per season.
The actual number of streaks of 19 or more since 1880 (137 seasons but most were fewer than 162 games) is 8 (6%). So there's a fat tail effect, or something, going on that I'm not accounting for.
A window company in Cleveland offered free window jobs to anyone who bought windows in July, if the Indians had a 15-game winning streak. You got the deal if you bought by July 31, at which time the Tribe had 58 games left to play. What are the odds of a 15-or-better winning streak by one of the 30 teams in 58 games?
I calculated it by looking at the longest streak for "Team 1" of my ensemble over 10,000 58-game "seasons". (It took 10,000 simulated seasons to get a stable value.) That streak was 15 or greater just 14 times, so the chance of a 15-game winning streak was 14/2/10,000 = 0.07%. The figure below shows on the left the strength and actual wins over the 58 games for the 30 teams, and on the right the histogram of longest streaks by Team 1 in each season. The caption should read 10,000 seasons, not 1000 seasons.
I didn't use all the information available. I could have only looked at teams that happened to have 57 wins in the first 104 games (as the Indians did), which would have taken a lot more simulations but probably wouldn't have changed the results much because 57 wins out of 104 is not much better than average.
As is typical of these kinds of promotions, the window company itself didn't take on the risk of having to pay out. They paid a promotion company, which took the risk. What would have been a fair price to pay the promotion company? They sold about $2 million worth of windows, so the expected payout would be 0.07% x $2 million or $1400. Even if they paid $10,000, that promotion company had to eat a very spicy meatball when the Tribe won their 15th game.
Now, a philosophical excursion. It only makes sense to talk about probability and odds when there is some degree of ignorance. On July 31, everyone was ignorant of how the Indians would actually play, but there were varying degrees of knowledge about their record so far, their injuries, which teams they were scheduled to play, how many home versus away games and other information that a sophisticated model could use to estimate the odds. Given what we know today, what are the odds the Indians would have won their 19th straight last night?
100%.
Monday, September 4, 2017
Songwriter
I trained a neural network on the lyrics of a certain popular songwriter and then had it generate a short song:
Got in a little favor for him.
I wanna find one place, I wanna find one face that ain't looking through me.
Down in the U.S.A.
Born in the shadow of the refinery.
I'm a cool rocking Daddy in the face of these....
Whoa whoa whoa badlands!. Whoa whoa whoa badlands!. Whoa whoa whoa whoa whoa badlands!. Whoa whoa whoa.
For the ones who had a woman he loved in Saigon.
I was born in the shadow of the penitentiary.
I was born in the night, with a fear so real, you spend your life just covering up.
Learned real good right now, you better listen to me, baby.
I'm a long gone Daddy in the shadow of the penitentiary.
If you can't figure out whose lyrics I trained the network on, you must not be between the ages of 30 and 80. I used this guy's code.
Got in a little favor for him.
I wanna find one place, I wanna find one face that ain't looking through me.
Down in the U.S.A.
Born in the shadow of the refinery.
I'm a cool rocking Daddy in the face of these....
Whoa whoa whoa badlands!. Whoa whoa whoa badlands!. Whoa whoa whoa whoa whoa badlands!. Whoa whoa whoa.
For the ones who had a woman he loved in Saigon.
I was born in the shadow of the penitentiary.
I was born in the night, with a fear so real, you spend your life just covering up.
Learned real good right now, you better listen to me, baby.
I'm a long gone Daddy in the shadow of the penitentiary.
If you can't figure out whose lyrics I trained the network on, you must not be between the ages of 30 and 80. I used this guy's code.
Are You Ready For Some Football?
tl;dr: The small number of games in the NFL season strongly exaggerates the differences between teams. The NFL rule of scheduling six of a team's games between division rivals would have no effect on the actual results of a season determined by coin-flips. But division scheduling very slightly exaggerates differences when real differences already exist.
Major League Baseball teams play 162 games a season, which are clearly enough to separate the truly good teams from the merely lucky ones. In the NFL it's only 16 games. Is that enough to separate the great from the lucky?
First, I ran the same simulation I used in my last two posts but set the number of teams to 32 and the number of games per season to 16. With each game decided by the flip of a fair coin (therefore, no ties), here is one example season (sorry about the formatting, Blogger has a fixed column width):
Major League Baseball teams play 162 games a season, which are clearly enough to separate the truly good teams from the merely lucky ones. In the NFL it's only 16 games. Is that enough to separate the great from the lucky?
First, I ran the same simulation I used in my last two posts but set the number of teams to 32 and the number of games per season to 16. With each game decided by the flip of a fair coin (therefore, no ties), here is one example season (sorry about the formatting, Blogger has a fixed column width):
AFC | ||||||||||
North | South | East | West | |||||||
Cincinnati | 12-4 | Indianapolis | 9-7 | NY Jets | 14-2 | Denver | 8-8 | |||
Cleveland | 12-4 | Tennessee | 9-7 | New England | 9-7 | LA Chargers | 8-8 | |||
Pittsburgh | 9-7 | Jacksonville | 8-8 | Miami | 5-11 | Oakland | 7-9 | |||
Baltimore | 7-9 | Houston | 6-10 | Buffalo | 7-9 | Kansas City | 3-13 | |||
NFC | ||||||||||
North | South | East | West | |||||||
Minnesota | 11-5 | Atlanta | 11-5 | Philadelphia | 8-8 | Arizona | 10-6 | |||
Chicago | 10-6 | Tampa Bay | 8-8 | Washington | 7-7 | Seattle | 9-7 | |||
Detroit | 7-9 | Carolina | 7-9 | Dallas | 7-7 | LA Rams | 6-10 | |||
Green Bay | 6-10 | New Orleans | 4-12 | NY Giants | 6-10 | San Francisco | 6-10 |
Two things you notice right away is that there seem to be too many teams within a game of .500 (7-9, 8-8 or 9-7), and that there isn't enough separation between the teams in most of the divisions. There are 17 teams within one game of .500, but in 2016 there were actually only 11 teams like that. And in two divisions, no team is more than two games from .500. That's unusual.
Obviously, if I assigned unequal strengths to the teams, this would tend to create some separation. But there is another thing that might work. In my simulation, the schedule ignores divisions. That is, each of the 16 games a team plays is a random matchup with one of the other 31 teams. The Browns are as likely to play the Saints as they are the Steelers. But in the real NFL,
1. A team plays its division rivals twice
2. A team plays all four teams in another division in its conference once
3. A team plays all four teams in another division in the other conference once
4. A team plays its remaining two games against teams from the two remaining divisions in its conference.
Rule 1 seems like it might be important in creating separation within a division. In effect, 3/8 of the season is played between just four teams, and each of those games separates two teams in a division by one game. There is a 100% chance of creating a one-game separation. In contrast, when two teams play opponents outside the division, there's a 50% chance of a one-game separation (one team wins, one loses) and a 50% chance of no separation (both win or both lose).
I almost bought that argument. But when the games are decided by coin flips, the expectation value of separation per game is still zero regardless of the number of teams. If that doesn't convince you, consider that in a simulation of 1000 seasons, the coefficient of variation of wins per team was 0.4065 for a 6-game, 4-team season and 0.4093 for a 6-game, 32-team season - not a statistically significant difference.
Anyway, I re-ran the simulation continuing to decide games by coin flips but taking into account Rule 1. Here's how it came out:
AFC | ||||||||||
North | South | East | West | |||||||
Cincinnati | 12-4 | Jacksonville | 11-5 | Miami | 12-4 | Oakland | 10-6 | |||
Baltimore | 9-7 | Houston | 10-6 | New England | 11-5 | Kansas City | 8-8 | |||
Pittsburgh | 6-10 | Indianapolis | 8-8 | NY Jets | 7-9 | Denver | 5-11 | |||
Cleveland | 5-11 | Tennessee | 6-10 | Buffalo | 8-8 | LA Chargers | 4-12 | |||
NFC | ||||||||||
North | South | East | West | |||||||
Detroit | 9-7 | Tampa Bay | 11-5 | Philadelphia | 10-6 | San Francisco | 9-7 | |||
Green Bay | 7-9 | Carolina | 10-6 | NY Giants | 7-9 | Seattle | 8-8 | |||
Chicago | 7-9 | Atlanta | 8-8 | Washington | 7-9 | LA Rams | 7-9 | |||
Minnesota | 6-10 | New Orleans | 7-9 | Dallas | 6-10 | Arizona | 5-11 |
It made very little difference. There are now only 15 teams within one game of .500, but there are still two tightly bunched divisions.
What happens if we assign random team strengths instead of just flipping a coin? I'll just base it on the CV. For uniformly distributed team strengths between 4 wins/season and 12 wins/season, the CV of wins per team for a league without divisions (no Rule 1) was 0.46 in 1000 simulated seasons. With Rule 1, it was 0.49. So Rule 1 does exaggerate the differences between teams when a real difference already exists. But it's a weak effect.
The range of 4-12 wins/season for team true strengths seems about right. So now you want to see the cloud plot of wins for the NFL. Here it is for 100 simulated seasons:
The scatter in wins per season is huge. An average team wins anywhere from 3 to 13 games a season. And the plot of "luck ratio" on the right is cleaner than it was for baseball and clearly shows that there's more random variation in wins for weaker teams than for stronger ones.
Sunday, September 3, 2017
More Baseball Simulations
One question that was raised from my post yesterday is what shape the distribution of true strengths is. Is it a bell curve, a uniform distribution, or something in between?
We can't answer that question directly, because we can never observe the true strengths, only the actual win-loss records. But the shape of the true strength distribution might have an effect on the shape of the actual distributions of wins per season, which we can observe.
If I assume the following bell curve for true strength
then I get the following distribution of wins per season (this was over 137 seasons for reasons I'll explain later):
We can't answer that question directly, because we can never observe the true strengths, only the actual win-loss records. But the shape of the true strength distribution might have an effect on the shape of the actual distributions of wins per season, which we can observe.
If I assume the following bell curve for true strength
then I get the following distribution of wins per season (this was over 137 seasons for reasons I'll explain later):
But if I assume the following flat distribution of strengths:
then I get this distribution of wins per season:
This example looks "blockier" than the one from the bell curve, but in fact its coefficient of variation is 0.12, compared to 0.13 for the bell curve result. So it's not really possible to tell from the actual outcomes whether the distribution of true strengths is bell-shaped or flat - and if you can't tell, then it doesn't matter, at least for the purpose of predicting the distribution of wins.
The histogram of actual wins for the last six MLB seasons is
and its CV is 0.135. You could probably do some more sophisticated tests, but in my experience doing this kind of modeling, if a result isn't apparent to the eye, no fancy test is going to be convincing. One thing that's interesting about the actual MLB histogram is the "dip" in the middle. This could just be random chance and might go away if more seasons were included, but it could be that as the season goes on, talent tends to drain from the weaker teams and go to the stronger teams, which could make the win histogram bimodal. Teams that have big payrolls but are out of the playoff hunt by August are often looking to unload what talent they have to the teams that are going to make a run for October, so bad teams get worse and good teams get better.
I ran 137 seasons because I wanted to get some statistics on win streaks. Here's a histogram of the longest win streak by any team during each of the 137 simulated seasons:
This distribution is definitely skewed. Its mode (the commonest value) is 12, but in no season was the longest streak less than 10. There are 22 streaks of 16 or longer, and the longest streak of all the 137 seasons is 26. This is not far from reality. In the past 137 seasons, there are 30 MLB streaks of 16 games or longer, and the all-time longest streak during that span of time was by the New York Giants of John McGraw, who won 26 in a row in a ridiculous September 101 years ago.
Saturday, September 2, 2017
The Role of Luck in Baseball
In Major League Baseball, there is decent parity. The span between the worst and best teams in baseball right now is the 51-83 (.381) Phillies to the 92-41 (.692) Dodgers. In contrast, the worst and best teams in the NBA last year were .244 (Brooklyn) and .817 (Golden State), and the worst and best teams in the NFL were .063 (Cleveland, eeegh) and .875 (New England).
There is an element of luck in every game. When the Phillies play the Dodgers, the Dodgers will probably win, but nobody is really shocked if the Phillies pull one out. Maybe the Dodgers stayed out too late the night before, or had a rough flight to Philly.
But in the long run, the "better" team will beat the "worse" team more often than not. I put better and worse in quotes because I haven't exactly defined a team's true strength yet. Here is my definition: the true strength of a baseball team is the average number of wins it would get over an infinite number of seasons. That way, the effect of luck washes out completely. For example, an average team would get 81 wins per 162 games, if they played forever. By forever, I mean the same roster, at the same age and skill level, playing hypothetical repeated seasons forever. Obviously, they aren't getting older and older in these hypothetical seasons, as they would in real life.
Considering the effect of luck, you can see how the shortness of the NFL season (16 games) might tend to exaggerate differences between teams. The Browns clearly suck, but over a large set of seasons they might average 2 or 3 wins instead of the single win they got last year.
How does luck affect the number of wins a baseball team gets in one season, compared to its true strength? The baseball season has 10 times as many games as the NFL season, so the effect of luck should be a lot less than in the NFL. I ran some simulations to find out.
I ran 100 full seasons where 30 teams play each other in random matchups for 162 games. At the beginning of each season, I assign true strengths to the teams from a normal distribution with a coefficient of variation of 0.2. That results in true strengths running from about 40 to about 120 expected wins per season. Then I run through all 162 x 15 = 2,430 games per season. (Remember that on each game day, 30 teams play a total of 15 games.)
Each game goes like this: I draw a number from a uniform distribution between 0 and 1. If that number is less than
Team A's strength / (2 * Team B's strength)
then Team A wins. Otherwise, Team B wins. From this formula you can verify that if Team A has strength 90, and plays average teams (strength of 81) over and over, then Team A will win an average of 90 games per season in the long run. So this satisfies my definition of the team's true strength.
But the outcome of each game has an element of chance. Drumroll, please...
From left: Distribution of team strengths, actual wins versus strength for all teams and seasons, and ratio of actual to expected wins for all teams and seasons |
When the team strengths are normally distributed, an average team (average 81 wins per season over infinity seasons) won as few as 65 and as many as 95 games during the 100 simulated seasons. That's the difference between first and last place. The plot of actual divided by expected wins was a check. It should average to 1 for all strength values, which it does, except for the very weak teams (not sure what's going on there, maybe a problem with my random number generator.) But it shows that the scatter is bigger for weak teams. That is, it's more likely for a weak team to do unexpectedly well or unexpectedly poorly than for a strong team. That is good - it means luck plays the least role for the strongest teams, which are the ones that get the glory. If a really crappy team gets lucky, it probably still won't be enough to affect a championship.
I then repeated the simulation but instead of choosing normally distributed strengths, I chose them from a uniformly random distribution on an interval. I set the interval width such that the standard deviation of the uniform distribution matched that of the normal distribution used previously.
Uniformly random draw of team strengths and the resulting actual wins and "luck ratio" versus team strength |
In this simulation, the scatter was a little smaller, as might be expected. A team of average true strength (81 wins expected) got between maybe 68 and 90 wins over the 100 simulated seasons. It looks like the actual/expected plot shows the same narrowing of the scatter as team strength increases, but it's hard to say.
By setting all the strengths equal to 81 (average), the outcome of each game is essentially decided by a coin flip. If a team won more than 81 games, it would solely be due to luck. In this case I found that on average, the winningest team had 90-95 wins per season, which is a very solid year. This would suggest, for instance, that probably every season, one of the division champions is a complete fluke. It took a large number of seasons (more than 10,000) to get a stable value for this number and I didn't have the patience to narrow it down further. The type of distribution used for team strengths didn't seem to matter.
You could do all kinds of things with this simulation - and I'm sure serious gamblers do. For example, you could estimate the likelihood of a 10-game winning streak and then try to find someone to bet against who underestimated the true odds. With a lot of bets like that, I suspect you could make money consistently. But that's a suspicion I probably shouldn't pursue until my kids are out of college.
Saturday, August 26, 2017
The Great American Eclipse
Yes! We went to the path of totality for the great eclipse of 2017. This trip was five years in the making, but we only seriously
started planning about a year ago. Our first idea was to drive to the nearest
location of totality, which would have been in maybe Kentucky or Tennessee. But
the humidity of the eastern US made me worry about cloud cover, so we decided
to go west. As it turned out, most places in the East had good weather on eclipse day, but I have no regrets about the extra travel to get west. We figured that if we were going to take time off work and shell out for a trip, it was worth a few dollars more to maximize the chance of a good view.
It came down to either eastern Oregon or Wyoming. Eastern Oregon
had slightly better weather odds, but it looked hard to get to without a whole
lot of driving. So we went with Wyoming, specifically eastern Wyoming away from
the mountains which tend to generate and trap clouds. We could get there in
about three hours from the Denver airport, which is easily reachable for us on
Southwest.
We ended up staying in Guernsey, Wyoming in a very nice, new
hotel at reasonable rates. But that took some doing, and it’s a good thing we
planned ahead. Shortly before the eclipse, rooms were going for $500-$1,000 a
night with multi-day minimum stays, and rental cars at the Denver airport were $1,500 a
day. That is not an exaggeration. There were thousands of tents in campgrounds and on ranchland, all along
the roads. One place in Guernsey was asking $150 for (I assume overnight) parking space.
Tents pitched in a grassy area between a hotel parking lot and the North Platte River on the morning of the eclipse (Guernsey, Wyoming) |
The National Weather Service had a cloud cover
prediction that was updated twice a day or so. Guernsey was well within the path
of totality, but 48 hours before the eclipse, the NWS was calling for over 30% cover
versus 5% in Casper, about an hour and a half west. So we planned on getting up
really early on Monday and driving up to Casper. But on Sunday night the
prediction changed, to 15% around Guernsey and 25% in Casper, but with a wide
band of 5% in between. We decided to go to Glendo State Park, in the 5% zone, keeping open the
option of moving around if there was a reason to. It's a good thing we didn't try to go to Casper, because we'd never have made it. We would have watched the eclipse from our car on the side of I-25. By 8 a.m., I-25 west of Glendo was at a standstill, filled with people from Denver who'd left at 3 or 4 a.m. That was too late; we talked to people who'd left at 2 a.m. and they'd reached their viewing spots just as the traffic was locking up. People who didn't leave Denver until 7 a.m. never made it out of Colorado.
Local totality time was 11:45
a.m. When we pulled out of Guernsey at 7 a.m., the
roads were moving well. There was some congestion near the park, but no real
line at the ticket booth (the state of Wyoming charged a very reasonable $6 a
car, compared to some private viewing sites in Casper that were asking $50 or more.)
View from Wyoming Route 319 just south of Glendo, parallel to I-25, which is where the line of traffic is sitting. This was about 7:30 am. |
Glendo State Park was set up really well – I’d estimate
there were 5,000 cars there and space was available for many more, although there was not enough road to accommodate entry and exit (more on that later.) Several
college astronomy departments had tents up, and the University of Wyoming gave
us free t-shirts. Some people had large (10-inch or bigger) fancy-looking telescopes. We parked and
walked over to a pavilion where we struck up a conversation with a guy from
Italy. He invited us to observe some sunspots using filtered binoculars on a
tripod. We also met a space weather specialist from the Johns Hopkins Applied Physics Lab and
I had a long talk with her about the ins and outs of government-funded
research. But most importantly – there was not a cloud in the sky!
Eclipse watchers near Bennett Hill, Glendo State Park. The tripod holds the filtered binoculars of our new Italian friend. |
Panoramic view at the foot of Bennett Hill |
Near the pavilion was flat-topped, rocky Bennett Hill with a path
leading up. We trekked up the hill around 10:00, and found about 200 people
at the top, some in folding chairs, some standing around, and some sitting on
the bare ground. The view from up there was tremendous, eclipse or none – it must have been 50 miles from horizon to treeless
horizon. The atmosphere was very slightly hazy from some wildfires several
states away, but that was only noticeable along the ground. The overhead sky
was clear and blue. We had heard that if
you have a wide enough view, you could see the shadow of totality advancing
across the ground at something like twice the speed of sound. I venture to say that the only way to improve on this viewing spot would have been to go airborne, which some people did in a helicopter and a hot-air balloon we saw overhead. Many videos taken from the hill can be found on YouTube.
View from atop Bennett Hill |
On Bennett Hill, pre-eclipse |
Eclipse watchers on Bennett Hill, Glendo State Park, Wyoming |
The boys bought these t-shirts at the Casper Eclipse Festival the day before. |
One lady had a big white sheet spread on the ground to
capture the shadow bands that are supposed to happen just before totality. But
the buildup to totality had us all a little bored. You could see the moon
covering the sun using eclipse glasses, but it was so clear and sunny that it
didn’t get noticeably darker until about 15 minutes before totality. We listened
to “Brain Damage/Eclipse” by Pink Floyd, just as I'd planned it five years ago. Then the light became…the
only word is unworldly. An antelope was spotted near the top of the hill within
a couple of minutes of totality, and it drew everyone’s attention. I wanted to
yell out, “Forget the antelope!”
Totality came on very suddenly, which is characteristic of
being right in the middle of the path, which we were. It happened too fast to
look for shadow bands, and I didn’t see the approaching shadow or any Baily’s Beads. (In fact, I'm skeptical that the approach of a distinct shadow is ever visible, because we didn't see it under these near-perfect conditions.) The temperature dropped but I didn’t notice any change in the wind. The
sky darkened as if someone had quickly turned down the knob on an adjustable
room light. The stars came out, and the horizon took on the pink-orange of a
sunset at all azimuths, not just in the west. We had rehearsed the taking of a
single picture of my boys and me with the corona in the background, and got
that out of the way quickly. Then we
just looked at the corona.
I can only partially capture it in words. There was an
illusion of the sun only being a few thousand feet high. There were three very
long white streamers from the corona, much longer than you see in pictures. A
high airplane crossed the corona, leaving a faint contrail. The corona looked
like a bright, white, round fire with a perfectly circular hole in the middle
of it. The boundary between the umbra (dark circle in the center) and the corona was very slightly dynamic, not like a flame. The corona streamers were stable. It could easily be
viewed without eclipse glasses; the brightness was not harsh on the eyes. I
could have watched it for hours, but of course it ended after two and a half
minutes. Then we were treated to a very distinct “diamond ring” before the
sun’s photosphere was uncovered again, and the lights went back on. There was
an artificial quality to it – like a very high-quality planetarium show, only
it covered the entire goddamned sky. It is no exaggeration that a Siberian
tiger could have waltzed through the crowd during totality and nobody would
have noticed.
After totality, people started down from the hill. The rest
of the eclipse was anticlimactic and only the real astronomy buffs continued to
observe it. Everything had gone perfectly, just as planned, up to this point.
Then…
It took about half an hour to walk back to our car, and I
foolishly started the engine as if we were going to just drive off. But the
cars were at a standstill on the only road out. So I turned off the engine and
we waited another half hour. The next two hours were short periods of driving
down the exit road interrupted by long periods of standstill waiting. It was only about 80 degrees, but inside the rental car it got hot quickly with the engine and A/C
off. The fun was only starting.
We intended to go west on I-25, then cut south at Casper to
Independence Rock and thence on to Steamboat Springs, Colorado, normally about
a five-hour drive. But we didn’t even exit the park onto Wyoming Route 319 for
two hours. We were moving so slowly that we were able to get out and visit the
porta-johns between movements of the vehicle line. Kids were selling water and
popsicles from wagons, and they were moving a lot faster than we were. Once on
the road, nothing sped up. There was another line of jammed cars coming in the
opposite direction, which we soon figured out were eclipse viewers leaving from
Casper who had hit a huge traffic jam on I-25 south and had exited thinking the
state route would be faster. There were trucks off-roading it, driving on the
dirt path along the railroad that ran between I-25 and the state highway. People
were hanging out windows and sunroofs, sitting on top of campers, and walking
along the roadside. I got out and walked for a while myself, to stretch my
back. Usually it was the car that had to catch up to me, not the other way
around. It was like the traffic jam scene from Woodstock.
Three hours and fifteen miles later we came to US 20, which
cut us over to I-25 north. At the intersection of Wyoming 319 and US 20 there
was a stop sign, with nobody directing traffic. There must have been three or
four thousand cars in that line of traffic, and every one of them was stopping
at the stop sign. Assume the stop takes five seconds, multiply by 3,000 and you
quickly understand the cause of the delay.
When we finally got onto I-25, we could see stopped traffic
on the southbound side stretching for twenty or thirty miles. It was the
biggest traffic jam I’ve ever seen, and I used to live in Los Angeles. I-25
north was clear, but our plan of going to Steamboat Springs was in the trash. We
made it to Casper by 6 p.m., having spent seven hours in the car already, and
called it quits. There is no way I was going to drive hundreds more miles
of unlit two-lane Wyoming state highway after that much car fatigue.
The only problem was, we had no room reserved in Casper and
there was no possible way to get south back towards Cheyenne with the traffic. There were only three towns of any size between Glendo and Casper, and they didn't look like they had any hotels. The distances in the western Great Plains are orders of magnitude longer than in other parts of
the country, and there can be sixty miles between cities that have even basic
services.
With the eclipse crowd not completely out of Casper, we
ended up paying $250 for a smoking room at a low-end Days Inn. It may well have
been the last room in town. We also had to forfeit a night’s room charge in
Steamboat Springs because it was a nonrefundable reservation. We headed to
Independence Rock and Colorado the next day, but the traffic cost us an entire day out of our vacation.
Verdict: Worth every penny and every iota of hassle. People
have lived their whole lives without seeing a total solar eclipse and that's
almost tragic.
Lessons learned: Stay near a big city if the path permits
it. They’re set up to accommodate hundreds of thousands of tourists; Wyoming
isn’t. You can keep your location flexible, to avoid cloud cover, until the day
of the event, but don’t expect to be mobile on eclipse day. You're going to have to just hunker down and hope the sky is clear. Thus the importance of getting to an area with good overall weather odds. (If everyone in a large city ever had
to leave suddenly due to some kind of calamity, and there was no special
traffic pattern set up, the scene would be very, very bad. I have new respect
for the people who do this type of disaster planning.) Reserve your room and
car at least a year in advance, and try not to tack on side trips for a couple
days on either side of the eclipse. But most of all…do it if you possibly can.
Subscribe to:
Posts (Atom)