Nicoles Magic Spatula
  • Home
  • Business & Finance News
  • Advertise Here
  • Contact Us
  • Privacy Policy
  • Sitemap
Select Page

Ballpark figures: Analyzing MLB baseball attendance

May 15, 2022

Ballpark figures: Analyzing MLB baseball attendance


It is springtime in the U.S., which means something as American as apple pie is back: baseball. And since there’s all kinds of great data around one of the country’s great pastimes, we decided for this week’s post to look at Major League Baseball (MLB) attendance statistics from the last 20 years, which is published on many websites including the one we used to get the data you’ll find in the charts below: ESPN.com.

To collect the attendance data from ESPN, we used Jupyter Workspaces (currently in beta in Domo) and the Python package Beautiful Soup to parse the HTML. And since Domo can now schedule code in Jupyter Workspaces to run on a regular schedule, you can be sure that this page will continue to update with the 2022 data.

The first thing you’ll probably notice when looking at the data is that 2020 is missing. That’s because, due to the pandemic, baseball was played without fans that year. There was a bit of a return to normalcy in 2021, but it wasn’t until this season that all spectating restrictions were lifted, so it will be interesting to watch how attendance rebounds (though, in full transparency, we only have the data for full years right now, so we are not capturing any data related to seasonality, such as how weather or a team’s place in the playoff race affects ticket sales).

Related Posts:

  • QR Codes Guide: 37 Creative Ideas for Small Businesses

One good way to review this data is with an old favorite of many data scientists: a box and whisker plot. The chart shows the minimum and maximum average attendance for each team in the whiskers (the top and bottom lines). I have sorted this to show the team with the highest peak attendance year on the left, and the lowest on the right:

Where the visualization gets more interesting for me is with the box elements. Each box shows the space between 25th and 75th percentiles, which is meant to reflect how much a team’s attendance has swung over the years. The bigger boxes tell me those teams (such as Philadelphia and Detroit) have had some great years for attendance and some not so great years. Smaller boxes (such as Boston) say that a team has been very consistent in its attendance numbers. We have also filtered the chart for pre-pandemic years only since 2021 (and to a lesser extent partial 2022 data) skews the data.

An alternative approach to understanding how teams rank in attendance is to create indexes of where a team’s attendance stands relative to the total MLB average—which is what we’ve done directly below. Dark blue boxes mean that a team is well above the average, while dark orange boxes mean that a team is well below the average. You can use the filters to look at whatever league, division, team(s), or year(s) you’re interested in:

Long-time Domo users may be looking at these indexes and thinking that I did some pre-calculation in a Magic ETL or a Dataset View. It’s true that doing calculations on such total levels typically require pre-calculation. But if I did that, it would be hard to allow for the year filter. So, the secret is out: With Domo’s new FIXED beast modes (currently in beta), you can do FIXED level of detail functions right in a beast mode. For the above “Index to League Avg”, this is the calculation:

You can see there are two things happening here. First, when I have the SUM FIXED by League, then it is summing across all values with the same league as the row I am on. That allows me to get that league total we need for the denominator of the index. Second, it is using FILTER ALLOW to tell Domo that filters on Year can impact the FIXED functions.  There are options for FILTER ALLOW, FILTER DENY, and FILTER NONE.

Here’s one last example of how useful the FIXED with FILTER DENY can be. The bar charts below are defaulted to the New York Yankees (my boss’ favorite team). The first chart is not using FIXED, so when I filter for the Yankees, the Min, Max, and Median fields become meaningless since they get filtered to be the same as the selected team. The second chart uses FIXED and DENY on team name so that the Min, Max, and Median remain as references to the main average, which is for the Yankees.

One of the things I love—and also at times find maddening—about exploring new data is that there is always more to explore. As I worked on this post, I realized that it would be quite interesting to bring in teams’ win/loss records as well as information on stadium capacity. But then I thought: Let’s maybe save that for a future post.






Source link

Recent Posts

  • Hot Stocks: ROST plunges on earnings; DECK rises; cybersecurity rally; BA sets 52-week low
  • Russia allows 15 companies to remain listed abroad – finance ministry
  • How (and Why) to Do Domain Background Check with Bill Hartzer
  • A Complete Guide to Understanding Equipment Financing
  • How to Manage a Remote Workforce And How to Become a Book Author » Succeed As Your Own Boss

Archives

Categories

Visit Now

supplemental health insurance
May 2022
M T W T F S S
 1
2345678
9101112131415
16171819202122
23242526272829
3031  
« Apr    
Intellifluence Trusted Blogger

Tags

Amazon Fba Business At&T Business Login Atlanta Business Chronicle'S Boss Baby Back In Business business Business Administration Degree Business Attire Women Business Card Design Business Cards Templates Business Casual Dress Business Checking Account Business Credit Card Business For Sale Near Me Business Intelligence Platform Business Lawyer Near Me Business Loan Calculator Business Name Ideas Business Professional Women Business Spectrum Login Capital One Spark Business Carl Weber'S The Family Business Charlotte Business Journal coronavirus Custom Business Cards enterprise Florida Business Search Fl Sos Business Search Harvard Business Publishing Insurance For Small Business Kelley School Of Business Maryland Business Express Moo Business Cards National Business Furniture Ohio Business Gateway Onedrive For Business Online Business Ideas Paramore Misery Business Risky Business Cast Small Business Insurance Spectrum Business Customer Service Tom Cruise Risky Business Us Small Business Administration Verizon Wireless Business Verizon Wireless Business Login Yelp Business Login

BL

LP

TL

  • Facebook
  • Twitter
  • Instagram
  • RSS
nicolesmagicspatula.com
This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Cookie settingsACCEPT
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT