It’s supposed to be hard. If it wasn’t hard, everyone would do it. The hard…is what makes it great.
Tom Hanks’ inspiring words in the award-winning 1992 film “A League of Their Own” perfectly portray the appeal of the sport of baseball to so many Americans. A complicated game blending technique, physicality, and mental capacity, it is difficult to achieve consistent success in baseball. The difficulty for a player to reach the collegiate and professional levels, in particular, is amplified by roster restrictions, as collegiate and professional teams set out to shrewdly recruit and acquire players to efficiently maximize roster talent in hopes of achieving on-field success. However, the process of finding such players can be extremely time consuming and inefficient, with many miles and charges being put on cars and credit cards, respectively.
One possibility in maximizing productivity in scouting and recruiting processes might be for teams to set their sights on specific geographical areas that produce an abundance of talented players. Narrowing scouting and recruiting focus on a geographical level can allow for more cost-effective usage of both time and financial resources. Part one of this two-part series examines potential geospatial factors that impact a geographical area’s production of baseball players that play beyond the high school level.
Hypotheses
Research has found low income as a barrier to sports participation (Offord, et al., 1998; Kremarik, 2000; Kokolakakis, et al., 2014). Building upon this discovery, increasing median per-capita income is expected to positively impact baseball participation within a county.
H1: As median per-capita income increases, the number of college and professional players provided by that region increases.
A 2000 General Social Survey analysis finds no significant difference in participation based on family structure, with similar participation rates among children living in two-parent versus one-parent families. However, a later analysis of a similar Canadian survey finds that children from single-parent families have lower odds of participating in organized sport than those from traditional families (McMillan, et al., 2016). This might imply that children from married-couple households have a higher likelihood of participating in sports. Due to these findings, married household percentage is expected to positively impact baseball participation for a given region.
H2: As marriage percentage increases, the number of college and professional players provided by that region increases.
In theory, a higher county population might indicate a larger pool of potential athletes. Due to this logic, county population is expected to positively correlate with baseball participation within the county.
H3: As population increases, the number of college and professional players provided by that region increases.
Data
To pursue these hypotheses, data was pulled from two sources: The Baseball Cube (TBC) and the Census Bureau’s American Community Survey (ACS). TBC’s “High School Baseball” section contains information on over 109,000 players from over 13,000 high schools across the United States. Players who graduated high school 2009 to 2019 ended up being the focus for this study, in an attempt to solely capture “modern” trends or indicators of participation. This filtering brought the number of players within the dataset to 38,489 records.
The second set of data was accessed from the Census Bureau’s ACS, an ongoing survey that provides data on a broad range of economic and demographic data for the U.S. population. Data was collected from the 5-year ACS at the county level to obtain family household counts, married household counts, per-capita median income, and population size.
Using the R statistical package, the two datasets (TBC and ACS) were joined to obtain frequencies of baseball participation past the high school level for each county, along with marriage percentage (married household count divided by total family household count), per-capita median income, and population size for each county with at least one baseball participant. Also included in the merged dataset was each player’s height and weight, to be used as controls in the study, as well as highest level each player reached in baseball.
Results
Using Ordinary Least Squares (OLS) regression analysis, the number of successful baseball players produced at the county level was regressed on the independent variables (median income, population, and marriage percentage). The table below illustrates the OLS regression results. The critical t-value (1.645) is based upon a one-sided t-test with a significance level of 5% and a degrees of freedom measure of 1715. The adjusted R-squared is 0.7007, which means that 70.07% of the variance in baseball player frequency is explained by the model.
In accordance with H1, median per-capita income was found to be significantly positively associated with baseball participation past the high school level. For each relative increase in $1,724 in per-capita median income, there is an expected increase of one successful baseball participant for that county. This result is consistent with the aforementioned research that posits lower income as a barrier to sport participation (Offord, et al., 1998; Kremarik, 2000; Kokolakakis, et al., 2014).
When considering H2, marriage percentage was found to be significantly positively related with baseball participation beyond the high school level. For each marriage percentage increase of approximately 4%, there is an expected increase of one successful baseball participant for that county. This result contradicts Kremarik’s (2000) findings that household structure has no significant impact on sport participation. However, this result coincides with McMillan’s (2016) findings that children from single-parent families have lower odds of participating in organized sport than those from two-parent family households.
Examining H3, population was found to have a positive relationship with baseball participation past the high school level. For each relative increase in county population of around 8,333 people, there is an expected increase of one baseball participant for that county. This follows prevailing logic, as a larger population should indicate a larger talent pool.
Conclusion
The method in which players are evaluated is a complicated one. A lot of factors go into recruiting and scouting the right player for a program, at both the physical and mental levels. Finding that county per-capita median income, population, and marriage percentage have a significantly positive impact on the number of collegiate and professional players produced in a given area might help add another piece to the ever-evolving puzzle of consistently acquiring baseball talent at the collegiate and professional levels.
In hopes of maximizing efficiency of returns in terms of recruiting and acquiring talented baseball players, teams might look to focus their scouting and recruiting to more wealthy and populated counties. These results might also portray baseball as a sport for the wealthier or more fortunate. Continued efforts to grow the sport in less-affluent areas, in addition to encouraging increased general interest, might lead talented but underprivileged youth to choose baseball over basketball or football. Major League Baseball’s Reviving Baseball in Inner Cities (RBI) program seems to be an important and logical step in that direction.
About the Author
Mathew Bennett is a current MBA student and baseball player at Samford University. After achieving his MBA, Mathew hopes to work in analytics in some capacity within the sports industry.
Twitter: @mattyice_126LinkedIn: https://www.linkedin.com/in/mathew-bennett-118714131/ or "Mathew Bennett"
References
Kremarik, Frances. 2000. “A family affair: Children’s participation in sports”. Statistics Canada Catalogue 11-008, 20-24.Kokolakakis, Themis., Lera-Lopez, Fernando & Castellanos, Pablo. May 2014. “Regional differences in sport participation: the case of local authorities in England”. International Journal of Sport Finance Vol. 9, Issue 2.
McMillan, Rachel., McIssac, Michael., Janssen, Ian. 2016. “Family Structure as a Correlate of Organized Sport Participation among Youth”. PLoS One. Vol 11, Issue 2.
Offord, D., E. Lipman & E. Duku. 1998. “Sports, The Arts and Community Programs: Rates and Correlates of Participation”. Ottawa: Human Resources Development Canada. 19.