Back to Portfolio
Research Paper
Analyzing Team Performance and Winning Trends in Baseball
A comprehensive statistical analysis of MLB data (1990-2023) using multiple linear regression to identify the most significant offensive and defensive predictors of team success.
Multiple RegressionR (ggplot2)Welch's t-testLahman DatabaseSports Analytics
Offensive R²
0.59
Model Fit
Runs Coeff
+0.108
Wins per Run
Teams Analyzed
30
1990-2023
Top Finding
Offense
Outperforms Defense
Regression Coefficients
Horizontal bar chart displaying the coefficient values for significant predictors. Note the surprising negative correlation for Home Runs and Triples.
Key Interpretations
- Runs Scored (0.1076):Strongest predictor. For every additional run scored, expected wins increase by ~0.11.
- Walks Allowed (0.1189):Found to be positively correlated, suggesting a strategic component to walks in defensive schemes.
- Home Runs (-0.0315):Unexpected negative coefficient suggests relying solely on power hitting (without base runners) may yield diminishing returns compared to balanced offense.
Research Paper
Open PDFPaper Details
Download PDF
Author
Assaf Bitton
Methodology
Multiple Linear Regression, LOESS Smoothing, Welch's t-test
Dataset
Lahman "Teams" Database
Regression Stats
- Offensive R²
- 0.591
- Defensive R²
- 0.539
- Key Predictors (p < 0.05)
- Runs, Walks, Errors, Stolen Bases