Back to Portfolio
Research Paper

Analyzing Team Performance and Winning Trends in Baseball

A comprehensive statistical analysis of MLB data (1990-2023) using multiple linear regression to identify the most significant offensive and defensive predictors of team success.

Multiple RegressionR (ggplot2)Welch's t-testLahman DatabaseSports Analytics

Offensive R²

0.59

Model Fit

Runs Coeff

+0.108

Wins per Run

Teams Analyzed

30

1990-2023

Top Finding

Offense

Outperforms Defense

Regression Coefficients

Horizontal bar chart displaying the coefficient values for significant predictors. Note the surprising negative correlation for Home Runs and Triples.

Key Interpretations

  • Runs Scored (0.1076):Strongest predictor. For every additional run scored, expected wins increase by ~0.11.
  • Walks Allowed (0.1189):Found to be positively correlated, suggesting a strategic component to walks in defensive schemes.
  • Home Runs (-0.0315):Unexpected negative coefficient suggests relying solely on power hitting (without base runners) may yield diminishing returns compared to balanced offense.

Research Paper

Open PDF

Paper Details

Download PDF

Author

Assaf Bitton

Methodology

Multiple Linear Regression, LOESS Smoothing, Welch's t-test

Dataset

Lahman "Teams" Database

Regression Stats

Offensive R²
0.591
Defensive R²
0.539
Key Predictors (p < 0.05)
Runs, Walks, Errors, Stolen Bases