The Kelly Criterion Meets AI: Optimal Betting Strategies for Sports Prediction Models

Ali Mahmoudi

January 20, 2025

The Kelly Criterion Meets AI: Optimal Betting Strategies for Sports Prediction Models

The Kelly Criterion, developed by Bell Labs mathematician John Kelly in 1956, remains one of the most elegant solutions to a fundamental question: given an edge, how much should you bet?

In the age of AI-powered sports prediction, this question has evolved. Now we’re asking: how do we optimally size positions when our edge comes from machine learning models with probabilistic outputs and time-varying confidence levels?

After three years building enterprise sports prediction systems, I’ve learned that combining Kelly with AI isn’t just about applying a formula—it’s about understanding uncertainty, managing model risk, and building systems that survive the inevitable periods when your models are wrong.

The Kelly Formula: More Than Just Math

The classical Kelly formula is deceptively simple:

f = (bp - q) / b

Where:

f = fraction of bankroll to bet
b = odds received on the bet (decimal odds - 1)
p = probability of winning
q = probability of losing (1 - p)

The Kelly surface reveals key insights: optimal position sizes grow exponentially with edge, but uncertainty in probability estimates can dramatically affect position sizing. For sports betting, where model confidence varies significantly, this uncertainty propagation becomes critical.

But in AI-driven sports analytics, each variable becomes a complex system:

p comes from ML models with confidence intervals
b represents live odds that change continuously
f must be calculated in real-time across thousands of markets

The Challenge: AI Models Aren’t Oracle

Traditional Kelly assumes you know the true probability p. AI models give us probability estimates with uncertainty. This uncertainty propagation is crucial for practical implementation.

Real Example: AFL Match Prediction

Consider predicting Richmond vs. Collingwood. Our ensemble model outputs:

Point estimate: Richmond 65% chance to win
95% confidence interval: [58%, 72%]
Model calibration score: 0.87 (from historical validation)

Key insight from our production systems: Model predictions become less reliable as game time approaches due to late-breaking information (injuries, weather, lineups). However, the confidence intervals typically widen, not narrow, showing that our models correctly identify when they’re less certain.

In our AFL example, if the model estimates Richmond at 65% (±7%), the Kelly calculation varies dramatically:

At 58% probability: Kelly suggests 0% position (no bet)
At 65% probability: Kelly suggests 12% of bankroll
At 72% probability: Kelly suggests 25% of bankroll

This sensitivity to probability estimates highlights why uncertainty quantification isn’t optional in production Kelly systems—it’s fundamental to avoiding catastrophic position sizing errors.

The naive approach uses 65% in Kelly. The sophisticated approach accounts for:

Estimation uncertainty in the 65%
Model calibration (is 65% actually 65%?)
Temporal stability (how does confidence change approaching game time?)

Practical Implementation Framework

Here’s the production-ready framework we use for Kelly-based position sizing:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219


import numpy as np
import pandas as pd
from scipy import stats
from scipy.optimize import minimize_scalar
import warnings

class KellyAIBetting:
    """
    Kelly Criterion implementation for AI-powered sports betting
    with uncertainty quantification and risk management
    """
    
    def __init__(self, bankroll=10000, max_kelly_fraction=0.25, 
                 min_edge=0.02, confidence_threshold=0.8):
        self.bankroll = bankroll
        self.max_kelly_fraction = max_kelly_fraction  # Never bet more than 25%
        self.min_edge = min_edge  # Minimum edge to place bet
        self.confidence_threshold = confidence_threshold
        
    def calculate_model_uncertainty(self, model_proba, historical_calibration):
        """
        Adjust model probabilities based on historical calibration
        and estimate prediction uncertainty
        """
        # Calibration adjustment using Platt scaling approach
        calibrated_proba = self._apply_calibration(model_proba, historical_calibration)
        
        # Estimate uncertainty from model ensemble variance or bootstrap
        # For simplicity, using Beta distribution approximation
        alpha = calibrated_proba * historical_calibration['effective_sample_size']
        beta = (1 - calibrated_proba) * historical_calibration['effective_sample_size']
        
        uncertainty_bounds = stats.beta.interval(0.95, alpha, beta)
        
        return {
            'calibrated_proba': calibrated_proba,
            'lower_bound': uncertainty_bounds[0],
            'upper_bound': uncertainty_bounds[1],
            'uncertainty': uncertainty_bounds[1] - uncertainty_bounds[0]
        }
    
    def _apply_calibration(self, model_proba, calibration_data):
        """Apply isotonic calibration based on historical performance"""
        # Simplified calibration - in practice use sklearn.calibration
        slope = calibration_data.get('slope', 1.0)
        intercept = calibration_data.get('intercept', 0.0)
        
        calibrated = slope * model_proba + intercept
        return np.clip(calibrated, 0.01, 0.99)  # Avoid extreme probabilities
    
    def conservative_kelly(self, odds, prob_estimate, uncertainty_info, 
                          risk_multiplier=0.5):
        """
        Conservative Kelly implementation accounting for model uncertainty
        """
        # Use lower bound of probability estimate for conservative approach
        conservative_prob = prob_estimate['lower_bound']
        
        # Calculate implied probability from odds
        implied_prob = 1.0 / odds
        
        # Calculate edge
        edge = conservative_prob - implied_prob
        
        # Only bet if we have sufficient edge and confidence
        if (edge < self.min_edge or 
            uncertainty_info['uncertainty'] > (1 - self.confidence_threshold)):
            return 0.0
        
        # Kelly calculation
        b = odds - 1  # Net odds
        p = conservative_prob
        q = 1 - p
        
        kelly_fraction = (b * p - q) / b
        
        # Apply safety constraints
        kelly_fraction = max(0, kelly_fraction)  # Never negative
        kelly_fraction = min(kelly_fraction, self.max_kelly_fraction)  # Cap maximum
        kelly_fraction *= risk_multiplier  # Conservative multiplier
        
        return kelly_fraction
    
    def dynamic_position_sizing(self, market_data, model_predictions, 
                              bankroll_history, time_to_event):
        """
        Dynamic Kelly sizing that adjusts based on market conditions
        and temporal factors
        """
        results = []
        
        for market in market_data:
            market_id = market['id']
            odds = market['odds']
            
            # Get model prediction with uncertainty
            prediction = model_predictions[market_id]
            uncertainty_info = self.calculate_model_uncertainty(
                prediction['probability'], 
                prediction['calibration_data']
            )
            
            # Time-based adjustments
            time_decay_factor = self._calculate_time_decay(time_to_event)
            
            # Bankroll momentum factor (reduce sizing after losses)
            momentum_factor = self._calculate_momentum_factor(bankroll_history)
            
            # Calculate base Kelly fraction
            base_kelly = self.conservative_kelly(odds, uncertainty_info, uncertainty_info)
            
            # Apply adjustments
            adjusted_kelly = (base_kelly * 
                            time_decay_factor * 
                            momentum_factor)
            
            # Final position size
            position_size = adjusted_kelly * self.bankroll
            
            results.append({
                'market_id': market_id,
                'kelly_fraction': adjusted_kelly,
                'position_size': position_size,
                'edge': uncertainty_info['calibrated_proba'] - (1/odds),
                'confidence': 1 - uncertainty_info['uncertainty'],
                'factors': {
                    'time_decay': time_decay_factor,
                    'momentum': momentum_factor
                }
            })
        
        return results
    
    def _calculate_time_decay(self, hours_to_event):
        """Reduce position sizes as event approaches (higher volatility)"""
        if hours_to_event < 1:
            return 0.5  # 50% reduction close to event
        elif hours_to_event < 24:
            return 0.8  # 20% reduction same day
        else:
            return 1.0  # Full size for advance betting
    
    def _calculate_momentum_factor(self, recent_results):
        """Reduce sizing after losses to manage bankroll drawdowns"""
        if len(recent_results) < 10:
            return 1.0
        
        # Look at last 20 bets
        recent_pnl = recent_results[-20:]
        win_rate = sum(1 for r in recent_pnl if r > 0) / len(recent_pnl)
        
        if win_rate < 0.4:  # Below 40% win rate recently
            return 0.7  # Reduce sizing by 30%
        elif win_rate > 0.6:  # Above 60% win rate
            return 1.1  # Slight increase (capped by max_kelly)
        else:
            return 1.0

# Example usage with AFL data
def example_afl_betting():
    """Example implementation for AFL match betting"""
    
    # Initialize Kelly system
    kelly_system = KellyAIBetting(
        bankroll=50000,
        max_kelly_fraction=0.15,  # Conservative 15% max
        min_edge=0.03,  # Require 3% edge minimum
        confidence_threshold=0.75
    )
    
    # Simulated model predictions
    market_data = [
        {'id': 'AFL_RIC_COL', 'odds': 2.10},  # Richmond vs Collingwood
        {'id': 'AFL_GEE_HAW', 'odds': 1.65}   # Geelong vs Hawthorn
    ]
    
    model_predictions = {
        'AFL_RIC_COL': {
            'probability': 0.52,  # 52% chance Richmond wins
            'calibration_data': {
                'slope': 0.98,
                'intercept': 0.01,
                'effective_sample_size': 100
            }
        },
        'AFL_GEE_HAW': {
            'probability': 0.68,  # 68% chance Geelong wins
            'calibration_data': {
                'slope': 0.96,
                'intercept': 0.02,
                'effective_sample_size': 120
            }
        }
    }
    
    # Recent bankroll performance (1 = win, -1 = loss)
    bankroll_history = [1, -1, 1, 1, -1, 1, -1, -1, 1, 1]
    
    # Calculate optimal positions
    positions = kelly_system.dynamic_position_sizing(
        market_data, 
        model_predictions, 
        bankroll_history,
        time_to_event=48  # 48 hours to match
    )
    
    print("Optimal Position Sizing:")
    print("=" * 50)
    for position in positions:
        print(f"\nMarket: {position['market_id']}")
        print(f"Kelly Fraction: {position['kelly_fraction']:.1%}")
        print(f"Position Size: ${position['position_size']:,.0f}")
        print(f"Edge: {position['edge']:.1%}")
        print(f"Confidence: {position['confidence']:.1%}")
        print(f"Time Factor: {position['factors']['time_decay']:.2f}")
        print(f"Momentum Factor: {position['factors']['momentum']:.2f}")

# Run example
example_afl_betting()

Advanced Considerations for Production Systems

1. Multi-Model Ensemble Kelly

When using ensemble models, we need to aggregate not just predictions but uncertainties:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19


def ensemble_kelly_calculation(model_outputs, correlation_matrix):
    """
    Calculate Kelly fractions for ensemble predictions
    accounting for model correlations
    """
    n_models = len(model_outputs)
    
    # Weight models by historical performance
    weights = np.array([model['historical_accuracy'] for model in model_outputs])
    weights = weights / weights.sum()
    
    # Aggregate probabilities
    ensemble_prob = sum(w * model['probability'] for w, model in zip(weights, model_outputs))
    
    # Aggregate uncertainties (accounting for correlations)
    variances = np.array([model['variance'] for model in model_outputs])
    ensemble_variance = weights.T @ correlation_matrix @ (weights * variances)
    
    return ensemble_prob, np.sqrt(ensemble_variance)

2. Real-Time Odds Movement Integration

Sports betting odds move continuously. Our Kelly calculation must adapt:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18


def streaming_kelly_update(current_position, new_odds, new_prediction):
    """
    Update Kelly position as odds and predictions change
    """
    # Calculate new optimal position
    new_kelly = calculate_kelly_fraction(new_odds, new_prediction)
    
    # Apply transaction cost consideration
    adjustment_threshold = 0.02  # Only adjust if >2% change
    
    if abs(new_kelly - current_position) > adjustment_threshold:
        # Calculate optimal partial adjustment (minimize transaction costs)
        optimal_adjustment = optimize_adjustment(
            current_position, new_kelly, transaction_cost=0.005
        )
        return optimal_adjustment
    
    return 0  # No adjustment needed

3. Bankroll Management Under Uncertainty

Kelly assumes infinite betting opportunities. In practice, we need bankroll preservation:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17


def adaptive_bankroll_management(kelly_fractions, correlation_estimates):
    """
    Adjust Kelly fractions to account for portfolio risk
    across multiple simultaneous bets
    """
    # Portfolio Kelly optimization
    def portfolio_kelly_objective(weights):
        portfolio_variance = weights.T @ correlation_estimates @ weights
        return -np.sum(weights) + 0.5 * portfolio_variance  # Risk penalty
    
    constraints = [
        {'type': 'ineq', 'fun': lambda x: 0.25 - np.sum(x)},  # Max 25% total exposure
        {'type': 'ineq', 'fun': lambda x: x}  # Non-negative positions
    ]
    
    result = minimize(portfolio_kelly_objective, kelly_fractions, constraints=constraints)
    return result.x

Risk Management: When Kelly Goes Wrong

Kelly assumes your edge estimate is correct. In AI systems, this is often wrong. Key risk management principles:

1. Fractional Kelly Implementation

Never use full Kelly. Use 25-50% of the calculated Kelly fraction:

1
2


# Conservative approach
position_size = 0.25 * kelly_fraction * bankroll

2. Stop-Loss Mechanisms

1
2
3
4
5


def implement_stop_loss(current_bankroll, initial_bankroll, stop_loss_threshold=0.2):
    """Halt betting if bankroll drops below threshold"""
    if current_bankroll < initial_bankroll * (1 - stop_loss_threshold):
        return True  # Stop all betting
    return False

3. Model Confidence Gating

1
2
3


def confidence_gate(prediction_confidence, min_confidence=0.8):
    """Only bet when model is highly confident"""
    return prediction_confidence >= min_confidence

Performance Metrics for Kelly+AI Systems

Traditional Kelly metrics don’t capture AI-specific risks. Key metrics to track:

Calibration Score: How well do probability estimates match outcomes?
Edge Decay: How quickly does model edge deteriorate?
Volatility-Adjusted Returns: Sharpe ratio adapted for betting
Maximum Drawdown Duration: How long do losing streaks last?
Model Sensitivity: How much do results change with model updates?

Production monitoring insights: We track six critical metrics across all Kelly+AI systems:

Calibration Score: Our models typically start at 0.85-0.90 but degrade to 0.75-0.80 over 3-6 months as market dynamics change
Edge Decay: Different sports show varying decay rates - AFL holds edge longer (seasonal patterns) while tennis degrades faster (player form volatility)
Rolling Sharpe Ratio: Target >1.5 for sustainable performance, but expect 2-3 month periods below 1.0
Maximum Drawdown: Conservative Kelly (25% max position) typically sees 8-12% drawdowns; aggressive approaches can hit 25%+
Model Sensitivity: A 5% change in win probability can alter position sizing by 50%+
Correlation Risk: During major events, seemingly independent bets become correlated, amplifying portfolio risk

Critical insight: Systems optimized for maximum theoretical return consistently underperform conservative implementations with robust risk controls.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16


def calculate_betting_metrics(betting_history):
    """Calculate comprehensive performance metrics"""
    returns = np.array([bet['pnl'] for bet in betting_history])
    probabilities = np.array([bet['predicted_prob'] for bet in betting_history])
    outcomes = np.array([bet['actual_outcome'] for bet in betting_history])
    
    metrics = {
        'total_return': returns.sum(),
        'sharpe_ratio': returns.mean() / returns.std() * np.sqrt(252),
        'max_drawdown': calculate_max_drawdown(returns),
        'calibration_score': calculate_calibration_score(probabilities, outcomes),
        'hit_rate': (returns > 0).mean(),
        'average_edge': calculate_realized_edge(betting_history)
    }
    
    return metrics

Lessons from Production Implementation

After running Kelly-based AI betting systems in production, key insights:

1. Model Uncertainty Dominates

The uncertainty in your probability estimates typically matters more than the exact Kelly calculation. Invest heavily in uncertainty quantification.

2. Temporal Effects Are Critical

Sports betting markets are highly dynamic. Models trained on historical data may not reflect current market conditions. Build adaptive systems.

3. Risk Management Saves Careers

Perfect Kelly with wrong probabilities can bankrupt you quickly. Conservative implementation with robust risk controls outperforms aggressive optimization.

4. Operational Complexity Is Real

Production betting systems need real-time data feeds, order management, regulatory compliance, and 24/7 monitoring. The math is the easy part.

Conclusion

Combining the Kelly Criterion with AI-powered sports prediction represents the cutting edge of quantitative betting strategies. However, success requires much more than applying a mathematical formula to model outputs.

The key insight from building these systems: Kelly provides the theoretical framework, but production success depends on uncertainty quantification, risk management, and operational excellence.

Whether you’re building sports analytics systems, trading algorithms, or any decision system under uncertainty, the principles remain the same: respect uncertainty, manage risk, and remember that elegant mathematics means nothing if your fundamental assumptions are wrong.

Ali Mahmoudi is Research Lead at a leading Australian sports technology company, where he architects enterprise ML systems for sports analytics and customer intelligence. He holds a PhD in Statistics from the University of Melbourne and specializes in Bayesian inference and production ML systems.

Interested in quantitative approaches to sports analytics? Connect on LinkedIn or email me—I’m always happy to discuss the intersection of statistics, AI, and real-world applications.