AFCON 2025

Prediction & Automation Engine

For us, it's a real-time laboratory where data, probability, and automation meet football.
We built this platform as a full pipeline, from raw match data to live predictions and automated visuals, designed to be transparent, adaptive, and content-ready.

How the Model Works

This page was crafted by 0x0redd .

Our Approach

We designed this system as a full pipeline that goes from raw match data to automated, on-brand content for social media. The system is built on three core principles that ensure transparency, adaptability, and content-first outputs.

🔍

Transparency

Show what the model believes and how confident it is. Every prediction includes the underlying data and reasoning.

🔄

Adaptability

The model updates itself as matches are played. Team strengths evolve over time, so predictions react to form and real results.

📊

Content-First

Every output is designed to be easy to turn into a story, post, or graphic. Everything the model produces is structured to become content fast.

Three layers, one workflow

Data Layer

Robust data pipeline for match schedules, live results, and historical data with automated scraping and normalization.

Automated data collectionResult validationVersioned storage

Model Layer

Statistical models that process match data to generate predictions and update team performance metrics.

Poisson distribution for goal predictionDynamic team strength updatesMonte Carlo simulations

Content Layer

Automated generation of predictions, visualizations, and insights for various platforms.

Automated reportingSocial media integrationReal-time dashboards

In one sentence: Data → Predictions → Automated content, refreshed continuously.

Data Layer

The Data Layer is our single source of truth. Every prediction, evaluation, and visual starts here — clean, normalized, and always up to date.

Core Data Files

We store AFCON data as simple JSON so it's easy to debug, version, and reuse across Python + Node.

Core files

AFCON-CSC/data/afcon_schedule.json

→ The static tournament schedule stored at AFCON-CSC/data/afcon_schedule.json. Includes group stage matches and placeholder 'TBD' entries for knockout rounds.

AFCON-CSC/data/afcon_results.json

→ Scraped from the official CAF website via afcon-results-scraper.js. Contains detailed match information with Opta timestamps and scores.

AFCON-CSC/outputs/state/processed_matches.json

→ Tracks which matches have already been used to update team strengths, ensuring the model never double-counts a game.

How everything joins

We standardize team names (aliases) and join data using a stable key:

// date + home + away = stable join key
const key = `${dateKey}__${home}__${away}`;
resultMap.set(key, r);

Why this matters

One clean "truth layer" means: the model never guesses which match is which, and the content layer never posts the wrong score.

Fresh Results (Scraper)

We scrape results from the official CAF AFCON page, using Puppeteer to interact with the CAF AFCON homepage and Opta widgets. The scraper follows a workflow that loads the homepage, extracts fixtures, enriches with match centre data when needed, and safely merges updates.

What we extract

•match_id, date_iso (from Opta timestamp)
•team codes (e.g. MLI, ZMB) + full names (from img[alt])
•scores, period/status (PreMatch / Live / FullTime)

await page.goto(HOME_URL, { waitUntil: "networkidle2", timeout: 60_000 });
await page.waitForSelector(".Opta-fixture[data-match]", { timeout: 30_000 });

const nodes = Array.from(document.querySelectorAll(".Opta-fixture[data-match]"));

Smart enrichment (match centre when needed)

If a match is live / missing score / not FullTime, we fetch the match-centre widget for higher accuracy:

const needsCentre =
  !Number.isFinite(fx.home_score) ||
  !Number.isFinite(fx.away_score) ||
  (fx.period && fx.period.toLowerCase() !== "fulltime");

const centre = await scrapeMatchCentre(page, fx.match_centre_url);

Data safety

We merge new scrapes with existing afcon_results.json so past matches are preserved and only improved when new info is better.

Schedule Enrichment

To keep the schedule and results aligned, we write final scores back into afcon_schedule.json once matches are finished. This makes "schedule + results + predictions" easy to join everywhere (Python, Sheets, stories).

// map results -> schedule rows and inject score
if (result) {
  return {
    ...match,
    score: result.score, // ex: "2-1"
  };
}

We also prioritize the most reliable score source when available:

const homeScore = r.match_centre?.home_score !== undefined
  ? r.match_centre.home_score
  : r.home_score;

Why enrich schedule at all?

Because the schedule becomes a timeline ledger: upcoming matches + played matches live in one file, with no guessing and no extra joins.

Model Layer

This is where raw match data becomes numbers you can trust: team strengths → expected goals → probabilities → final prediction. All model logic lives in AFCON-CSC/utils/afcon_pipeline.py, powered by shared utilities in poisson_utils.py.

Model Architecture

Poisson Distribution Model

Uses Poisson distribution to predict goal-scoring probabilities based on team strengths.

P(goals) = (λ^k * e^-λ) / k!

Parameters

•λ (lambda) = attack_strength * defense_strength * league_average

Team Strength Updates

Team strengths are updated after each match using an Exponential Moving Average (EMA) to adapt to team form.

new_strength = (α * latest_performance) + ((1 - α) * old_strength)

Parameters

•α (alpha) = learning rate (typically 0.1-0.3)
•Home advantage factor: 1.2x attack strength

Match Simulation

Simulates matches using Monte Carlo methods to generate probabilities for different outcomes.

Process

•1. Calculate expected goals for both teams
•2. Simulate multiple match outcomes
•3. Aggregate results for win/draw/lose probabilities

We also normalize team names with a small alias map so all sources match cleanly (example: "Congo DR" vs "DR Congo").

# Team-name aliasing (display name -> strength key)
TEAM_NAME_TO_STRENGTH = {
    "Congo DR": "DR Congo",
}

def to_strength_team_name(name: str) -> str:
    return TEAM_NAME_TO_STRENGTH.get(name.strip(), name.strip())

Dynamic updates after each finished match (EMA)

We update strengths incrementally so the model adapts over the tournament.

strengths[home_key] = poisson_utils.update_team_strength_after_match(
    strengths[home_key],
    {"goals_scored": int(hs), "goals_conceded": int(a_s)},
    league_avg_goals,
    learning_rate=0.15,
)

And inside the EMA rule:

updated_attack = current_attack * (1 - learning_rate) + new_attack * learning_rate
updated_defense = current_defense * (1 - learning_rate) + new_defense * learning_rate

Expected Goals (xG)

Once we have strengths, xG is computed with a clean formula:

•Home xG = home_attack × away_defense × home_advantage
•Away xG = away_attack × home_defense

def expected_goals(home_attack, away_attack, home_defense, away_defense, home_advantage=1.0):
    home_exp = home_attack * away_defense * home_advantage
    away_exp = away_attack * home_defense
    return home_exp, away_exp

Optional: scoring sensitivity (more varied scores)

We can boost xG slightly so predictions aren't too conservative.

home_exp_boosted = home_exp * scoring_sensitivity
away_exp_boosted = away_exp * scoring_sensitivity

Score Probabilities (Poisson)

We treat goals as Poisson events and compute a full matrix:

# P(Home=i, Away=j) = P(Home=i) × P(Away=j)
prob_matrix[i, j] = poisson.pmf(i, home_exp) * poisson.pmf(j, away_exp)

Then we can extract the most likely score (mode) from the matrix:

max_idx = np.unravel_index(np.argmax(prob_matrix), prob_matrix.shape)
home_goals, away_goals = max_idx[0], max_idx[1]

Outcome Probabilities (1X2)

From the same matrix we compute:

•Home win = sum of probabilities where i > j
•Draw = diagonal sum (i == j)
•Away win = sum where i < j

home_win = np.sum(prob_matrix[np.tril_indices(max_goals + 1, k=-1)])
draw     = np.sum(np.diag(prob_matrix))
away_win = np.sum(prob_matrix[np.triu_indices(max_goals + 1, k=1)])

And the pipeline uses those probabilities per match:

outcomes = poisson_utils.match_outcome_probabilities(home_exp_boosted, away_exp_boosted, max_goals=5)
outcomes = poisson_utils.round_probabilities(outcomes, decimals=1)

Mean vs Mode: Design Rationale

Each match outputs two predicted scores:

Mean (rounded xG)

Our primary score for content. Gives more realistic scorelines and has performed better for exact score hits.

Poisson Mode

The single most likely score but often low-scoring (0-0, 1-0). Used as a reference to evaluate and improve the model.

Why both? We keep both so we can evaluate and improve the model over the tournament.

{
  "predicted_score": "1 - 0",          # mean (rounded xG) – primary
  "predicted_score_mode": "0 - 0",      # Poisson mode – reference
  "score_probability": 27.7,
  "expected_goals": {
    "home": 0.91,
    "away": 0.37
  },
  "probabilities": {
    "home_win": 47.5,
    "draw": 37.9,
    "away_win": 14.6
  }
}

Incremental Model Updates

Function: train_or_update_model()

Process:

Loads existing team_strengths_dynamic.json (if present) or static strengths
Reads all finished matches from afcon_results.json
For each unprocessed finished match:
- Uses poisson_utils.update_team_strength_after_match() with EMA
- New attack = blend of previous attack strength and goals scored / league average
- New defense = blend of previous defense strength and goals conceded / league average
- Learning rate (≈0.15) controls how quickly the model reacts
Marks the match as processed in processed_matches.json

EMA Update Formula

The Exponential Moving Average update follows:

attack_new = (1 - α) × attack_old + α × (goals_scored / league_avg)

defense_new = (1 - α) × defense_old + α × (goals_conceded / league_avg)

where α ≈ 0.15 is the learning rate

Automation Loop

AFCON-CSC/run_afcon_loop.py runs in a continuous loop (e.g., every minute or configured interval):

Updates team strengths with new results
Regenerates predictions
Re-evaluates accuracy

This loop is launched from Node (index.js) as a background Python process, so everything stays in sync.

Evaluation & Transparency

We track the truth. After predictions are generated, we evaluate performance by joining predicted matches to actual results. This allows tracking performance over time and communicating it transparently.

Evaluation Metrics

Function: evaluate_predictions(predictions)

Process:

Joins predictions with actual results via composite key (date + home_team + away_team)
Computes evaluation metrics for both mean and mode predictions
Appends a row to AFCON-CSC/outputs/evaluation/accuracy_summary.csv

Exact Score Accuracy

Measures how often the predicted score matches the actual score exactly.

•exact_score_accuracy_mean - accuracy for mean-based predictions
•exact_score_accuracy_mode - accuracy for Poisson mode predictions

Outcome Accuracy

Measures whether we correctly predicted the match outcome (Home Win / Draw / Away Win).

•Did we at least get the 1X2 (H/D/A) right?
•Tracks correct vs incorrect outcome predictions

Why Transparency Matters

By tracking and sharing our evaluation metrics, we can continuously improve the model and build trust with our audience. Every prediction includes the underlying data and reasoning, so users understand both what we predict and how confident we are.

Sheet & Orchestration Layer

We use a Google Sheet as the control panel for content and data. The sheet is the "single screen" where content and data meet, enabling both automated workflows and human-in-the-loop controls.

Syncing Predictions to Google Sheets

Implementation: afcon-sheet-sync.js

Sync Process:

Loads latest afcon_predictions.json and afcon_results.json
Builds rows with rich fields (columns A--X)
Merges with existing sheet data:
- Preserves manual columns V--X
- Updates existing rows if the same match reappears
- Keeps past matches even if they disappear from the latest prediction file

Sheet Column Structure

Column	Field
A	Date
B	Stage
C-D	Home Team, Away Team
E-F	Home Code, Away Code (for flags)
G	Match ID
H-I	Kickoff DateTime (UTC), Kickoff Time (UTC)
J	Predicted Score (mean -- primary)
K	Predicted Score (Mode)
L	Actual Score
M	Exact (Mean)?
N	Exact (Mode)?
O	Outcome? (YES/NO, 1X2)
P-R	P(Home Win), P(Draw), P(Away Win) (0--1)
S-T	xG(Home), xG(Away)
U	Match Status (Played/Upcoming/period)
V	Posted Prediction? (manual)
W	Posted Review? (manual)
X	Content Notes (manual)

Sheet Functions

This sheet now powers:

Content Decisions

What to post when, based on match status and manual flags

Story Generation

Story generation scripts read directly from the sheet

Manual Controls

Human-in-the-loop comments and notes per match

Content Layer

We use Node.js + Canvas to render Instagram-style stories directly from the sheet. Every prediction is structured to become content fast: dashboard rows, story-ready numbers, and clean fields that plug into templates.

Prediction Story Generator

Script: render-afcon-story.js

Template: Untitlemmmd-1.png (1080×1920)

Generates pre-match prediction stories with flags, scores, and probabilities.

•Reads today's rows from the Predictions sheet
•Takes up to 2 matches per day
•Uses team codes to load flag images
•Displays predicted score (mean) and probabilities
•Confidence badge system (High/Medium/Low)
•Highlights correct probability in green if result is known

Video Story Generator

Script: render-afcon-story-video.js

Template: Untitlemmmd-1_2.mp4 (animated video)

Creates animated video stories by compositing match data onto video templates.

•Generates transparent PNG overlay with match data
•Uses FFmpeg to composite overlay onto animated video template
•Delays overlay appearance by 1 second for sync
•Outputs final MP4 ready for social media

Review Story Generator

Script: render-afcon-story-result.js

Template: Artboard 2-1.jpg (1080×1920)

Post-match review stories comparing predictions with actual results.

•Top half: Predicted view with flags and probabilities
•Bottom half: Final result from afcon_results.json
•Color-coded correct vs incorrect predictions
•Comment box for one-line summary (ready for AI narration)

Confidence Badge System

The confidence label is computed based on maximum probability:

High

≥ 0.65

Maximum probability is 65% or higher

Medium

0.45 - 0.65

Maximum probability between 45% and 65%

Low

< 0.45

Maximum probability below 45%

Automated Scheduling

afcon-story-scheduler.js uses node-cron to schedule:

12:00

Daily Prediction Story

Render and send prediction story at 12:00 PM

22:15

Daily Review Story

Render and send match review story at 10:15 PM

Stories are sent as documents via WhatsApp to a configured phone number.

End-to-End Automation Flow

Putting it all together: a complete workflow from data collection to automated content generation, with transparency and adaptability at its core.

Complete Workflow

Continuous Model Loop (Python)

run_afcon_loop.py

Load latest results → update team strengths (EMA) → regenerate predictions → log accuracy

Scraper (Node.js)

afcon-results-scraper.js

Periodically scrape CAF to update afcon_results.json

Sheet Sync (Node.js)

afcon-sheet-sync.js

Merge predictions + results into the Predictions sheet with all rich fields

Story Generation (Node.js)

render-afcon-story.js

render-afcon-story-result.js

render-afcon-story-video.js

Generate pre-match prediction stories, post-match review stories, and animated video stories

Automated Scheduling

afcon-story-scheduler.js

Schedule story generation and delivery at configured times

Manual Controls

Sheet columns: Posted Prediction?, Posted Review?, Content Notes. Human oversight for content workflow

System Integration

The system runs as a loop: Scrape → Update → Regenerate → Evaluate → Sync → Render. It's designed to power the AFCON dashboard experience end-to-end, with everything staying in sync automatically.

Scalability and Generalization

Although built for AFCON, the pattern is event-agnostic. Once configured for one tournament, the same architecture can be reused for multiple competitions with minimal changes.

Why This Approach Scales

Data layer

Event schedule + real-time results from any reliable source

Model layer

A transparent, updatable prediction model (Poisson, Elo, or more advanced)

Content layer

Structured sheet or JSON payload, Canva/Figma templates or custom story renderers

Applicable Use Cases

•World Cup, Champions League, local leagues
•Esports tournaments
•Any event with a clear schedule + outcomes (awards, elections, etc.)

Technology Stack

Component	Technology
Web Scraping	`Puppeteer (Node.js)`
Data Processing	`Python 3.x`
Statistical Modeling	`NumPy, SciPy`
Content Rendering	`Node Canvas, FFmpeg`
Sheet Integration	`Google Sheets API`
Scheduling	`node-cron`
Messaging	`WhatsApp Web.js`

Key Achievements

• Automated data collection from official sources

• Transparent statistical modeling with continuous learning

• Automated content generation for multiple platforms

• Human-in-the-loop controls via Google Sheets

• End-to-end automation with scheduled delivery

Date	Stage	Match	Prediction	Result	Probabilities	Status
Dec 21, 2025	Group Stage	MoroccovsComoros	2 - 1	2 - 0	H: 0,643 D: 0,279 A: 0,076	Finished
Dec 22, 2025	Group Stage	MalivsZambia	1 - 1	1 - 1	H: 0,348 D: 0,401 A: 0,251	Finished
Dec 22, 2025	Group Stage	South AfricavsAngola	2 - 1	2 - 1	H: 0,499 D: 0,252 A: 0,243	Finished
Dec 22, 2025	Group Stage	EgyptvsZimbabwe	2 - 1	2 - 1	H: 0,656 D: 0,185 A: 0,132	Finished
Dec 23, 2025	Group Stage	Congo DRvsBenin	1 - 0	1 - 0	H: 0,575 D: 0,289 A: 0,133	Finished
Dec 23, 2025	Group Stage	SenegalvsBotswana	2 - 0	3 - 0	H: 0,719 D: 0,209 A: 0,063	Finished
Dec 23, 2025	Group Stage	NigeriavsTanzania	2 - 1	2 - 1	H: 0,713 D: 0,18 A: 0,087	Finished
Dec 23, 2025	Group Stage	TunisiavsUganda	1 - 0	3 - 1	H: 0,598 D: 0,274 A: 0,124	Finished
Dec 24, 2025	Group Stage	Burkina FasovsEquatorial Guinea	1 - 1	2 - 1	H: 0,307 D: 0,376 A: 0,316	Finished
Dec 24, 2025	Group Stage	AlgeriavsSudan	1 - 0	3 - 0	H: 0,529 D: 0,321 A: 0,149	Finished