| Title: | Fitting Bayesian and MLE Football Models |
|---|---|
| Description: | This is the first package allowing for the estimation, visualization and prediction of the most well-known football models: double Poisson, bivariate Poisson, Skellam, student_t, diagonal-inflated bivariate Poisson, and zero-inflated Skellam. It supports both maximum likelihood estimation (MLE, for 'static' models only) and Bayesian inference. For Bayesian methods, it incorporates several techniques: MCMC sampling with Hamiltonian Monte Carlo, variational inference using either the Pathfinder algorithm or Automatic Differentiation Variational Inference (ADVI), and the Laplace approximation. The package compiles all the 'CmdStan' models once during installation using the 'instantiate' package. The model construction relies on the most well-known football references, such as Dixon and Coles (1997) <doi:10.1111/1467-9876.00065>, Karlis and Ntzoufras (2003) <doi:10.1111/1467-9884.00366> and Egidi, Pauli and Torelli (2018) <doi:10.1177/1471082X18798414>. |
| Authors: | Leonardo Egidi [aut, cre], Roberto Macrì Demartino [aut], Vasilis Palaskas. [aut] |
| Maintainer: | Leonardo Egidi <[email protected]> |
| License: | GPL-2 |
| Version: | 2.1.0 |
| Built: | 2026-06-07 23:12:18 UTC |
| Source: | https://github.com/leoegidi/footbayes |
Fits a Bayesian Bradley-Terry-Davidson model using Stan. Supports both static and dynamic ranking models, allowing for the estimation of team strengths over time.
btd_foot( data, dynamic_rank = FALSE, home_effect = FALSE, prior_par = list(logStrength = normal(0, 3), logTie = normal(0, 0.3), home = normal(0, 5)), rank_measure = "median", method = "MCMC", ... )btd_foot( data, dynamic_rank = FALSE, home_effect = FALSE, prior_par = list(logStrength = normal(0, 3), logTie = normal(0, 0.3), home = normal(0, 5)), rank_measure = "median", method = "MCMC", ... )
data |
A data frame containing the observations with columns:
The data frame must not contain missing values. |
dynamic_rank |
A logical value indicating whether a dynamic ranking model is used (default is |
home_effect |
A logical value indicating the inclusion of a home effect in the model. (default is |
prior_par |
A list specifying the prior distributions for the parameters of interest, using the
Only normal priors are allowed for this model. |
rank_measure |
A character string specifying the method used to summarize the posterior distributions of the team strengths. Options are:
|
method |
A character string specifying the method used to obtain the Bayesian estimates. Options are:
|
... |
Additional arguments passed to |
An object of class "btdFoot", which is a list containing:
fit: The fitted CmdStanFit object returned by cmdstanr.
rank: A data frame with the rankings, including columns:
periods: The time period.
team: The team name.
rank_points: The estimated strength of the team based on the chosen rank_measure.
data: The input data.
stan_data: The data list passed to Stan.
stan_code: The Stan code of the underline model.
stan_args: The optional cmdstanr parameters passed to (...).
rank_measure: The summary statistic used to compute the rankings.
alg_method: The inference algorithm used to obtain the Bayesian estimates.
Roberto Macrì Demartino [email protected].
## Not run: if (instantiate::stan_cmdstan_exists()) { library(dplyr) data("italy") italy_2020_2021 <- italy %>% dplyr::select(Season, home, visitor, hgoal, vgoal) %>% dplyr::filter(Season == "2020" | Season == "2021") %>% dplyr::mutate(match_outcome = dplyr::case_when( hgoal > vgoal ~ 1, # Home team wins hgoal == vgoal ~ 2, # Draw hgoal < vgoal ~ 3 # Away team wins )) %>% dplyr::mutate(periods = dplyr::case_when( dplyr::row_number() <= 190 ~ 1, dplyr::row_number() <= 380 ~ 2, dplyr::row_number() <= 570 ~ 3, TRUE ~ 4 )) %>% # Assign periods based on match number dplyr::select(periods, home_team = home, away_team = visitor, match_outcome ) # Dynamic Ranking Example with Median Rank Measure fit_result_dyn <- btd_foot( data = italy_2020_2021, dynamic_rank = TRUE, home_effect = TRUE, prior_par = list( logStrength = normal(0, 10), logTie = normal(0, 5), home = normal(0, 5) ), rank_measure = "median", iter_sampling = 1000, parallel_chains = 2, chains = 2 ) print(fit_result_dyn) print(fit_result_dyn, pars = c("logStrength", "home"), teams = c("AC Milan", "AS Roma")) # Static Ranking Example with MAP Rank Measure fit_result_stat <- btd_foot( data = italy_2020_2021, dynamic_rank = FALSE, prior_par = list( logStrength = normal(0, 10), logTie = normal(0, 5), home = normal(0, 5) ), rank_measure = "map", iter_sampling = 1000, parallel_chains = 2, chains = 2 ) print(fit_result_stat) } ## End(Not run)## Not run: if (instantiate::stan_cmdstan_exists()) { library(dplyr) data("italy") italy_2020_2021 <- italy %>% dplyr::select(Season, home, visitor, hgoal, vgoal) %>% dplyr::filter(Season == "2020" | Season == "2021") %>% dplyr::mutate(match_outcome = dplyr::case_when( hgoal > vgoal ~ 1, # Home team wins hgoal == vgoal ~ 2, # Draw hgoal < vgoal ~ 3 # Away team wins )) %>% dplyr::mutate(periods = dplyr::case_when( dplyr::row_number() <= 190 ~ 1, dplyr::row_number() <= 380 ~ 2, dplyr::row_number() <= 570 ~ 3, TRUE ~ 4 )) %>% # Assign periods based on match number dplyr::select(periods, home_team = home, away_team = visitor, match_outcome ) # Dynamic Ranking Example with Median Rank Measure fit_result_dyn <- btd_foot( data = italy_2020_2021, dynamic_rank = TRUE, home_effect = TRUE, prior_par = list( logStrength = normal(0, 10), logTie = normal(0, 5), home = normal(0, 5) ), rank_measure = "median", iter_sampling = 1000, parallel_chains = 2, chains = 2 ) print(fit_result_dyn) print(fit_result_dyn, pars = c("logStrength", "home"), teams = c("AC Milan", "AS Roma")) # Static Ranking Example with MAP Rank Measure fit_result_stat <- btd_foot( data = italy_2020_2021, dynamic_rank = FALSE, prior_par = list( logStrength = normal(0, 10), logTie = normal(0, 5), home = normal(0, 5) ), rank_measure = "map", iter_sampling = 1000, parallel_chains = 2, chains = 2 ) print(fit_result_stat) } ## End(Not run)
Compares multiple football models or directly provided probability matrices based on specified metrics (accuracy, Brier score, ranked probability score, Pseudo , average coverage probability), using a test dataset. Additionally, computes the confusion matrices. The function returns an object of class compareFoot.
compare_foot( source, test_data, metric = c("accuracy", "brier", "ACP", "pseudoR2", "RPS"), conf_matrix = FALSE )compare_foot( source, test_data, metric = c("accuracy", "brier", "ACP", "pseudoR2", "RPS"), conf_matrix = FALSE )
source |
A named list containing either:
|
test_data |
A data frame containing the test dataset, with columns:
|
metric |
A character vector specifying the metrics to use for comparison. Options are:
Default is |
conf_matrix |
A logical value indicating whether to generate a confusion matrix comparing predicted outcomes against actual outcomes for each model or probability matrix. Default is |
The function extracts predictions from each model or directly uses the provided probability matrices and computes the chosen metrics on the test dataset. It also possible to compute confusion matrices.
An object of class compare_foot_output, which is a list containing:
metrics: A data frame containing the metric values for each model or probability matrix.
confusion_matrix: Confusion matrices for each model or probability matrix.
Roberto Macrì Demartino [email protected]
## Not run: if (instantiate::stan_cmdstan_exists()) { library(dplyr) data("italy") italy_2000 <- italy %>% dplyr::select(Season, home, visitor, hgoal, vgoal) %>% dplyr::filter(Season == "2000") colnames(italy_2000) <- c("periods", "home_team", "away_team", "home_goals", "away_goals") # Example with fitted models fit_1 <- stan_foot( data = italy_2000, model = "double_pois", predict = 18 ) # Double Poisson model fit_2 <- stan_foot( data = italy_2000, model = "biv_pois", predict = 18 ) # Bivariate Poisson model italy_2000_test <- italy_2000[289:306, ] compare_results_models <- compare_foot( source = list( double_poisson = fit_1, bivariate_poisson = fit_2 ), test_data = italy_2000_test, metric = c("accuracy", "brier", "ACP", "pseudoR2", "RPS"), conf_matrix = TRUE ) print(compare_results_models) # Example with probability matrices home_team <- c( "AC Milan", "Inter", "Juventus", "AS Roma", "Napoli", "Lazio", "Atalanta", "Fiorentina", "Torino", "Sassuolo", "Udinese" ) away_team <- c( "Juventus", "Napoli", "Inter", "Atalanta", "Lazio", "AC Milan", "Sassuolo", "Torino", "Fiorentina", "Udinese", "AS Roma" ) # Home and Away goals based on given data home_goals <- c(2, 0, 2, 2, 3, 1, 4, 2, 1, 1, 2) away_goals <- c(1, 0, 1, 3, 2, 1, 1, 2, 1, 1, 2) # Combine into a data frame test_data <- data.frame(home_team, away_team, home_goals, away_goals) # Define the data for each column pW <- c(0.51, 0.45, 0.48, 0.53, 0.56, 0.39, 0.52, 0.55, 0.61, 0.37, 0.35) pD <- c(0.27, 0.25, 0.31, 0.18, 0.23, 0.30, 0.24, 0.26, 0.18, 0.19, 0.22) pL <- c(0.22, 0.30, 0.21, 0.29, 0.21, 0.31, 0.24, 0.19, 0.21, 0.44, 0.43) # Create the data frame table_prob table_prob <- data.frame(pW, pD, pL) matrix_prob <- as.matrix(table_prob) # Use compare_foot function compare_results_matrices <- compare_foot( source = list(matrix_1 = matrix_prob), test_data = test_data, metric = c("accuracy", "brier", "pseudoR2", "ACP", "RPS") ) # Print the results print(compare_results_matrices) } ## End(Not run)## Not run: if (instantiate::stan_cmdstan_exists()) { library(dplyr) data("italy") italy_2000 <- italy %>% dplyr::select(Season, home, visitor, hgoal, vgoal) %>% dplyr::filter(Season == "2000") colnames(italy_2000) <- c("periods", "home_team", "away_team", "home_goals", "away_goals") # Example with fitted models fit_1 <- stan_foot( data = italy_2000, model = "double_pois", predict = 18 ) # Double Poisson model fit_2 <- stan_foot( data = italy_2000, model = "biv_pois", predict = 18 ) # Bivariate Poisson model italy_2000_test <- italy_2000[289:306, ] compare_results_models <- compare_foot( source = list( double_poisson = fit_1, bivariate_poisson = fit_2 ), test_data = italy_2000_test, metric = c("accuracy", "brier", "ACP", "pseudoR2", "RPS"), conf_matrix = TRUE ) print(compare_results_models) # Example with probability matrices home_team <- c( "AC Milan", "Inter", "Juventus", "AS Roma", "Napoli", "Lazio", "Atalanta", "Fiorentina", "Torino", "Sassuolo", "Udinese" ) away_team <- c( "Juventus", "Napoli", "Inter", "Atalanta", "Lazio", "AC Milan", "Sassuolo", "Torino", "Fiorentina", "Udinese", "AS Roma" ) # Home and Away goals based on given data home_goals <- c(2, 0, 2, 2, 3, 1, 4, 2, 1, 1, 2) away_goals <- c(1, 0, 1, 3, 2, 1, 1, 2, 1, 1, 2) # Combine into a data frame test_data <- data.frame(home_team, away_team, home_goals, away_goals) # Define the data for each column pW <- c(0.51, 0.45, 0.48, 0.53, 0.56, 0.39, 0.52, 0.55, 0.61, 0.37, 0.35) pD <- c(0.27, 0.25, 0.31, 0.18, 0.23, 0.30, 0.24, 0.26, 0.18, 0.19, 0.22) pL <- c(0.22, 0.30, 0.21, 0.29, 0.21, 0.31, 0.24, 0.19, 0.21, 0.44, 0.43) # Create the data frame table_prob table_prob <- data.frame(pW, pD, pL) matrix_prob <- as.matrix(table_prob) # Use compare_foot function compare_results_matrices <- compare_foot( source = list(matrix_1 = matrix_prob), test_data = test_data, metric = c("accuracy", "brier", "pseudoR2", "ACP", "RPS") ) # Print the results print(compare_results_matrices) } ## End(Not run)
All results for English soccer games in the top 4 tiers from 1888/89 season to 2021/22 season.
englandengland
A data frame with 203956 rows and 12 variables:
Date of match
Season of match - refers to starting year
Home team
Visiting team
Full-time result
Goals scored by home team
Goals scored by visiting team
Division: 1,2,3,4 or 3N (Old 3-North) or 3S (Old 3-South)
Tier of football pyramid: 1,2,3,4
Total goals in game
Goal difference in game home goals - visitor goals
Result: H-Home Win, A-Away Win, D-Draw
Depicts teams' abilities either from the Stan models fitted via the stan_foot function
or from MLE models fitted via the mle_foot function. Supported MLE models include
"double_pois", "biv_pois", "dixon_coles", "neg_bin",
"skellam", and "student_t". Models with separate attack and defence parameters
(all except "student_t") display attack and defence effects; the "student_t" model
displays combined global abilities.
foot_abilities(object, data, type = "both", teams = NULL)foot_abilities(object, data, type = "both", teams = NULL)
object |
An object either of class |
data |
A data frame containing match data with columns:
|
type |
Type of ability for models with separate attack/defence parameters: one among |
teams |
An optional character vector specifying team names to include. If |
A ggplot object showing each selected team’s ability estimates:
For static Bayesian or MLE fits, horizontal error bars (95% intervals) and point estimates.
For dynamic Bayesian fits, ribbon and line plots over periods.
Leonardo Egidi [email protected] and Roberto Macrì Demartino [email protected].
## Not run: if (instantiate::stan_cmdstan_exists()) { library(dplyr) data("italy") italy <- as_tibble(italy) ### no dynamics, no prediction italy_2000_2002 <- italy %>% dplyr::select(Season, home, visitor, hgoal, vgoal) %>% dplyr::filter(Season == "2000" | Season == "2001" | Season == "2002") colnames(italy_2000_2002) <- c("periods", "home_team", "away_team", "home_goals", "away_goals") fit1 <- stan_foot( data = italy_2000_2002, model = "double_pois" ) # double poisson fit2 <- stan_foot( data = italy_2000_2002, model = "biv_pois" ) # bivariate poisson fit3 <- stan_foot( data = italy_2000_2002, model = "skellam" ) # skellam fit4 <- stan_foot( data = italy_2000_2002, model = "student_t" ) # student_t foot_abilities(fit1, italy_2000_2002) foot_abilities(fit2, italy_2000_2002) foot_abilities(fit3, italy_2000_2002) foot_abilities(fit4, italy_2000_2002) ### seasonal dynamics, predict the last season fit5 <- stan_foot( data = italy_2000_2002, model = "biv_pois", predict = 180, dynamic_type = "seasonal" ) # bivariate poisson foot_abilities(fit5, italy_2000_2002) } ### MLE models (no Stan required) data("italy") italy_2000 <- italy[italy$Season == "2000", ] italy_2000 <- data.frame( periods = italy_2000$Season, home_team = italy_2000$home, away_team = italy_2000$visitor, home_goals = italy_2000$hgoal, away_goals = italy_2000$vgoal ) ## Dixon-Coles MLE mle_dc <- mle_foot(data = italy_2000, model = "dixon_coles") foot_abilities(mle_dc, italy_2000) ## Negative Binomial MLE mle_nb <- mle_foot(data = italy_2000, model = "neg_bin") foot_abilities(mle_nb, italy_2000, type = "attack") ## End(Not run)## Not run: if (instantiate::stan_cmdstan_exists()) { library(dplyr) data("italy") italy <- as_tibble(italy) ### no dynamics, no prediction italy_2000_2002 <- italy %>% dplyr::select(Season, home, visitor, hgoal, vgoal) %>% dplyr::filter(Season == "2000" | Season == "2001" | Season == "2002") colnames(italy_2000_2002) <- c("periods", "home_team", "away_team", "home_goals", "away_goals") fit1 <- stan_foot( data = italy_2000_2002, model = "double_pois" ) # double poisson fit2 <- stan_foot( data = italy_2000_2002, model = "biv_pois" ) # bivariate poisson fit3 <- stan_foot( data = italy_2000_2002, model = "skellam" ) # skellam fit4 <- stan_foot( data = italy_2000_2002, model = "student_t" ) # student_t foot_abilities(fit1, italy_2000_2002) foot_abilities(fit2, italy_2000_2002) foot_abilities(fit3, italy_2000_2002) foot_abilities(fit4, italy_2000_2002) ### seasonal dynamics, predict the last season fit5 <- stan_foot( data = italy_2000_2002, model = "biv_pois", predict = 180, dynamic_type = "seasonal" ) # bivariate poisson foot_abilities(fit5, italy_2000_2002) } ### MLE models (no Stan required) data("italy") italy_2000 <- italy[italy$Season == "2000", ] italy_2000 <- data.frame( periods = italy_2000$Season, home_team = italy_2000$home, away_team = italy_2000$visitor, home_goals = italy_2000$hgoal, away_goals = italy_2000$vgoal ) ## Dixon-Coles MLE mle_dc <- mle_foot(data = italy_2000, model = "dixon_coles") foot_abilities(mle_dc, italy_2000) ## Negative Binomial MLE mle_nb <- mle_foot(data = italy_2000, model = "neg_bin") foot_abilities(mle_nb, italy_2000, type = "attack") ## End(Not run)
The function provides a table containing the home win, draw and away win probabilities for a bunch of
out-of-sample matches as specified by stan_foot or mle_foot.
foot_prob(object, data, home_team, away_team)foot_prob(object, data, home_team, away_team)
object |
An object either of class |
data |
A data frame containing match data with columns:
|
home_team |
The home team(s) for the predicted matches. |
away_team |
The away team(s) for the predicted matches. |
For Bayesian models the results probabilities are computed according to the
simulation from the posterior predictive distribution of future (out-of-sample) matches.
Specifically, matches are ordered from those in which the favorite team has the highest posterior probability
of winning to those where the underdog is more likely to win. For MLE models
fitted via the mle_foot the probabilities are computed by simulating from the MLE estimates.
Supported MLE models include: "double_pois", "biv_pois", "dixon_coles",
"neg_bin", "skellam", and "student_t".
For the Dixon-Coles model, predictions incorporate the low-score dependence adjustment via weighted resampling from independent Poisson simulations. For the negative binomial model, predictions use the NB2 parameterization with separate overdispersion parameters for home and away goals.
A list with components:
prob_table: A data frame containing the results probabilities of the out-of-sample matches.
prob_plot: A ggplot object for Bayesian models only showing the posterior predictive heatmap
of exact score probabilities, with the true result highlighted.
Leonardo Egidi [email protected] and Roberto Macrì Demartino [email protected].
## Not run: if (instantiate::stan_cmdstan_exists()) { library(dplyr) data("italy") italy_2000 <- italy %>% dplyr::select(Season, home, visitor, hgoal, vgoal) %>% dplyr::filter(Season == "2000") colnames(italy_2000) <- c("periods", "home_team", "away_team", "home_goals", "away_goals") fit <- stan_foot( data = italy_2000, model = "double_pois", predict = 18 ) # double pois foot_prob( fit, italy_2000, "Inter", "Bologna FC" ) foot_prob(fit, italy_2000) # all the out-of-sample matches } ## End(Not run)## Not run: if (instantiate::stan_cmdstan_exists()) { library(dplyr) data("italy") italy_2000 <- italy %>% dplyr::select(Season, home, visitor, hgoal, vgoal) %>% dplyr::filter(Season == "2000") colnames(italy_2000) <- c("periods", "home_team", "away_team", "home_goals", "away_goals") fit <- stan_foot( data = italy_2000, model = "double_pois", predict = 18 ) # double pois foot_prob( fit, italy_2000, "Inter", "Bologna FC" ) foot_prob(fit, italy_2000) # all the out-of-sample matches } ## End(Not run)
Posterior predictive plots and final rank table for football seasons.
foot_rank(object, data, teams = NULL, visualize = "individual")foot_rank(object, data, teams = NULL, visualize = "individual")
object |
An object either of class |
data |
A data frame containing match data with columns:
|
teams |
An optional character vector specifying team names to include. If |
visualize |
Type of plots, one among |
For Bayesian models fitted via stan_foot the final rank tables are computed according to the
simulation from the posterior predictive distribution of future (out-of-sample) matches.
The dataset should refer to one or more seasons from a given national football league (Premier League, Serie A, La Liga, etc.).
If visualize = "aggregated": a list with
rank_table: A data frame of observed and simulated final points (median, 25%/75% quantiles).
rank_plot: A ggplot comparing observed vs simulated final points for each team.
If visualize = "individual": A ggplot showing, for each selected team, the observed and
simulated cumulative points over match‑days.
Leonardo Egidi [email protected] and Roberto Macrì Demartino [email protected]
## Not run: if (instantiate::stan_cmdstan_exists()) { library(dplyr) data("italy") italy_1999_2000 <- italy %>% dplyr::select(Season, home, visitor, hgoal, vgoal) %>% dplyr::filter(Season == "1999" | Season == "2000") colnames(italy_1999_2000) <- c("periods", "home_team", "away_team", "home_goals", "away_goals") fit <- stan_foot(italy_1999_2000, "double_pois", iter_sampling = 200) foot_rank(fit, italy_1999_2000) foot_rank(fit, italy_1999_2000, visualize = "individual") } ## End(Not run)## Not run: if (instantiate::stan_cmdstan_exists()) { library(dplyr) data("italy") italy_1999_2000 <- italy %>% dplyr::select(Season, home, visitor, hgoal, vgoal) %>% dplyr::filter(Season == "1999" | Season == "2000") colnames(italy_1999_2000) <- c("periods", "home_team", "away_team", "home_goals", "away_goals") fit <- stan_foot(italy_1999_2000, "double_pois", iter_sampling = 200) foot_rank(fit, italy_1999_2000) foot_rank(fit, italy_1999_2000, visualize = "individual") } ## End(Not run)
Posterior predictive probabilities for a football season in a round-robin format
foot_round_robin(object, data, teams = NULL, output = "both")foot_round_robin(object, data, teams = NULL, output = "both")
object |
An object either of class |
data |
A data frame containing match data with columns:
|
teams |
An optional character vector specifying team names to include. If |
output |
An optional character string specifying the type of output to return. One of |
For Bayesian models fitted via stan_foot the round-robin table is computed according to the
simulation from the posterior predictive distribution of future (out-of-sample) matches.
The dataset should refer to one or more seasons from a given national football league (Premier League, Serie A, La Liga, etc.).
If output = "both" a list with:
round_table: A data frame of matchups (Home, Away), observed scores, and Home_prob (median posterior probability of a home win).
round_plot: A ggplot heatmap of home‑win probabilities with observed scores overlaid.
If output = "table" or "plot", returns only that component.
Leonardo Egidi [email protected] and Roberto Macrì Demartino [email protected]
## Not run: if (instantiate::stan_cmdstan_exists()) { library(dplyr) data("italy") italy_1999_2000 <- italy %>% dplyr::select(Season, home, visitor, hgoal, vgoal) %>% dplyr::filter(Season == "1999" | Season == "2000") colnames(italy_1999_2000) <- c("periods", "home_team", "away_team", "home_goals", "away_goals") fit <- stan_foot(italy_1999_2000, "double_pois", predict = 45, iter_sampling = 200) foot_round_robin(fit, italy_1999_2000) foot_round_robin(fit, italy_1999_2000, c("Parma AC", "AS Roma")) } ## End(Not run)## Not run: if (instantiate::stan_cmdstan_exists()) { library(dplyr) data("italy") italy_1999_2000 <- italy %>% dplyr::select(Season, home, visitor, hgoal, vgoal) %>% dplyr::filter(Season == "1999" | Season == "2000") colnames(italy_1999_2000) <- c("periods", "home_team", "away_team", "home_goals", "away_goals") fit <- stan_foot(italy_1999_2000, "double_pois", predict = 45, iter_sampling = 200) foot_round_robin(fit, italy_1999_2000) foot_round_robin(fit, italy_1999_2000, c("Parma AC", "AS Roma")) } ## End(Not run)
All results for Italian soccer games in the top tier from 1934/35 season to 2021/22 season.
italyitaly
A data frame with 27684 rows and 8 variables:
Date of match
Season of match - refers to starting year
Home team
Visiting team
Full-time result
Goals scored by home team
Goals scored by visiting team
Tier of football pyramid: 1
Fits football goal-based models using maximum likelihood estimation. Supported models include: double Poisson, bivariate Poisson, Dixon-Coles, negative binomial, Skellam, and Student's t.
mle_foot( data, model, predict = 0, maxit = 1000, method = "BFGS", interval = "profile", hessian = FALSE, sigma_y = 1 )mle_foot( data, model, predict = 0, maxit = 1000, method = "BFGS", interval = "profile", hessian = FALSE, sigma_y = 1 )
data |
A data frame containing match data with columns:
|
model |
A character specifying the model to fit. Options are:
|
predict |
An integer specifying the number of out-of-sample matches for prediction. If missing, the function fits the model to the entire dataset without making predictions. |
maxit |
An integer specifying the maximum number of optimizer iterations default is 1000). |
method |
A character specifying the optimization method. Options are
For further details see |
interval |
A character specifying the interval type for confidence intervals. Options are
|
hessian |
A logical value indicating to include the computation of the Hessian (default FALSE). |
sigma_y |
A positive numeric value indicating the scale parameter for Student t likelihood (default 1). |
MLE can be obtained only for static models, with no time-dependence.
Likelihood optimization is performed via the BFGS method
of the {optim} function in stats package.
A named list containing:
att: A matrix of attack ratings, with MLE and 95% confidence intervals (for "double_pois", "biv_pois", "dixon_coles", "neg_bin" and "skellam" models).
def: A matrix of defence ratings, with MLE and 95% confidence intervals (for "double_pois", "biv_pois", "dixon_coles", "neg_bin" and "skellam" models).
abilities: A matrix of combined ability, with MLE and 95% confidence intervals (for "student_t" only).
home_effect: A matrix with with MLE and 95% confidence intervals for the home effect estimate.
corr: A matrix with MLE and 95% confidence intervals for the bivariate Poisson correlation parameter (for "biv_pois" only).
rho: A matrix with MLE and 95% confidence intervals for the Dixon-Coles dependence parameter (for "dixon_coles" only).
overdispersion: A matrix with MLE and 95% confidence intervals for the home and away overdispersion parameters (for "neg_bin" only).
model: The name of the fitted model (character).
predict: The number of out-of-sample matches used for prediction (integer).
sigma_y: The scale parameter used in the Student t likelihood (for "student_t" only).
team1_prev: Integer indices of home teams in the out-of-sample matches (if predict > 0).
team2_prev: Integer indices of away teams in the out-of-sample matches (if predict > 0).
logLik: The maximized log likelihood (numeric).
aic: Akaike Information Criterion (numeric).
bic: Bayesian Information Criterion (numeric).
Leonardo Egidi [email protected] and Roberto Macrì Demartino [email protected]
Baio, G. and Blangiardo, M. (2010). Bayesian hierarchical model for the prediction of football results. Journal of Applied Statistics 37(2), 253-264.
Dixon, M. J. and Coles, S. G. (1997). Modelling association football scores and inefficiencies in the football betting market. Journal of the Royal Statistical Society: Series C (Applied Statistics) 46(2), 265-280.
Egidi, L., Pauli, F., and Torelli, N. (2018). Combining historical data and bookmakers' odds in modelling football scores. Statistical Modelling, 18(5-6), 436-459.
Gelman, A. (2014). Stan goes to the World Cup. From "Statistical Modeling, Causal Inference, and Social Science" blog.
Karlis, D. and Ntzoufras, I. (2003). Analysis of sports data by using bivariate poisson models. Journal of the Royal Statistical Society: Series D (The Statistician) 52(3), 381-393.
Karlis, D. and Ntzoufras,I. (2009). Bayesian modelling of football outcomes: Using the Skellam's distribution for the goal difference. IMA Journal of Management Mathematics 20(2), 133-145.
Owen, A. (2011). Dynamic Bayesian forecasting models of football match outcomes with estimation of the evolution variance parameter. IMA Journal of Management Mathematics, 22(2), 99-113.
## Not run: library(dplyr) data("italy") italy <- as_tibble(italy) italy_2000_2002 <- italy %>% dplyr::select(Season, home, visitor, hgoal, vgoal) %>% dplyr::filter(Season == "2000" | Season == "2001" | Season == "2002") colnames(italy_2000_2002) <- c("periods", "home_team", "away_team", "home_goals", "away_goals") mle_fit <- mle_foot( data = italy_2000_2002, model = "double_pois" ) ## End(Not run)## Not run: library(dplyr) data("italy") italy <- as_tibble(italy) italy_2000_2002 <- italy %>% dplyr::select(Season, home, visitor, hgoal, vgoal) %>% dplyr::filter(Season == "2000" | Season == "2001" | Season == "2002") colnames(italy_2000_2002) <- c("periods", "home_team", "away_team", "home_goals", "away_goals") mle_fit <- mle_foot( data = italy_2000_2002, model = "double_pois" ) ## End(Not run)
btdFoot ObjectsPlots for the posterior distributions of team log-strengths and other parameters with customizable plot types and facets.
plot_btdPosterior( x, pars = "logStrength", plot_type = "boxplot", teams = NULL, ncol = NULL, scales = NULL )plot_btdPosterior( x, pars = "logStrength", plot_type = "boxplot", teams = NULL, ncol = NULL, scales = NULL )
x |
An object of class |
pars |
A character string specifying the parameter to plot.
Choices are |
plot_type |
A character string specifying the type of plot.
Choices are |
teams |
An optional character vector specifying team names to include in the posterior
boxplots or density plots. If |
ncol |
An optional integer specifying the number of columns in the facet wrap
when using a dynamic Bayesian Bradley-Terry-Davidson model.
Default is |
scales |
An optional character string specifying the scales for the facets
when using a dynamic Bayesian Bradley-Terry-Davidson model.
Options include |
A ggplot object displaying:
For pars="logStrength":
Dynamic BTD: Faceted boxplots or density plots (including the 95% credible interval) of posterior log-strengths by team and period.
Static BTD: Boxplots or density plots (including the 95% credible interval) of posterior log-strengths for each team.
For pars="logTie" or pars="home": A single boxplot or density plot with 95% credible interval.
Roberto Macrì Demartino [email protected].
## Not run: if (instantiate::stan_cmdstan_exists()) { library(dplyr) # Load example data data("italy") # Prepare the data italy_2020_2021_rank <- italy %>% select(Season, home, visitor, hgoal, vgoal) %>% filter(Season %in% c("2020", "2021")) %>% mutate(match_outcome = case_when( hgoal > vgoal ~ 1, # Home team wins hgoal == vgoal ~ 2, # Draw hgoal < vgoal ~ 3 # Away team wins )) %>% mutate(periods = case_when( row_number() <= 190 ~ 1, row_number() <= 380 ~ 2, row_number() <= 570 ~ 3, TRUE ~ 4 )) %>% # Assign periods based on match number select(periods, home_team = home, away_team = visitor, match_outcome ) # Fit the Bayesian Bradley-Terry-Davidson model with dynamic ranking fit_rank_dyn <- btd_foot( data = italy_2020_2021_rank, dynamic_rank = TRUE, rank_measure = "median", iter_sampling = 1000, parallel_chains = 2, chains = 2 ) # Plot posterior distributions with default settings plot_btdPosterior(fit_rank_dyn) # Plot posterior distributions for specific teams with customized facets plot_btdPosterior( fit_rank_dyn, teams = c("AC Milan", "AS Roma", "Juventus", "Inter"), ncol = 2 ) plot_btdPosterior( fit_rank_dyn, plot_type = "density", teams = c("AC Milan", "AS Roma", "Juventus", "Inter"), ncol = 2 ) } ## End(Not run)## Not run: if (instantiate::stan_cmdstan_exists()) { library(dplyr) # Load example data data("italy") # Prepare the data italy_2020_2021_rank <- italy %>% select(Season, home, visitor, hgoal, vgoal) %>% filter(Season %in% c("2020", "2021")) %>% mutate(match_outcome = case_when( hgoal > vgoal ~ 1, # Home team wins hgoal == vgoal ~ 2, # Draw hgoal < vgoal ~ 3 # Away team wins )) %>% mutate(periods = case_when( row_number() <= 190 ~ 1, row_number() <= 380 ~ 2, row_number() <= 570 ~ 3, TRUE ~ 4 )) %>% # Assign periods based on match number select(periods, home_team = home, away_team = visitor, match_outcome ) # Fit the Bayesian Bradley-Terry-Davidson model with dynamic ranking fit_rank_dyn <- btd_foot( data = italy_2020_2021_rank, dynamic_rank = TRUE, rank_measure = "median", iter_sampling = 1000, parallel_chains = 2, chains = 2 ) # Plot posterior distributions with default settings plot_btdPosterior(fit_rank_dyn) # Plot posterior distributions for specific teams with customized facets plot_btdPosterior( fit_rank_dyn, teams = c("AC Milan", "AS Roma", "Juventus", "Inter"), ncol = 2 ) plot_btdPosterior( fit_rank_dyn, plot_type = "density", teams = c("AC Milan", "AS Roma", "Juventus", "Inter"), ncol = 2 ) } ## End(Not run)
Visualizes team rankings based on whether the ranking is dynamic or static.
plot_logStrength(x, teams = NULL)plot_logStrength(x, teams = NULL)
x |
An object of class |
teams |
An optional character vector specifying team names to include in the rankings plot. If |
Dynamic Ranking: Plots Rank Points over Periods for each team with lines and points.
Static Ranking: Plots Rank Points on the x-axis against Team Names on the y-axis with horizontal lines and points.
A ggplot object:
Dynamic BTD: A lineplot for the log_strengths over each period, colored by team.
Static BTD: An horizontal barplot for each team.
Roberto Macrì Demartino [email protected].
## Not run: if (instantiate::stan_cmdstan_exists()) { library(dplyr) data("italy") italy_2020_2021_rank <- italy %>% select(Season, home, visitor, hgoal, vgoal) %>% filter(Season == "2020" | Season == "2021") %>% mutate(match_outcome = case_when( hgoal > vgoal ~ 1, # Home team wins hgoal == vgoal ~ 2, # Draw hgoal < vgoal ~ 3 # Away team wins )) %>% mutate(periods = case_when( row_number() <= 190 ~ 1, row_number() <= 380 ~ 2, row_number() <= 570 ~ 3, TRUE ~ 4 )) %>% # Assign periods based on match number select(periods, home_team = home, away_team = visitor, match_outcome ) fit_rank_dyn <- btd_foot( data = italy_2020_2021_rank, dynamic_rank = TRUE, rank_measure = "median", iter_sampling = 1000, parallel_chains = 2, chains = 2 ) plot_logStrength(fit_rank_dyn) plot_logStrength(fit_rank_dyn, teams = c("AC Milan", "AS Roma", "Juventus", "Inter")) } ## End(Not run)## Not run: if (instantiate::stan_cmdstan_exists()) { library(dplyr) data("italy") italy_2020_2021_rank <- italy %>% select(Season, home, visitor, hgoal, vgoal) %>% filter(Season == "2020" | Season == "2021") %>% mutate(match_outcome = case_when( hgoal > vgoal ~ 1, # Home team wins hgoal == vgoal ~ 2, # Draw hgoal < vgoal ~ 3 # Away team wins )) %>% mutate(periods = case_when( row_number() <= 190 ~ 1, row_number() <= 380 ~ 2, row_number() <= 570 ~ 3, TRUE ~ 4 )) %>% # Assign periods based on match number select(periods, home_team = home, away_team = visitor, match_outcome ) fit_rank_dyn <- btd_foot( data = italy_2020_2021_rank, dynamic_rank = TRUE, rank_measure = "median", iter_sampling = 1000, parallel_chains = 2, chains = 2 ) plot_logStrength(fit_rank_dyn) plot_logStrength(fit_rank_dyn, teams = c("AC Milan", "AS Roma", "Juventus", "Inter")) } ## End(Not run)
The function provides posterior predictive plots to check the adequacy of the Bayesian models as
returned by the stan_foot function.
pp_foot(object, data, type = "aggregated", coverage = 0.95)pp_foot(object, data, type = "aggregated", coverage = 0.95)
object |
An object either of class |
data |
A data frame containing match data with columns:
|
type |
Type of plots, one among |
coverage |
Argument to specify the width |
Posterior predictive plots: when "aggregated" (default) is selected, the function
returns a frequency plot for some pre-selected goal-difference values,
along with their correspondent Bayesian p-values, computed as
, where is a data replication from the
posterior predictive distribution (more details in Gelman et al., 2013).
Bayesian p-values very close to 0 or 1 could exhibit
possible model misfits.
When "matches" is selected an ordered-frequency plot for all the
goal-differences in the considered matches is provided, along with the
empirical Bayesian coverage at level .
A list with elements:
pp_plot: A ggplot object for the selected type of plot.
pp_table: A data frame of summary statistics:
For "aggregated": Goal differences and their Bayesian p‑values.
For "matches": Nominal 1-alpha and observed empirical Bayesian coverage.
Leonardo Egidi [email protected] and Roberto Macrì Demartino [email protected]
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013). Bayesian data analysis. CRC press.
## Not run: if (instantiate::stan_cmdstan_exists()) { library(dplyr) data("italy") italy_2000 <- italy %>% dplyr::select(Season, home, visitor, hgoal, vgoal) %>% dplyr::filter(Season == "2000") colnames(italy_2000) <- c("periods", "home_team", "away_team", "home_goals", "away_goals") fit <- stan_foot(italy_2000, "double_pois", iter_sampling = 200) pp_foot(fit, italy_2000) } ## End(Not run)## Not run: if (instantiate::stan_cmdstan_exists()) { library(dplyr) data("italy") italy_2000 <- italy %>% dplyr::select(Season, home, visitor, hgoal, vgoal) %>% dplyr::filter(Season == "2000") colnames(italy_2000) <- c("periods", "home_team", "away_team", "home_goals", "away_goals") fit <- stan_foot(italy_2000, "double_pois", iter_sampling = 200) pp_foot(fit, italy_2000) } ## End(Not run)
Provides detailed posterior summaries for the Bayesian Bradley-Terry-Davidson model parameters.
## S3 method for class 'btdFoot' print( x, pars = NULL, teams = NULL, digits = 3, true_names = TRUE, display = "both", ... )## S3 method for class 'btdFoot' print( x, pars = NULL, teams = NULL, digits = 3, true_names = TRUE, display = "both", ... )
x |
An object of class |
pars |
Optional character vector specifying parameters to include in the summary (e.g., |
teams |
Optional character vector specifying team names whose |
digits |
Number of digits to use when printing numeric values. Default is |
true_names |
Logical value indicating whether to display team names in parameter summaries. Default is |
display |
Character string specifying which parts of the output to display. Options are |
... |
Additional arguments passed. |
Roberto Macrì Demartino [email protected]
Provides a formatted output when printing objects of class compareFoot, displaying the predictive performance metrics and, if available, the confusion matrices for each model or probability matrix.
## S3 method for class 'compareFoot' print(x, digits = 3, ...)## S3 method for class 'compareFoot' print(x, digits = 3, ...)
x |
An object of class |
digits |
Number of digits to use when printing numeric values for the metrics. Default is |
... |
Additional arguments passed to |
Roberto Macrì Demartino [email protected]
Provides detailed posterior summaries for the Stan football model parameters.
## S3 method for class 'stanFoot' print(x, pars = NULL, teams = NULL, digits = 3, true_names = TRUE, ...)## S3 method for class 'stanFoot' print(x, pars = NULL, teams = NULL, digits = 3, true_names = TRUE, ...)
x |
An object of class |
pars |
Optional character vector specifying parameters to include in the summary. This can be specific parameter names (e.g., |
teams |
Optional character vector specifying team names whose |
digits |
Number of digits to use when printing numeric values. Default is |
true_names |
Logical value indicating whether to display team names in parameter summaries. Default is |
... |
Additional arguments passed. |
Roberto Macrì Demartino [email protected]
This prior specification is just a duplicate of some of the priors used by the rstanarm package.
These prior distributions can be passed to the
stan_foot function, through the arguments prior and prior_sd.
See the vignette Prior
Distributions for rstanarm Models for further details (to view the priors used for an existing model see
prior_summary).
The default priors used in the stan_foot modeling function
are intended to be weakly informative in that they provide moderate
regularlization and help stabilize computation.
You can choose between: normal, cauchy, laplace, student_t.
normal(location = 0, scale = NULL, autoscale = TRUE) student_t(df = 1, location = 0, scale = NULL, autoscale = TRUE) cauchy(location = 0, scale = NULL, autoscale = TRUE) laplace(location = 0, scale = NULL, autoscale = TRUE)normal(location = 0, scale = NULL, autoscale = TRUE) student_t(df = 1, location = 0, scale = NULL, autoscale = TRUE) cauchy(location = 0, scale = NULL, autoscale = TRUE) laplace(location = 0, scale = NULL, autoscale = TRUE)
location |
Prior location. In most cases, this is the prior mean, but
for |
scale |
Prior scale. The default depends on the family (see Details). |
autoscale |
A logical scalar, defaulting to |
df |
Prior degrees of freedom. The default is |
The details depend on the family of the prior being used:
Family members:
normal(location, scale)
student_t(df, location, scale)
cauchy(location, scale)
Each of these functions also takes an argument autoscale.
For the prior distribution for the intercept, location,
scale, and df should be scalars. For the prior for the other
coefficients they can either be vectors of length equal to the number of
coefficients (not including the intercept), or they can be scalars, in
which case they will be recycled to the appropriate length. As the
degrees of freedom approaches infinity, the Student t distribution
approaches the normal distribution and if the degrees of freedom are one,
then the Student t distribution is the Cauchy distribution.
If scale is not specified it will default to for the
intercept and for the other coefficients.
If the autoscale argument is TRUE (the default), then the
scales will be further adjusted as described above in the documentation of
the autoscale argument in the Arguments section.
Family members:
laplace(location, scale)
Each of these functions also takes an argument autoscale.
The Laplace distribution is also known as the double-exponential distribution. It is a symmetric distribution with a sharp peak at its mean / median / mode and fairly long tails. This distribution can be motivated as a scale mixture of normal distributions and the remarks above about the normal distribution apply here as well.
A named list to be used internally by the
stan_foot model fitting function.
Leonardo Egidi [email protected]
Gelman, A., Jakulin, A., Pittau, M. G., and Su, Y. (2008). A weakly informative default prior distribution for logistic and other regression models. Annals of Applied Statistics. 2(4), 1360–1383.
The various vignettes for the rstanarm package also discuss and demonstrate the use of some of the supported prior distributions.
Fits football goal-based models using Stan via the CmdStan backend. Supported models include: double Poisson, bivariate Poisson, Dixon-Coles, Skellam, Student's t, diagonal-inflated bivariate Poisson, zero-inflated Skellam, and negative Binomial.
stan_foot( data, model, predict = 0, ranking, dynamic_type, dynamic_weight = FALSE, dynamic_par = list(common_sd = FALSE, kl_variance = FALSE, spike = normal(10, 0.1), slab = normal(0, 3)), prior_par = list(ability = normal(0, NULL), ability_sd = cauchy(0, 5), home = normal(0, 5)), home_effect = TRUE, norm_method = "none", ranking_map = NULL, method = "MCMC", ... )stan_foot( data, model, predict = 0, ranking, dynamic_type, dynamic_weight = FALSE, dynamic_par = list(common_sd = FALSE, kl_variance = FALSE, spike = normal(10, 0.1), slab = normal(0, 3)), prior_par = list(ability = normal(0, NULL), ability_sd = cauchy(0, 5), home = normal(0, 5)), home_effect = TRUE, norm_method = "none", ranking_map = NULL, method = "MCMC", ... )
data |
A data frame containing match data with columns:
|
model |
A character string specifying the Stan model to fit. Options are:
|
predict |
An integer specifying the number of out-of-sample matches for prediction. If missing, the function fits the model to the entire dataset without making predictions. |
ranking |
An optional
|
dynamic_type |
A character string specifying the type of dynamics in the model. Options are:
|
dynamic_weight |
A logical value indicating whether to use a weighted dynamic model
with a commensurate prior for the dynamic attack and defense parameters (default |
dynamic_par |
List of hyperparameters for dynamic models. Elements:
|
prior_par |
A list specifying the prior distributions for the parameters of interest:
See the rstanarm package for more details on specifying priors. |
home_effect |
A logical value indicating the inclusion of a home effect in the model. (default is |
norm_method |
A character string specifying the method used to normalize team-specific ranking points. Options are:
|
ranking_map |
An optional vector mapping ranking periods to data periods. If not provided and the number of ranking periods matches the number of data periods, a direct mapping is assumed. |
method |
A character string specifying the method used to obtain the Bayesian estimates. Options are:
|
... |
Additional arguments passed to |
Let denote the
observed number of goals scored by the home
and the away team in the -th game,
respectively. A general bivariate Poisson model
allowing for goals' correlation
(Karlis & Ntzoufras, 2003) is the following:
where the case reduces to
the double Poisson model (Baio & Blangiardo, 2010).
represent the
scoring rates for the home and the away team,
respectively, where: is the home effect;
the parameters and
represent the attack and the
defence abilities,
respectively, for each team , ;
the nested indexes
denote the home and the away team playing in the -th game,
respectively. Attack/defence parameters are imposed a
sum-to-zero constraint to achieve identifiability and
assigned some weakly-informative prior distributions:
with hyperparameters .
The Dixon and Coles (1997) model keeps the two scoring rates
of the double Poisson model but
multiplies the joint probability of low-scoring results by a
dependence factor to better capture the observed
correlation in the and
outcomes:
where
and is the diagonal-inflation dependence parameter,
assigned a weakly-informative prior and
constrained to keep the adjustment factor positive.
Instead of using the marginal number of goals,
another alternative is to modelling directly
the score difference .
We can use the Poisson-difference distribution
(or Skellam distribution) to model goal
difference in the -th match (Karlis & Ntzoufras, 2009):
and the scoring rates are
unchanged with respect to the bivariate/double Poisson model.
If we want to use a continue distribution, we can
use a student t distribution with 7 degrees of
freedom (Gelman, 2014):
where is the overall ability for
the -th team, whereas
is a prior measure of team's strength (for instance a
ranking).
These model rely on the assumption of static parameters.
However, we could assume dynamics in the attach/defence
abilities (Owen, 2011; Egidi et al., 2018, Macri Demartino et al., 2024) in terms of weeks or seasons through the argument
dynamic_type. In such a framework, for a given
number of times , the models
above would be unchanged, but the priors for the abilities
parameters at each time would be:
whereas for we have:
Of course, the identifiability constraint must be imposed for
each time .
The Koopman and Lit (2015) approach extends the dynamic model by allowing
the evolution variance to increase at structural break points (e.g., summer
transfer windows). Specifically, the variance at time is:
where and is an indicator function.
The current version of the package allows for the fit of a diagonal-inflated bivariate Poisson and a zero-inflated Skellam model in the spirit of (Karlis & Ntzoufras, 2003) to better capture draw occurrences. See the vignette for further details.
An object of class "stanFoot", which is a list containing:
fit: The CmdStanFit object returned by cmdstanr.
data: The input data.
stan_data: The data list passed to Stan.
stan_code: The Stan code of the underline model.
stan_args: The optional cmdstanr parameters passed to (...).
alg_method: The inference algorithm used to obtain the Bayesian estimates.
Leonardo Egidi [email protected], Roberto Macrì Demartino [email protected], and Vasilis Palaskas [email protected].
Baio, G. and Blangiardo, M. (2010). Bayesian hierarchical model for the prediction of football results. Journal of Applied Statistics 37(2), 253-264.
Dixon, M. J. and Coles, S. G. (1997). Modelling association football scores and inefficiencies in the football betting market. Journal of the Royal Statistical Society: Series C (Applied Statistics), 46(2), 265-280.
Egidi, L., Pauli, F., and Torelli, N. (2018). Combining historical data and bookmakers' odds in modelling football scores. Statistical Modelling, 18(5-6), 436-459.
Gelman, A. (2014). Stan goes to the World Cup. From "Statistical Modeling, Causal Inference, and Social Science" blog.
Koopman, S. J. and Lit, R. (2015). A dynamic bivariate Poisson model for analysing and forecasting match results in the English Premier League. Journal of the Royal Statistical Society: Series A (Statistics in Society), 178(1), 167-186.
Macrì Demartino, R., Egidi, L. and Torelli, N. Alternative ranking measures to predict international football results. Computational Statistics (2024), 1-19.
Karlis, D. and Ntzoufras, I. (2003). Analysis of sports data by using bivariate poisson models. Journal of the Royal Statistical Society: Series D (The Statistician) 52(3), 381-393.
Karlis, D. and Ntzoufras,I. (2009). Bayesian modelling of football outcomes: Using the Skellam's distribution for the goal difference. IMA Journal of Management Mathematics 20(2), 133-145.
Owen, A. (2011). Dynamic Bayesian forecasting models of football match outcomes with estimation of the evolution variance parameter. IMA Journal of Management Mathematics, 22(2), 99-113.
## Not run: if (instantiate::stan_cmdstan_exists()) { library(dplyr) # Example usage with Koopman & Lit (2015) approach data("italy") italy <- as_tibble(italy) italy_multi <- italy %>% select(Season, home, visitor, hgoal, vgoal) %>% filter(Season %in% c("2018", "2019", "2020", "2021")) colnames(italy_multi) <- c("periods", "home_team", "away_team", "home_goals", "away_goals") # Fit with K&L variance inflation at summer breaks fit_kl <- stan_foot( data = italy_multi, model = "biv_pois", dynamic_type = "seasonal", dynamic_par = list(kl_variance = TRUE), home_effect = TRUE, iter_sampling = 1000, chains = 4, parallel_chains = 4 ) print(fit_kl, pars = c("sigma_att_kl", "sigma_def_kl", "sigma_break")) } ## End(Not run)## Not run: if (instantiate::stan_cmdstan_exists()) { library(dplyr) # Example usage with Koopman & Lit (2015) approach data("italy") italy <- as_tibble(italy) italy_multi <- italy %>% select(Season, home, visitor, hgoal, vgoal) %>% filter(Season %in% c("2018", "2019", "2020", "2021")) colnames(italy_multi) <- c("periods", "home_team", "away_team", "home_goals", "away_goals") # Fit with K&L variance inflation at summer breaks fit_kl <- stan_foot( data = italy_multi, model = "biv_pois", dynamic_type = "seasonal", dynamic_par = list(kl_variance = TRUE), home_effect = TRUE, iter_sampling = 1000, chains = 4, parallel_chains = 4 ) print(fit_kl, pars = c("sigma_att_kl", "sigma_def_kl", "sigma_break")) } ## End(Not run)