\documentclass[a4paper]{article} \bibliographystyle{apalike} \usepackage[utf8]{inputenc} %\VignetteIndexEntry{Consensus models} \title{Consensus models weighted by AUC for multiple class responses} \author{Andreas Dominik Cullmann} \SweaveOpts{echo=FALSE} \SweaveOpts{print=FALSE} \SweaveOpts{width=60} \begin{document} \maketitle \providecommand{\rpack}[1]{package \texttt{#1}} \providecommand{\rdata}[1]{\texttt{#1} data} \providecommand{\rcode}[1]{\texttt{#1}} \section{Introduction} This vignette shows how to build a consensus model for the \rdata{fgl} (see \cite{MASS} or \rcode{?MASS::fgl}). We will follow the modelling process shown in \cite[Figure 1]{marmion2009} restricting ourselves to only two 'Single-models': a classification tree and a multinomial log-linear model using \rpack{rpart} and \rpack{nnet}, respectively. \section{Example} <>= options(useFancyQuotes="UTF-8") @ After loading the data, we create random indices for training and evaluation sets (since the training and evaluation sets are complements with respect to the data set, two indices are redundant, but \rcode{fgl[ind\_eval, ]} might be easier to read than \rcode{fgl[!ind\_train, ]}). <>= library(MASS) data(fgl) set.seed(100) ind_train <- sample(nrow(fgl), size = floor(nrow(fgl) * 0.7), replace = FALSE) ind_eval <- setdiff(seq(1:nrow(fgl)), ind_train) @ Choosing \rcode{fgl\$type} as response and all other variables as predictors, we calibrate a classification tree using \rcode{rpart::rpart} (cf. \cite[p. 264]{MASS}): <>= library(rpart) set.seed(123) fgl_rpart <- rpart(type ~ ., data = fgl[ind_train, TRUE], parms = list(split = "information")) newcp <- max(fgl_rpart$cptable[,"CP"] * as.numeric(fgl_rpart$cptable[TRUE ,"xerror"] < sum(fgl_rpart$cptable[dim(fgl_rpart$cptable)[1], c("xerror","xstd")])) ) + 1e-13 fgl_rpart_pruned <- prune(fgl_rpart, cp = newcp) @ and a multinomial log-linear model using \rcode{nnet::multinom}: <>= library(nnet) fgl_multinom <- multinom(type ~ ., data = fgl[ind_train, TRUE], trace = FALSE) @ We can now assess the model accuracy using either a confusion matrix of the classified predictions <>= library(mda) confusion(predict(fgl_rpart_pruned, newdata = fgl[ind_eval, TRUE], type = "class"), fgl[ind_eval, "type"]) confusion(predict(fgl_multinom, newdata = fgl[ind_eval, TRUE], type = "class"), fgl[ind_eval, "type"]) @ or a multiple class version of AUC using the raw predictions: <>= library(HandTill2001) auc(multcap(response = fgl[ind_eval, "type"], predicted = predict(fgl_rpart_pruned, newdata = fgl[ind_eval, TRUE]))) auc(multcap(response = fgl[ind_eval, "type"], predicted = predict(fgl_multinom, newdata = fgl[ind_eval, TRUE], type = "probs"))) @ To enhance predictive performance, we decide to use both models to build a consensus model. Furthermore, we want to use the 'weighted average' consensus method given by \cite[Eqn 1]{marmion2009}, which uses the pre-evaluated AUC of the models as weights. So we split the training set into two complementary subsets: 'inner training' and 'inner evaluation'. <>= set.seed(100) ind_inner_train <- sample(ind_train, size = floor(length(ind_train) * 0.7), replace = FALSE) ind_inner_eval <- setdiff(ind_train, ind_inner_train) @ We then refit our two models to the 'inner training' data: <>= wa_fgl_multinom <- multinom(fgl_multinom, data = fgl[ind_inner_train, ], trace = FALSE) wa_fgl_rpart <- rpart(type ~ ., data = fgl[ind_inner_train, ], parms = list(split = "information") ) newcp <- max(wa_fgl_rpart$cptable[,"CP"] * as.numeric(wa_fgl_rpart$cptable[TRUE ,"xerror"] < sum(wa_fgl_rpart$cptable[dim(wa_fgl_rpart$cptable)[1], c("xerror","xstd")])) ) + 1e-13 wa_fgl_rpart_pruned <- prune(wa_fgl_rpart, cp = newcp) @ and calculate pre-evaluation AUC (which we save in a \rcode{list}) using the 'inner evaluation' data and the refitted models: <>= li <- list() li$rpart$auc <- auc(multcap(response = fgl[ind_inner_eval, "type"], predicted = predict(wa_fgl_rpart_pruned, newdata = fgl[ind_inner_eval, TRUE] ))) li$mllm$auc <- auc(multcap(response = fgl[ind_inner_eval, "type"], predicted = predict(wa_fgl_multinom, newdata = fgl[ind_inner_eval, TRUE], type = "probs"))) @ We add the predictions using the models (the 'original' or 'Single' ones, not the refitted) for the evaluation set <>= li$rpart$predictions <- predict(fgl_rpart_pruned, newdata = fgl[ind_eval, TRUE]) li$mllm$predictions <- predict(fgl_multinom, newdata = fgl[ind_eval, TRUE], type = "probs") @ and obtain the consensus predictions as (\cite[Eqn 1]{marmion2009}) <>= predicted <- Reduce('+', lapply(li, function(x) x$auc * x$predictions) ) / Reduce('+', sapply(li,"[", "auc")) @ which perform (slightly) better than the predictions using the 'single models': <>= auc(multcap(response = fgl[ind_eval, "type"], predicted = predicted)) @ To classify the predicted probabilities, we choose the class with highest predicted probability (which.max gives the \_first\_ maximum): <>= classes_predicted <- factor(x = apply(predicted, 1, function(x) dimnames(predicted)[[2]][which.max(x)]), levels = levels(fgl[ind_eval, "type"]) ) @ \bibliography{bibliography} \end{document}