Chapter 31 Regularized Discriminant Analysis

We now use the Sonar dataset from the mlbench package to explore a new regularization method, regularized discriminant analysis (RDA), which combines the LDA and QDA. This is similar to how elastic net combines the ridge and lasso.

31.1 Sonar Data

# this is a temporary workaround for an issue with glmnet, Matrix, and R version 3.3.3
# see here: http://stackoverflow.com/questions/43282720/r-error-in-validobject-object-when-running-as-script-but-not-in-console
library(methods)
library(mlbench)
library(caret)
library(glmnet)
library(klaR)
data(Sonar)
#View(Sonar)
table(Sonar$Class) / nrow(Sonar)
## 
##         M         R 
## 0.5336538 0.4663462
ncol(Sonar) - 1
## [1] 60

31.2 RDA

Regularized discriminant analysis uses the same general setup as LDA and QDA but estimates the covariance in a new way, which combines the covariance of QDA \((\hat{\Sigma}_k)\) with the covariance of LDA \((\hat{\Sigma})\) using a tuning parameter \(\lambda\).

\[ \hat{\Sigma}_k(\lambda) = (1-\lambda)\hat{\Sigma}_k + \lambda \hat{\Sigma} \]

Using the rda() function from the klaR package, which caret utilizes, makes an additional modification to the covariance matrix, which also has a tuning parameter \(\gamma\).

\[ \hat{\Sigma}_k(\lambda,\gamma) = (1 -\gamma) \hat{\Sigma}_k(\lambda) + \gamma \frac{1}{p} \text{tr}(\hat{\Sigma}_k(\lambda)) I \]

Both \(\gamma\) and \(\lambda\) can be thought of as mixing parameters, as they both take values between 0 and 1. For the four extremes of \(\gamma\) and \(\lambda\), the covariance structure reduces to special cases:

  • \((\gamma=0, \lambda=0)\): QDA - individual covariance for each group.
  • \((\gamma=0, \lambda=1)\): LDA - a common covariance matrix.
  • \((\gamma=1, \lambda=0)\): Conditional independent variables - similar to Naive Bayes, but variable variances within group (main diagonal elements) are all equal.
  • \((\gamma=1, \lambda=1)\): Classification using euclidean distance - as in previous case, but variances are the same for all groups. Objects are assigned to group with nearest mean.

31.5 Comparison to Elastic Net

set.seed(1337)
fit_elnet_grid = train(Class ~ ., data = Sonar, method = "glmnet",
                       trControl = cv_5_grid, tuneLength = 10)
set.seed(1337)
fit_elnet_int_grid = train(Class ~ . ^ 2, data = Sonar, method = "glmnet",
                           trControl = cv_5_grid, tuneLength = 10)

31.6 Results

get_best_result = function(caret_fit) {
  best_result = caret_fit$results[as.numeric(rownames(caret_fit$bestTune)), ]
  rownames(best_result) = NULL
  best_result
}
knitr::kable(rbind(
  get_best_result(fit_rda_grid),
  get_best_result(fit_rda_rand)))
gamma lambda Accuracy Kappa AccuracySD KappaSD
0.5000000 0.5000000 0.8271777 0.6502693 0.0970937 0.1962078
0.1460436 0.3391397 0.8508711 0.6996434 0.0887983 0.1797382
knitr::kable(rbind(
  get_best_result(fit_elnet_grid),
  get_best_result(fit_elnet_int_grid)))
alpha lambda Accuracy Kappa AccuracySD KappaSD
0.1 0.0350306 0.8271777 0.6501991 0.0416190 0.0881929
0.1 0.0561881 0.8364692 0.6687824 0.0751608 0.1540228

31.8 RMarkdown

The RMarkdown file for this chapter can be found here. The file was created using R version 4.0.2 and the following packages:

  • Base Packages, Attached
## [1] "stats"     "graphics"  "grDevices" "utils"     "datasets"  "methods"  
## [7] "base"
  • Additional Packages, Attached
## [1] "klaR"    "MASS"    "glmnet"  "Matrix"  "caret"   "ggplot2" "lattice"
## [8] "mlbench"
  • Additional Packages, Not Attached
##  [1] "splines"      "foreach"      "prodlim"      "shiny"        "highr"       
##  [6] "stats4"       "yaml"         "ipred"        "pillar"       "glue"        
## [11] "pROC"         "digest"       "promises"     "colorspace"   "recipes"     
## [16] "htmltools"    "httpuv"       "plyr"         "timeDate"     "pkgconfig"   
## [21] "labelled"     "haven"        "questionr"    "bookdown"     "purrr"       
## [26] "xtable"       "scales"       "later"        "gower"        "lava"        
## [31] "tibble"       "combinat"     "generics"     "farver"       "ellipsis"    
## [36] "withr"        "nnet"         "survival"     "magrittr"     "crayon"      
## [41] "mime"         "evaluate"     "nlme"         "forcats"      "class"       
## [46] "tools"        "data.table"   "hms"          "lifecycle"    "stringr"     
## [51] "munsell"      "compiler"     "e1071"        "rlang"        "grid"        
## [56] "iterators"    "rstudioapi"   "miniUI"       "labeling"     "rmarkdown"   
## [61] "gtable"       "ModelMetrics" "codetools"    "reshape2"     "R6"          
## [66] "lubridate"    "knitr"        "dplyr"        "fastmap"      "shape"       
## [71] "stringi"      "Rcpp"         "vctrs"        "rpart"        "tidyselect"  
## [76] "xfun"