class pclStatsBox::ConfusionMatrix
sys::Obj pclStatsBox::ConfusionMatrix
A Confusion Matrix is often used in statistics or machine learning to hold the number of observed against predicted labels from an experiment.
A confusion matrix represents "the relative frequencies with which each of a number of stimuli is mistaken for each of the others by a person in a task requiring recognition or identification of stimuli" (R. Colman, A Dictionary of Psychology, 2008). Each row represents the predicted label of an instance, and each column represents the observed label of that instance. Numbers at each (row, column) reflect the total number of instances of predicted label "row" which were observed as having label "column".
A twoclass example is:
Observed Observed  Positive Negative  Predicted + a b  Positive c d  Negative
Here the value:
a
the true positives (those predicted positive and observed positive)b
the false negatives (those predicted positive but observed negative)c
the false positives (those predicted negative but observed positive)d
the true negatives (those predicted negative and observed negative)
From this table we can calculate statistics like:
 true positive rate  a/(a+b)
 positive recall  a/(a+c)
As statistics can also be calculated for the negative label, e.g. the true negative rate is d/(c+d), the functions below have an optional "label" parameter, to specify which label they are calculated for: the default is to report for the first label named when the matrix is created
The implementation supports confusion matrices with more than two labels. When more than two labels are in use, the statistics are calculated as if the first, or named, label were positive and all the other labels are grouped as if negative.
Usage
The following example creates a simple twolabel confusion matrix, prints a few statistics and displays the table:
using pclStatsBox class ExampleConfusionMatrix { static Void main() { cm := ConfusionMatrix(["pos", "neg"]) cm.addCount("pos", "pos", 10) cm.addCount("pos", "neg", 3) cm.addCount("neg", "neg", 20) cm.addCount("neg", "pos", 5) echo("Confusion Matrix") echo("") echo(cm) echo("Precision: ${cm.precision}") echo("Recall : ${cm.recall}") echo("MCC : ${cm.matthewsCorrelation}") } }
which outputs:
Confusion Matrix Observed pos neg  Predicted + 10 3  pos 5 20  neg Precision: 0.6666666666666666 Recall : 0.7692307692307693 MCC : 0.5524850114241865
 addCount

Void addCount(Str predicted, Str observed, Int count := 1)
Adds total to the count for given (predicted, observed) labels. Throws an error if labels are not valid.
 cohenKappa

Float cohenKappa(Str label := this.labels.first())
Cohen's Kappa statistic compares observed accuracy with an expected accuracy.
 count

Int count(Str predicted, Str observed)
Returns count for given (predicted, observed) labels. Throws an error if labels are not valid.
 fMeasure

Float fMeasure(Str label := this.labels.first())
Harmonic mean of the precision and recall for the given label.
 falseNegative

Int falseNegative(Str label := this.labels.first())
Returns the number of instances of the given label which are incorrectly observed.
 falsePositive

Int falsePositive(Str label := this.labels.first())
Returns the number of instances incorrectly observed as the given label.
 falseRate

Float falseRate(Str label := this.labels.first())
Returns the proportion of instances of given label incorrectly observed out of all instances not originally of that label.
 geometricMean

Float geometricMean()
Nth root of product of truerate for each label
 make

new make(Str[] labels := ["positive","negative"])
Constructor takes a list of labels for the confusion matrix. There should be at least two labels, and the default is ("positive", "negative")
 matthewsCorrelation

Float matthewsCorrelation(Str label := this.labels.first())
Matthew's Correlation is a measure of the quality of binary classification.
 overallAccuracy

Float overallAccuracy()
Proportion of instances which are correctly observed.
 precision

Float precision(Str label := this.labels.first())
Precision is the proportion of instances of given label which are correctly observed.
 prevalence

Float prevalence(Str label := this.labels.first())
Prevalence is proportion of instances of given label out of total.
 recall

Float recall(Str label := this.labels.first())
Recall is equal to the trueRate, for a given label.
 sensitivity

Float sensitivity(Str label := this.labels.first())
Sensitivity is another name for the true positive rate (recall).
 specificity

Float specificity(Str label := this.labels.first())
Specificity is 1falseRate for a given label.
 toStr

virtual override Str toStr()
Returns a string representation of the matrix across multiple lines in a tablelike format.
 total

Int total()
Returns total of all counts.
 trueNegative

Int trueNegative(Str label := this.labels.first())
Returns the number of instances NOT of the given label which are correctly observed.
 truePositive

Int truePositive(Str label := this.labels.first())
Returns the number of instances of the given label correctly observed.
 trueRate

Float trueRate(Str label := this.labels.first())
Returns the proportion of instances of the given label which are correctly observed.