pusion.evaluation.evaluation_metrics module¶

pusion.evaluation.evaluation_metrics.multi_label_brier_score_micro(y_true: numpy.ndarray, y_pred: numpy.ndarray) → float¶: Calculate the brier score for multi-label problems according to Brier 1950 :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The micro brier score.

pusion.evaluation.evaluation_metrics.multi_label_brier_score(y_true: numpy.ndarray, y_pred: numpy.ndarray)¶: Calculate the brier score for multiclass problems according to Brier 1950 :param y_true: numpy.array of shape (n_samples,) or (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples,) or (n_samples, n_classes). Predicted labels or class assignments. :return: The brier score.

pusion.evaluation.evaluation_metrics.multiclass_brier_score(y_true: numpy.ndarray, y_pred: numpy.ndarray)¶: Calculate the brier score for multi-label problems according to Brier 1950 :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The brier score.

pusion.evaluation.evaluation_metrics.far(y_true: numpy.ndarray, y_pred: numpy.ndarray, pos_normal_class: int = 0) → float¶: Calculate the false alarm rate for multiclass and multi-label problems. FAR = (number of normal class samples incorrectly classified)/(number of all normal class samples) * 100 :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :param pos_normal_class: the position of the ‘normal class’ in :param y_true and :param y_pred. Default is 0 :return: The false alarm rate.

pusion.evaluation.evaluation_metrics.multiclass_fdr(y_true: numpy.ndarray, y_pred: numpy.ndarray) → float¶: fault detection rate = (# correctly classified faulty samples) / (# all faulty samples) * 100 :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The fault detection rate.

pusion.evaluation.evaluation_metrics.multilabel_subset_fdr(y_true: numpy.ndarray, y_pred: numpy.ndarray) → float¶: fault detection rate = (# correctly classified faulty samples) / (# all faulty samples) * 100 In multilabel classification, the function considers the faulty subset, i. e., if the entire set of predicted faulty labels for a sample strictly match with the true set of faulty labels. :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The fault detection rate.

pusion.evaluation.evaluation_metrics.multilabel_minor_fdr(y_true: numpy.ndarray, y_pred: numpy.ndarray) → float¶: fault detection rate = (# correctly classified faulty samples) / (# all faulty samples) * 100 In multilabel classification, the function considers the faulty subset, i. e., if the entire set of predicted faulty labels for a sample strictly match with the true set of faulty labels. :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The fault detection rate.

pusion.evaluation.evaluation_metrics.multiclass_weighted_precision(y_true: numpy.ndarray, y_pred: numpy.ndarray) → float¶: Calculate the precision for a multiclass problem with a weighted average. :param y_true: numpy.array of shape (n_samples,). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples,). Predicted labels or class assignments. :return: The precision score.

pusion.evaluation.evaluation_metrics.multi_label_weighted_precision(y_true: numpy.ndarray, y_pred: numpy.ndarray) → float¶: Calculate the precision for a multi-label problem with a weighted average. :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The precision score.

pusion.evaluation.evaluation_metrics.multiclass_class_wise_precision(y_true: numpy.ndarray, y_pred: numpy.ndarray) → numpy.ndarray¶: Calculate the precision for a multiclass problem with average None. :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The precision score.

pusion.evaluation.evaluation_metrics.multi_label_class_wise_precision(y_true: numpy.ndarray, y_pred: numpy.ndarray) → numpy.ndarray¶: Calculate the precision for a multi-label problem with average None. :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The precision score.

pusion.evaluation.evaluation_metrics.multiclass_recall(y_true: numpy.ndarray, y_pred: numpy.ndarray) → float¶: Calculate the recall for a multiclass problem with average weighted. :param y_true: numpy.array of shape (n_samples,). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples,). Predicted labels or class assignments. :return: The recall score.

pusion.evaluation.evaluation_metrics.multi_label_recall(y_true: numpy.ndarray, y_pred: numpy.ndarray) → float¶: Calculate the recall for a multi-label problem with average weighted. :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The recall score.

pusion.evaluation.evaluation_metrics.multiclass_class_wise_recall(y_true: numpy.ndarray, y_pred: numpy.ndarray) → numpy.ndarray¶: Calculate the recall for a multiclass problem with average None. :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: Sequence of recall scores (for each class).

pusion.evaluation.evaluation_metrics.multi_label_class_wise_recall(y_true: numpy.ndarray, y_pred: numpy.ndarray) → numpy.ndarray¶: Calculate the recall for a multi-label problem with average None. :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: Sequence of recall scores (for each class).

pusion.evaluation.evaluation_metrics.multiclass_weighted_scikit_auc_roc_score(y_true: numpy.ndarray, y_pred: numpy.ndarray) → float¶: Compute the scikit auc roc score for a multi-label problem with average weighted. :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The auc roc score.

pusion.evaluation.evaluation_metrics.multi_label_weighted_pytorch_auc_roc_score(y_true: numpy.ndarray, y_pred: numpy.ndarray) → float¶: Compute the pytorch auc roc score for a multi-label problem with average weighted. :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The auc roc score.

pusion.evaluation.evaluation_metrics.multi_label_pytorch_auc_roc_score(y_true: numpy.ndarray, y_pred: numpy.ndarray)¶: Compute the pytorch auc roc score for a multi-label problem with average None. :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The auc roc score.

pusion.evaluation.evaluation_metrics.multiclass_class_wise_avg_precision(y_true: numpy.ndarray, y_pred: numpy.ndarray)¶: Compute the class wise precision for a multiclass problem with average None. :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The precision score.

pusion.evaluation.evaluation_metrics.multiclass_weighted_avg_precision(y_true: numpy.ndarray, y_pred: numpy.ndarray) → float¶: Compute the precision for a multiclass problem with average weighted. :param y_true: numpy.array of shape (n_samples,). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The precision score.

pusion.evaluation.evaluation_metrics.multiclass_auc_precision_recall_curve(y_true: numpy.ndarray, y_pred: numpy.ndarray)¶: Compute the class wise auc precision recall curve for a multiclass problem. :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The aggregated auc precision recall curve class wise.

pusion.evaluation.evaluation_metrics.multiclass_weighted_pytorch_auc_roc(y_true: numpy.ndarray, y_pred: numpy.ndarray) → float¶: Compute the pytorch auc roc for a multiclass problem with average weighted. :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The auc roc score.

pusion.evaluation.evaluation_metrics.multiclass_pytorch_auc_roc(y_true: numpy.ndarray, y_pred: numpy.ndarray)¶: Compute the pytorch auc roc for a multiclass problem with average None. :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The auc roc score.

pusion.evaluation.evaluation_metrics.multi_label_ranking_avg_precision_score(y_true: numpy.ndarray, y_pred: numpy.ndarray) → float¶: Compute the label ranking based average precision score for a multi-label problem. :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The precision score.

pusion.evaluation.evaluation_metrics.multi_label_ranking_loss(y_true: numpy.ndarray, y_pred: numpy.ndarray) → float¶: Compute the label ranking loss for a multi-label problem. :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The precision score. :return: The loss.

pusion.evaluation.evaluation_metrics.multi_label_normalized_discounted_cumulative_gain(y_true: numpy.ndarray, y_pred: numpy.ndarray) → float¶: Compute the normalized discounted cumulative gain for a multi-label problem. :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The gain.

pusion.evaluation.evaluation_metrics.multiclass_top_1_accuracy(y_true: numpy.ndarray, y_pred: numpy.ndarray) → float¶: Compute the top-1 accuracy for a multiclass problem. :param y_true: numpy.array of shape (n_samples,). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The accuracy score.

pusion.evaluation.evaluation_metrics.multiclass_top_3_accuracy(y_true: numpy.ndarray, y_pred: numpy.ndarray) → float¶: Compute the top-3 accuracy for a multiclass problem. :param y_true: numpy.array of shape (n_samples,). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The accuracy score.

pusion.evaluation.evaluation_metrics.multiclass_top_5_accuracy(y_true: numpy.ndarray, y_pred: numpy.ndarray) → float¶: Compute the top-5 accuracy for a multiclass problem. :param y_true: numpy.array of shape (n_samples,). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes)`. Predicted labels or class assignments. :return: The accuracy score.

pusion.evaluation.evaluation_metrics.multiclass_log_loss(y_true: numpy.ndarray, y_pred: numpy.ndarray) → float¶: The logarithmic loss for a multiclass problem. :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The loss.

pusion.evaluation.evaluation_metrics.multi_label_log_loss(y_true: numpy.ndarray, y_pred: numpy.ndarray) → float¶: The logarithmic loss for a multi-label problem. :param y_true: numpy.array of shape (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples, n_classes). Predicted labels or class assignments. :return: The loss.

pusion.evaluation.evaluation_metrics.micro_precision(y_true, y_pred)¶

Calculate the micro precision, i.e. TP / (TP + FP).

Parameters

y_true – numpy.array of shape (n_samples,) or (n_samples, n_classes). True labels or class assignments.
y_pred – numpy.array of shape (n_samples,) or (n_samples, n_classes). Predicted labels or class assignments.

Returns

The micro precision.

pusion.evaluation.evaluation_metrics.micro_recall(y_true, y_pred)¶

Calculate the micro recall, i.e. TP / (TP + FN).

Parameters

y_true – numpy.array of shape (n_samples,) or (n_samples, n_classes). True labels or class assignments.
y_pred – numpy.array of shape (n_samples,) or (n_samples, n_classes). Predicted labels or class assignments.

Returns

The micro recall.

pusion.evaluation.evaluation_metrics.micro_f1(y_true, y_pred)¶

Calculate the micro F1-score, i.e. 2 * (Precision * Recall) / (Precision + Recall).

Parameters

y_true – numpy.array of shape (n_samples,) or (n_samples, n_classes). True labels or class assignments.
y_pred – numpy.array of shape (n_samples,) or (n_samples, n_classes). Predicted labels or class assignments.

Returns

The micro F1-score.

pusion.evaluation.evaluation_metrics.micro_f2(y_true, y_pred)¶

Calculate the micro F2-score (beta=2).

Parameters

y_true – numpy.array of shape (n_samples,) or (n_samples, n_classes). True labels or class assignments.
y_pred – numpy.array of shape (n_samples,) or (n_samples, n_classes). Predicted labels or class assignments.

Returns

The micro F2-score.

pusion.evaluation.evaluation_metrics.micro_jaccard(y_true, y_pred)¶

Calculate the micro Jaccard-score, i.e. TP / (TP + FP + FN).

Parameters

y_true – numpy.array of shape (n_samples,) or (n_samples, n_classes). True labels or class assignments.
y_pred – numpy.array of shape (n_samples,) or (n_samples, n_classes). Predicted labels or class assignments.

Returns

The micro Jaccard-score.

pusion.evaluation.evaluation_metrics.macro_precision(y_true, y_pred)¶

Calculate the macro precision, i.e. TP / (TP + FP).

Parameters

y_true – numpy.array of shape (n_samples,) or (n_samples, n_classes). True labels or class assignments.
y_pred – numpy.array of shape (n_samples,) or (n_samples, n_classes). Predicted labels or class assignments.

Returns

The macro precision.

pusion.evaluation.evaluation_metrics.macro_recall(y_true, y_pred)¶

Calculate the macro recall, i.e. TP / (TP + FN).

Parameters

y_true – numpy.array of shape (n_samples,) or (n_samples, n_classes). True labels or class assignments.
y_pred – numpy.array of shape (n_samples,) or (n_samples, n_classes). Predicted labels or class assignments.

Returns

The macro recall.

pusion.evaluation.evaluation_metrics.macro_f1(y_true, y_pred)¶

Calculate the macro F1-score, i.e. 2 * (Precision * Recall) / (Precision + Recall).

Parameters

y_true – numpy.array of shape (n_samples,) or (n_samples, n_classes). True labels or class assignments.
y_pred – numpy.array of shape (n_samples,) or (n_samples, n_classes). Predicted labels or class assignments.

Returns

The macro F1-score.

pusion.evaluation.evaluation_metrics.weighted_f1(y_true, y_pred)¶

Calculate the macro F1-score, i.e. 2 * (Precision * Recall) / (Precision + Recall), weighted by the class support.

Parameters

y_true – numpy.array of shape (n_samples,) or (n_samples, n_classes). True labels or class assignments.
y_pred – numpy.array of shape (n_samples,) or (n_samples, n_classes). Predicted labels or class assignments.

Returns

The weighted macro F1-score.

pusion.evaluation.evaluation_metrics.macro_f2(y_true, y_pred)¶

Calculate the macro F2-score (beta=2).

Parameters

y_true – numpy.array of shape (n_samples,) or (n_samples, n_classes). True labels or class assignments.
y_pred – numpy.array of shape (n_samples,) or (n_samples, n_classes). Predicted labels or class assignments.

Returns

The macro F2-score.

pusion.evaluation.evaluation_metrics.macro_jaccard(y_true, y_pred)¶

Calculate the macro Jaccard-score, i.e. TP / (TP + FP + FN).

Parameters

y_true – numpy.array of shape (n_samples,) or (n_samples, n_classes). True labels or class assignments.
y_pred – numpy.array of shape (n_samples,) or (n_samples, n_classes). Predicted labels or class assignments.

Returns

The macro Jaccard-score.

pusion.evaluation.evaluation_metrics.weighted_jaccard(y_true, y_pred)¶

Calculate the Jaccard-score for each label, and find their average, weighted by support, i. e., the number of true instances of each label instance.

Parameters

y_true – numpy.array of shape (n_samples,) or (n_samples, n_classes). True labels or class assignments.
y_pred – numpy.array of shape (n_samples,) or (n_samples, n_classes). Predicted labels or class assignments.

Returns

The macro Jaccard-score.

pusion.evaluation.evaluation_metrics.accuracy(y_true, y_pred)¶

Calculate the accuracy, i.e. (TP + TN) / (TP + FP + FN + TN).

Parameters

y_true – numpy.array of shape (n_samples,) or (n_samples, n_classes). True labels or class assignments.
y_pred – numpy.array of shape (n_samples,) or (n_samples, n_classes). Predicted labels or class assignments.

Returns

Accuracy.

pusion.evaluation.evaluation_metrics.error_rate(y_true, y_pred)¶: Calculate the error rate, i. e. error_rate = 1-accuracy :param y_true: numpy.array of shape (n_samples,) or (n_samples, n_classes). True labels or class assignments. :param y_pred: numpy.array of shape (n_samples,) or (n_samples, n_classes). Predicted labels or class assignments. :return: Error Rate of typ float

pusion.evaluation.evaluation_metrics.balanced_multiclass_accuracy(y_true, y_pred)¶

Calculate the balanced accuracy, i.e. (Precision + Recall) / 2.

Parameters

y_true – numpy.array of shape (n_samples,) or (n_samples, n_classes). True labels or class assignments.
y_pred – numpy.array of shape (n_samples,) or (n_samples, n_classes). Predicted labels or class assignments.

Returns

Accuracy.

pusion.evaluation.evaluation_metrics.mean_multilabel_confusion_matrix(y_true, y_pred)¶

Calculate the normalized mean confusion matrix across all classes.

Parameters

y_true – numpy.array of shape (n_samples,) or (n_samples, n_classes). True labels or class assignments.
y_pred – numpy.array of shape (n_samples,) or (n_samples, n_classes). Predicted labels or class assignments.

Returns

numpy.array of shape (n_classes, n_classes). Normalized mean confusion matrix.

pusion.evaluation.evaluation_metrics.mean_confidence(y_true, y_pred)¶

Calculate the mean confidence for continuous multiclass and multilabel classification outputs.

Parameters

y_true – numpy.array of shape (n_samples, n_classes). True class assignments.
y_pred – numpy.array of shape (n_samples, n_classes). Predicted class assignments.

Returns

Mean confidence.

pusion.evaluation.evaluation_metrics.hamming(y_true, y_pred)¶

Calculate the average Hamming Loss.

Parameters

y_true – numpy.array of shape (n_samples,) or (n_samples, n_classes). True labels or class assignments.
y_pred – numpy.array of shape (n_samples,) or (n_samples, n_classes). Predicted labels or class assignments.

Returns

Average Hamming Loss.

pusion.evaluation.evaluation_metrics.log(y_true, y_pred)¶

Calculate the Logistic Loss.

Parameters

y_true – numpy.array of shape (n_samples,) or (n_samples, n_classes). True labels or class assignments.
y_pred – numpy.array of shape (n_samples,) or (n_samples, n_classes). Predicted labels or class assignments.

Returns

Logistic Loss.

pusion.evaluation.evaluation_metrics.cohens_kappa(y1, y2, labels)¶

Calculate the Cohen’s Kappa annotator agreement score according to 1.

1: Jacob Cohen. A coefficient of agreement for nominal scales. Educational and psychological measurement, 20(1):37–46, 1960.

Parameters

y1 – numpy.array of shape (n_samples,) or (n_samples, n_classes). Labels or class assignments.
y2 – numpy.array of shape (n_samples,) or (n_samples, n_classes). Labels or class assignments.
labels – list of all possible labels.

Returns

Cohen’s Kappa score.

pusion.evaluation.evaluation_metrics.pairwise_cohens_kappa(decision_tensor)¶

Calculate the average of pairwise Cohen’s Kappa scores over all multiclass decision outputs. E.g., for 3 classifiers (0,1,2), the agreement score is calculated for classifier tuples (0,1), (0,2) and (1,2). These scores are then averaged over all 3 classifiers.

Parameters: decision_tensor – numpy.array of shape (n_classifiers, n_samples, n_classes). Tensor of crisp multiclass decision outputs by different classifiers per sample.
Returns: Pairwise (averages) Cohen’s Kappa score.

pusion.evaluation.evaluation_metrics.correlation(y1, y2, y_true)¶

Calculate the correlation score for decision outputs of two classifiers according to Kuncheva 2.

2(1,2): Ludmila I Kuncheva. Combining pattern classifiers: methods and algorithms. John Wiley & Sons, 2014.

Parameters

y1 – numpy.array of shape (n_samples, n_classes). Crisp multiclass decision outputs by the first classifier.
y2 – numpy.array of shape (n_samples, n_classes). Crisp multiclass decision outputs by the second classifier.
y_true – numpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered as true.

Returns

Correlation score.

pusion.evaluation.evaluation_metrics.q_statistic(y1, y2, y_true)¶

Calculate the Q statistic score for decision outputs of two classifiers according to Yule 3.

3: G Udny Yule. On the association of attributes in statistics: with illustrations from the material of the childhood society, &c. Philosophical Transactions of the Royal Society of London Series A, 194:257–319, 1900.

Parameters

y1 – numpy.array of shape (n_samples, n_classes). Crisp multiclass decision outputs by the first classifier.
y2 – numpy.array of shape (n_samples, n_classes). Crisp multiclass decision outputs by the second classifier.
y_true – numpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered as true.

Returns

Correlation score.

pusion.evaluation.evaluation_metrics.kappa_statistic(y1, y2, y_true)¶

Calculate the kappa score for decision outputs of two classifiers according to Kuncheva 2.

Parameters

y1 – numpy.array of shape (n_samples, n_classes). Crisp multiclass decision outputs by the first classifier.
y2 – numpy.array of shape (n_samples, n_classes). Crisp multiclass decision outputs by the second classifier.
y_true – numpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered as true.

Returns

Kappa score.

pusion.evaluation.evaluation_metrics.disagreement(y1, y2, y_true)¶

Calculate the disagreement for decision outputs of two classifiers, i.e. the percentage of samples which are correctly classified by exactly one of the classifiers.

Parameters

y1 – numpy.array of shape (n_samples, n_classes). Crisp multiclass decision outputs by the first classifier.
y2 – numpy.array of shape (n_samples, n_classes). Crisp multiclass decision outputs by the second classifier.
y_true – numpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered as true.

Returns

Disagreement score.

pusion.evaluation.evaluation_metrics.double_fault(y1, y2, y_true)¶

Calculate the double fault for decision outputs of two classifiers, i.e. the percentage of samples which are misclassified by both classifiers.

Parameters

y1 – numpy.array of shape (n_samples, n_classes). Crisp multiclass decision outputs by the first classifier.
y2 – numpy.array of shape (n_samples, n_classes). Crisp multiclass decision outputs by the second classifier.
y_true – numpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered as true.

Returns

Double fault score.

pusion.evaluation.evaluation_metrics.abs_correlation(y1, y2, y_true)¶

Calculate the absolute correlation score for decision outputs of two classifiers.

Parameters

y1 – numpy.array of shape (n_samples, n_classes). Crisp multiclass decision outputs by the first classifier.
y2 – numpy.array of shape (n_samples, n_classes). Crisp multiclass decision outputs by the second classifier.
y_true – numpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered as true.

Returns

Correlation score.

pusion.evaluation.evaluation_metrics.abs_q_statistic(y1, y2, y_true)¶

Calculate the absolute Q statistic score for decision outputs of two classifiers.

Parameters

y1 – numpy.array of shape (n_samples, n_classes). Crisp multiclass decision outputs by the first classifier.
y2 – numpy.array of shape (n_samples, n_classes). Crisp multiclass decision outputs by the second classifier.
y_true – numpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered as true.

Returns

Correlation score.

pusion.evaluation.evaluation_metrics.pairwise_correlation(decision_tensor, true_assignments, **kwargs)¶

Calculate the average of the pairwise absolute correlation scores over all decision outputs.

Parameters

decision_tensor – numpy.array of shape (n_classifiers, n_samples, n_classes). Tensor of crisp multiclass decision outputs by different classifiers per sample.
true_assignments – numpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered as true.

Returns

Pairwise correlation score.

pusion.evaluation.evaluation_metrics.pairwise_q_statistic(decision_tensor, true_assignments)¶

Calculate the average of the pairwise absolute Q-statistic scores over all decision outputs.

Parameters

decision_tensor – numpy.array of shape (n_classifiers, n_samples, n_classes). Tensor of crisp multiclass decision outputs by different classifiers per sample.
true_assignments – numpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered as true.

Returns

Pairwise correlation score.

pusion.evaluation.evaluation_metrics.pairwise_kappa_statistic(decision_tensor, true_assignments, **kwargs)¶

Calculate the average of pairwise Kappa scores over all decision outputs. Multilabel class assignments are transformed to equivalent multiclass class assignments.

Parameters

decision_tensor – numpy.array of shape (n_classifiers, n_samples, n_classes). Tensor of crisp multiclass decision outputs by different classifiers per sample.
true_assignments – numpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered as true.

Returns

Pairwise kappa score.

pusion.evaluation.evaluation_metrics.pairwise_disagreement(decision_tensor, true_assignments)¶

Calculate the average of pairwise disagreement scores over all decision outputs. Multilabel class assignments are transformed to equivalent multiclass class assignments.

Parameters

decision_tensor – numpy.array of shape (n_classifiers, n_samples, n_classes). Tensor of crisp multiclass decision outputs by different classifiers per sample.
true_assignments – numpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered as true.

Returns

Pairwise disagreement score.

pusion.evaluation.evaluation_metrics.pairwise_double_fault(decision_tensor, true_assignments, **kwargs)¶

Calculate the average of pairwise double fault scores over all decision outputs. Multilabel class assignments are transformed to equivalent multiclass class assignments.

Parameters

decision_tensor – numpy.array of shape (n_classifiers, n_samples, n_classes). Tensor of crisp multiclass decision outputs by different classifiers per sample.
true_assignments – numpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered as true.

Returns

Pairwise double fault score.

pusion.evaluation.evaluation_metrics.pairwise_euclidean_distance(decision_tensor)¶

Calculate the average of pairwise euclidean distance between decision matrices for the given classifiers.

Parameters: decision_tensor – numpy.array of shape (n_classifiers, n_samples, n_classes). Tensor of crisp multiclass decision outputs by different classifiers per sample.
Returns: Pairwise euclidean distance.