pusion.evaluation.evaluation module¶
- class pusion.evaluation.evaluation.Evaluation(*argv)¶
provides methods for evaluating decision outputs (i.e. combiners and classifiers) with different problems and coverage types.- Parameters
argv – Performance metric functions.
- evaluate(true_assignments, decision_tensor)¶
Evaluate the decision outputs with already set classification performance metrics.
This evaluation is only applicable on redundant multiclass or multilabel decision outputs.
- Parameters
true_assignments – numpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered true for the evaluation.
decision_tensor – numpy.array of shape (n_classifiers, n_samples, n_classes). Tensor of crisp decision outputs by different classifiers per sample.
- Returns
numpy.array of shape (n_instances, n_metrics). Performance matrix containing performance values for each set instance row-wise and each set performance metric column-wise.
- evaluate_cr_decision_outputs(true_assignments, decision_outputs, coverage=None)¶
Evaluate complementary-redundant decision outputs with already set classification performance metrics. The outputs of each classifier for each class is considered as a binary output and thus, the performance is calculated class-wise and averaged across all classes, which are covered by individual classifiers.
This evaluation is applicable on complementary-redundant ensemble classifier outputs.
- Parameters
true_assignments – numpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered true for the evaluation.
decision_outputs – numpy.array of shape (n_classifiers, n_samples, n_classes) or a list of numpy.array elements of shape (n_samples, n_classes’), where n_classes’ is classifier-specific due to the coverage.
coverage – list of list elements. Each inner list contains classes as integers covered by a classifier, which is identified by the positional index of the respective list. If none set, the coverage for fully redundant classification is chosen by default.
- Returns
numpy.array of shape (n_instances, n_metrics). Performance matrix containing performance values for each set instance row-wise and each set performance metric column-wise.
- evaluate_cr_multi_combiner_decision_outputs(true_assignments, decision_tensor)¶
Evaluate decision outputs of multiple CR combiners with already set classification performance metrics. The evaluation is performed by
for each combiner.- Parameters
true_assignments – numpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered true for the evaluation.
decision_tensor – numpy.array of shape (n_combiners, n_samples, n_classes). Tensor of crisp decision outputs by different combiners per sample.
- Returns
numpy.array of shape (n_instances, n_metrics). Performance matrix containing performance values for each set instance row-wise and each set performance metric column-wise.
- class_wise_mean_score(true_assignments, decision_outputs, coverage, metric)¶
Calculate the class-wise mean score with the given metric for the given classification outputs.
- Parameters
true_assignments – numpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered true for the evaluation.
decision_outputs – numpy.array of shape (n_classifiers, n_samples, n_classes) or a list of numpy.array elements of shape (n_samples, n_classes’), where n_classes’ is classifier-specific due to the coverage.
coverage – list of list elements. Each inner list contains classes as integers covered by a classifier, which is identified by the positional index of the respective list. If none set, the coverage for fully redundant classification is chosen by default.
metric – The score metric.
- Returns
numpy.array of shape (n_classes,). The mean score per class across all classifiers.
- get_report()¶
- Returns
A summary Report of performed evaluations including all involved instances and performance metrics.
- get_runtime_report()¶
- Returns
A summary Report of train and combine runtimes for all involved instances.
- get_instances()¶
- Returns
A list of instances (i.e. combiner or classifiers) been evaluated.
- get_metrics()¶
- Returns
A list of performance metrics been used for evaluation.
- get_performance_matrix()¶
- Returns
numpy.array of shape (n_instances, n_metrics). Performance matrix containing performance values for each set instance row-wise and each set performance metric column-wise.
- get_runtime_matrix()¶
- Returns
numpy.array of shape (n_instances, 2). Runtime matrix containing runtimes for each set instance row-wise. The column at index 0 describes train times and the column at index 1 describes combine times.
- get_top_n_instances(n=None, metric=None)¶
Retrieve top n best instances according to the given metric in a sorted order.
- Parameters
n – integer. Number of instances to be retrieved. If unset, all instances are retrieved.
metric – The metric all instances are sorted by. If unset, the first metric is used.
- Returns
Evaluated top n instances.
- get_top_instances(metric=None)¶
Retrieve best performing instances according to the given metric. Multiple instances may be returned having the identical best performance score.
- Parameters
metric – The metric all instances were evaluated with. If unset, the first metric is used.
- Returns
Evaluated top instances according to their performance.
- get_instance_performance_tuples(metric=None)¶
Retrieve (instance, performance) tuples created for to the given metric.
- Parameters
metric – The metric all instances are evaluated by. If unset, the first set metric is used.
- Returns
list of (instance, performance) tuples.
- set_metrics(*argv)¶
- Parameters
argv – Performance metric functions.
- set_instances(instances)¶
- Parameters
instances – An instance or a list of instances to be evaluated, e.g. classifiers or combiners.
- set_runtimes(runtimes)¶
- Parameters
runtimes – A tuple of two lists of tuples describing the train and combine runtimes respectively. Each runtime list is aligned with the list of set instances.