pusion.evaluation.evaluation module¶

class pusion.evaluation.evaluation.Evaluation(*argv)¶

Bases: object

Evaluation provides methods for evaluating decision outputs (i.e. combiners and classifiers) with different problems and coverage types.

Parameters: argv – Performance metric functions.

evaluate(true_assignments, decision_tensor)¶

Evaluate the decision outputs with already set classification performance metrics.

Warning

This evaluation is only applicable on redundant multiclass or multilabel decision outputs.

Parameters

true_assignments – numpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered true for the evaluation.
decision_tensor – numpy.array of shape (n_classifiers, n_samples, n_classes). Tensor of crisp decision outputs by different classifiers per sample.

Returns

numpy.array of shape (n_instances, n_metrics). Performance matrix containing performance values for each set instance row-wise and each set performance metric column-wise.

evaluate_cr_decision_outputs(true_assignments, decision_outputs, coverage=None)¶

Evaluate complementary-redundant decision outputs with already set classification performance metrics. The outputs of each classifier for each class is considered as a binary output and thus, the performance is calculated class-wise and averaged across all classes, which are covered by individual classifiers.

Note

This evaluation is applicable on complementary-redundant ensemble classifier outputs.

Parameters

true_assignments – numpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered true for the evaluation.
decision_outputs – numpy.array of shape (n_classifiers, n_samples, n_classes) or a list of numpy.array elements of shape (n_samples, n_classes’), where n_classes’ is classifier-specific due to the coverage.
coverage – list of list elements. Each inner list contains classes as integers covered by a classifier, which is identified by the positional index of the respective list. If none set, the coverage for fully redundant classification is chosen by default.

Returns

numpy.array of shape (n_instances, n_metrics). Performance matrix containing performance values for each set instance row-wise and each set performance metric column-wise.

evaluate_cr_multi_combiner_decision_outputs(true_assignments, decision_tensor)¶

Evaluate decision outputs of multiple CR combiners with already set classification performance metrics. The evaluation is performed by evaluate_cr_decision_outputs() for each combiner.

Parameters

true_assignments – numpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered true for the evaluation.
decision_tensor – numpy.array of shape (n_combiners, n_samples, n_classes). Tensor of crisp decision outputs by different combiners per sample.

Returns

numpy.array of shape (n_instances, n_metrics). Performance matrix containing performance values for each set instance row-wise and each set performance metric column-wise.

class_wise_mean_score(true_assignments, decision_outputs, coverage, metric)¶

Calculate the class-wise mean score with the given metric for the given classification outputs.

Parameters

true_assignments – numpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered true for the evaluation.
decision_outputs – numpy.array of shape (n_classifiers, n_samples, n_classes) or a list of numpy.array elements of shape (n_samples, n_classes’), where n_classes’ is classifier-specific due to the coverage.
coverage – list of list elements. Each inner list contains classes as integers covered by a classifier, which is identified by the positional index of the respective list. If none set, the coverage for fully redundant classification is chosen by default.
metric – The score metric.

Returns

numpy.array of shape (n_classes,). The mean score per class across all classifiers.

get_report()¶

Returns: A summary Report of performed evaluations including all involved instances and performance metrics.

get_runtime_report()¶

Returns: A summary Report of train and combine runtimes for all involved instances.

get_instances()¶

Returns: A list of instances (i.e. combiner or classifiers) been evaluated.

get_metrics()¶

Returns: A list of performance metrics been used for evaluation.

get_performance_matrix()¶

Returns: numpy.array of shape (n_instances, n_metrics). Performance matrix containing performance values for each set instance row-wise and each set performance metric column-wise.

get_runtime_matrix()¶

Returns: numpy.array of shape (n_instances, 2). Runtime matrix containing runtimes for each set instance row-wise. The column at index 0 describes train times and the column at index 1 describes combine times.

get_top_n_instances(n=None, metric=None)¶

Retrieve top n best instances according to the given metric in a sorted order.

Parameters

n – integer. Number of instances to be retrieved. If unset, all instances are retrieved.
metric – The metric all instances are sorted by. If unset, the first metric is used.

Returns

Evaluated top n instances.

get_top_instances(metric=None)¶

Retrieve best performing instances according to the given metric. Multiple instances may be returned having the identical best performance score.

Parameters: metric – The metric all instances were evaluated with. If unset, the first metric is used.
Returns: Evaluated top instances according to their performance.

get_instance_performance_tuples(metric=None)¶

Retrieve (instance, performance) tuples created for to the given metric.

Parameters: metric – The metric all instances are evaluated by. If unset, the first set metric is used.
Returns: list of (instance, performance) tuples.

set_metrics(*argv)¶

Parameters: argv – Performance metric functions.

set_instances(instances)¶

Parameters: instances – An instance or a list of instances to be evaluated, e.g. classifiers or combiners.

set_runtimes(runtimes)¶

Parameters: runtimes – A tuple of two lists of tuples describing the train and combine runtimes respectively. Each runtime list is aligned with the list of set instances.