pusion.evaluation.evaluation module

class pusion.evaluation.evaluation.Evaluation(*argv)

Bases: object

Evaluation provides methods for evaluating decision outputs (i.e. combiners and classifiers) with different problems and coverage types.

Parameters

argv – Performance metric functions.

evaluate(true_assignments, decision_tensor)

Evaluate the decision outputs with already set classification performance metrics.

Warning

This evaluation is only applicable on redundant multiclass or multilabel decision outputs.

Parameters
  • true_assignmentsnumpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered true for the evaluation.

  • decision_tensornumpy.array of shape (n_classifiers, n_samples, n_classes). Tensor of crisp decision outputs by different classifiers per sample.

Returns

numpy.array of shape (n_instances, n_metrics). Performance matrix containing performance values for each set instance row-wise and each set performance metric column-wise.

evaluate_cr_decision_outputs(true_assignments, decision_outputs, coverage=None)

Evaluate complementary-redundant decision outputs with already set classification performance metrics. The outputs of each classifier for each class is considered as a binary output and thus, the performance is calculated class-wise and averaged across all classes, which are covered by individual classifiers.

Note

This evaluation is applicable on complementary-redundant ensemble classifier outputs.

Parameters
  • true_assignmentsnumpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered true for the evaluation.

  • decision_outputsnumpy.array of shape (n_classifiers, n_samples, n_classes) or a list of numpy.array elements of shape (n_samples, n_classes’), where n_classes’ is classifier-specific due to the coverage.

  • coveragelist of list elements. Each inner list contains classes as integers covered by a classifier, which is identified by the positional index of the respective list. If none set, the coverage for fully redundant classification is chosen by default.

Returns

numpy.array of shape (n_instances, n_metrics). Performance matrix containing performance values for each set instance row-wise and each set performance metric column-wise.

evaluate_cr_multi_combiner_decision_outputs(true_assignments, decision_tensor)

Evaluate decision outputs of multiple CR combiners with already set classification performance metrics. The evaluation is performed by evaluate_cr_decision_outputs() for each combiner.

Parameters
  • true_assignmentsnumpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered true for the evaluation.

  • decision_tensornumpy.array of shape (n_combiners, n_samples, n_classes). Tensor of crisp decision outputs by different combiners per sample.

Returns

numpy.array of shape (n_instances, n_metrics). Performance matrix containing performance values for each set instance row-wise and each set performance metric column-wise.

class_wise_mean_score(true_assignments, decision_outputs, coverage, metric)

Calculate the class-wise mean score with the given metric for the given classification outputs.

Parameters
  • true_assignmentsnumpy.array of shape (n_samples, n_classes). Matrix of crisp class assignments which are considered true for the evaluation.

  • decision_outputsnumpy.array of shape (n_classifiers, n_samples, n_classes) or a list of numpy.array elements of shape (n_samples, n_classes’), where n_classes’ is classifier-specific due to the coverage.

  • coveragelist of list elements. Each inner list contains classes as integers covered by a classifier, which is identified by the positional index of the respective list. If none set, the coverage for fully redundant classification is chosen by default.

  • metric – The score metric.

Returns

numpy.array of shape (n_classes,). The mean score per class across all classifiers.

get_report()
Returns

A summary Report of performed evaluations including all involved instances and performance metrics.

get_runtime_report()
Returns

A summary Report of train and combine runtimes for all involved instances.

get_instances()
Returns

A list of instances (i.e. combiner or classifiers) been evaluated.

get_metrics()
Returns

A list of performance metrics been used for evaluation.

get_performance_matrix()
Returns

numpy.array of shape (n_instances, n_metrics). Performance matrix containing performance values for each set instance row-wise and each set performance metric column-wise.

get_runtime_matrix()
Returns

numpy.array of shape (n_instances, 2). Runtime matrix containing runtimes for each set instance row-wise. The column at index 0 describes train times and the column at index 1 describes combine times.

get_top_n_instances(n=None, metric=None)

Retrieve top n best instances according to the given metric in a sorted order.

Parameters
  • ninteger. Number of instances to be retrieved. If unset, all instances are retrieved.

  • metric – The metric all instances are sorted by. If unset, the first metric is used.

Returns

Evaluated top n instances.

get_top_instances(metric=None)

Retrieve best performing instances according to the given metric. Multiple instances may be returned having the identical best performance score.

Parameters

metric – The metric all instances were evaluated with. If unset, the first metric is used.

Returns

Evaluated top instances according to their performance.

get_instance_performance_tuples(metric=None)

Retrieve (instance, performance) tuples created for to the given metric.

Parameters

metric – The metric all instances are evaluated by. If unset, the first set metric is used.

Returns

list of (instance, performance) tuples.

set_metrics(*argv)
Parameters

argv – Performance metric functions.

set_instances(instances)
Parameters

instances – An instance or a list of instances to be evaluated, e.g. classifiers or combiners.

set_runtimes(runtimes)
Parameters

runtimes – A tuple of two lists of tuples describing the train and combine runtimes respectively. Each runtime list is aligned with the list of set instances.