jmetal.lab.statistical_test package

Submodules

jmetal.lab.statistical_test.apv_procedures module

jmetal.lab.statistical_test.apv_procedures.bonferroni_dunn(p_values, control)[source]

Bonferroni-Dunn’s procedure for the adjusted p-value computation.

Parameters:

p_values: 2-D array or DataFrame containing the p-values obtained from a ranking test. control: int or string. Index or Name of the control algorithm.

Returns:

APVs: DataFrame containing the adjusted p-values.

jmetal.lab.statistical_test.apv_procedures.finner(p_values, control)[source]

Finner’s procedure for the adjusted p-value computation.

Parameters:

p_values: 2-D array or DataFrame containing the p-values obtained from a ranking test. control: int or string. Index or Name of the control algorithm.

Returns:

APVs: DataFrame containing the adjusted p-values.

jmetal.lab.statistical_test.apv_procedures.hochberg(p_values, control)[source]

Hochberg’s procedure for the adjusted p-value computation.

Parameters:

p_values: 2-D array or DataFrame containing the p-values obtained from a ranking test. control: int or string. Index or Name of the control algorithm.

Returns:

APVs: DataFrame containing the adjusted p-values.

jmetal.lab.statistical_test.apv_procedures.holland(p_values, control)[source]

Holland’s procedure for the adjusted p-value computation.

Parameters:

p_values: 2-D array or DataFrame containing the p-values obtained from a ranking test. control: int or string. Index or Name of the control algorithm.

Returns:

APVs: DataFrame containing the adjusted p-values.

jmetal.lab.statistical_test.apv_procedures.holm(p_values, control=None)[source]

Holm’s procedure for the adjusted p-value computation.

Parameters:

p_values: 2-D array or DataFrame containing the p-values obtained from a ranking test. control: optional int or string. Default None

Index or Name of the control algorithm. If control is provided, control vs all comparisons are considered, else all vs all.

Returns:

APVs: DataFrame containing the adjusted p-values.

jmetal.lab.statistical_test.apv_procedures.li(p_values, control)[source]

Li’s procedure for the adjusted p-value computation.

Parameters:

p_values: 2-D array or DataFrame containing the p-values obtained from a ranking test. control: optional int or string. Default None

Index or Name of the control algorithm. If control is provided, control vs all comparisons are considered, else all vs all.

Returns:

APVs: DataFrame containing the adjusted p-values.

jmetal.lab.statistical_test.apv_procedures.nemenyi(p_values)[source]

Nemenyi’s procedure for adjusted p_value computation.

Parameters:

data: 2-D array or DataFrame containing the p-values.

Returns:

APVs: DataFrame containing the adjusted p-values.

jmetal.lab.statistical_test.apv_procedures.shaffer(p_values)[source]

Shaffer’s procedure for adjusted p_value ccmputation.

Parameters:

data: 2-D array or DataFrame containing the p-values.

Returns:

APVs: DataFrame containing the adjusted p-values.

jmetal.lab.statistical_test.bayesian module

jmetal.lab.statistical_test.bayesian.bayesian_sign_test(data, rope_limits=[-0.01, 0.01], prior_strength=0.5, prior_place='rope', sample_size=50000, return_sample=False)[source]

Bayesian version of the sign test.

Parameters:
  • data – An (n x 2) array or DataFrame contaning the results. In data, each column represents an algorithm and, and each row a problem.

  • rope_limits – array_like. Default [-0.01, 0.01]. Limits of the practical equivalence.

  • prior_strength – positive float. Default 0.5. Value of the prior strengt

  • prior_place – string {left, rope, right}. Default ‘left’. Place of the pseudo-observation z_0.

  • sample_size – integer. Default 10000. Total number of random_search samples generated

  • return_sample – boolean. Default False. If true, also return the samples drawn from the Dirichlet process.

Returns:

List of posterior probabilities: [Pr(algorith_1 < algorithm_2), Pr(algorithm_1 equiv algorithm_2), Pr(algorithm_1 > algorithm_2)]

jmetal.lab.statistical_test.bayesian.bayesian_signed_rank_test(data, rope_limits=[-0.01, 0.01], prior_strength=1.0, prior_place='rope', sample_size=10000, return_sample=False)[source]

Bayesian version of the signed rank test.

Parameters:
  • data – An (n x 2) array or DataFrame contaning the results. In data, each column represents an algorithm and, and each row a problem.

  • rope_limits – array_like. Default [-0.01, 0.01]. Limits of the practical equivalence.

  • prior_strength – positive float. Default 0.5. Value of the prior strengt

  • prior_place – string {left, rope, right}. Default ‘left’. Place of the pseudo-observation z_0.

  • sample_size – integer. Default 10000. Total number of random_search samples generated

  • return_sample – boolean. Default False. If true, also return the samples drawn from the Dirichlet process.

Returns:

List of posterior probabilities: [Pr(algorith_1 < algorithm_2), Pr(algorithm_1 equiv algorithm_2), Pr(algorithm_1 > algorithm_2)]

jmetal.lab.statistical_test.critical_distance module

jmetal.lab.statistical_test.critical_distance.CDplot(results, alpha: float = 0.05, higher_is_better: bool = False, alg_names: list | None = None, output_filename: str = 'cdplot.eps')[source]

CDgraph plots the critical difference graph show in Janez Demsar’s 2006 work: * Statistical Comparisons of Classifiers over Multiple Data Sets. :param results: A 2-D array containing results from each algorithm. Each row of ‘results’ represents an algorithm, and each column a dataset. :param alpha: {0.1, 0.999}. Significance level for the critical difference. :param alg_names: Names of the tested algorithms.

jmetal.lab.statistical_test.critical_distance.NemenyiCD(alpha: float, num_alg, num_dataset)[source]

Computes Nemenyi’s critical difference: * CD = q_alpha * sqrt(num_alg*(num_alg + 1)/(6*num_prob)) where q_alpha is the critical value, of the Studentized range statistic divided by sqrt(2). :param alpha: {0.1, 0.999}. Significance level. :param num_alg: number of tested algorithms. :param num_dataset: Number of problems/datasets where the algorithms have been tested.

jmetal.lab.statistical_test.functions module

jmetal.lab.statistical_test.functions.friedman_aligned_ph_test(data, control=None, apv_procedure=None)[source]

Friedman Aligned Ranks post-hoc test.

Parameters:
  • data – An (n x 2) array or DataFrame contaning the results. In data, each column represents an algorithm and, and each row a problem.

  • control – optional int or string. Default None. Index or Name of the control algorithm. If control = None all FriedmanPosHocTest considers all possible comparisons among algorithms.

  • apv_procedure

    optional string. Default None. Name of the procedure for computing adjusted p-values. If apv_procedure is None, adjusted p-value are not computed, else the values are computed according to the specified procedure: For 1 vs all comparisons.

    {‘Bonferroni’, ‘Holm’, ‘Hochberg’, ‘Holland’, ‘Finner’, ‘Li’}

    For all vs all coparisons.

    {‘Shaffer’, ‘Holm’, ‘Nemenyi’}

Return z_values:

Test statistic.

Return p_values:

The p-value according to the Studentized range distribution.

jmetal.lab.statistical_test.functions.friedman_aligned_rank_test(data)[source]

Method of aligned ranks for the Friedman test.

..note:: Null Hypothesis: In a set of k (>=2) treaments (or tested algorithms), all the treatments are equivalent, so their average ranks should be equal.

Parameters:

data – An (n x 2) array or DataFrame contaning the results. In data, each column represents an algorithm and, and each row a problem.

Return p_value:

The associated p-value.

Return aligned_rank_stat:

Friedman’s aligned rank chi-square statistic.

jmetal.lab.statistical_test.functions.friedman_ph_test(data, control=None, apv_procedure=None)[source]

Friedman post-hoc test.

Parameters:
  • data – An (n x 2) array or DataFrame contaning the results. In data, each column represents an algorithm and, and each row a problem.

  • control – optional int or string. Default None. Index or Name of the control algorithm. If control = None all FriedmanPosHocTest considers all possible comparisons among algorithms.

  • apv_procedure

    optional string. Default None. Name of the procedure for computing adjusted p-values. If apv_procedure is None, adjusted p-value are not computed, else the values are computed according to the specified procedure: For 1 vs all comparisons.

    {‘Bonferroni’, ‘Holm’, ‘Hochberg’, ‘Holland’, ‘Finner’, ‘Li’}

    For all vs all coparisons.

    {‘Shaffer’, ‘Holm’, ‘Nemenyi’}

Return z_values:

Test statistic.

Return p_values:

The p-value according to the Studentized range distribution.

jmetal.lab.statistical_test.functions.friedman_test(data)[source]

Friedman ranking test.

..note:: Null Hypothesis: In a set of k (>=2) treaments (or tested algorithms), all the treatments are equivalent, so their average ranks should be equal.

Parameters:

data – An (n x 2) array or DataFrame contaning the results. In data, each column represents an algorithm and, and each row a problem.

Return p_value:

The associated p-value.

Return friedman_stat:

Friedman’s chi-square.

jmetal.lab.statistical_test.functions.quade_ph_test(data, control=None, apv_procedure=None)[source]

Quade post-hoc test.

Parameters:
  • data – An (n x 2) array or DataFrame contaning the results. In data, each column represents an algorithm and, and each row a problem.

  • control – optional int or string. Default None. Index or Name of the control algorithm. If control = None all FriedmanPosHocTest considers all possible comparisons among algorithms.

  • apv_procedure

    optional string. Default None. Name of the procedure for computing adjusted p-values. If apv_procedure is None, adjusted p-value are not computed, else the values are computed according to the specified procedure: For 1 vs all comparisons.

    {‘Bonferroni’, ‘Holm’, ‘Hochberg’, ‘Holland’, ‘Finner’, ‘Li’}

    For all vs all coparisons.

    {‘Shaffer’, ‘Holm’, ‘Nemenyi’}

Return z_values:

Test statistic.

Return p_values:

The p-value according to the Studentized range distribution.

jmetal.lab.statistical_test.functions.quade_test(data)[source]

Quade test.

..note:: Null Hypothesis: In a set of k (>=2) treaments (or tested algorithms), all the treatments are equivalent, so their average ranks should be equal.

Parameters:

data – An (n x 2) array or DataFrame contaning the results. In data, each column represents an algorithm and, and each row a problem.

Return p_value:

The associated p-value from the F-distribution.

Return fq:

Computed F-value.

jmetal.lab.statistical_test.functions.ranks(data: array, descending=False)[source]

Computes the rank of the elements in data.

Parameters:
  • data – 2-D matrix

  • descending – boolean (default False). If true, rank is sorted in descending order.

Returns:

ranks, where ranks[i][j] == rank of the i-th row w.r.t the j-th column.

jmetal.lab.statistical_test.functions.sign_test(data)[source]

Given the results drawn from two algorithms/methods X and Y, the sign test analyses if there is a difference between X and Y.

Note

Null Hypothesis: Pr(X<Y)= 0.5

Parameters:

data – An (n x 2) array or DataFrame contaning the results. In data, each column represents an algorithm and, and each row a problem.

Return p_value:

The associated p-value from the binomial distribution.

Return bstat:

Number of successes.

Module contents