jmetal.lab.statistical_test.apv_procedures.
bonferroni_dunn
(p_values, control)[source]¶Bonferroni-Dunn’s procedure for the adjusted p-value computation.
p_values: 2-D array or DataFrame containing the p-values obtained from a ranking test. control: int or string. Index or Name of the control algorithm.
APVs: DataFrame containing the adjusted p-values.
jmetal.lab.statistical_test.apv_procedures.
finner
(p_values, control)[source]¶Finner’s procedure for the adjusted p-value computation.
p_values: 2-D array or DataFrame containing the p-values obtained from a ranking test. control: int or string. Index or Name of the control algorithm.
APVs: DataFrame containing the adjusted p-values.
jmetal.lab.statistical_test.apv_procedures.
hochberg
(p_values, control)[source]¶Hochberg’s procedure for the adjusted p-value computation.
p_values: 2-D array or DataFrame containing the p-values obtained from a ranking test. control: int or string. Index or Name of the control algorithm.
APVs: DataFrame containing the adjusted p-values.
jmetal.lab.statistical_test.apv_procedures.
holland
(p_values, control)[source]¶Holland’s procedure for the adjusted p-value computation.
p_values: 2-D array or DataFrame containing the p-values obtained from a ranking test. control: int or string. Index or Name of the control algorithm.
APVs: DataFrame containing the adjusted p-values.
jmetal.lab.statistical_test.apv_procedures.
holm
(p_values, control=None)[source]¶Holm’s procedure for the adjusted p-value computation.
p_values: 2-D array or DataFrame containing the p-values obtained from a ranking test. control: optional int or string. Default None
Index or Name of the control algorithm. If control is provided, control vs all comparisons are considered, else all vs all.
APVs: DataFrame containing the adjusted p-values.
jmetal.lab.statistical_test.apv_procedures.
li
(p_values, control)[source]¶Li’s procedure for the adjusted p-value computation.
p_values: 2-D array or DataFrame containing the p-values obtained from a ranking test. control: optional int or string. Default None
Index or Name of the control algorithm. If control is provided, control vs all comparisons are considered, else all vs all.
APVs: DataFrame containing the adjusted p-values.
jmetal.lab.statistical_test.bayesian.
bayesian_sign_test
(data, rope_limits=[-0.01, 0.01], prior_strength=0.5, prior_place='rope', sample_size=50000, return_sample=False)[source]¶Bayesian version of the sign test.
data – An (n x 2) array or DataFrame contaning the results. In data, each column represents an algorithm and, and each row a problem.
rope_limits – array_like. Default [-0.01, 0.01]. Limits of the practical equivalence.
prior_strength – positive float. Default 0.5. Value of the prior strengt
prior_place – string {left, rope, right}. Default ‘left’. Place of the pseudo-observation z_0.
sample_size – integer. Default 10000. Total number of random_search samples generated
return_sample – boolean. Default False. If true, also return the samples drawn from the Dirichlet process.
List of posterior probabilities: [Pr(algorith_1 < algorithm_2), Pr(algorithm_1 equiv algorithm_2), Pr(algorithm_1 > algorithm_2)]
jmetal.lab.statistical_test.bayesian.
bayesian_signed_rank_test
(data, rope_limits=[-0.01, 0.01], prior_strength=1.0, prior_place='rope', sample_size=10000, return_sample=False)[source]¶Bayesian version of the signed rank test.
data – An (n x 2) array or DataFrame contaning the results. In data, each column represents an algorithm and, and each row a problem.
rope_limits – array_like. Default [-0.01, 0.01]. Limits of the practical equivalence.
prior_strength – positive float. Default 0.5. Value of the prior strengt
prior_place – string {left, rope, right}. Default ‘left’. Place of the pseudo-observation z_0.
sample_size – integer. Default 10000. Total number of random_search samples generated
return_sample – boolean. Default False. If true, also return the samples drawn from the Dirichlet process.
List of posterior probabilities: [Pr(algorith_1 < algorithm_2), Pr(algorithm_1 equiv algorithm_2), Pr(algorithm_1 > algorithm_2)]
jmetal.lab.statistical_test.critical_distance.
CDplot
(results, alpha: float = 0.05, higher_is_better: bool = False, alg_names: list = None, output_filename: str = 'cdplot.eps')[source]¶CDgraph plots the critical difference graph show in Janez Demsar’s 2006 work: * Statistical Comparisons of Classifiers over Multiple Data Sets. :param results: A 2-D array containing results from each algorithm. Each row of ‘results’ represents an algorithm, and each column a dataset. :param alpha: {0.1, 0.999}. Significance level for the critical difference. :param alg_names: Names of the tested algorithms.
jmetal.lab.statistical_test.critical_distance.
NemenyiCD
(alpha: float, num_alg, num_dataset)[source]¶Computes Nemenyi’s critical difference: * CD = q_alpha * sqrt(num_alg*(num_alg + 1)/(6*num_prob)) where q_alpha is the critical value, of the Studentized range statistic divided by sqrt(2). :param alpha: {0.1, 0.999}. Significance level. :param num_alg: number of tested algorithms. :param num_dataset: Number of problems/datasets where the algorithms have been tested.
jmetal.lab.statistical_test.functions.
friedman_aligned_ph_test
(data, control=None, apv_procedure=None)[source]¶Friedman Aligned Ranks post-hoc test.
data – An (n x 2) array or DataFrame contaning the results. In data, each column represents an algorithm and, and each row a problem.
control – optional int or string. Default None. Index or Name of the control algorithm. If control = None all FriedmanPosHocTest considers all possible comparisons among algorithms.
apv_procedure –
optional string. Default None. Name of the procedure for computing adjusted p-values. If apv_procedure is None, adjusted p-value are not computed, else the values are computed according to the specified procedure: For 1 vs all comparisons.
{‘Bonferroni’, ‘Holm’, ‘Hochberg’, ‘Holland’, ‘Finner’, ‘Li’}
{‘Shaffer’, ‘Holm’, ‘Nemenyi’}
Test statistic.
The p-value according to the Studentized range distribution.
jmetal.lab.statistical_test.functions.
friedman_aligned_rank_test
(data)[source]¶Method of aligned ranks for the Friedman test.
..note:: Null Hypothesis: In a set of k (>=2) treaments (or tested algorithms), all the treatments are equivalent, so their average ranks should be equal.
data – An (n x 2) array or DataFrame contaning the results. In data, each column represents an algorithm and, and each row a problem.
The associated p-value.
Friedman’s aligned rank chi-square statistic.
jmetal.lab.statistical_test.functions.
friedman_ph_test
(data, control=None, apv_procedure=None)[source]¶Friedman post-hoc test.
data – An (n x 2) array or DataFrame contaning the results. In data, each column represents an algorithm and, and each row a problem.
control – optional int or string. Default None. Index or Name of the control algorithm. If control = None all FriedmanPosHocTest considers all possible comparisons among algorithms.
apv_procedure –
optional string. Default None. Name of the procedure for computing adjusted p-values. If apv_procedure is None, adjusted p-value are not computed, else the values are computed according to the specified procedure: For 1 vs all comparisons.
{‘Bonferroni’, ‘Holm’, ‘Hochberg’, ‘Holland’, ‘Finner’, ‘Li’}
{‘Shaffer’, ‘Holm’, ‘Nemenyi’}
Test statistic.
The p-value according to the Studentized range distribution.
jmetal.lab.statistical_test.functions.
friedman_test
(data)[source]¶Friedman ranking test.
..note:: Null Hypothesis: In a set of k (>=2) treaments (or tested algorithms), all the treatments are equivalent, so their average ranks should be equal.
data – An (n x 2) array or DataFrame contaning the results. In data, each column represents an algorithm and, and each row a problem.
The associated p-value.
Friedman’s chi-square.
jmetal.lab.statistical_test.functions.
quade_ph_test
(data, control=None, apv_procedure=None)[source]¶Quade post-hoc test.
data – An (n x 2) array or DataFrame contaning the results. In data, each column represents an algorithm and, and each row a problem.
control – optional int or string. Default None. Index or Name of the control algorithm. If control = None all FriedmanPosHocTest considers all possible comparisons among algorithms.
apv_procedure –
optional string. Default None. Name of the procedure for computing adjusted p-values. If apv_procedure is None, adjusted p-value are not computed, else the values are computed according to the specified procedure: For 1 vs all comparisons.
{‘Bonferroni’, ‘Holm’, ‘Hochberg’, ‘Holland’, ‘Finner’, ‘Li’}
{‘Shaffer’, ‘Holm’, ‘Nemenyi’}
Test statistic.
The p-value according to the Studentized range distribution.
jmetal.lab.statistical_test.functions.
quade_test
(data)[source]¶Quade test.
..note:: Null Hypothesis: In a set of k (>=2) treaments (or tested algorithms), all the treatments are equivalent, so their average ranks should be equal.
data – An (n x 2) array or DataFrame contaning the results. In data, each column represents an algorithm and, and each row a problem.
The associated p-value from the F-distribution.
Computed F-value.
jmetal.lab.statistical_test.functions.
ranks
(data: numpy.array, descending=False)[source]¶Computes the rank of the elements in data.
data – 2-D matrix
descending – boolean (default False). If true, rank is sorted in descending order.
ranks, where ranks[i][j] == rank of the i-th row w.r.t the j-th column.
jmetal.lab.statistical_test.functions.
sign_test
(data)[source]¶Given the results drawn from two algorithms/methods X and Y, the sign test analyses if there is a difference between X and Y.
Note
Null Hypothesis: Pr(X<Y)= 0.5
data – An (n x 2) array or DataFrame contaning the results. In data, each column represents an algorithm and, and each row a problem.
The associated p-value from the binomial distribution.
Number of successes.