credible.bayesian.metrics¶
Implementation of Scikit-Learn compatible measures with bayesian credible regions.
Module Attributes
Suggested number of samples to use for Monte Carlo simulations in this package. |
Functions
|
Accuracy binary classification score. |
|
Compute average precision (AP) from prediction scores. |
|
Compute the Detection Error-Tradeoff (DET) curve. |
|
Return the mean, mode, upper and lower bounds of the credible region of the F1 score. |
|
Jaccard binary classification score. |
|
Compute Precision-Recall (PR) curve. |
|
Precision binary classification score. |
|
Recall binary classification score. |
|
Calculate the area under the ROC (FPR vs TPR) curve. |
|
Compute Receiver operating characteristic (ROC). |
|
Specificity binary classification score. |
- credible.bayesian.metrics.NUMBER_MC_SAMPLES = 100000¶
Suggested number of samples to use for Monte Carlo simulations in this package.
- credible.bayesian.metrics.precision_score(y_true, y_pred, lambda_=1.0, coverage=0.95)[source]¶
Precision binary classification score.
AKA positive predictive value (PPV), mean, mode and credible intervals. It corresponds arithmetically to
tp/(tp+fp). This function only supports binary classification problems.- Parameters:
y_pred (
Iterable[int]) – Predicted labels, as returned by a classifier.lambda – The parameterisation of the Beta prior to consider. Use \(\lambda=1\) for a flat prior. Use \(\lambda=0.5\) for Jeffrey’s prior. Changes in this value do not significantly affect the outcome, unless
tporfpare very small (close to 1).coverage (
float) – A floating-point number between 0 and 1.0 indicating the coverage you’re expecting. A value of 0.95 will ensure 95% of the area under the probability density of the posterior is covered by the returned equal-tailed interval.
- Returns:
Tuple with 4 floating-point numbers:
The actual precision, as would be returned by scikit-learn
The mode of the posterior distribution: It is typically close to the value estimated by scikit-learn.
The lower value of the credible region/confidence interval
The upper value of the credible region/confidence interval
- Return type:
- credible.bayesian.metrics.recall_score(y_true, y_pred, lambda_=1.0, coverage=0.95)[source]¶
Recall binary classification score.
AKA sensitivity, hit rate, or true positive rate (TPR), mean, mode and credible intervals. It corresponds arithmetically to
tp/(tp+fn).- Parameters:
y_pred (
Iterable[int]) – Predicted labels, as returned by a classifier.lambda – The parameterisation of the Beta prior to consider. Use \(\lambda=1\) for a flat prior. Use \(\lambda=0.5\) for Jeffrey’s prior. Changes in this value do not significantly affect the outcome, unless
tporfpare very small (close to 1).coverage (
float) – A floating-point number between 0 and 1.0 indicating the coverage you’re expecting. A value of 0.95 will ensure 95% of the area under the probability density of the posterior is covered by the returned equal-tailed interval.
- Returns:
Tuple with 4 floating-point numbers:
The actual recall, as would be returned by scikit-learn
The mode of the posterior distribution: this represents the best estimate of the recall a posteriori. It is typically close to the value estimated by scikit-learn.
The lower value of the credible region/confidence interval
The upper value of the credible region/confidence interval
- Return type:
- credible.bayesian.metrics.specificity_score(y_true, y_pred, lambda_=1.0, coverage=0.95)[source]¶
Specificity binary classification score.
AKA selectivity or true negative rate (TNR), mean, mode and credible intervals. It corresponds arithmetically to
tn/(tn+fp).- Parameters:
y_pred (
Iterable[int]) – Predicted labels, as returned by a classifier.lambda – The parameterisation of the Beta prior to consider. Use \(\lambda=1\) for a flat prior. Use \(\lambda=0.5\) for Jeffrey’s prior. Changes in this value do not significantly affect the outcome, unless
tporfpare very small (close to 1).coverage (
float) – A floating-point number between 0 and 1.0 indicating the coverage you’re expecting. A value of 0.95 will ensure 95% of the area under the probability density of the posterior is covered by the returned equal-tailed interval.
- Returns:
Tuple with 4 floating-point numbers:
The actual specificity, as would be returned by scikit-learn
The mode of the posterior distribution: this represents the best estimate of the specificity a posteriori. It is typically close to the value estimated by scikit-learn.
The lower value of the credible region/confidence interval
The upper value of the credible region/confidence interval
- Return type:
- credible.bayesian.metrics.accuracy_score(y_true, y_pred, lambda_=1.0, coverage=0.95)[source]¶
Accuracy binary classification score.
See Accuracy. is the proportion of correct predictions (both true positives and true negatives) among the total number of pixels examined. It corresponds arithmetically to
(tp+tn)/(tp+tn+fp+fn). This measure includes both true-negatives and positives in the numerator, what makes it sensitive to data or regions without annotations. AKA selectivity or true negative rate (TNR), mean, mode and credible intervals. It corresponds arithmetically totn/(tn+fp).- Parameters:
y_pred (
Iterable[int]) – Predicted labels, as returned by a classifier.lambda – The parameterisation of the Beta prior to consider. Use \(\lambda=1\) for a flat prior. Use \(\lambda=0.5\) for Jeffrey’s prior. Changes in this value do not significantly affect the outcome, unless
tporfpare very small (close to 1).coverage (
float) – A floating-point number between 0 and 1.0 indicating the coverage you’re expecting. A value of 0.95 will ensure 95% of the area under the probability density of the posterior is covered by the returned equal-tailed interval.
- Returns:
Tuple with 4 floating-point numbers:
The actual accuracy, as would be returned by scikit-learn
The mode of the posterior distribution: this represents the best estimate of the accuracy a posteriori. It is typically close to the value estimated by scikit-learn.
The lower value of the credible region/confidence interval
The upper value of the credible region/confidence interval
- Return type:
- credible.bayesian.metrics.jaccard_score(y_true, y_pred, lambda_=1.0, coverage=0.95)[source]¶
Jaccard binary classification score.
See Jaccard Index or Similarity. It corresponds arithmetically to
tp/(tp+fp+fn). The Jaccard index depends on a TP-only numerator, similarly to the F1 score. For regions where there are no annotations, the Jaccard index will always be zero, irrespective of the model output. Accuracy may be a better proxy if one needs to consider the true abscence of annotations in a region as part of the measure.- Parameters:
y_pred (
Iterable[int]) – Predicted labels, as returned by a classifier.lambda – The parameterisation of the Beta prior to consider. Use \(\lambda=1\) for a flat prior. Use \(\lambda=0.5\) for Jeffrey’s prior. Changes in this value do not significantly affect the outcome, unless
tporfpare very small (close to 1).coverage (
float) – A floating-point number between 0 and 1.0 indicating the coverage you’re expecting. A value of 0.95 will ensure 95% of the area under the probability density of the posterior is covered by the returned equal-tailed interval.
- Returns:
Tuple with 4 floating-point numbers:
The actual jaccard score, as would be returned by scikit-learn
The mode of the posterior distribution: this represents the best estimate of the jaccard score a posteriori. It is typically close to the value estimated by scikit-learn.
The lower value of the credible region/confidence interval
The upper value of the credible region/confidence interval
- Return type:
- credible.bayesian.metrics.f1_score(y_true, y_pred, rng, lambda_=1.0, coverage=0.95, nb_samples=100000)[source]¶
Return the mean, mode, upper and lower bounds of the credible region of the F1 score.
See F1-score. It corresponds arithmetically to
2*P*R/(P+R)or2*tp/(2*tp+fp+fn). The F1 or Dice score depends on a TP-only numerator, similarly to the Jaccard index. For regions where there are no annotations, the F1-score will always be zero, irrespective of the model output. Accuracy may be a better proxy if one needs to consider the true abscence of annotations in a region as part of the measure.This implementation is based on [GOUTTE-2005].
- Parameters:
y_pred (
Iterable[int]) – Predicted labels, as returned by a classifier.rng (
Generator) – An initialized numpy random number generator.lambda – The parameterisation of the Beta prior to consider. Use \(\lambda=1\) for a flat prior. Use \(\lambda=0.5\) for Jeffrey’s prior.
coverage (
float) – A floating-point number between 0 and 1.0 indicating the coverage you are expecting. A value of 0.95 will ensure 95% of the area under the probability density of the posterior is covered by the returned equal-tailed interval.nb_samples (
int) – Number of generated variates for the M-C simulation.
- Returns:
Tuple with 4 floating-point numbers:
The actual F1 score, as would be returned by scikit-learn
The mode of the posterior distribution: this represents the best estimate of the F1 score a posteriori. It is typically close to the value estimated by scikit-learn.
The lower value of the credible region/confidence interval
The upper value of the credible region/confidence interval
- Return type:
- credible.bayesian.metrics.roc_curve(y_true, y_score, lambda_=1.0, coverage=0.95)[source]¶
Compute Receiver operating characteristic (ROC).
Approximately follows API of
sklearn.metrics.roc_curve().Important
The returned credible regions are not immediately usable for plots or the evaluation of the area under the curve, only as point estimates for individual thresholds. To plot, feed the output of this funtion to
curves.curve_ci_hull()and use the lower and upper estimates provided by that function instead.- Parameters:
y_score (
Iterable[float]) – Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by “decision_function” on some classifiers).lambda – The parameterisation of the Beta prior to consider. Use \(\lambda=1\) for a flat prior. Use \(\lambda=0.5\) for Jeffrey’s prior.
coverage (
float) – A floating-point number between 0 and 1.0 indicating the coverage you are expecting. A value of 0.95 will ensure 95% of the area under the probability density of the posterior is covered by the returned equal-tailed interval.
- Return type:
tuple[ndarray[tuple[int,...],dtype[float64]],ndarray[tuple[int,...],dtype[float64]],ndarray[tuple[int,...],dtype[float64]],ndarray[tuple[int,...],dtype[float64]],ndarray[tuple[int,...],dtype[float64]],ndarray[tuple[int,...],dtype[float64]],ndarray[tuple[int,...],dtype[float64]]]- Returns:
Seven 1-D floating point arrays corresponding to:
FPR (false positive rates)
TPR (true positive rates)
The thresholds used to evaluated the selected metrics
The lower confidence interval for the FPR
The lower confidence interval for the TPR
The upper confidence interval for the FPR
The upper confidence interval for the TPR
- credible.bayesian.metrics.roc_auc_score(y_true, y_score, lambda_=1.0, coverage=0.95)[source]¶
Calculate the area under the ROC (FPR vs TPR) curve.
This function mimics the scikit-learn API, except it also returns lower and upper bounds considering the credible regions defined in each threshold.
- Parameters:
y_score (
Iterable[float]) – Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by “decision_function” on some classifiers).lambda – The parameterisation of the Beta prior to consider. Use \(\lambda=1\) for a flat prior. Use \(\lambda=0.5\) for Jeffrey’s prior.
coverage (
float) – A floating-point number between 0 and 1.0 indicating the coverage you are expecting. A value of 0.95 will ensure 95% of the area under the probability density of the posterior is covered by the returned equal-tailed interval.
- Return type:
- Returns:
A tuple with 3 floats:
the area under the ROC (FPR vs. TPR) curve
the lower bound considering the credible region defined by
lambda_andcoverageparameters.the upper bound considering the credible region defined by
lambda_andcoverageparameters.
- credible.bayesian.metrics.det_curve(y_true, y_score, lambda_=1.0, coverage=0.95)[source]¶
Compute the Detection Error-Tradeoff (DET) curve.
Approximately follows API of
sklearn.metrics.det_curve().Important
The returned credible regions are not immediately usable for plots or the evaluation of the area under the curve, only as point estimates for individual thresholds. To plot, feed the output of this funtion to
curves.curve_ci_hull()and use the lower and upper estimates provided by that function instead.- Parameters:
y_score (
Iterable[float]) – Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by “decision_function” on some classifiers).lambda – The parameterisation of the Beta prior to consider. Use \(\lambda=1\) for a flat prior. Use \(\lambda=0.5\) for Jeffrey’s prior.
coverage (
float) – A floating-point number between 0 and 1.0 indicating the coverage you are expecting. A value of 0.95 will ensure 95% of the area under the probability density of the posterior is covered by the returned equal-tailed interval.
- Return type:
tuple[ndarray[tuple[int,...],dtype[float64]],ndarray[tuple[int,...],dtype[float64]],ndarray[tuple[int,...],dtype[float64]],ndarray[tuple[int,...],dtype[float64]],ndarray[tuple[int,...],dtype[float64]],ndarray[tuple[int,...],dtype[float64]],ndarray[tuple[int,...],dtype[float64]]]- Returns:
Seven 1-D floating point arrays corresponding to:
FPR (false positive rates)
FNR (false negative rates)
The thresholds used to evaluated the selected metrics
The lower confidence interval for the FPR
The lower confidence interval for the FNR
The upper confidence interval for the FPR
The upper confidence interval for the FNR
- credible.bayesian.metrics.precision_recall_curve(y_true, y_score, lambda_=1.0, coverage=0.95)[source]¶
Compute Precision-Recall (PR) curve.
Approximately follows API of
sklearn.metrics.precision_recall_curve().Note
This package computes the precision-recall curve in a similar, but slightly different way than scikit-learn. It does not add an extra (1.0, 0.0) at the end of the PR curve. (c.f.: documentation for
sklearn.metrics.precision_recall_curve()).Important
The returned credible regions are not immediately usable for plots or the evaluation of the area under the curve, only as point estimates for individual thresholds. To plot, feed the output of this funtion to
curves.curve_ci_hull()and use the lower and upper estimates provided by that function instead.- Parameters:
y_score (
Iterable[float]) – Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by “decision_function” on some classifiers).lambda – The parameterisation of the Beta prior to consider. Use \(\lambda=1\) for a flat prior. Use \(\lambda=0.5\) for Jeffrey’s prior.
coverage (
float) – A floating-point number between 0 and 1.0 indicating the coverage you are expecting. A value of 0.95 will ensure 95% of the area under the probability density of the posterior is covered by the returned equal-tailed interval.
- Return type:
tuple[ndarray[tuple[int,...],dtype[float64]],ndarray[tuple[int,...],dtype[float64]],ndarray[tuple[int,...],dtype[float64]],ndarray[tuple[int,...],dtype[float64]],ndarray[tuple[int,...],dtype[float64]],ndarray[tuple[int,...],dtype[float64]],ndarray[tuple[int,...],dtype[float64]]]- Returns:
Seven 1-D floating point arrays corresponding to:
Precision
Recall
The thresholds used to evaluated the selected metrics
The lower confidence interval for the Precision
The lower confidence interval for the Recall
The upper confidence interval for the Precision
The upper confidence interval for the Recall
- credible.bayesian.metrics.average_precision_score(y_true, y_score, lambda_=1.0, coverage=0.95)[source]¶
Compute average precision (AP) from prediction scores.
This function mimics the scikit-learn API, except it also returns lower and upper bounds considering the credible regions defined in each threshold.
- Parameters:
y_score (
Iterable[float]) – Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by “decision_function” on some classifiers).lambda – The parameterisation of the Beta prior to consider. Use \(\lambda=1\) for a flat prior. Use \(\lambda=0.5\) for Jeffrey’s prior.
coverage (
float) – A floating-point number between 0 and 1.0 indicating the coverage you are expecting. A value of 0.95 will ensure 95% of the area under the probability density of the posterior is covered by the returned equal-tailed interval.
- Return type:
- Returns:
A tuple with 3 floats:
the area under the ROC (FPR vs. TPR) curve
the lower bound considering the credible region defined by
lambda_andcoverageparameters.the upper bound considering the credible region defined by
lambda_andcoverageparameters.