API Reference
This page provides the complete API documentation for PySODMetrics.
Core Metrics Module
- class py_sod_metrics.sod_metrics.Fmeasure(beta: float = 0.3)[source]
Bases:
objectF-measure evaluator for salient object detection.
Computes precision, recall, and F-measure at multiple thresholds, supporting both adaptive and dynamic evaluation modes.
``` @inproceedings{Fmeasure,
title={Frequency-tuned salient region detection}, author={Achanta, Radhakrishna and Hemami, Sheila and Estrada, Francisco and S{"u}sstrunk, Sabine}, booktitle=CVPR, number={CONF}, pages={1597–1604}, year={2009}
}
- __init__(beta: float = 0.3)[source]
Initialize the F-measure evaluator.
- Parameters:
beta (float) – the weight of the precision
- step(pred: ndarray, gt: ndarray, normalize: bool = True)[source]
Statistics the metric for the pair of pred and gt.
- Parameters:
pred (np.ndarray) – Prediction, gray scale image.
gt (np.ndarray) – Ground truth, gray scale image.
normalize (bool, optional) – Whether to normalize the input data. Defaults to True.
- cal_adaptive_fm(pred: ndarray, gt: ndarray) float[source]
Calculate the adaptive F-measure.
- Returns:
adaptive_fm
- Return type:
- cal_pr(pred: ndarray, gt: ndarray) tuple[source]
Calculate the corresponding precision and recall when the threshold changes from 0 to 255.
These precisions and recalls can be used to obtain the mean F-measure, maximum F-measure, precision-recall curve and F-measure-threshold curve.
For convenience, changeable_fms is provided here, which can be used directly to obtain the mean F-measure, maximum F-measure and F-measure-threshold curve.
- Returns:
(precisions, recalls, changeable_fms)
- Return type:
- class py_sod_metrics.sod_metrics.MAE[source]
Bases:
objectMean Absolute Error.
Computes the MAE between predicted saliency maps and ground truth masks.
``` @inproceedings{MAE,
title={Saliency filters: Contrast based filtering for salient region detection}, author={Perazzi, Federico and Kr{"a}henb{"u}hl, Philipp and Pritch, Yael and Hornung, Alexander}, booktitle=CVPR, pages={733–740}, year={2012}
}
- step(pred: ndarray, gt: ndarray, normalize: bool = True)[source]
Statistics the metric for the pair of pred and gt.
- Parameters:
pred (np.ndarray) – Prediction, gray scale image.
gt (np.ndarray) – Ground truth, gray scale image.
normalize (bool, optional) – Whether to normalize the input data. Defaults to True.
- class py_sod_metrics.sod_metrics.Smeasure(alpha: float = 0.5)[source]
Bases:
objectS-measure evaluates foreground maps by considering both object-aware and region-aware structural similarity between prediction and ground truth. It combines object-level and region-level scores to provide a comprehensive assessment of structural quality.
``` @inproceedings{Smeasure,
title={Structure-measure: A new way to eval foreground maps}, author={Fan, Deng-Ping and Cheng, Ming-Ming and Liu, Yun and Li, Tao and Borji, Ali}, booktitle=ICCV, pages={4548–4557}, year={2017}
}
- __init__(alpha: float = 0.5)[source]
Initialize S-measure (Structure-measure) evaluator.
- Parameters:
alpha (float, optional) – Weight for balancing the object score and the region score. Higher values give more weight to object-level similarity. Valid range: [0, 1]. Defaults to 0.5 for equal weighting.
- step(pred: ndarray, gt: ndarray, normalize: bool = True)[source]
Statistics the metric for the pair of pred and gt.
- Parameters:
pred (np.ndarray) – Prediction, gray scale image.
gt (np.ndarray) – Ground truth, gray scale image.
normalize (bool, optional) – Whether to normalize the input data. Defaults to True.
- cal_sm(pred: ndarray, gt: ndarray) float[source]
Calculate the S-measure (Structure-measure) score.
Computes a weighted combination of object-aware and region-aware structural similarity scores. For edge cases (all foreground or all background), returns simplified metrics.
- Parameters:
pred (np.ndarray) – Normalized prediction map with values in [0, 1].
gt (np.ndarray) – Binary ground truth mask.
- Returns:
S-measure score in range [0, 1], where higher is better.
- Return type:
- s_object(x: ndarray) float[source]
Calculate object-aware score for a region.
Computes a similarity score that considers both mean and standard deviation of the input region.
- Parameters:
x (np.ndarray) – Input region data.
- Returns:
Object-aware similarity score.
- Return type:
- object(pred: ndarray, gt: ndarray) float[source]
Calculate the object-level structural similarity score.
Evaluates structural similarity separately for foreground and background regions, then combines them using the ratio of foreground pixels.
- Parameters:
pred (np.ndarray) – Normalized prediction map with values in [0, 1].
gt (np.ndarray) – Binary ground truth mask.
- Returns:
Object-level similarity score.
- Return type:
- region(pred: ndarray, gt: ndarray) float[source]
Calculate the region-level structural similarity score.
Divides the image into four quadrants based on the foreground centroid, then calculates SSIM for each quadrant weighted by its area.
- Parameters:
pred (np.ndarray) – Normalized prediction map with values in [0, 1].
gt (np.ndarray) – Binary ground truth mask.
- Returns:
Region-level similarity score.
- Return type:
- ssim(pred: ndarray, gt: ndarray) float[source]
Calculate the SSIM (Structural Similarity Index) score.
Computes structural similarity based on luminance, contrast, and structure comparisons between prediction and ground truth regions.
- Parameters:
pred (np.ndarray) – Prediction region.
gt (np.ndarray) – Ground truth region.
- Returns:
SSIM score in range [0, 1].
- Return type:
- class py_sod_metrics.sod_metrics.Emeasure[source]
Bases:
objectE-measure assesses binary foreground map quality by measuring the alignment between prediction and ground truth using an enhanced alignment matrix. It addresses limitations of traditional metrics by considering spatial alignment and local/global pixel matching.
``` @inproceedings{Emeasure,
title=”Enhanced-alignment Measure for Binary Foreground Map Evaluation”, author=”Deng-Ping {Fan} and Cheng {Gong} and Yang {Cao} and Bo {Ren} and Ming-Ming {Cheng} and Ali {Borji}”, booktitle=IJCAI, pages=”698–704”, year={2018}
}
Note
More implementation details: https://www.yuque.com/lart/blog/lwgt38
- step(pred: ndarray, gt: ndarray, normalize: bool = True)[source]
Statistics the metric for the pair of pred and gt.
- Parameters:
pred (np.ndarray) – Prediction, gray scale image.
gt (np.ndarray) – Ground truth, gray scale image.
normalize (bool, optional) – Whether to normalize the input data. Defaults to True.
- cal_adaptive_em(pred: ndarray, gt: ndarray) float[source]
Calculate the adaptive E-measure using an adaptive threshold.
Uses twice the mean prediction value as the adaptive threshold to binarize the prediction before computing E-measure.
- Parameters:
pred (np.ndarray) – Normalized prediction map with values in [0, 1].
gt (np.ndarray) – Binary ground truth mask.
- Returns:
Adaptive E-measure score.
- Return type:
- cal_changeable_em(pred: ndarray, gt: ndarray) ndarray[source]
Calculate E-measure scores across all thresholds from 0 to 255.
Computes the E-measure for 257 different thresholds, enabling analysis of maximum E-measure, mean E-measure, and E-measure-threshold curves.
- Parameters:
pred (np.ndarray) – Normalized prediction map with values in [0, 1].
gt (np.ndarray) – Binary ground truth mask.
- Returns:
Array of 257 E-measure scores corresponding to thresholds [0, 255].
- Return type:
np.ndarray
- cal_em_with_threshold(pred: ndarray, gt: ndarray, threshold: float) float[source]
Calculate the E-measure for a specific binarization threshold.
Computes enhanced alignment based on four regions: true positives, false positives, false negatives, and true negatives.
- Parameters:
pred (np.ndarray) – Normalized prediction map with values in [0, 1].
gt (np.ndarray) – Binary ground truth mask.
threshold (float) – Binarization threshold value.
- Returns:
E-measure score for the given threshold.
- Return type:
Note
Variable naming convention: [pred_attr(fg/bg)]_[gt_attr(fg/bg)]_[meaning] ‘_’ indicates don’t-care attribute.
- cal_em_with_cumsumhistogram(pred: ndarray, gt: ndarray) ndarray[source]
Calculate the E-measure corresponding to the threshold that varies from 0 to 255..
Variable naming rules within the function: [pred attribute(foreground fg, background bg)]_[gt attribute(foreground fg, background bg)]_[meaning]
If only pred or gt is considered, another corresponding attribute location is replaced with ‘_’.
- class py_sod_metrics.sod_metrics.WeightedFmeasure(beta: float = 1)[source]
Bases:
objectWeighted F-measure considers both pixel dependency and pixel importance when evaluating foreground maps. It weights different pixels according to their distance from the foreground boundary to provide a more perceptually meaningful assessment than standard F-measure.
``` @inproceedings{wFmeasure,
title={How to eval foreground maps?}, author={Margolin, Ran and Zelnik-Manor, Lihi and Tal, Ayellet}, booktitle=CVPR, pages={248–255}, year={2014}
}
- __init__(beta: float = 1)[source]
Initialize Weighted F-measure evaluator.
- Parameters:
beta (float, optional) – Weight for balancing precision and recall. Defaults to 1 for equal weighting (F1-score).
- step(pred: ndarray, gt: ndarray, normalize: bool = True)[source]
Statistics the metric for the pair of pred and gt.
- Parameters:
pred (np.ndarray) – Prediction, gray scale image.
gt (np.ndarray) – Ground truth, gray scale image.
normalize (bool, optional) – Whether to normalize the input data. Defaults to True.
- cal_wfm(pred: ndarray, gt: ndarray) float[source]
Calculate the weighted F-measure score.
Implements the weighted F-measure algorithm that considers: 1. Pixel dependency: Uses error at closest GT edge for background pixels 2. Pixel importance: Weights errors by distance from foreground
- Parameters:
pred (np.ndarray) – Normalized prediction map with values in [0, 1].
gt (np.ndarray) – Binary ground truth mask.
- Returns:
Weighted F-measure score based on weighted precision and recall.
- Return type:
- class py_sod_metrics.sod_metrics.HumanCorrectionEffortMeasure(relax: int = 5, epsilon: float = 2.0)[source]
Bases:
objectHuman Correction Effort Measure for Dichotomous Image Segmentation.
``` @inproceedings{HumanCorrectionEffortMeasure,
title = {Highly Accurate Dichotomous Image Segmentation}, author = {Xuebin Qin and Hang Dai and Xiaobin Hu and Deng-Ping Fan and Ling Shao and Luc Van Gool}, booktitle = ECCV, year = {2022}
}
- __init__(relax: int = 5, epsilon: float = 2.0)[source]
Initialize the Human Correction Effort Measure.
- step(pred: ndarray, gt: ndarray, normalize: bool = True)[source]
Statistics the metric for the pair of pred and gt.
- Parameters:
pred (np.ndarray) – Prediction, gray scale image.
gt (np.ndarray) – Ground truth, gray scale image.
normalize (bool, optional) – Whether to normalize the input data. Defaults to True.
- cal_hce(pred: ndarray, gt: ndarray) float[source]
Calculate the Human Correction Effort (HCE) for a pair of prediction and ground truth.
- Parameters:
pred (np.ndarray) – Prediction, gray scale image.
gt (np.ndarray) – Ground truth, gray scale image.
- Returns:
The HCE value.
- Return type:
- filter_conditional_boundary(contours: list, mask: ndarray, condition: ndarray)[source]
Filter boundary segments based on a given condition mask and compute the number of independent connected regions that require human correction.
- Parameters:
contours (List[np.ndarray]) – List of boundary contours (OpenCV format).
mask (np.ndarray) – Binary mask representing the region of interest.
condition (np.ndarray) – Condition mask used to determine which boundary points need to be considered.
- Returns:
boundaries (List[np.ndarray]): Filtered boundary segments that require correction.
independent_count (int): Number of independent connected regions
that need correction (i.e., human editing effort).
- Return type:
Tuple[List[np.ndarray], int]
- count_polygon_control_points(boundaries: list, epsilon: float = 1.0) int[source]
Approximate each boundary using the Ramer-Douglas-Peucker (RDP) algorithm and count the total number of control points of all approximated polygons.
- Parameters:
boundaries (List[np.ndarray]) – List of boundary contours. Each contour is an Nx1x2 numpy array (OpenCV contour format).
epsilon (float) – RDP approximation tolerance. Larger values result in fewer control points.
- Returns:
The total number of control points across all approximated polygons.
- Return type:
FmeasureV2 Module
- class py_sod_metrics.fmeasurev2.IOUHandler(with_dynamic: bool, with_adaptive: bool, *, with_binary: bool = False, sample_based: bool = True)[source]
Bases:
_BaseHandlerIntersection over Union.
iou = tp / (tp + fp + fn)
- class py_sod_metrics.fmeasurev2.SpecificityHandler(with_dynamic: bool, with_adaptive: bool, *, with_binary: bool = False, sample_based: bool = True)[source]
Bases:
_BaseHandlerSpecificity.
True negative rate (TNR)/specificity (SPC)/selectivity
specificity = tn / (tn + fp)
- py_sod_metrics.fmeasurev2.TNRHandler
alias of
SpecificityHandler
- class py_sod_metrics.fmeasurev2.DICEHandler(with_dynamic: bool, with_adaptive: bool, *, with_binary: bool = False, sample_based: bool = True)[source]
Bases:
_BaseHandlerDICE.
dice = 2 * tp / (tp + fn + tp + fp)
- class py_sod_metrics.fmeasurev2.OverallAccuracyHandler(with_dynamic: bool, with_adaptive: bool, *, with_binary: bool = False, sample_based: bool = True)[source]
Bases:
_BaseHandlerOverall Accuracy.
oa = overall_accuracy = (tp + tn) / (tp + fp + tn + fn)
- class py_sod_metrics.fmeasurev2.KappaHandler(with_dynamic: bool, with_adaptive: bool, *, with_binary: bool = False, sample_based: bool = True)[source]
Bases:
_BaseHandlerKappa Accuracy.
kappa = kappa = (oa - p_) / (1 - p_) p_ = [(tp + fp)(tp + fn) + (tn + fn)(tn + tp)] / (tp + fp + tn + fn)^2
- __init__(with_dynamic: bool, with_adaptive: bool, *, with_binary: bool = False, sample_based: bool = True)[source]
Initialize the Kappa handler.
- Parameters:
with_dynamic (bool, optional) – Record dynamic results for max/avg/curve versions.
with_adaptive (bool, optional) – Record adaptive results for adp version.
with_binary (bool, optional) – Record binary results for binary version.
sample_based (bool, optional) – Whether to average the metric of each sample or calculate the metric of the dataset. Defaults to True.
- class py_sod_metrics.fmeasurev2.PrecisionHandler(with_dynamic: bool, with_adaptive: bool, *, with_binary: bool = False, sample_based: bool = True)[source]
Bases:
_BaseHandlerPrecision.
precision = tp / (tp + fp)
- class py_sod_metrics.fmeasurev2.RecallHandler(with_dynamic: bool, with_adaptive: bool, *, with_binary: bool = False, sample_based: bool = True)[source]
Bases:
_BaseHandlerRecall.
True positive rate (TPR)/recall/sensitivity (SEN)/probability of detection/hit rate/power
recall = tp / (tp + fn)
- py_sod_metrics.fmeasurev2.TPRHandler
alias of
RecallHandler
- py_sod_metrics.fmeasurev2.SensitivityHandler
alias of
RecallHandler
- class py_sod_metrics.fmeasurev2.FPRHandler(with_dynamic: bool, with_adaptive: bool, *, with_binary: bool = False, sample_based: bool = True)[source]
Bases:
_BaseHandlerFalse Positive Rate.
False positive rate (FPR)/probability of false alarm/fall-out
fpr = fp / (tn + fp)
- class py_sod_metrics.fmeasurev2.BERHandler(with_dynamic: bool, with_adaptive: bool, *, with_binary: bool = False, sample_based: bool = True)[source]
Bases:
_BaseHandlerBalance Error Rate.
ber = 1 - 0.5 * (tp / (tp + fn) + tn / (tn + fp))
- class py_sod_metrics.fmeasurev2.FmeasureHandler(with_dynamic: bool, with_adaptive: bool, *, with_binary: bool = False, sample_based: bool = True, beta: float = 0.3)[source]
Bases:
_BaseHandlerF-measure.
fmeasure = (beta + 1) * precision * recall / (beta * precision + recall)
- __init__(with_dynamic: bool, with_adaptive: bool, *, with_binary: bool = False, sample_based: bool = True, beta: float = 0.3)[source]
Initialize the F-measure handler.
- Parameters:
with_dynamic (bool, optional) – Record dynamic results for max/avg/curve versions.
with_adaptive (bool, optional) – Record adaptive results for adp version.
with_binary (bool, optional) – Record binary results for binary version.
sample_based (bool, optional) – Whether to average the metric of each sample or calculate the metric of the dataset. Defaults to True.
beta (bool, optional) – β^2 in F-measure. Defaults to 0.3.
- class py_sod_metrics.fmeasurev2.FmeasureV2(metric_handlers: dict | None = None)[source]
Bases:
objectEnhanced F-measure evaluator with support for multiple evaluation metrics.
This class provides a flexible framework for computing various binary classification metrics including precision, recall, specificity, dice, IoU, and F-measure. It supports dynamic thresholding, adaptive thresholding, and binary evaluation modes.
- __init__(metric_handlers: dict | None = None)[source]
Enhanced Fmeasure class with more relevant metrics, e.g. precision, recall, specificity, dice, iou, fmeasure and so on.
- Parameters:
metric_handlers (dict, optional) – Handlers of different metrics. Defaults to None.
- add_handler(handler_name, metric_handler)[source]
Add a metric handler to the evaluator.
- Parameters:
handler_name (str) – Name identifier for the metric handler.
metric_handler – Handler instance that computes the specific metric.
- static get_statistics(binary: ndarray, gt: ndarray, FG: int, BG: int) dict[source]
Calculate the TP, FP, TN and FN based a adaptive threshold.
- adaptively_binarizing(pred: ndarray, gt: ndarray, FG: int, BG: int) dict[source]
Calculate the TP, FP, TN and FN based a adaptive threshold.
- dynamically_binarizing(pred: ndarray, gt: ndarray, FG: int, BG: int) dict[source]
Calculate the corresponding TP, FP, TN and FNs when the threshold changes from 0 to 255.
- step(pred: ndarray, gt: ndarray, normalize: bool = True)[source]
Statistics the metrics for the pair of pred and gt.
- Parameters:
pred (np.ndarray) – Prediction, gray scale image.
gt (np.ndarray) – Ground truth, gray scale image.
normalize (bool, optional) – Whether to normalize the input data. Defaults to True.
Context Measure Module
- class py_sod_metrics.context_measure.ContextMeasure(beta2: float = 1.0, alpha: float = 6.0)[source]
Bases:
objectContext-measure for evaluating foreground segmentation quality.
This metric evaluates predictions by considering both forward inference (how well predictions align with ground truth) and reverse deduction (how completely ground truth is covered by predictions), using context-aware Gaussian kernels.
``` @article{ContextMeasure,
title={Context-measure: Contextualizing Metric for Camouflage}, author={Wang, Chen-Yang and Ji, Gepeng and Shao, Song and Cheng, Ming-Ming and Fan, Deng-Ping}, journal={arXiv preprint arXiv:2512.07076}, year={2025}
}
- step(pred: ndarray, gt: ndarray, normalize: bool = True)[source]
Statistics the metric for the pair of pred and gt.
- Parameters:
pred (np.ndarray) – Prediction, gray scale image.
gt (np.ndarray) – Ground truth, gray scale image.
normalize (bool, optional) – Whether to normalize the input data. Defaults to True.
- compute(pred: ndarray, gt: ndarray, cd: ndarray) float[source]
Compute the context measure between prediction and ground truth.
- Parameters:
pred (np.ndarray) – Prediction map (values between 0 and 1).
gt (np.ndarray) – Ground truth map (boolean or 0/1 values).
cd (np.ndarray) – Camouflage degree map (values between 0 and 1).
- Returns:
Context measure value.
- Return type:
- class py_sod_metrics.context_measure.CamouflageContextMeasure(beta2: float = 1.2, alpha: float = 6.0, gamma: int = 8, lambda_spatial: float = 20)[source]
Bases:
ContextMeasureCamouflage Context-measure for evaluating camouflaged object detection quality.
This metric extends the base ContextMeasure by incorporating camouflage degree, which measures how well the foreground blends with its surrounding background. It uses patch-based nearest neighbor matching in Lab color space with spatial constraints to estimate camouflage difficulty.
``` @article{ContextMeasure,
title={Context-measure: Contextualizing Metric for Camouflage}, author={Wang, Chen-Yang and Ji, Gepeng and Shao, Song and Cheng, Ming-Ming and Fan, Deng-Ping}, journal={arXiv preprint arXiv:2512.07076}, year={2025}
}
- __init__(beta2: float = 1.2, alpha: float = 6.0, gamma: int = 8, lambda_spatial: float = 20)[source]
Initialize the Camouflage Context Measure evaluator.
- Parameters:
beta2 (float) – Balancing factor for forward and reverse. Defaults to 1.2 for camouflage.
alpha (float) – Gaussian kernel scaling factor. Defaults to 6.0.
gamma (int) – Exponential scaling factor for camouflage degree. Defaults to 8.
lambda_spatial (float) – Weight for spatial distance in ANN search. Defaults to 20.
- step(pred: ndarray, gt: ndarray, img: ndarray, normalize: bool = True)[source]
Statistics the metric for the pair of pred, gt, and img.
- Parameters:
pred (np.ndarray) – Prediction, gray scale image.
gt (np.ndarray) – Ground truth, gray scale image.
img (np.ndarray) – Original RGB image (required for camouflage degree calculation).
normalize (bool, optional) – Whether to normalize the input data. Defaults to True.
Multi-Scale IoU Module
- class py_sod_metrics.multiscale_iou.MSIoU(with_dynamic: bool, with_adaptive: bool, *, with_binary: bool = False, num_levels=10)[source]
Bases:
objectMulti-Scale Intersection over Union (MSIoU) metric.
This implements the MSIoU metric which evaluates segmentation quality at multiple scales by comparing edge maps. It addresses the limitation of traditional IoU which struggles with fine structures in segmentation results.
``` @inproceedings{MSIoU,
title = {Multiscale IOU: A Metric for Evaluation of Salient Object Detection with Fine Structures}, author = {Ahmadzadeh, Azim and Kempton, Dustin J. and Chen, Yang and Angryk, Rafal A.}, booktitle = ICIP, year = {2021},
}
- __init__(with_dynamic: bool, with_adaptive: bool, *, with_binary: bool = False, num_levels=10)[source]
Initialize the MSIoU evaluator.
- get_edge(mask: ndarray)[source]
Edge detection based on the scipy.ndimage.sobel function.
- Parameters:
mask – a binary mask of an object whose edges are of interest.
- Returns:
a binary mask of 1’s as edges and 0’s as background.
- shrink_by_grid(image: ndarray, cell_size: int) ndarray[source]
Shrink the image by summing values within grid cells.
Performs box-counting after applying zero padding if the image dimensions are not perfectly divisible by the cell size.
- Parameters:
image – The input binary image (edges).
cell_size – The size of the grid cells.
- Returns:
A shrunk binary image where each pixel represents a grid cell.
- multi_scale_iou(pred_edge: ndarray, gt_edge: ndarray) list[source]
Calculate Multi-Scale IoU.
- Parameters:
pred_edge (np.ndarray) – edge map of pred
gt_edge (np.ndarray) – edge map of gt
- Returns:
ratios
- Return type:
- binarizing(pred_bin: ndarray, gt_edge: ndarray) list[source]
Calculate Multi-Scale IoU based on dynamically thresholding.
- Parameters:
pred_bin (np.ndarray) – binarized pred
gt_edge (np.ndarray) – gt binarized by 128
- Returns:
areas under the curve
- Return type:
np.ndarray
- step(pred: ndarray, gt: ndarray, normalize: bool = True)[source]
Calculate the Multi-Scale IoU for a single prediction-ground truth pair.
This method first extracts edges from both prediction and ground truth, then computes IoU ratios at multiple scales defined by self.cell_sizes. Finally, it calculates the area under the curve of these ratios.
- Parameters:
pred (np.ndarray) – Prediction, gray scale image.
gt (np.ndarray) – Ground truth, gray scale image.
normalize (bool, optional) – Whether to normalize the input data. Defaults to True.
- Returns:
The MSIoU score for the given pair (float between 0 and 1).
Size Invariance Module
- py_sod_metrics.size_invariance.parse_connected_components(mask: ndarray, area_threshold: float = 50) tuple[source]
Find the connected components in a binary mask.
If there are no connected components, return an empty list.
If all the connected components are smaller than the area_threshold, we will return the largest one.
- py_sod_metrics.size_invariance.encode_bboxwise_tgts_bitwise(max_valid_tgt_idx: int, valid_labeled_mask: ndarray) ndarray[source]
Encode each target bbox region with a bitwise mask.
- Parameters:
max_valid_tgt_idx (int) – The maximum index of the valid targets.
valid_labeled_mask (np.ndarray) – The mask of the valid targets. 0 is background.
- Returns:
The size weight for the bbox of each target.
- Return type:
np.ndarray
- py_sod_metrics.size_invariance.get_kth_bit(n: ndarray, k: int) ndarray[source]
Get the value (0 or 1) in the k-th bit of each element in the array.
- Parameters:
n (np.ndarray) – The original data array.
k (int) – The index of the bit to extract.
- Returns:
The extracted data array. Only the output of the kth bit which is not 0 equals 1.
- Return type:
np.ndarray
- class py_sod_metrics.size_invariance.SizeInvarianceFmeasureV2(metric_handlers: dict | None = None)[source]
Bases:
FmeasureV2Size invariance version of FmeasureV2.
This provides size-invariant versions of standard SOD metrics that address the imbalance problem in multi-object salient object detection. Traditional metrics can be biased toward larger objects, while size-invariant metrics ensure fair evaluation across objects of different sizes.
``` @inproceedings{SizeInvarianceVariants,
title = {Size-invariance Matters: Rethinking Metrics and Losses for Imbalanced Multi-object Salient Object Detection}, author = {Feiran Li and Qianqian Xu and Shilong Bao and Zhiyong Yang and Runmin Cong and Xiaochun Cao and Qingming Huang}, booktitle = ICML, year = {2024}
}
- step(pred: ndarray, gt: ndarray, normalize: bool = True)[source]
Statistics the metrics for the pair of pred and gt.
- Parameters:
pred (np.ndarray) – Prediction, gray scale image.
gt (np.ndarray) – Ground truth, gray scale image.
normalize (bool, optional) – Whether to normalize the input data. Defaults to True.
- class py_sod_metrics.size_invariance.SizeInvarianceMAE[source]
Bases:
MAESize invariance version of MAE.
``` @inproceedings{SizeInvarianceVariants,
title = {Size-invariance Matters: Rethinking Metrics and Losses for Imbalanced Multi-object Salient Object Detection}, author = {Feiran Li and Qianqian Xu and Shilong Bao and Zhiyong Yang and Runmin Cong and Xiaochun Cao and Qingming Huang}, booktitle = ICML, year = {2024}
}
- step(pred: ndarray, gt: ndarray, normalize: bool = True)[source]
Statistics the metric for the pair of pred and gt.
- Parameters:
pred (np.ndarray) – Prediction, gray scale image.
gt (np.ndarray) – Ground truth, gray scale image.
normalize (bool, optional) – Whether to normalize the input data. Defaults to True.
Utility Functions
- py_sod_metrics.utils.validate_and_normalize_input(pred: ndarray, gt: ndarray, normalize: bool = True)[source]
Validate and optionally normalize prediction and ground truth inputs.
This function ensures that prediction and ground truth arrays have compatible shapes and appropriate data types. When normalization is enabled, it converts inputs to the standard format required by the predefined metrics (pred in [0, 1] as float, gt as boolean).
- Parameters:
pred (np.ndarray) – Prediction array. If normalize=True, should be uint8 grayscale image (0-255). If normalize=False, should be float32/float64 in range [0, 1].
gt (np.ndarray) – Ground truth array. If normalize=True, should be uint8 grayscale image (0-255). If normalize=False, should be boolean array.
normalize (bool, optional) – Whether to normalize the input data using prepare_data(). Defaults to True.
- Returns:
- A tuple containing:
pred (np.ndarray): Normalized prediction as float64 in range [0, 1].
gt (np.ndarray): Normalized ground truth as boolean array.
- Return type:
- Raises:
ValueError – If prediction and ground truth shapes don’t match, or if prediction values are outside [0, 1] range when normalize=False.
TypeError – If data types are invalid when normalize=False (pred must be float32/float64, gt must be boolean).
- py_sod_metrics.utils.prepare_data(pred: ndarray, gt: ndarray) tuple[source]
Convert and normalize prediction and ground truth data.
For predictions, mimics MATLAB’s mapminmax(im2double(…)).
For ground truth, applies binary thresholding at 128.
- Parameters:
pred (np.ndarray) – Prediction grayscale image, uint8 type with values in [0, 255].
gt (np.ndarray) – Ground truth grayscale image, uint8 type with values in [0, 255].
- Returns:
- A tuple containing:
pred (np.ndarray): Normalized prediction as float64 in range [0, 1].
gt (np.ndarray): Binary ground truth as boolean array.
- Return type: