Abstract

Translate

In the last decade, research on artificial intelligence has seen rapid growth with deep learning models, especially in the field of medical image segmentation. Various studies demonstrated that these models have powerful prediction capabilities and achieved similar results as clinicians. However, recent studies revealed that the evaluation in image segmentation studies lacks reliable model performance assessment and showed statistical bias by incorrect metric implementation or usage. Thus, this work provides an overview and interpretation guide on the following metrics for medical image segmentation evaluation in binary as well as multi-class problems: Dice similarity coefficient, Jaccard, Sensitivity, Specificity, Rand index, ROC curves, Cohen’s Kappa, and Hausdorff distance. Furthermore, common issues like class imbalance and statistical as well as interpretation biases in evaluation are discussed. As a summary, we propose a guideline for standardized medical image segmentation evaluation to improve evaluation quality, reproducibility, and comparability in the research field.

Details

Title

Towards a guideline for evaluation metrics in medical image segmentation

Author

Müller, Dominik

; Soto-Rey, Iñaki; Kramer, Frank

Pages

1-8

Section

Commentary

Publication year

2022

Publication date

2022

Publisher

BioMed Central

e-ISSN

17560500

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1186/s13104-022-06096-y

ProQuest document ID

2691514334

© 2022. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Towards a guideline for evaluation metrics in medical image segmentation

Jump to:

Abstract

Details

Suggested sources