You are here

Lung nodule malignancy classification using only radiologist-quantified image features as inputs to statistical learning algorithms

Title: Lung nodule malignancy classification using only radiologist-quantified image features as inputs to statistical learning algorithms: probing the Lung Image Database Consortium dataset with two statistical learning methods.
0 views
0 downloads
Name(s): Hancock, Matthew C, author
Magnan, Jerry F, author
Type of Resource: text
Genre: Journal Article
Text
Date Issued: 2016-10-01
Physical Form: computer
online resource
Extent: 1 online resource
Language(s): English
Abstract/Description: In the assessment of nodules in CT scans of the lungs, a number of image-derived features are diagnostically relevant. Currently, many of these features are defined only qualitatively, so they are difficult to quantify from first principles. Nevertheless, these features (through their qualitative definitions and interpretations thereof) are often quantified via a variety of mathematical methods for the purpose of computer-aided diagnosis (CAD). To determine the potential usefulness of quantified diagnostic image features as inputs to a CAD system, we investigate the predictive capability of statistical learning methods for classifying nodule malignancy. We utilize the Lung Image Database Consortium dataset and only employ the radiologist-assigned diagnostic feature values for the lung nodules therein, as well as our derived estimates of the diameter and volume of the nodules from the radiologists' annotations. We calculate theoretical upper bounds on the classification accuracy that are achievable by an ideal classifier that only uses the radiologist-assigned feature values, and we obtain an accuracy of 85.74 [Formula: see text], which is, on average, 4.43% below the theoretical maximum of 90.17%. The corresponding area-under-the-curve (AUC) score is 0.932 ([Formula: see text]), which increases to 0.949 ([Formula: see text]) when diameter and volume features are included and has an accuracy of 88.08 [Formula: see text]. Our results are comparable to those in the literature that use algorithmically derived image-based features, which supports our hypothesis that lung nodules can be classified as malignant or benign using only quantified, diagnostic image features, and indicates the competitiveness of this approach. We also analyze how the classification accuracy depends on specific features and feature subsets, and we rank the features according to their predictive power, statistically demonstrating the top four to be spiculation, lobulation, subtlety, and calcification.
Identifier: FSU_pmch_27990453 (IID), 10.1117/1.JMI.3.4.044504 (DOI), PMC5146644 (PMCID), 27990453 (RID), 27990453 (EID), 16150R (PII)
Keywords: Lung Image Database Consortium dataset, Computer-aided diagnosis, Logistic regression, Lung nodule classification, Machine learning, Random forests
Publication Note: This NIH-funded author manuscript originally appeared in PubMed Central at https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5146644.
Persistent Link to This Record: http://purl.flvc.org/fsu/fd/FSU_pmch_27990453
Owner Institution: FSU
Is Part Of: Journal of medical imaging (Bellingham, Wash.).
2329-4302
Issue: iss. 4, vol. 3

Choose the citation style.
Hancock, M. C., & Magnan, J. F. (2016). Lung nodule malignancy classification using only radiologist-quantified image features as inputs to statistical learning algorithms: probing the Lung Image Database Consortium dataset with two statistical learning methods. Journal Of Medical Imaging (Bellingham, Wash.). Retrieved from http://purl.flvc.org/fsu/fd/FSU_pmch_27990453