Loading…

Artificial intelligence to identify fractures on pediatric and young adult upper extremity radiographs

Background Pediatric fractures are challenging to identify given the different response of the pediatric skeleton to injury compared to adults, and most artificial intelligence (AI) fracture detection work has focused on adults. Objective Develop and transparently share an AI model capable of detect...

Full description

Saved in:
Bibliographic Details
Published in:Pediatric radiology 2023-11, Vol.53 (12), p.2386-2397
Main Authors: Zech, John R., Jaramillo, Diego, Altosaar, Jaan, Popkin, Charles A., Wong, Tony T.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Background Pediatric fractures are challenging to identify given the different response of the pediatric skeleton to injury compared to adults, and most artificial intelligence (AI) fracture detection work has focused on adults. Objective Develop and transparently share an AI model capable of detecting a range of pediatric upper extremity fractures. Materials and methods In total, 58,846 upper extremity radiographs (finger/hand, wrist/forearm, elbow, humerus, shoulder/clavicle) from 14,873 pediatric and young adult patients were divided into train ( n  = 12,232 patients), tune ( n  = 1,307), internal test ( n  = 819), and external test ( n  = 515) splits. Fracture was determined by manual inspection of all test radiographs and the subset of train/tune radiographs whose reports were classified fracture-positive by a rule-based natural language processing (NLP) algorithm. We trained an object detection model (Faster Region-based Convolutional Neural Network [R-CNN]; “strongly-supervised”) and an image classification model (EfficientNetV2-Small; “weakly-supervised”) to detect fractures using train/tune data and evaluate on test data. AI fracture detection accuracy was compared with accuracy of on-call residents on cases they preliminarily interpreted overnight. Results A strongly-supervised fracture detection AI model achieved overall test area under the receiver operating characteristic curve (AUC) of 0.96 (95% CI 0.95–0.97), accuracy 89.7% (95% CI 88.0–91.3%), sensitivity 90.8% (95% CI 88.5–93.1%), and specificity 88.7% (95% CI 86.4–91.0%), and outperformed a weakly-supervised model (AUC 0.93, 95% CI 0.92–0.94, P  
ISSN:1432-1998
0301-0449
1432-1998
DOI:10.1007/s00247-023-05754-y