180 - Revolutionizing Neonatal Monitoring: AI-Generated Arterial Waveforms from Non-Invasive Cardiorespiratory Monitoring in NICU patients
Saturday, April 26, 2025
2:30pm – 4:45pm HST
Publication Number: 180.6339
Ben Shank, EMx Biotech, Austin, TX, United States; Alvaro Moreira, The University of Texas Health Science Center at San Antonio Joe R. and Teresa Lozano Long School of Medicine, San Antonio, TX, United States; Brynne Sullivan, University of Virginia School of Medicine, Charlottesville, VA, United States
Background: Vital sign analytics in neonates can facilitate early detection of clinical deterioration. Amongst routinely collected parameters, monitoring continuous blood pressure (BP) is typically limited to those with arterial lines. Advances in generative AI offer the potential to generate continuous neonatal BP waveforms non-invasively. Objective: To evaluate the accuracy of continuous BP waveforms generated through deep learning across neonates, with respect to gestational age, sex, and race/ethnicity. Design/Methods: We analyzed infants admitted to a level IV NICU in 2023 who had continuous monitoring data available, including intra-arterial blood pressure (IBP), photoplethysmogram (PPG), and electrocardiogram (ECG) waveforms. Continuous BP monitoring was performed using an umbilical catheter or peripheral arterial line at the medical team's discretion. PPG and ECG monitoring were performed for all patients using Masimo and GE technology, respectively, from NICU admission to discharge or death. We analyzed 120 minutes of 3-channel IBP, PPG, and ECG data from each patient. The dataset was randomly partitioned by patient into a 70:10:20 ratio of training, validation, and test sets, respectively. We excluded segments with missing data or zeros and did not otherwise filter the waveform data. All data were resampled to 128 Hz and normalized to a 0-1 range using global dataset statistics. We then trained a commercial waveform intelligence foundation model for 253 epochs with a batch size of 64, where each batch consisted of 256 samples. Results: The model was trained on 6,360 minutes of unfiltered waveform training data from 53 infants with median gestational age and birthweight of 37.6 weeks (IQR 29.2 - 39.3) and 2700 grams (IQR 1355 - 3329) (Table 1a). Model convergence yielded a validation root mean squared error of 0.0302 and training was validated per epoch in a set of seven patients, attaining a minimum validation root mean square error of 0.0299. Model evaluation was performed using a test set of 15 patients, wherein a significant portion of neonates were premature (n=32, 42.7%). Model-generated arterial waveforms were accurate compared to IBP waveforms (Figure 2) and among subgroups of patients stratified by gestational age, sex, and race/ethnicity (Table 1c).
Conclusion(s): Continuous BP monitoring generated through deep learning performed reliably across neonates, irrespective of gestational age, sex, or race/ethnicity. While these findings highlight the model's generalizability and promise as a non-invasive alternative, further validation in larger datasets is necessary to confirm its clinical utility.
Table 1: Patient Characteristics and Model Performance table-1.pdf
Figure 1: Ensemble Interarterial Complex Bland-Altman Bland-Altman Comparing True to Generated Arterial Complexes. This analysis excludes interarterial complexes that could not be matched due to ground truth waveform signal noise, nonperfusing arterial beats, or model predictions that could not be aligned to a true peak.
Figure 2 Random output traces at various levels of detail for different patients. The upper-right panel shows a relatively clean set of input signals and good concordance between true and predicted waveforms. The upper-left depicts error in the presence of ECG and PPG signal artifacts. Note the extremely erroneous prediction in the lower-left panel, which was an output pattern observed intermittently and caused by overfitting to noisy training data and insufficient regularization. The lower-right panel depicts an example that should typically be excluded from a machine learning training workflow, as the ground truth signal is corrupted.