Optimizing automated sleep stage scoring of 5-second mini-epochs: a transfer learning study
Optimizing automated sleep stage scoring of 5-second mini-epochs: a transfer learning study
Follin, L. F.; Christensen, J. A. E.; Vevelstad, J.; Juvodden, H. T.; Viste, R.; Hansen, B. H.; Perslev, M.; Kaufmann, T.; Zahid, A. N.; Knudsen-Heier, S.
AbstractStudy objective: Conventional sleep staging relies on 30-second epochs, potentially concealing transient sleep stage intrusion and reducing precision. Building on our previous study of mini-epochs, we investigated whether U-Sleep, an existing automatic deep learning-based sleep staging model with high performance in epochs, could be optimized to similar performance level in 5-second mini-epoch scoring, thereby enabling more detailed sleep characterization. Methods: We created a dataset of 48,000 human-scored 5-second mini-epochs from 100 PSGs. We compared mini-epochs to human-scored epochs before U-Sleep was optimized using trans-fer learning and evaluated on a test set. Model performance was assessed using F1-scores, con-fusion matrices, stage distributions and transition rates comparing scorings of the original U-Sleep before, and the optimized U-Sleep after transfer learning to human-scored mini-epochs. Results: Compared to human-scored epochs, human-scored mini-epochs captured significantly more transitions (1.70/minute vs. 0.21/minute, p<0.001), and significantly more wake (8.4% versus 5.4%), N1 (7.2% versus 5.4%), and N2 (51.8% versus 40.9%), less N3 (15.4% versus 25.2%) and REM sleep (16.7% versus 23.0%) (all p<0.001). Optimizing U-Sleep improved its performance significantly from F1=0.74 to F1=0.81 (p<0.05) and gave increased transition rates in the test set (original U-Sleep: 1.06/minute, optimized U-Sleep: 1.34/minute, human-scored mini-epochs: 1.70/minute). Stage distributions did not differ between optimized U-Sleep\'s scorings and human-scored mini-epochs. Conclusions: After optimization, U-Sleep performance in mini-epochs matched the high performance levels previously reported in both human and automated 30-second epoch scoring. This demonstrates the feasibility of precise, automated high resolution sleep staging. Future work should include external validation and application to full-night recordings.