Ballard and Dubowitz Neonatal Assessments for Gestational Age Determination
Summary / Explanation
Introduction
Worldwide, prematurity is the leading cause of childhood mortality.[1] The prevalence of preterm birth varies across nations, with rates ranging from 4% to 16%.[2] Precise determination of gestational age is crucial for addressing and mitigating morbidity and mortality associated with preterm birth, as emphasized by the Every Newborn Action Plan.[3] Three methods currently used for estimating gestational age include antenatal ultrasound, dating from the last menstrual period (LMP), and clinical assessments.
Antenatal ultrasound is a reliable method for determining gestational age in the first trimester. However, it has significant limitations, including reduced accuracy in later pregnancy, high cost, limited accessibility in resource-limited settings, and the need for trained personnel.[4] In contrast, dating from the LMP is cost-effective and simple, yet estimation issues often arise from uncertainties such as irregular menstruation, which can result from nutritional deficiencies or maternal illnesses prevalent in low- and middle-income regions.[5] In developed countries, healthcare providers commonly use first-trimester ultrasonographic dating and dating from the LMP for gestational age estimation.
As many premature infants are born to mothers with unreliable menstrual histories or who lack prenatal care, clinical assessment of the newborn's maturity often serves as the sole available measure of gestational age. Healthcare professionals have historically relied on clinical evaluation of newborn maturity to estimate gestational age after birth. Numerous neonatal assessments or scoring systems exist, using multiple physical features to ascertain gestational age. The various clinical scoring systems, including the Amiel-Tison, Feresu, Dubowitz, Finnström, Ballard, New Ballard score, Farr, Tunçer, Eregie, Capurro, Kollée, Klimek, simplified Dubowitz, Narayanan, Robinson, Parkin, and Bhagwat scores, incorporate a wide range of physical and neuromuscular criteria. All of the scoring systems differ in their accuracy in estimating gestational age.[6] From the late 1960s to recent years, these methods have been evaluated in diverse settings globally, including neonatal intensive care units (NICUs), maternity units, and university hospitals. Studies have provided insights regarding gestational age assessment practices across different geographical and institutional contexts. The Dubowitz and Ballard systems are the most widely used in neonatal practice and are the focus of this review.
Dubowitz Scoring System
In 1970, pediatrician Lilly Dubowitz and neurologist Victor Dubowitz collaborated to develop the Dubowitz scoring system to estimate gestational age in newborns. This scoring system derives from a study conducted in the NICU at Jessop Hospital for Women, Sheffield, England, with a sample size of 167 neonates.[7] The assessment involves 21 criteria encompassing 11 physical and 10 neuromuscular characteristics. Physical criteria include skin color, texture, and opacity; ear form and firmness; plantar creases; edema; lanugo; breast size; and nipple and genital development. Physical criteria are scored on a scale from 0 to 4. Neuromuscular features include posture, ventral suspension, scarf sign, head lag, square window, popliteal angle, ankle dorsiflexion, arm recoil, leg recoil, and heel-to-ear. Neuromuscular criteria are scored on a scale from 0 to 5. The physical and neuromuscular scores are then combined to help determine gestational age. The composite score is plotted on a graph to estimate gestational age between 26 and 44 weeks. Dubowitz et al proposed a regression formula to determine gestational maturity using the total composite score obtained from assessments. The regression formula is y=0.2642x+24.595, where x represents the total score, and y represents the gestational age. The system uses the LMP as the reference standard, with a reported accuracy of ±2.0 weeks within a 95% confidence interval.[7]
The Dubowitz scoring system has several limitations. Spinnato et al reported that the Dubowitz assessment may be inaccurate in estimating gestational age in neonates with low birth weights, particularly those born before 33 weeks of gestation.[8] Sanders et al observed that gestational age is often overestimated in premature infants, especially those with lower birth weights, when assessed using the Dubowitz method.[9] The length of the assessment can be challenging, and its complexity requires expertise that may be limited. Completing a Dubowitz assessment for a term newborn typically takes 10 to 15 minutes.[10] Potentially, this time requirement may impede its acceptance in busy healthcare facilities.
A meta-analysis by Lee et al encompassing 26 global studies revealed variability in the mean difference between gestational age estimates obtained using the Dubowitz scoring system and ultrasound-based dating.[6] Dubowitz and Ballard's methods were found to overestimate gestational age, especially among early preterm infants. Although the accuracy of Dubowitz scoring in gestational age dating remained consistent across diverse settings, its tendency to overestimate gestational age among early preterm infants highlights the need for cautious interpretation.
Simplified Dubowitz Scoring System
The simplified Dubowitz scoring system, introduced by Allan et al in 2009, comprises 6 criteria—breast size, skin texture, ear-bending, square window, popliteal angle, and scarf sign. The system was validated using prenatal ultrasound dating as the reference standard. The ear-bending criterion replaced ear firmness due to variations in ear cartilage among Aboriginal babies. The study, conducted in private hospitals in the Northern Territory, Australia, included a sample size of 98 neonates. Allan et al revealed that the mean difference between gestational age estimations from the simplified Dubowitz scoring system and ultrasound-based dating ranged from 2.8 weeks below to 1.9 weeks above the ultrasound estimates, indicating its effectiveness in estimating gestational age.[11] However, the small sample size of this study and the relatively narrow range of gestational ages studied may limit the generalizability of these findings.
Ballard Scoring System
The Ballard scoring system, introduced by Ballard et al in 1979, is based on a study conducted in the NICU of Cincinnati General Hospital with a sample size of 252 neonates.[12] The study included newborns with birth weights ranging from 760 to 5460 g and gestational ages ranging from 26 to 44 weeks. The Ballard method aimed to streamline gestational age assessment, addressing the lengthy process of the Dubowitz method. The Ballard score comprises fewer items, allowing for a shorter examination and applicability to all neonates. Ballard et al recommend that healthcare providers perform this scoring at 30 to 42 hours of age. This timing is critical as it allows neonates to stabilize physiologically after birth while minimizing the influence of immediate postnatal changes, such as edema or skin texture alterations, that could affect the scoring.
The Ballard scoring system encompasses 6 physical and 6 neuromuscular criteria adapted from the Dubowitz method. The physical criteria include lanugo, skin color, plantar creases, ear cartilage, breast development, and genitals. The neuromuscular criteria include posture, arm recoil, square window, scarf sign, popliteal angle, and heel to ear. Each criterion is evaluated on a scale from 0 to 4, except for skin and the popliteal angle, which are scored from 0 to 5. Total scores range from 0 to 50, with a score of 5 corresponding to 26 weeks and a score of 50 corresponding to an unvalidated age of 44 weeks. The reference standard used is the LMP and clinical data, with a reported correlation of 0.852. The correlation between the total scores obtained from the Ballard and Dubowitz scoring systems was 0.969 (P<.00001), indicating comparable reliability between the two methods.[12]
Expanded New Ballard Scoring System
Ballard and Khoury refined and modified the Ballard scoring system to enhance accuracy, particularly for extremely premature neonates and those younger than 26 weeks of gestational age. This modification was based on a comprehensive study conducted in NICUs and nurseries in Cincinnati, Ohio, with a sample size of 530 neonates.[13] The expanded New Ballard scoring system retains the original 12 criteria outlined in the Ballard scoring system with a few modifications.
The New Ballard scoring system considers the specific physical and neuromuscular characteristics unique to extremely premature neonates. To accommodate the nuances of extreme prematurity, four neuromuscular criteria—square window, arm recoil, popliteal angle, and heel-to-ear maneuver—were expanded to include a score of −1. In addition, under the physical criteria, values of −1 and −2 are assigned to foot lengths of 40 to 50 mm and <40 mm, respectively. These values are integrated into a criterion labeled plantar surface instead of the plantar creases in the original scoring. Similarly, values of −1 and −2 are assigned for loosely and tightly fused eyelids, respectively, and these are included under the criterion labeled eye/ear in the modified scoring system. In the New Ballard scoring system, total composite scores range from −10 to 50. A score of −10 corresponds to an unvalidated age of 20 weeks of gestation, whereas a score of 50 corresponds to an unvalidated age of 44 weeks. The reference standard for this modified scoring system is prenatal ultrasonography, with the New Ballard scoring system showing a correlation of 0.96 with gestational age estimated using ultrasound.[13]
The New Ballard examination is ideally performed between 12 and 24 hours after birth for the highest accuracy, but reasonable accuracy is still achievable up to 1 week after birth. The New Ballard scoring system has become essential in routine neonatology practice for assessing postnatal gestational maturity. This system stands out among other scoring methods due to its notable accuracy, quick assessment process, appropriate interrater reliability, and consistent reproducibility. In addition, it offers significant improvements over maturity assessments described by earlier researchers and provides an easy method for determining maturity through a rating scale. The New Ballard scoring system allows a direct comparison of the total maturity score with the gestational age range in weeks, eliminating the need for complex calculations required by previous approaches such as the Dubowitz scoring methods.[14] Although the Ballard score estimates gestational age for 95% of neonates within ±2.5 weeks, its accuracy varies depending on gestational age and birth weight.[15] For instance, studies have shown that the Ballard score tends to overestimate gestational age by an average of 0.4 weeks in preterm neonates before 37 weeks, whereas underestimating gestational age by up to 2.5 weeks in growth-restricted babies.[16] This variability highlights the need to complement physical assessments with other tools, such as ultrasound-based dating, for improved accuracy in challenging cases.
In a meta-analysis, Lee et al discovered that across 7 studies, the correlation between gestational age determined by the New Ballard score and ultrasound varied from 0.12 to 0.97, with a median of 0.85.[6] The strong correlation with ultrasound in certain studies (up to 0.97) suggests that the New Ballard score can be highly accurate under optimal conditions, particularly for term and near-term infants. Sensitivity and specificity for correctly estimating gestational age with Ballard scoring were reported as 64% and 95%, respectively. Notably, the New Ballard score tended to overestimate gestational age to a lesser extent compared to the original Ballard method in neonates younger than 30 weeks.[17]
Recent research has focused on improving gestational age assessment for preterm newborns using machine learning techniques and smartphone applications. Patel et al developed the Tablet App for the Simplified Gestational Age Score, a simple and inexpensive tool for estimating gestational age, particularly valuable in resource-limited settings to facilitate early identification of preterm neonates, optimize health resource allocation, and improve preterm birth management.[18] Machine learning models are being applied to metabolomic gestational age dating, which estimates gestational age by analyzing specific metabolic markers in the neonate's blood that reflect physiological processes varying with fetal maturity, offering a potentially accurate assessment method during the first week of life.[19][20]
Conclusion
Given the significant implications for clinical management and outcomes, evaluating gestational age in neonates is critical to neonatal care. Although various methods exist for estimating gestational age, including antenatal ultrasound and dating from the LMP, postnatal clinical assessments remain crucial, particularly in settings where other methods are not feasible. The Dubowitz and New Ballard scoring systems have long been mainstays in neonatology practice, providing vital information for neonatal maturity assessment. Future research should focus on refining existing scoring systems to address their limitations and improve their accuracy and reliability. Comparative studies with a large sample size are needed to evaluate the effectiveness of different scoring systems in diverse clinical settings and patient populations. By continuing to refine and validate neonatal scoring systems, clinicians can better assess neonatal maturity, ultimately improving care and outcomes for preterm infants.
Register For Free And Read The Full Article
Search engine and full access to all medical articles
10 free questions in your specialty
Free CME/CE Activities
Free daily question in your email
Save favorite articles to your dashboard
Emails offering discounts
Learn more about a Subscription to StatPearls Point-of-Care
References
Perin J, Mulick A, Yeung D, Villavicencio F, Lopez G, Strong KL, Prieto-Merino D, Cousens S, Black RE, Liu L. Global, regional, and national causes of under-5 mortality in 2000-19: an updated systematic analysis with implications for the Sustainable Development Goals. The Lancet. Child & adolescent health. 2022 Feb:6(2):106-115. doi: 10.1016/S2352-4642(21)00311-4. Epub 2021 Nov 17 [PubMed PMID: 34800370]
Level 1 (high-level) evidenceOhuma EO, Moller AB, Bradley E, Chakwera S, Hussain-Alkhateeb L, Lewin A, Okwaraji YB, Mahanani WR, Johansson EW, Lavin T, Fernandez DE, Domínguez GG, de Costa A, Cresswell JA, Krasevec J, Lawn JE, Blencowe H, Requejo J, Moran AC. National, regional, and global estimates of preterm birth in 2020, with trends from 2010: a systematic analysis. Lancet (London, England). 2023 Oct 7:402(10409):1261-1271. doi: 10.1016/S0140-6736(23)00878-4. Epub [PubMed PMID: 37805217]
Level 1 (high-level) evidenceMoxon SG, Ruysen H, Kerber KJ, Amouzou A, Fournier S, Grove J, Moran AC, Vaz LM, Blencowe H, Conroy N, Gülmezoglu A, Vogel JP, Rawlins B, Sayed R, Hill K, Vivio D, Qazi SA, Sitrin D, Seale AC, Wall S, Jacobs T, Ruiz Peláez J, Guenther T, Coffey PS, Dawson P, Marchant T, Waiswa P, Deorari A, Enweronu-Laryea C, Arifeen S, Lee AC, Mathai M, Lawn JE. Count every newborn; a measurement improvement roadmap for coverage data. BMC pregnancy and childbirth. 2015:15 Suppl 2(Suppl 2):S8. doi: 10.1186/1471-2393-15-S2-S8. Epub 2015 Sep 11 [PubMed PMID: 26391444]
Committee on Practice Bulletins—Obstetrics and the American Institute of Ultrasound in Medicine. Practice Bulletin No. 175: Ultrasound in Pregnancy. Obstetrics and gynecology. 2016 Dec:128(6):e241-e256 [PubMed PMID: 27875472]
Lynch CD, Zhang J. The research implications of the selection of a gestational age estimation method. Paediatric and perinatal epidemiology. 2007 Sep:21 Suppl 2():86-96 [PubMed PMID: 17803622]
Lee AC, Panchal P, Folger L, Whelan H, Whelan R, Rosner B, Blencowe H, Lawn JE. Diagnostic Accuracy of Neonatal Assessment for Gestational Age Determination: A Systematic Review. Pediatrics. 2017 Dec:140(6):. pii: e20171423. doi: 10.1542/peds.2017-1423. Epub 2017 Nov 17 [PubMed PMID: 29150458]
Level 1 (high-level) evidenceDubowitz LM, Dubowitz V, Goldberg C. Clinical assessment of gestational age in the newborn infant. The Journal of pediatrics. 1970 Jul:77(1):1-10 [PubMed PMID: 5430794]
Spinnato JA, Sibai BM, Shaver DC, Anderson GD. Inaccuracy of Dubowitz gestational age in low birth weight infants. Obstetrics and gynecology. 1985 Apr:65(4):601-2 [PubMed PMID: 3982736]
Sanders M, Allen M, Alexander GR, Yankowitz J, Graeber J, Johnson TR, Repka MX. Gestational age assessment in preterm neonates weighing less than 1500 grams. Pediatrics. 1991 Sep:88(3):542-6 [PubMed PMID: 1881734]
Dubowitz L, Ricciw D, Mercuri E. The Dubowitz neurological examination of the full-term newborn. Mental retardation and developmental disabilities research reviews. 2005:11(1):52-60 [PubMed PMID: 15856443]
Allan RC, Sayers S, Powers J, Singh G. The development and evaluation of a simple method of gestational age estimation. Journal of paediatrics and child health. 2009 Jan-Feb:45(1-2):15-9. doi: 10.1111/j.1440-1754.2008.01429.x. Epub [PubMed PMID: 19208060]
Ballard JL, Novak KK, Driver M. A simplified score for assessment of fetal maturation of newly born infants. The Journal of pediatrics. 1979 Nov:95(5 Pt 1):769-74 [PubMed PMID: 490248]
Ballard JL, Khoury JC, Wedig K, Wang L, Eilers-Walsman BL, Lipp R. New Ballard Score, expanded to include extremely premature infants. The Journal of pediatrics. 1991 Sep:119(3):417-23 [PubMed PMID: 1880657]
Nandy A, Guha A, Datta D, Mondal R. Evolution of clinical method for new-born infant maturity assessment. The journal of maternal-fetal & neonatal medicine : the official journal of the European Association of Perinatal Medicine, the Federation of Asia and Oceania Perinatal Societies, the International Society of Perinatal Obstetricians. 2020 Aug:33(16):2852-2859. doi: 10.1080/14767058.2018.1560417. Epub 2019 Jan 7 [PubMed PMID: 30563394]
Alliance for Maternal and Newborn Health Improvement (AMANHI) Gestational Age Study Group, Alliance for Maternal and Newborn Health Improvement (AMANHI) GA Study Group. Simplified models to assess newborn gestational age in low-middle income countries: findings from a multicountry, prospective cohort study. BMJ global health. 2021 Sep:6(9):. doi: 10.1136/bmjgh-2021-005688. Epub [PubMed PMID: 34518201]
Lee AC, Mullany LC, Ladhani K, Uddin J, Mitra D, Ahmed P, Christian P, Labrique A, DasGupta SK, Lokken RP, Quaiyum M, Baqui AH, Projahnmo Study Group. Validity of Newborn Clinical Assessment to Determine Gestational Age in Bangladesh. Pediatrics. 2016 Jul:138(1):. doi: 10.1542/peds.2015-3303. Epub 2016 Jun 16 [PubMed PMID: 27313070]
Wariyar U, Tin W, Hey E. Gestational assessment assessed. Archives of disease in childhood. Fetal and neonatal edition. 1997 Nov:77(3):F216-20 [PubMed PMID: 9462193]
Patel AB, Kulkarni H, Kurhe K, Prakash A, Bhargav S, Parepalli S, Fogleman EV, Moore JL, Wallace DD, Hibberd PL. Early identification of preterm neonates at birth with a Tablet App for the Simplified Gestational Age Score (T-SGAS) when ultrasound gestational age dating is unavailable: A validation study. PloS one. 2020:15(8):e0238315. doi: 10.1371/journal.pone.0238315. Epub 2020 Aug 31 [PubMed PMID: 32866202]
Level 1 (high-level) evidenceRittenhouse KJ, Vwalika B, Keil A, Winston J, Stoner M, Price JT, Kapasa M, Mubambe M, Banda V, Muunga W, Stringer JSA. Improving preterm newborn identification in low-resource settings with machine learning. PloS one. 2019:14(2):e0198919. doi: 10.1371/journal.pone.0198919. Epub 2019 Feb 27 [PubMed PMID: 30811399]
Sazawal S, Ryckman KK, Das S, Khanam R, Nisar I, Jasper E, Dutta A, Rahman S, Mehmood U, Bedell B, Deb S, Chowdhury NH, Barkat A, Mittal H, Ahmed S, Khalid F, Raqib R, Manu A, Yoshida S, Ilyas M, Nizar A, Ali SM, Baqui AH, Jehan F, Dhingra U, Bahl R. Machine learning guided postnatal gestational age assessment using new-born screening metabolomic data in South Asia and sub-Saharan Africa. BMC pregnancy and childbirth. 2021 Sep 7:21(1):609. doi: 10.1186/s12884-021-04067-y. Epub 2021 Sep 7 [PubMed PMID: 34493237]
Level 2 (mid-level) evidence