22 June 2026 : Clinical Research
Development and Validation of Machine-Learning-Based Prediction Models for Thyroid Diseases During Pregnancy
Guang Yang BCDEF 1, Yi Gao BCEF 1, Pengfei Liu BC 1, Jingwen Jiang BC 1, Hui Qiao ADEG 1*, Weixuan ShengDOI: 10.12659/MSM.953235
Med Sci Monit 2026; 32:e953235
Table 2 Patient clinical characteristics.
| Characteristics | Training set (n=4466) | Test set (n=995) | ||||
|---|---|---|---|---|---|---|
| Normal (n=3927) | Thyroid disease (n=539) | P value | Normal (n=894) | Thyroid disease (n=101) | P value | |
| Age, mean (SD), y | 31.83 (3.83) | 32.02 (3.80) | 0.264 | 32.34 (3.97) | 32.53 (4.05) | 0.641 |
| Height, mean (SD), cm | 162.71 (5.05) | 162.33 (5.12) | 0.103 | 162.69 (5.18) | 162.97 (5.26) | 0.600 |
| Pre-pregnancy weight, mean (SD), kg | 58.17 (9.35) | 59.22 (10.08) | 0.015 | 58.59 (9.24) | 59.96 (9.34) | 0.159 |
| BMI, mean (SD), kg/m | 21.95 (3.24) | 22.44 (3.44) | 0.001 | 22.12 (3.24) | 22.54 (3.06) | 0.224 |
| Gravidity, mean (SD) | 1.77 (0.99) | 1.84 (0.97) | 0.113 | 1.64 (0.94) | 1.78 (0.99) | 0.161 |
| Parity, mean (SD) | 1.35 (0.51) | 1.30 (0.47) | 0.056 | 1.30 (0.51) | 1.35 (0.56) | 0.345 |
| PM, No. (%) | 0.147 | 0.512 | ||||
| Primipara | 2627 (66.9) | 378 (70.1) | 652 (72.9) | 70 (69.3) | ||
| Multipara | 1300 (33.1) | 161 (29.9) | 242 (27.1) | 31 (30.7) | ||
| IVF-ET, No. (%) | 109 (2.8) | 14 (2.6) | 0.923 | 10 (1.1) | 0 (0.0) | 0.588 |
| Twins, No. (%) | 0.938 | 00.172 | ||||
| Single | 3885 (98.9) | 534 (99.1) | 869 (97.2) | 101 (100.0) | ||
| Twins | 42 (1.1) | 5 (0.9) | 25 (2.8) | 0 (0.0) | ||
| Proteinuria, No. (%) | 55 (1.4) | 8 (1.5) | 1 | 13 (1.5) | 2 (2.0) | 1 |
| Anemia, No. (%) | 0.64 | 0.529 | ||||
| No | 2715 (69.1) | 375 (69.6) | 844 (94.4) | 98 (97.0) | ||
| Mild | 997 (25.4) | 141 (26.2) | 49 (5.5) | 3 (3.0) | ||
| Moderate | 212 (5.4) | 23 (4.3) | 1 (0.1) | 0 (0.0) | ||
| Severe | 3 (0.1) | 0 (0.0) | 0 (0.0) | 0 (0.0) | ||
| GDM, No. (%) | 626 (15.9) | 104 (19.3) | 0.056 | 174 (19.5) | 21 (20.8) | 0.852 |
| HDP, No. (%) | 0.3 | 0.307 | ||||
| No | 3668 (93.4) | 496 (92.0) | 781 (87.4) | 88 (87.1) | ||
| Chronic | 34 (0.9) | 2 (0.4) | 7 (0.8) | 3 (3.0) | ||
| Gestational | 78 (2.0) | 16 (3.0) | 44 (4.9) | 4 (4.0) | ||
| Chronic preeclampsia | 30 (0.8) | 3 (0.6) | 0 (0.0) | 0 (0.0) | ||
| Preeclampsia | 58 (1.5) | 12 (2.2) | 25 (2.8) | 3 (3.0) | ||
| Eclampsia | 59 (1.5) | 10 (1.9) | 37 (4.1) | 3 (3.0) | ||
| Scarred uterus, No. (%) | 281 (7.2) | 37 (6.9) | 0.061 | 58 (6.5) | 7 (6.9) | 1 |
| AID, No. (%) | 16 (0.4) | 4 (0.7) | 0.455 | 3 (0.3) | 0 (0.0) | 1 |
| APCD, No. (%) | 84 (2.1) | 19 (3.5) | 0.063 | 0 (0.0) | 0 (0.0) | NA |
| Thrombocytopenia, No. (%) | 62 (1.6) | 9 (1.7) | 1 | 15 (1.7) | 2 (2.0) | 1 |
| CVD, No. (%) | 87 (2.2) | 15 (2.8) | 0.501 | 3 (0.3) | 1 (1.0) | 0.876 |
| Respiratory disease, No. (%) | 13 (0.3) | 5 (0.9) | 0.092 | 6 (0.7) | 1 (1.0) | 1 |
| Note: BMI was calculated as pre-pregnancy weight in kilograms divided by height in meters squared. For age, height, pre-pregnancy weight, BMI, gravidity, and parity, the independent-samples Student’s t-test was used to compare differences between groups. For PM, IVF-ET, twins, proteinuria, anemia, GDM, HDP, scarred uterus, AID, APCD, thrombocytopenia, CVD, and respiratory disease, the χ test or Fisher’s exact test was used for group comparisons. All values are presented for descriptive and exploratory purposes only; they were derived from univariate analyses. Variable selection for the prediction models was based on the Boruta algorithm and machine learning procedures, rather than univariate significance testing. Abbreviations: AID, autoimmune disease; APCD, adverse pregnancy and childbirth history; BMI, body mass index; CVD, cardiovascular disease; GDM, gestational diabetes mellitus; HDP, hypertensive disorders of pregnancy; IVF-ET, in vitro fertilization and embryo transfer; PM, primipara or multipara; SD, standard deviation. | ||||||






