02 August 2021: Editorial
Editorial: The National COVID Cohort Collaborative Consortium Combines Population Data with Machine Learning to Evaluate and Predict Risk Factors for the Severity of COVID-19Dinah V. Parums1CDEF*
Med Sci Monit 2021; 27:e934171
ABSTRACT: Infection with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) that causes coronavirus disease 2019 (COVID-19) commonly presents with pneumonia. However, COVID-19 is now recognized to involve multiple organ systems with varying severity and duration. In July 2021, the findings from a retrospective population study from the National COVID Cohort Collaborative (N3C) Consortium were published that included analysis by machine learning methods of 174,568 adults with SARS-CoV-2 infection from 34 medical centers in the US. The study stratified patients for COVID-19 according to the World Health Organization (WHO) Clinical Progression Scale (CPS). Severe clinical outcomes were identified as the requirement for invasive ventilatory support, or extracorporeal membrane oxygenation (ECMO), and patient mortality. Machine learning analysis showed that the factor most strongly associated with severity of clinical course in patients with COVID-19 was pH. A separate multivariable logistic regression model showed that independent factors associated with more severe clinical outcomes included age, dementia, male gender, liver disease, and obesity. This Editorial aims to present the rationale and findings of the largest population cohort of adult patients with COVID-19 to date and highlights the importance of using large population studies with sophisticated analytical methods, including machine learning.
Keywords: Editorial, Population Characteristics, machine learning, Epidemiology, severe acute respiratory syndrome, COVID-19, severe acute respiratory syndrome coronavirus 2, COVID-19, Diagnosis, Computer-Assisted, Middle Aged, Models, Statistical, population health, Risk Factors, SARS-CoV-2, Severity of Illness Index
Infection with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) that causes coronavirus disease 2019 (COVID-19) commonly presents with pneumonia. However, COVID-19 is now recognized to involve multiple organ systems with varying severity and duration . Early in the COVID-19 pandemic, attempts were made to classify the severity of COVID-19 based on symptom severity, degree of impaired lung function, and lung imaging changes on lung computed tomography (CT) . However, impaired pulmonary function and imaging changes, such as consolidation and ground-glass opacities, are non-specific and associated with other types of viral pneumonia and infiltrative lung diseases [2,3]. The radiological term, ground-glass opacity, has been used for more than two decades to describe a mild increase in lung opacity . However, the lung vasculature and bronchial structures remain visible, and this is not the same condition as consolidation .
The US National Institutes of Health (NIH) currently provides guidance to identify patients with severe COVID-19 pneumonia . Severe COVID-19 includes the following clinical criteria: an oxygen saturation (SpO2) of <94% on room air; a respiratory rate of >30 breaths per minute; a ratio of arterial oxygen partial pressure to fractional inspired oxygen (PaO2/FiO2) of <300 mmHg; or lung infiltrates that involve >50% of the lung on CT imaging .
In August 2020, the World Health Organization (WHO) developed the Clinical Progression Scale (CPS) to stratify patients for COVID-19 research studies according to a minimum set of three outcome measures . The three WHO CPS elements include: a measure of SARS-CoV-2 viral burden determined by quantitative polymerase chain reaction (PCR); patient survival or mortality at hospital discharge or 60 days following initial diagnosis; and a measure of progression through the healthcare system using the WHO CPS criteria .
Standardized classification systems for disease severity are the basis for clinical studies, including clinical trials, that provide evidence for therapies, vaccines, and clinical guidelines. Studies from single centers or regions have evaluated clinical and laboratory findings in patients with confirmed COVID-19 to identify risk factors for disease severity and mortality in COVID-19 . Patient age, immune suppression, and comorbidities, particularly chronic lung disease, were initially identified as risk factors for disease severity and patient mortality . However, data from large population studies have been awaited to evaluate the clinical effects of SARS-CoV-2 infection and identify the risk factors for disease severity and patient mortality . Because of the complex nature of COVID-19, large population studies will require complex analysis using machine learning to allow for the analysis of multiple confounders and risk factors over time .
In July 2021, the findings from a retrospective cohort study from the National COVID Cohort Collaborative (N3C) Consortium were published that included analysis by machine learning of 174,568 adults with SARS-CoV-2 infection from 34 medical centers in the US . The National COVID Cohort Collaborative (N3C) contains members of the National Institutes of Health Clinical and Translational Science Awards Program, the IDeA Centers for Translational Research, TriNetX, the National Patient-Centered Clinical Research Network, the Observational Health Data Sciences and Informatics network, and the Accrual to Clinical Trials network . The National COVID Cohort Collaborative (N3C) is a centralized, electronic health record repository representing the largest population cohort of adult patients with COVID-19 to date and includes 1,926,526 individuals [10,11].
In the National COVID Cohort Collaborative (N3C) Consortium study, patients were included between January 1, 2020, and December 7, 2020 . More than 99% of patients in this database had a confirmed diagnosis of SARS-CoV-2 infection using polymerase chain reaction, with only <1% diagnosed by antigen testing alone . A control population included SARS-CoV-2-negative individuals . Patients were stratified using the WHO COVID-19 severity scale and patient demographic characteristics [6,10]. The National COVID Cohort Collaborative (N3C) used 64 data inputs taken on the first hospital day . Machine learning models accurately predicted ultimate clinical severity using commonly collected clinical data from the first 24 hours of hospital admission . Over time, data on the differences between the groups underwent multivariable logistic regression analysis . Random forest models and eXtreme Gradient Boosting (XGBoost) machine learning models were used to predict mortality and severe clinical outcomes that included the requirement for invasive ventilatory support or extracorporeal membrane oxygenation (ECMO) .
The findings from the National COVID Cohort Collaborative (N3C) Consortium of 174,568 SARS-CoV-2-positive adults had a mean age of 44.4 ±18.6 years, of which 53.2% were women, 32,472 (18.6%) were hospitalized, and of the patients hospitalized, 6,565 (20.2%) patients had a severe clinical course . The overall mortality rate was 11.6% . Patient overall mortality decreased from 16.4% in March and April 2020 to 8.6% in September and October 2020 .
Machine learning analysis showed that the factor most strongly associated with severity of clinical course in patients with COVID-19 was pH, which was consistent between machine learning methods . The results from a separate multivariable logistic regression model showed that the following were independent factors associated with a more severe clinical outcomes: age (OR, 1.03; 95% CI, 1.03–1.04); dementia (OR, 1.26; 95% CI, 1.13–1.41); male gender (OR, 1.60; 95% CI, 1.51–1.69); liver disease (OR, 1.20; 95% CI, 1.08–1.34); and obesity (OR, 1.36; 95% CI, 1.27–1.46) . The study also showed racial disparity for disease severity, with African American and Asian American patients having more severe clinical outcomes .
Machine learning analysis of a national multicenter population database from the US the National COVID Cohort Collaborative (N3C) Consortium analyzed clinical and demographic data on 174,568 SARS-CoV-2-positive adults admitted to hospital. This study confirmed that patient mortality from COVID-19 decreased during 2020. Patient demographic characteristics of age, male gender, and comorbidities, including obesity, were associated with increased clinical severity.
1. Parums DV, Editorial: Long COVID, or post-COVID syndrome, and the global impact on health care: Med Sci Monit, 2021; 27; e933446
2. Li L, Sun W, Han M, A study on the predictors of disease severity of COVID-19: Med Sci Monit, 2020; 26; e927167
3. Hochhegger B, Zanon M, Altmayer S, COVID-19 mimics on chest CT: A pictorial review and radiologic guide: Br J Radiol, 2021; 94(1118); 20200703
4. Collins J, Stern EJ, Ground-glass opacity at CT: The ABCs: Am J Roentgenol, 1997; 169; 355-67
5. National Institutes of Health (NIH), COVID-19 Treatment Guidelines Panel: Clinical spectrum of SARS-CoV-2 infection April 21, 2021 Available at: https://www.covid19treatmentguidelines.nih.gov/overview/clinical-spectrum/
6. WHO Working Group on the Clinical Characterization and Management of COVID-19 infection, A minimal common outcome measure set for COVID-19 clinical research: Lancet Infect Dis, 2020; 20(8); e192-97
7. Wang Y, Yao S, Liu X, Risk factors of coronavirus disease 2019-related mortality and optimal treatment regimens: A retrospective study: Med Sci Monit, 2021; 27; e926751
8. Parums DV, Editorial: Registries and population databases in clinical research and practice: Med Sci Monit, 2021; 27; e933554
9. Parums DV, Editorial: Artificial intelligence (AI) in clinical medicine and the 2020 CONSORT-AI study guidelines: Med Sci Monit, 2021; 27; e933675
10. Bennett TD, Moffitt RA, Hajagos JGNational COVID Cohort Collaborative (N3C) Consortium, Clinical characterization and prediction of clinical severity of SARS-CoV-2 infection among US adults using data from the US National COVID Cohort Collaborative: JAMA Netw Open, 2021; 4(7); e2116901
11. Haendel MA, Chute CG, Bennett TDN3C Consortium, The National COVID Cohort Collaborative (N3C): Rationale, design, infrastructure, and deployment: J Am Med Inform Assoc, 2021; 28(3); 427-43
01 Dec 2023 : Clinical ResearchRisk Factors and Clinical Outcomes of COVID-19 Infection in Multiple Sclerosis Patients: A Retrospective St...
Med Sci Monit In Press; DOI:
30 Nov 2023 : Review articleDecoding the Neurological Sequelae of General Anesthesia: A Review
Med Sci Monit In Press; DOI:
30 Nov 2023 : Clinical ResearchEnhanced Pain Relief and Muscle Growth in Individuals with Low Back Instability: The Impact of Blood Flow R...
Med Sci Monit In Press; DOI:
Most Viewed Current Articles
13 Nov 2021 : Clinical ResearchAcceptance of COVID-19 Vaccination and Its Associated Factors Among Cancer Patients Attending the Oncology ...
Med Sci Monit 2021; 27:e932788
30 Dec 2021 : Clinical ResearchRetrospective Study of Outcomes and Hospitalization Rates of Patients in Italy with a Confirmed Diagnosis o...
Med Sci Monit 2021; 27:e935379
14 Dec 2022 : Clinical ResearchPrevalence and Variability of Allergen-Specific Immunoglobulin E in Patients with Elevated Tryptase Levels
Med Sci Monit 2022; 28:e937990
08 Mar 2022 : Review articleA Review of the Potential Roles of Antioxidant and Anti-Inflammatory Pharmacological Approaches for the Man...
Med Sci Monit 2022; 28:e936292