The European Respiratory Journal
European Respiratory Society
A predictive model for disease progression in non-severely ill patients with coronavirus disease 2019
DOI 10.1183/13993003.01234-2020, Volume: 56, Issue: 1,

Table of Contents




A predictive model for COVID-19

Ji, Yuan, Shen, Lv, Li, Chen, Zhu, Liu, Liang, Lin, Xie, Li, Chen, Lu, Ding, An, Zhu, Gao, Ni, Hu, Shi, Shi, and Dong: A predictive model for disease progression in non-severely ill patients with coronavirus disease 2019

To the Editor:

Over the past 3 months, coronavirus disease 2019 (COVID-19) has emerged across China and developed into a worldwide outbreak [1]. The disease has caused varying degrees of illness. The proportion of patients with COVID-19 with non-severe illness was 84.3% on admission, and severe cases accounted for 15.7% [2]. Most of the non-severe pneumonia patients would gradually alleviate and be cured with treatment, while others would rapidly progress to severe illness, which has a poor prognosis [3, 4]. As recently reported, the cumulative risk of the composite end-point was 3.6% in all COVID-19 patients, and the cumulative risk was 20.6% for severe illness [2].

However, it is still unknown whether early identification and intervention for non-severe patients with COVID-19 could prevent progression into severe disease. According to the experience of treating other diseases, there might be a large promoting effect of treatment. In this paper, we aim to build a predictive model for identifying high-risk non-severe pneumonia patients at an early stage.

86 patients with COVID-19 and non-severe pneumonia on admission were recruited as the training cohort at Renmin Hospital of Wuhan University from 2 to 20 January, 2020, and another 62 patients were prospectively enrolled as the validation cohort from 28 January to 9 February, 2020. COVID-19 was confirmed by real-time PCR. Disease severities of COVID-19 were defined as severe and non-severe pneumonia based on the criteria of American Thoracic Society guidelines for community-acquired pneumonia [2, 5]. The exclusion criteria included: 1) degrees of severity were not available on admission or during follow-up; 2) diagnosed with severe illness at the time of admission; 3) confirmed with COVID-19 and treated at other hospitals; 4) medication was administered within 15 days before admission; 5) received oxygen support during follow-up. Patients were divided into “progressed” or “non-progressed” groups, based on whether they progressed to severe illness or not during the 14-day follow-up period. Comorbidity included diabetes, hypertension, cardiovascular and cerebrovascular diseases, COPD, malignant tumour, chronic liver disease, chronic kidney disease, tuberculosis and immunodeficiency diseases, etc.

Clinical characteristics and laboratory findings were extracted from electronic medical records. Radiological features were extracted from chest computed tomography (CT) imaging using a double-blind method [6]. To evaluate the lesion size accurately, a diagnosis system for COVID-19 based on artificial intelligence (AI) was employed to measure volume ratio of pneumonia automatically by analysing CT values [7, 8].

Logistic regression was used as the classifier to build the predictive model. The discriminative performance of the predictive model was quantified by the value of the area under the receiver operating characteristic curve (AUC) in the cross-validation of the training and validation datasets. Risk index calculated with the weight of each variable in the model was used to identify high-risk groups. All analyses were performed using R-3.6.0.

The median age of the 148 patients was 46.5 years (interquartile range (IQR) 35.8–58.0 years), and 81 (54.7%) were female. A total of 60 (40.5%) non-severe patients progressed to severe illness, and the median time of progression was 5.0 days (IQR 2.8–9.0 days). For training cohort, 60 (40.5%) non-severe patients progressed to severe illness, and 26 (41.9%) cases were in validation cohort. The median time of progression in these two cohorts were 5.5 days (IQR 1.0–9.0 days) and 5.0 days (IQR 3.0–9.8 days). Description of variables was provided in the table 1.

Description of clinical characteristics and multivariate analysis in training cohort
VariablesClinical characteristicsMultivariate analysis in training cohort
Training cohort (n=86)Validation cohort (n=62)OR (95% CI)Score
 Non-progressed group52 (60.5%)36 (58.1%)
 Progressed group34 (39.5%)26 (41.9%)
Time of progression days5.5 (1.0–9.0)5.0 (3.0–9.8)
Age years50.5 (37.0–60.5)44.5 (35.0–53.0)
Age range years
 <4027 (31.4%)21 (33.9%)
 40–4915 (17.4%)14 (22.6%)
 50–5922 (25.6%)15 (24.2%)
 60–6913 (15.1%)9 (14.5%)
 70–797 (8.1%)2 (3.2%)
 ≥802 (2.3%)1 (1.6%)
Female45 (52.3%)36 (58.1%)
Comorbidity42 (48.8%)15 (24.2%)3.436 (1.084–10.896)12× (0/1; no=0, yes=1)
Dyspnoea on admission11 (12.8%)6 (9.7%)4.869 (0.760–31.212)16× (0/1; no=0, yes=1)
Temperature on admission °C36.8 (36.5–37.2)36.8 (36.5–37.1)
Respiratory rate on admission19.0 (18.0–20.0)20.0 (19.0–20.0)
Lactate dehydrogenase U·L−1214.0 (187.8–275.8)201.5 (160.3–247.0)1.008 (1.001–1.014)0.07× per unit (U·L−1)
Procalcitonin ng·mL−10.04 (0.03–0.07)0.03 (0.02–0.05)
Lymphocyte count ×109 L−11.2 (0.9–1.6)1.3 (1.0–1.7)0.134 (0.038–0.471)−20× per unit (109 L−1)
White blood cells ×109 L−14.8 (3.7–6.1)4.7 (4.0–6.1)
Neutrophil count ×109 L−13.1 (2.2–4.1)3.0 (2.0–3.9)
Platelet count ×109 L−1159.3 (132.5–204.0)164.5 (120.3–210.4)
Haemoglobin concentration g·L−1138.5 (127.0–156.6)143.3 (130.0–152.8)
Arterial oxygen saturation %97.0 (95.3–98.8)96.0 (95.0–98.0)
Radiological abnormality
 GGOSS36 (41.9%)
 Pure ground-glass opacity32 (37.2%)
 Consolidation12 (14.0%)
 Other6 (7.0%)
Number of affected segments7.0 (2.3–12.0)
Lesion size
 <1 cm4 (4.7%)
 1–3 cm32 (37.2%)
 3 cm to 50% lobe45 (52.3%)
 >50% lobe5 (5.8%)
AI-based volume ratio of pneumonia
 −700 to 500 HU0.18 (0.11–0.27)
 −600 to 500 HU0.11 (0.07–0.17)
 Corticosteroid agents55 (64.0%)19 (30.6%)
 Anti-infection agents85 (98.8%)52 (83.9%)
 Interferon agents34 (39.5%)7 (11.3%)
 Antiviral agents74 (86%)61 (98.4%)
 Gamma globulin agents54 (62.8%)21 (33.9%)

Data are presented as medians (interquartile ranges) and n (%). Percentages may not total 100 because of rounding. Variables in the validation cohort were not completely collected, as some of them did not appear in the model of the training cohort. GGOSS: ground-glass opacities overlapped with striped shadows; AI: artificial intelligence.

To build the predictive model, we tested all the clinical, laboratory and radiological variables, except for characteristics about treatment. Four variables were finally included in the model, including comorbidity (β=1.234, p=0.036), dyspnoea on admission (β=1.583, p=0.095), lactate dehydrogenase (β=0.007, p=0.027) and lymphocyte count (β=−2.012, p=0.002). The Hosmer Lemeshow test of the training dataset was done (Chi-squared=10.451, p=0.235). The AUC value in the cross-validation of training dataset was 0.819 (95% CI 0.731–0.907). It was 0.759 (95% CI 0.635–0.884) in the validation dataset. According to the regression coefficients, the four variables were given different weights. Comorbidity was 12 points per unit, dyspnoea was 16, lactate dehydrogenase was 0.07, and lymphocyte count was −20. Then, total scores for each person were calculated, and different scores showed different risks. AUC value based on the risk scores in training dataset was 0.856 (95% CI 0.776–0.935). Patients were divided into high-risk and low-risk groups (total score >−6.0 and ≤−6.0) based on the best cut-off value determined by the Youden index; the sensitivity was 0.941, specificity was 0.635. More details can be found in table 1.

In our prediction model, comorbidity was associated with disease progression, which meant that patients with comorbidities were more likely to progress to severe disease than those without. Previous studies have shown a higher proportion of patients with comorbidities in those with more severe disease [9]. We further confirmed that non-severe patients with comorbidities were more likely to progress. It should be explained that the p value for dyspnoea on admission was not less than 0.05 in the multivariate regression, which might be due to the relationship between dyspnoea and the outcome in this study not being strictly linear after adjusting for other variables. Although we did try other models with better performance earlier, we finally chose the logistic model because of its interpretability and simplicity of application. Patients who progressed have been found to be more likely to accompany this with a decrease in lymphocyte count and an increase in lactate dehydrogenase [2, 10]. Our research further confirmed that these two indicators were also related to disease progression. A decrease in lymphocyte count usually indicated the decline of immune function, and multiple organ dysfunction might lead to an increase in lactate dehydrogenase [11], which are consistent with the phenomena we have observed clinically.

Previous reports have pointed out that advanced age was one of the risk factors for poor prognosis in patients with COVID-19 [2, 3]. However, age was not included in the model. It suggests that treatment for young non-severe illness patients should not be neglected in at an early stage. We speculate that the contribution of age to disease progression was reflected in comorbidities and dyspnoea. In addition, some studies reported the correlations between radiological indicators and COVID-19 disease [12]. Although radiological features in CT images on admission were described in detail, they were not included into the model. We speculate that multiple images during treatment instead of a single image could indicate further progression of the disease. Although variables extracted with AI from CT imaging were not included in the model, this was showed promise and will be the focus of our subsequent research.

There were some limitations to this study. First, patients with COVID-19 included in this study were from a single hospital, which is a potential constraint for the generalisation of our model. Second, critically ill patients were transferred to other designated hospitals according to the regulations of the local government. We were unable to track these patients’ deaths in the short term, and the association between the model and overall survival could not be evaluated, which unfortunately was a major limitation of this study.

Conclusively, the progression of non-severe patients with COVID-19 could be predicted by our model based on clinical characteristics on admission. The model was further verified with a prospective validation cohort with good performance. With the help of our model, clinicians could easily identify high-risk non-severe patients on admission with few routine clinical indicators, thereby contributing to the treatment and prevention of COVID-19.

Shareable PDF


This one-page PDF can be shared freely online.

Shareable PDF ERJ-01234-2020.Shareable


Support statement: This work was supported by the National Natural Science Foundation of China (grant: 81901817), Natural Science Foundation of Hubei Province (grant: 2018CFB136), and Innovation Seed Funding of Wuhan University (grant: TFZZ2018020). Funding information for this article has been deposited with the Crossref Funder Registry.
Conflict of interest: M. Ji has nothing to disclose.
Conflict of interest: L. Yuan has nothing to disclose.
Conflict of interest: W. Shen has nothing to disclose.
Conflict of interest: J. Lv has nothing to disclose.
Conflict of interest: Y. Li has nothing to disclose.
Conflict of interest: J. Chen has nothing to disclose.
Conflict of interest: C. Zhu has nothing to disclose.
Conflict of interest: B. Liu has nothing to disclose.
Conflict of interest: Z. Liang has nothing to disclose.
Conflict of interest: Q. Lin has nothing to disclose.
Conflict of interest: W. Xie has nothing to disclose.
Conflict of interest: M. Li has nothing to disclose.
Conflict of interest: Z. Chen has nothing to disclose.
Conflict of interest: X. Lu has nothing to disclose.
Conflict of interest: Y. Ding has nothing to disclose.
Conflict of interest: P. An has nothing to disclose.
Conflict of interest: S. Zhu has nothing to disclose.
Conflict of interest: M. Gao has nothing to disclose.
Conflict of interest: H. Ni has nothing to disclose.
Conflict of interest: L. Hu has nothing to disclose.
Conflict of interest: G. Shi has nothing to disclose.
Conflict of interest: L. Shi has nothing to disclose.
Conflict of interest: W. Dong has nothing to disclose.




    Guan WJ, Ni ZY, Hu Y, et al.. Clinical characteristics of coronavirus disease 2019 in China. N Engl J Med2020; 382: , pp.1708–1720. doi:, doi: 10.1056/NEJMoa2002032


    Wang D, Hu B, Hu C, et al.. Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in Wuhan, China. JAMA2020; 323: , pp.1061–1069.


    Yang X, Yu Y, Xu J, et al.. Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study. Lancet Respir Med2020; 8: , pp.475–481. doi:, doi: 10.1016/S2213-2600(20)30079-5


    Metlay JP, Waterer GW, Long AC, et al.. Diagnosis and treatment of adults with community-acquired pneumonia. an official clinical practice guideline of the American Thoracic Society and Infectious Diseases Society of America. Am J Respir Crit Care Med2019; 200: , pp.e45–e67. doi:, doi: 10.1164/rccm.201908-1581ST


    Song F, Shi N, Shan F, et al.. Emerging 2019 novel coronavirus (2019-nCoV) pneumonia. Radiology2020; 295: , pp.210–217. doi:, doi: 10.1148/radiol.2020200274


    Cohen JG, Goo JM, Yoo RE, et al.. Software performance in segmenting ground-glass and solid components of subsolid nodules in pulmonary adenocarcinomas. Eur Radiol2016; 26: , pp.4465–4474. doi:, doi: 10.1007/s00330-016-4317-3


    Kitami A, Kamio Y, Hayashi S, et al.. One-dimensional mean computed tomography value evaluation of ground-glass opacity on high-resolution images. Gen Thorac Cardiovasc Surg2012; 60: , pp.425–430. doi:, doi: 10.1007/s11748-012-0066-7


    Liang W, Guan W, Chen R, et al.. Cancer patients in SARS-CoV-2 infection: a nationwide analysis in China. Lancet Oncol2020; 21: , pp.335–337. doi:, doi: 10.1016/S1470-2045(20)30096-6


    Liu Y, Yang Y, Zhang C, et al.. Clinical and biochemical indexes from 2019-nCoV infected patients linked to viral loads and lung injury. SciChina Life Sci2020; 63: , pp.364–374. doi:, doi: 10.1007/s11427-020-1643-8


    Khan AA, Allemailem KS, Alhumaydhi FA, et al.. The biochemical and clinical perspectives of lactate dehydrogenase: an enzyme of active metabolism. Endocr Metab Immune Disord Drug Targets2019; in press [].


    Arabi YM, Arifi AA, Balkhy HH, et al.. Clinical course and outcomes of critically ill patients with Middle East respiratory syndrome coronavirus infection. Ann Intern Med2014; 160: , pp.389–397. doi:, doi: 10.7326/M13-2486 predictive model for disease progression in non-severely ill patients with coronavirus disease 2019&author=Mengyao Ji,Lei Yuan,Wei Shen,Junwei Lv,Yong Li,Jia Chen,Chaonan Zhu,Bo Liu,Zhenzhen Liang,Qiang Lin,Wenjie Xie,Ming Li,Zhifan Chen,Xuefang Lu,YiJuan Ding,Ping An,Sheng Zhu,Mengting Gao,Hao Ni,Lanhua Hu,Guanglei Shi,Lei Shi,Weiguo Dong,&keyword=&subject=Agora,Research Letters,