Klebsiella pneumoniae is a major causative agent of hospital-acquired infections worldwide, contributing substantially to morbidity, mortality, and healthcare burden.. The emergence of strains that combine resistance to last-resort antimicrobials with hypervirulence has become a pressing public-health challenge. Despite extensive characterization of the genetic determinants of multidrug resistance and hypervirulence, the relationship between the genetic repertoire of K. pneumoniae and the clinical severity of infections remains inadequately understood.
We analyzed a nationwide large-scale collection of 1,306 K. pneumoniae complex strains retrieved over seven years from five centres across the Kingdom of Saudi Arabia. Using detailed and comprehensive patient-level clinical data, We employed a range of regression analyses, genome-wide association study (GWAS) methods, and machine-learning approaches to elucidate the clinical significance of ESBL/carbapenemase-producing (ESBL/CP), hypervirulent, and convergent ESBL(+)/CP(+) hypervirulent K. pneumoniae strains. We examined clinical severity outcomes including in-hospital all-cause mortality rate, ICU admission rate and length of hospitalisation (LOS) across these K. pneumoniae types, identified genome-wide determinants linked with clinical severity and used machine learning approaches to predict clinical severity outcomes from genomic biomarkers together with clinical metadata.
Infections caused by convergent strains exhibited the greatest clinical severity, showing nearly double the in-hospital mortality (reaching 42% at 90 days), a 2.4-fold higher likelihood of ICU admission, and an average 150% increase in LOS compared to infections caused by susceptible and non-hypervirulent strains. Our findings indicate an additive effect of hypervirulence and multidrug resistance on disease severity. Carbapenem resistance determinants showed the strongest association with adverse outcomes, even after adjusting for the presentce of other resistance and virulence genes and clinical confounder features. The GWAS analysis revealed associations of the clinical outcomes with accessory genes involved in carbohydrate metabolism and the Type VI secretion system (T6SS) machinery, metabolic-adaptation and stress-tolerance/persistence loci. Additional significant associations were identified with SNPs in ABC-transporters, cell-envelope systems, sugar transporter families and RND-family efflux systems. Machine-learning models yielded average Area Under the Curve (AUC) values of 0.78 and 0.79 for mortality and ICU admission, respectively, and exhibited strong monotonic association between observed and predicted outcomes for LOS, with an average correlation of 0.59 on unseen test data when trained using combined genomic and clinical predictors.
This study identifies key genomic determinants that drive severe K. pneumoniae infections, with carbapenem-resistance markers emerging as the leading contributors to poor clinical outcomes. The strong predictive performance of genomic biomarkers, particularly for mortality, ICU admission, and LOS, highlights their value in enhancing diagnostic precision, improving clinical risk stratification, and informing targeted infection-prevention strategies.