New machine learning tool predicts devastating intestinal disease in premature infants

August 11 , 2020

by Columbia University School of Engineering and Applied Science

Necrotizing enterocolitis (NEC) is a life-threatening intestinal disease of prematurity. Characterized by sudden and progressive intestinal inflammation and tissue death, it affects up to 11,000 premature infants in the United States annually, and 15-30% of affected babies die from NEC. Survivors often face long-term intestinal and neurodevelopmental complications.

Researchers from Columbia Engineering and the University of Pittsburgh have developed a sensitive and specific early warning system for predicting NEC in premature infantsbefore the disease occurs. The prototype predicts NEC accurately and early, using stool microbiome features combined with clinical and demographic information. The pilot study was presented virtually on July 23 at ACM CHIL 2020.

"It's amazing how we may be able to use machine learning to stop this from happening to babies," said the study's co-author, Ansaf Salleb-Aouissi, a senior lecturer in discipline from the computer science department at Columbia Engineering and a specialist in artificial intelligence and its applications to medical informatics. "We looked at the data and developed a tool that can truly be useful, even life-saving."

"If doctors could accurately predict NEC before the baby actually becomes sick, there are some very simple steps they could take—treatment could include stopping feeds, giving IV fluids, and starting antibiotics to prevent the worst outcomes such as long-term disability or death," said the study's lead author, Thomas A. Hooven, who began his collaboration with Salleb-Aouissi when he was an assistant professor of pediatrics in the Division of Neonatology-Perinatology at Columbia University Medical Center. He is now assistant professor of pediatrics in the Division of Newborn Medicine at the University of Pittsburgh School of Medicine.

Currently, there is no tool to predict which preterm babies will get the disease, and often NEC is not recognized until it is too late to effectively intervene. NEC is the most common intestinal emergency among preterm infants. It is characterized by rapidly progressive intestinal necrosis, bacteremia, acidosis, and high rates of morbidity and mortality.

Causes of NEC are not well-understood, but several studies have focused on shifts in the intestinal microbiome, the bacteria in the intestine whose composition can be determined from DNA sequencing from small stool samples. The researchers hypothesized that a machine learning approach to modeling clinical, demographic, and microbiome data from preterm patients might allow discrimination of patients at high risk for NEC long before clinical disease onset, which would permit early intervention and mitigation of serious complications.

Hooven, Salleb-Aouissi, and Lin used data from a 2016 NIH clinical study of premature infants whose stool was collected in several American neonatal ICUs between 2009 and 2013. The team examined 2,895 stool samples from 161 preterm infants, 45 of whom developed NEC. Given the complexity of the microbiome data, the researchers performed several data preprocessing steps to reduce its dimensionality, and to address the compositionally and hierarchical nature of this data to harness it to machine learning.

"NEC represents an excellent application from a machine learning perspective," said Salleb-Aouissi. "The lessons we've learned from our new technique could well translate to other genetic or proteomic datasets and inspire new machine learning algorithms for healthcare datasets."

The team evaluated several machine learning methods to determine the best strategy for predicting NEC from microbiome data. They found optimal performance from a gated attention-based multiple instance learning (MIL) approach.

Since human microbiomes are subject to change, the MIL methods address the sequential aspect of the problem. For example, in the first 20 days after an infant is born, the infant's microbiome goes through a drastic change. Many studies have shown that infants with a higher diversity of microbiome typically are healthier.

"This led us to think that changes in microbiome diversity can help to explain why some infants are more likely to be sick from NEC," said Adam (Yun Chao) Lin, a computer science MS student and co-author of the study whose work on this project prompted him to now pursue a Ph.D.

Instead of viewing microbiome samples from an infant as independent, the team represented each patient as a collection of samples and applied attention mechanisms to learning the complex relationships among the samples. The machine learning algorithm "looks" at each bag and tries to guess from its contents whether or not the baby is affected.

In repeated trials, the ability of the model to distinguish affected from non-affected infants had a good balance of sensitivity and specificity. "The Area Under the ROC Curve (AUC) is about 0.9, which demonstrates how good our models are at distinguishing between affected and unaffected patients," Salleb-Aouissi noted. "Ours is the first effective system for a clinically applicable machine learning model that combines microbiome, demographic, and clinical data that can be collected and monitored in real-time in a neonatal ICU. We are excited about extending its applicability to a new area of predictive monitoring in medicine."

The researchers are now developing a noninvasive standalone testing platform for accurate identification of infants at high risk for NEC before clinical onset, to prevent the worst outcomes. Once the platform is ready, they will conduct a randomized clinical trial to validate their technique's predictions in a real-time neonatal ICU cohort.

"For the first time I can envision a future where parents of preterm infants, and their medical teams, no longer live in constant fear of NEC," said Hooven.