Just as “alarm fatigue” can be a problem for clinicians in the hospital, experts fear that emerging systems of computer-aided polyp detection may be generating so many false-positive findings that gastroenterologists who use the technology risk burnout. 

In computer-assisted colonoscopy, suspected polyps are visualized as alarm boxes; false positives are anything besides a polyp, such as a bubble or a fold. Some false positives are inevitable, but too many can be a problem. What’s more, variation in how computer-aided polyp detection systems define false-positive results can dramatically affect the number of such findings, but researchers say they may have identified an ideal threshold, which could help in future comparisons of these systems.

“Artificial intelligence in the evaluation of colorectal polyps is in its early stages, so we want to find ways to minimize certain effects, one of which is endoscopist burnout,” Erik Holzwanger, MD, a gastroenterology fellow at Tufts Medical Center, in Boston, told Gastroenterology & Endoscopy News. “If you keep seeing boxes light up and it’s not clear whether it’s a polyp or stool, which can happen if you don’t have a true definition of a false positive, your procedure can be much longer and more repetitive and add to burnout,” he said.

To study the diagnostic performance of computer-aided polyp detection by false-positive threshold definitions, Dr. Holzwanger and his colleagues applied a previously validated system to videos of 62 colonoscopies performed between September 2016 and March 2017.

They stratified the videos into three groups using different parameters for false-positive alerts. The alerts were defined as the time the computer continuously traced an alarm box around what it suspected to be a lesion. Group 1 equaled more than 0.5 seconds, group 2 was equal to or greater than one second, and group 3 was two or more seconds.

The number of false positives fell precipitously as the length of time increased: There were 111 false positives in group 1, 23 in group 2, and three false positives in group 3. Specificity and accuracy improved with time duration, from 93% and 98% respectively in group 1 to 98.6% and 99.5% in group 2, and nearly 100% in group 3.

Greater rates of false positives were associated with bowel preparation scores of fair and poor.

Presenting the study at the 2020 virtual annual meeting of the American College of Gastroenterology (abstract P1182), Dr. Holzwanger said the analysis reveals the impact that different threshold definitions for false positives have on the reported diagnostic performance of computer-aided detection systems in colonoscopy.

“We suggest that a greater than two-second false-positive threshold is a clinically practical benchmark for standardizing the interpretation of data for computer-aided colon polyp detection in colonoscopy,” he said.

Ultimately, Dr. Holzwanger said he hoped the use of artificial intelligence in colonoscopy will help increase the adenoma detection rate (ADR) by picking up polyps that endoscopists miss. “That’s what piqued my interest in this; we’re always trying to increase polyp detection and ADR.”

Prateek Sharma, MD, a professor of medicine at the University of Kansas School of Medicine, in Kansas City, who is a leading authority in the field of AI in endoscopy, said the clarification of the false positive definition for detection studies is an important topic.

“Like this study evaluating patients undergoing AI colonoscopy, we’ll have to clarify false positive definitions for future studies in esophageal and gastric diseases," Dr. Sharma said.  An evidence-based and consensus-based approach to this will be ideal.”


—Monica J. Smith 



Dr. Sharma is a member of the editorial board of Gastroenterology & Endoscopy News