While ever-increasing computational power and the availability of big datasets have improved machine learning - the process by which computers analyze data, identify patterns and essentially teach themselves how to perform a task without the direct involvement of a human programmer - important obstacles can prevent such systems from being integrated into clinical decision making. These include the need for large and well annotated datasets - previously developed imaging analysis systems capable of duplicating the performance of a physician were trained with more than 100,000 images - and the "black box" problem, the inability of systems to explain how they arrived at a decision. The U.S. Food and Drug Administration requires any decision support system to provide data allowing users to review the reasons behind its findings.
"It is somewhat paradoxical to use the words 'small data' or 'explainable' to describe a study that used deep learning," says Hyunkwang Lee, a graduate student at the Harvard School of Engineering and Applied Sciences, one of the two lead authors of the study. "However, in medicine, it is especially hard to collect high-quality big data. It is critical to have multiple experts label a dataset to ensure consistency of data. This process is very expensive and time-consuming."
Co-lead author Sehyo Yune, MD, of MGH Radiology adds, "Some critics suggest that machine learning algorithms cannot be used in clinical practice, because the algorithms do not provide justification for their decisions. We realized that it is imperative to overcome these two challenges to facilitate the use in health care of machine learning, which has an immense potential to improve the quality of and access to care."
To train their system, the MGH team began with 904 head CT scans, each consisting of around 40 individual images, that were labeled by a team of five MGH neuroradiologists as to whether they depicted one of five hemorrhage subtypes, based on the location within the brain, or no hemorrhage. To improve the accuracy of this deep-learning system the team - led by senior author Synho Do, PhD, director of the MGH Radiology Laboratory of Medical Imaging and Computation and an assistant professor of Radiology at Harvard Medical School - built in steps mimicking the way radiologists analyze images. These include adjusting factors such as contrast and brightness to reveal subtle differences not immediately apparent and scrolling through adjacent CT scan slices to determine whether or not something that appears on a single image reflects a real problem or is a meaningless artifact.
Once the model system was created, the investigators tested it on two separate sets of CT scans - a retrospective set taken before the system was developed, consisting of 100 scans with and 100 without intracranial hemorrhage, and a prospective set of 79 scans with and 117 without hemorrhage, taken after the model was created. In its analysis of the retrospective set, the model system was as accurate in detecting and classifying intracranial hemorrhages as the radiologists that had reviewed the scans had been. In its analysis of the prospective set, it proved to be even better than non-expert human readers.
To solve the "black box" problem, the team had the system review and save the images from the training dataset that most clearly represented the classic features of each of the five hemorrhage subtypes. Using this atlas of distinguishing features, the system is able to display a group of images similar to those of the CT scan being analyzed in order to explain the basis of its decisions.
"Rapid recognition of intracranial hemorrhage, leading to prompt appropriate treatment of patients with acute stroke symptoms, can prevent or mitigate major disability or death," says co-author Michael Lev, MD, MGH Radiology. "Many facilities do not have access to specially trained neuroradiologists - especially at night or over weekends - which can require non-expert providers to determine whether or not a hemorrhage is the cause of a patient's symptoms. The availability of a reliable, 'virtual second opinion' - trained by neuroradiologists - could make those providers more efficient and confident and help ensure that patients get the right treatment."
Co-author Shahein Tajmir, MD, MGH Radiology adds, "In addition to providing that much needed virtual second opinion, this system also could be deployed directly onto scanners, alerting the care team to the presence of a hemorrhage and triggering appropriate further testing before the patient is even off the scanner. The next step will be to deploy the system into clinical areas and further validate its performance with many more cases. We are currently building a platform to allow for the widespread application of such tools throughout the department. Once we have this running in the clinical setting, we can evaluate its impact on turnaround time, clinical accuracy and the time to diagnosis."
Hyunkwang Lee, Sehyo Yune, Mohammad Mansouri, Myeongchan Kim, Shahein H Tajmir, Claude E Guerrier, Sarah A Ebert, Stuart R Pomerantz, Javier M Romero, Shahmir Kamalian, Ramon G Gonzalez, Michael H Lev, Synho Do.
An explainable deep-learning algorithm for the detection of acute intracranial haemorrhage from small datasets.
Nature Biomedical Engineering (2018)doi: 10.1038/s41551-018-0324-9.