Ethical Challenges Loom Large in Implementing Machine Learning in Health Care
A recent perspective article raises concerns and proposes solutions regarding the use of machine learning in clinical medicine.
Machine learning holds promise for improving health care, especially in fields such as radiology and anatomical pathology, which require detailed image analysis. But the benefits will not be realized without first addressing thorny ethical challenges, according to a perspective article published March 15 in the New England Journal of Medicine.
“I think there are things humans will prove to do better than AI/ML [artificial intelligence/machine learning] systems,” said first author Danton Char, an assistant professor of anesthesiology at the Stanford University Medical Center in California. “Right now, certainly the ability to step aside from an idea and ask abstract questions like ‘Is this a good idea?’ is something humans can do that machine systems can't.”
The authors are particularly concerned with potential biases inherent in training data, the profit-driven design of clinical decision support systems, and the premature incorporation of controversial health care practices into algorithms. They also noted that machine learning tools could take on an unintended power and authority in clinical decision-making and potentially change the nature of the physician-patient relationship by eroding trust, respect, good will, confidentiality and personal responsibility.
“What we hope to accomplish is that we don't fear AI, and we also don't rush in foolishly,” said co-author Nigam Shah, an associate professor of medicine at Stanford University. “Well-intended but premature use of machine learning has led to problems in other domains. We should learn from those missteps and do things differently in medicine.”
Private sector designers who create machine learning systems for clinical use could be subject to temptations similar to those that motivated Uber to develop a software tool designed to predict which ride hailers might be undercover law enforcement officers, and Volkswagen to develop an algorithm that allowed vehicles to pass emissions tests, the researchers point out. Given the growing importance of quality indicators for public evaluations and determining reimbursement rates, there may be a temptation to teach machine learning systems to guide users toward clinical actions that would improve quality metrics but not necessarily reflect better care.
“The way that the black box of medical AI can obscure improper motives makes it easier to hide those motives, and can shield debates we should otherwise have -- in particular, profit versus care quality, and social versus individual benefits,” said Nicholson Price, an assistant professor of law at the University of Michigan Law School in Ann Arbor who was not an author on the perspective. “That makes enforcement harder, of course.”
To address these challenges, the authors recommend that ethical standards be built into machine learning systems, with appropriate enforcement strategies put in place, and that physicians become more educated about how these tools are constructed. “Physicians as a group right now don't know the limitations of ML, and there is a danger in treating ML and AI like a black box and not working closely with medical algorithm designers,” Char said. “It could lead to both overreliance on ML/AI recommendations without discerning use of clinical judgement or underreliance on ML/AI without drawing on the potential benefits.”
The concerns raised in the article could be addressed with careful oversight mechanisms, according to Alex London, professor of ethics and philosophy at Carnegie Mellon University in Pittsburgh, Pennsylvania, who is not one of the article’s authors. “This highlights the importance of ensuring that there is a mechanism for independent verification of the accuracy of such systems and for independent audit and public assessment of the assumptions that are built into their models,” he said. “The question is whether there will be the political will to put such mechanisms in place.”
Additional hurdles may make it difficult to implement the recommendations suggested by the authors, according to Philipp Kellmeyer, a neurologist and postdoctoral researcher at the University of Freiburg Medical Center in Germany. For example, it remains unclear who should develop guidelines on how to use machine learning in clinical decision support systems. Moreover, it may be almost impossible to effectively de-bias machine learning systems, and there is not just one medical ethics code or an obvious way to build algorithms for moral learning, he noted.
“My intuition is that this will remain a crucial challenge for software engineering, and ethical theory for that matter, for the foreseeable future, which means that the morality and ethics for the time being will remain the responsibility of the doctors and health care professionals,” Kellmeyer said.
In the end, the article might expand the currently small group of people thinking deeply about the legal and ethical issues of AI in health care, Price said. “It also has the potential to raise the profile of AI in medicine for health care providers and policymakers, where it’s been flying more under the radar,” he added.
Machine learning applications in health care are already accelerating, and there is no reason why this trend won’t continue, London said. “When ML applications are used thoughtfully, they hold out the potential to improve diagnostic and prognostic accuracy,” he said. “Although we have to be careful to make sure that the promise of ML systems is not exaggerated and we have to protect against snake oil, we also have to be prepared to use such systems where we can establish their clinical value, even if this means displacing some human judgment from the clinic.”