In the operating room, a split-second decision can mean the difference between life and death decisions are increasingly being shaped by reasoning uninterpretable to human practitioners. More and more, these choices are becoming the product of “black-box” algorithms, one of the newest ways AI is entering the medical field, and a development that raises concerns about the trade-off between transparency and accuracy.
“Black-box” algorithms are deep learning systems whose internal decision-making processes are not accessible or interpretable to human beings. In some cases, this opacity is the result of systems intentionally obscured to protect intellectual property. More often, however, it is an inevitable part of their structure, as deep learning models typically contain hundreds, if not thousands, of layers. As a result, these models become so complex that even their creators cannot fully explain how particular outputs are produced.
Despite this, there is a growing push to bring “black-box” AI systems into the operating room, largely because, in many cases, they appear to outperform more interpretable models. This success is due to ML’s capacity to integrate real-time physiological data alongside historical clinical data and to model complex, non-linear relationships that simpler, rule-based systems cannot capture. However, this enthusiasm has also drawn criticism, particularly of claims that “black-box” algorithms can be made genuinely “interpretable” through the use of additional explanatory technology.
It’s clear that before “black-box” algorithms can be safely entrusted with high-stakes clinical decisions, several serious concerns need to be addressed, chief among them significant biases. Although deep learning AI is often promoted as a way to eliminate this challenge from the medical field, that is not always the case. This is because the first step in developing and training these models involves collecting vast datasets, datasets that are not only reflective of human biases but are also frequently imbalanced.
One of the most troubling consequences of this is the underrepresentation of certain patient groups. For example, many training datasets overrepresent non-Hispanic Caucasian patients, and more than half of published clinical AI models rely on data from either the United States or China. As a result, these algorithms tend to favor average trends and perform poorly for underrepresented groups, a limitation that becomes especially dangerous in outlier cases.
This has deeply concerning implications, as biases in “black-box” decisions will continue to compound and perpetuate longstanding health disparities. For example, a study investigating AI-based melanoma detection saw that models trained primarily on light-skinned patients from the US, Europe, and Australia perform significantly worse on darker skin tones, reinforcing existing diagnostic disparities and undermining the promise of early detection for underrepresented groups. Similarly, a separate study training a mortality-prediction model on the MIMIC-III ICU dataset found that racial class imbalance sharply degraded performance for underrepresented groups, with recall rates dropping as low as 25%, raising serious concerns about real-world clinical deployment.
Moreover, patients with rare or emerging diseases will face even greater barriers to advocating for themselves, as they will now have to contend not only with clinicians but also with algorithmic judgments. In many of these cases, there simply will not be enough relevant data, so the system will default to averages drawn from the general population, precisely where atypical patients are most likely to be misrepresented.
With these biases in mind, we need to ask how much authority “black-box” systems should have in high-stakes medical decisions, and, more importantly, who is responsible when something goes wrong? Is it the engineer who designed the system, the hospital that approved its use, or the clinician who acted on the algorithm’s recommendation? If it becomes standard practice to defer to the AI’s recommendation, doctors who disagree with an algorithm may expose themselves to legal risk, even when their clinical judgment is sound.
This pressure will encourage algorithmic conformity, as physicians may be more inclined to trust the machine than the patient, especially under the influence of confirmation bias. Over time, this dynamic will raise a deeper concern: expertise can atrophy. If AI consistently suggests diagnoses, flags risks, and recommends treatments, clinicians may practice less independent reasoning and gradually lose their diagnostic sharpness.
As a result, when considering the introduction of “black-box” systems into the operating room, regulation should move slowly and deliberately. Moreover, this transition period should be collaborative, with clinicians and patients providing feedback to developers and developers remaining involved and accountable for as long as their systems shape medical decisions. At the same time, patients must be given clear, informed consent and a real ability to opt out, whether from AI-assisted decision-making or from having their data used to train these systems.
Because “black-box” algorithms are opaque and often change as they learn from new data, they pose a regulatory challenge unlike traditional medical devices: what is approved today may look very different a year from now. Instead of relying on one-time approval, oversight should focus on continuous validation in real clinical use, ensuring high-quality data and development practices, testing performance where possible, and closely tracking outcomes over time. A regulatory approach like this, combining disclosure, real-world feedback, and flexible standards, would better protect patients while still leaving room for innovation. But this only works if we address potential issues early and ensure both clinicians and patients remain actively engaged in the process.