Can A.I. Treat Mental Illness?: New computer systems aim to peer inside our heads—and to help us fix what they find there.

“Roughly one in five American adults has a mental illness. An estimated one in twenty has what’s considered a serious mental illness—major depression, bipolar disorder, schizophrenia—that profoundly impairs the ability to live, work, or relate to others. Decades-old drugs such as Prozac and Xanax, once billed as revolutionary antidotes to depression and anxiety, have proved less effective than many had hoped; care remains fragmented, belated, and inadequate; and the over-all burden of mental illness in the U.S., as measured by years lost to disability, seems to have increased. Suicide rates have fallen around the world since the nineteen-nineties, but in America they’ve risen by about a third. Mental-health care is “a shitstorm,” Thomas Insel, a former director of the National Institute of Mental Health, told me. “Nobody likes what they get. Nobody is happy with what they give. It’s a complete mess.” [..]

The treatment of mental illness requires imagination, insight, and empathy—traits that A.I. can only pretend to have. And yet, Eliza, which [MIT computer scientist Joseph] Weizenbaum named after Eliza Doolittle, the fake-it-till-you-make-it heroine of George Bernard Shaw’s “Pygmalion,” created a therapeutic illusion despite having “no memory” and “no processing power,” [author of “The Most Human Human” Brian] Christian writes. What might a system like OpenAI’s ChatGPT, which has been trained on vast swaths of the writing on the Internet, conjure? An algorithm that analyzes patient records has no interior understanding of human beings—but it might still identify real psychiatric problems. Can artificial minds heal real ones? And what do we stand to gain, or lose, in letting them try?

In 2013, the team started working on a program that would analyze V.A. patient data automatically, hoping to identify those at risk. In tests, the algorithm they developed flagged many people who had gone unnoticed in other screenings—a signal that it was “providing something novel,” [V.A.’s director of data and surveillance for suicide prevention John] McCarthy said. The algorithm eventually came to focus on sixty-one variables. Some are intuitive: for instance, the algorithm is likely to flag a widowed veteran with a serious disability who is on several mood stabilizers and has recently been hospitalized for a psychiatric condition. But others are less obvious: having arthritis, lupus, or head-and-neck cancer; taking statins or Ambien; or living in the Western U.S. can also add to a veteran’s risk.

In 2017, the V.A. announced an initiative called REACH VET, which introduced the algorithm into clinical practice throughout its system. Each month, it flags about six thousand patients, some for the first time; clinicians contact them and offer mental-health services, ask about stressors, and help with access to food and housing. Inevitably, there is a strangeness to the procedure: veterans are being contacted about ideas they may not have had. The V.A. had “considered being vague—just saying, ‘You’ve been identified as at risk for a bunch of bad outcomes,’ ” McCarthy told me. “But, ultimately, we communicated rather plainly, ‘You’ve been identified as at risk for suicide. We wanted to check in and see how you’re doing.’ ”

Many veterans are isolated and financially insecure, and the safety nets meant to help them are too small. Jodie Trafton, who leads the V.A.’s evaluation center for mental-health programs, told me about one veteran identified by REACH VET who confirmed that he had had suicidal thoughts; he turned out to be sick, lonely, and broke. A social worker discovered that he’d received only a fraction of the financial support for which he was eligible—a single form stood between him and thousands of dollars in untapped benefits. The social worker helped him access the money, allowing him to move closer to his family, and potentially preventing a tragedy.

After the system’s implementation, psychiatric admissions fell by eight per cent among those that the A.I. had identified as high risk, and documented suicide attempts in that group fell by five per cent. But REACH VET has not yet been shown to reduce suicide mortality. Among veterans, about two per cent of attempts are fatal; a very large or very targeted reduction in the number of attempts might be needed to avert deaths. It’s also possible that preventing deaths takes time—that frequent touchpoints, over years, are what drive suicide rates down across a population.

The design and implementation of an algorithm can be full of pitfalls and surprises. Ziad Obermeyer, a physician and a machine-learning researcher at the University of California, Berkeley, told me about one algorithm he had studied, not affiliated with the V.A., that aimed to figure out who in a patient population had substantial health needs and could use additional support. “We want algorithms to stratify patients based on their likelihood of getting sick,” Obermeyer said. “But, when you’re writing code, there’s no variable called ‘Got Sick.’ ” The algorithm’s designers needed a proxy for illness and settled on medical costs. (All things being equal, people who are sicker tend to use more health care.) Obermeyer found, however, that the algorithm dramatically underestimated how sick Black patients were, because the Black patients it examined had much lower health spending than the white patients, even when they were equally sick. Such algorithmic bias can occur not just by race, but by gender, age, rurality, income, and other factors of which we’re only dimly aware, making algorithms less accurate. Trafton told me that the V.A. is doing “a ton of work to make sure our models are optimized for various subpopulations”—in the future, she went on, REACH VET may have “a model for older adults, a model for women, a model for young men, et cetera.”

Even fine-tuned algorithms have limitations. REACH VET can only assess veterans who use V.A. services. According to the agency, about twenty veterans die by suicide every day, and fewer than forty per cent of them have received V.A. care. Joshua Omvig, the Iowa soldier for whom Congress named its legislation, resisted when his family urged him to seek professional help; if REACH VET had existed at the time, it probably would not have reached him.

If the V.A. hired more therapists, it could see more patients. But it already employs more than twenty thousand mental-health professionals—and the wait to see one of them for routine care can last more than a month. The problem of scale is endemic in mental-health care, not least because, as Eliza’s boosters noted, therapy so often involves face-to-face, one-on-one sessions. In 2016, the United Kingdom, a wealthy country with universal health care, set a five-year goal of providing therapy to just one in four people who needed it. It failed; one British doctor called the initiative “overwhelmed, under-resourced, and impersonal.”

In 2013, in an effort to increase the scale of its mental-health treatment, the U.K.’s National Health Service contracted with Ieso, a digital-health company, to help therapists deliver cognitive behavioral therapy through text chat. More than a hundred thousand people in the U.K. have now used Ieso’s software to receive what the company calls “typed therapy.” Studies have shown that text-based therapy can work well. It also generates data. Ieso has used A.I. to analyze more than half a million therapy sessions, performing what Valentin Tablan, the company’s chief A.I. officer, described as “quantitative analyses of the conversations inside the therapy room.” [..]

I signed up for Woebot, and discovered that using the app could feel centering. I didn’t like that I often had to choose between pre-written replies; in one unfortunate instance, my only option was “Hehe.” But I sometimes found myself messaging Woebot in spare moments—waiting for an elevator, riding in an Uber, walking to the bathroom—the way I used to scroll through Twitter. Once, I told Woebot that I was feeling anxious about work. “Anxiety can be a real monster to handle,” it wrote back. “I’m sorry that you’re dealing with it.” Woebot gently inquired whether I wanted to work through my problem together, then asked, “Do you think this anxiety might be serving you in some way?” It pointed out that stress has its benefits: it could motivate someone to work harder.

Woebot: Now, it might seem like I’m trying to convince you to stay very anxious. I’m not—that’s entirely up to you

Me: What are you saying?

Woebot: I’m saying that in order to dial down your anxiety, you must first acknowledge the reasons why it might be helpful . . . and decide to give it up in spite of those reasons

I knew that I was talking to a computer, but in a way I didn’t mind. The app became a vehicle for me to articulate and examine my own thoughts. I was talking to myself. [..]

Almost certainly, the future will include bespoke L.L.M.s designed just for therapy: PsychGPT and the like. Such systems will reach people who aren’t getting help now—but any flaws they contain will be multiplied by the millions who use them. Companies will amass even more sensitive information about us than they already have, and that information may get hacked or sold. “When we have systems operating at enormous scale, a single point of failure can have catastrophic consequences,” the writer Brian Christian told me. It seems likely that we’ll be surprised by our A.I.s. Microsoft’s Bing chatbot, which is based on OpenAI’s technology, is designed to help users find information—and yet the beta version has also offered up ethnic slurs, described creepy fantasies, and told users that they are “bad,” “rude,” and “confused.” [..]

On my first day of medical school, I sat in a sunlit courtyard alongside dozens of uneasy students as professors offered advice from a lectern. I remember almost nothing of what they said, but I jotted down a warning from one senior doctor: the more clinical skills you gain, the easier it gets to dismiss the skills you had before you started—your compassion, your empathy, your curiosity. A.I. language models will only grow more effective at interpreting and summarizing our words, but they won’t listen, in any meaningful sense, and they won’t care. A doctor I know once sneaked a beer to a terminally ill patient, to give him something he could savor in a process otherwise devoid of pleasure. It was an idea that didn’t appear in any clinical playbook, and that went beyond words—a simple, human gesture.”

Full article, D Khullar, 2023.2.27