“Already, we are being told that A.I. is making coders and customer service representatives and writers more productive. At least one chief executive plans to add ChatGPT use in employee performance evaluations. But I’m skeptical of this early hype. It is measuring A.I.’s potential benefits without considering its likely costs — the same mistake we made with the internet.
I worry we’re headed in the wrong direction in at least three ways.
One is that these systems will do more to distract and entertain than to focus. Right now, the large language models tend to hallucinate information: Ask them to answer a complex question, and you will receive a convincing, erudite response in which key facts and citations are often made up. I suspect this will slow their widespread use in important industries much more than is being admitted, akin to the way driverless cars have been tough to roll out because they need to be perfectly reliable rather than just pretty good.
A question to ask about large language models, then, is where does trustworthiness not matter? Those are the areas where adoption will be fastest. An example from media is telling, I think. CNET, the technology website, quietly started using these models to write articles, with humans editing the pieces. But the process failed. Forty-one of the 77 A.I.-generated articles proved to have errors the editors missed, and CNET, embarrassed, paused the program. BuzzFeed, which recently shuttered its news division, is racing ahead with using A.I. to generate quizzes and travel guides. Many of the results have been shoddy, but it doesn’t really matter. A BuzzFeed quiz doesn’t have to be reliable.
A.I. will be great for creating content where reliability isn’t a concern. The personalized video games and children’s shows and music mash-ups and bespoke images will be dazzling. And new domains of delight and distraction are coming: I believe we’re much closer to A.I. friends, lovers and companions becoming a widespread part of our social lives than society is prepared for. But where reliability matters — say, a large language model devoted to answering medical questions or summarizing doctor-patient interactions — deployment will be more troubled, as oversight costs will be immense. The problem is that those are the areas that matter most for economic growth. [..]
One lesson of the digital age is that more is not always better. More emails and more reports and more Slacks and more tweets and more videos and more news articles and more slide decks and more Zoom calls have not led, it seems, to more great ideas. “We can produce more information,” [professor of information science at the University of California, Irvine, and the author of “Attention Span,” Gloria] Mark said. “But that means there’s more information for us to process. Our processing capability is the bottleneck.”
Email and chat systems like Slack offer useful analogies here. Both are widely used across the economy. Both were initially sold as productivity boosters, allowing more communication to take place faster. And as anyone who uses them knows, the productivity gains — though real — are more than matched by the cost of being buried under vastly more communication, much of it junk and nonsense.
The magic of a large language model is that it can produce a document of almost any length in almost any style, with a minimum of user effort. Few have thought through the costs that will impose on those who are supposed to respond to all this new text. One of my favorite examples of this comes from The Economist, which imagined NIMBYs — but really, pick your interest group — using GPT-4 to rapidly produce a 1,000-page complaint opposing a new development. Someone, of course, will then have to respond to that complaint. Will that really speed up our ability to build housing? [..]
Jonathan Frankle, the chief scientist at MosaicML and a computer scientist at Harvard, described this to me as the “boring apocalypse” scenario for A.I., in which “we use ChatGPT to generate long emails and documents, and then the person who received it uses ChatGPT to summarize it back down to a few bullet points, and there is tons of information changing hands, but all of it is just fluff. We’re just inflating and compressing content generated by A.I.” [..]
My third concern is related to that use of A.I.: Even if those summaries and drafts are pretty good, something is lost in the outsourcing. Part of my job is reading 100-page Supreme Court documents and composing crummy first drafts of columns. It would certainly be faster for me to have A.I. do that work. But the increased efficiency would come at the cost of new ideas and deeper insights.
Our societywide obsession with speed and efficiency has given us a flawed model of human cognition that I’ve come to think of as the Matrix theory of knowledge. Many of us wish we could use the little jack from “The Matrix” to download the knowledge of a book (or, to use the movie’s example, a kung fu master) into our heads, and then we’d have it, instantly. But that misses much of what’s really happening when we spend nine hours reading a biography. It’s the time inside that book spent drawing connections to what we know and having thoughts we would not otherwise have had that matters.
“Nobody likes to write reports or do emails, but we want to stay in touch with information,” Mark said. “We learn when we deeply process information. If we’re removed from that and we’re delegating everything to GPT — having it summarize and write reports for us — we’re not connecting to that information.”
We understand this intuitively when it’s applied to students. No one thinks that reading the SparkNotes summary of a great piece of literature is akin to actually reading the book. And no one thinks that if students have ChatGPT write their essays, they have cleverly boosted their productivity rather than lost the opportunity to learn. The analogy to office work is not perfect — there are many dull tasks worth automating so people can spend their time on more creative pursuits — but the dangers of overautomating cognitive and creative processes are real.
These are old concerns, of course. Socrates questioned the use of writing (recorded, ironically, by Plato), worrying that “if men learn this, it will implant forgetfulness in their souls; they will cease to exercise memory because they rely on that which is written, calling things to remembrance no longer from within themselves but by means of external marks.” I think the trade-off here was worth it — I am, after all, a writer — but it was a trade-off. Human beings really did lose faculties of memory we once had.
To make good on its promise, artificial intelligence needs to deepen human intelligence. And that means human beings need to build A.I., and build the workflows and office environments around it, in ways that don’t overwhelm and distract and diminish us. We failed that test with the internet. Let’s not fail it with A.I.”
Full editorial, E Klein, New York Times, 2023.5.28