The first time I heard about AI agents, I thought they could monitor your computer use, anticipate your needs, and manipulate your behavior accordingly. This wasn\u2019t entirely off base. There is a dystopic future about what AI technology could enable that experts issue regular warnings about<\/a>. There\u2019s also the present reality of agentic AI<\/a>, which is here and clumsier than you would have guessed.<\/p>\n
Last month, OpenAI released something called Operator<\/a>. It\u2019s what experts would call an AI agent, meaning a version of AI technology that can not only recall information and generate content, like ChatGPT, but can also actually do<\/em> things<\/a>. In the case of Operator, the AI can use a web browser to do anything from buying your groceries to updating your LinkedIn profile. At least in theory. Operator is also currently a \u201cresearch preview\u201d that\u2019s only available to ChatGPT Pro users, who pay $200 a month for the privilege. <\/strong>(Disclosure: Vox Media is one of several publishers that have signed partnership agreements with OpenAI. Our reporting remains editorially independent.)<\/p>\n
The reality is that, in its current form, Operator is not great at doing things. <\/p>\n
\u201cI’m very optimistic about using AI as sort of a dumb assistant, in that I don’t want it to make decisions for me,\u201d Aditi Raghunathan<\/a>, an assistant professor of computer science at Carnegie Mellon University. \u201cI don’t trust it to do things better than me.\u201d<\/p>\n
The basic concept of an AI agent is simultaneously alluring and horrific. Who wouldn\u2019t want an AI to handle mundane computer chores? But if the AI can use a computer to do boring things, you have to imagine it can do scary things, too. For now, for people like you and me, scary things include buying expensive eggs<\/a> or briefly screwing up your presence<\/a> on the world\u2019s largest network for professionals. For the economy as a whole, well, it depends on how much we trust AI and how much freedom we give it to operate unchecked.<\/p>\n
Global leaders gathered for the Paris AI Action Summit this week to discuss the future of the technology. Past summits in Bletchley Park, famous for its code-breaking computer used in World War II<\/a>, and Seoul focused on AI safety<\/a>, including the kinds of regulations governments should adopt in order to keep AI in check. But this meeting seemed to highlight a growing sense of competition between global powers, namely the US and China, to win the AI arms race. Vice President JD Vance was in attendance and said<\/a>, \u201cThe AI future is not going to be won by hand-wringing about safety.\u201d <\/p>\n
So now I\u2019m feeling a little nervous. While OpenAI\u2019s entry into the AI agent space currently feels like a parlor trick, I have to wonder what the industry\u2019s endgame is here. AI could usher in a friendly future of digital assistants who make our lives easier without any negative consequences. Or it could finally realize the paper-clip scenario<\/a>, in which we give AI free rein to solve one problem, like making paper clips, and it diverts all global resources toward that problem, destroying humanity in the process. <\/p>\n
The future will almost certainly be something in between the best- and worst-case scenarios. In any case, plenty of experts say fully autonomous agents should never be invented<\/a>. I have to say, if the AI agents of the future are as clumsy as Operator is right now, I\u2019m not too worried.<\/p>\n
Whether you like it or not, the next wave of AI technology will involve computers using computers. It\u2019s already happening. In the big agriculture industry, for example, farmers are already handing over the keys to their John Deere tractors to AI-powered software<\/a> that can work through the night. Others, like the global development nonprofit Digital Green, are giving farmers in developing countries access to Operator<\/a> so that it can lower costs and improve crop yields.<\/p>\n
Another arresting example of AI agents in action is also a pretty boring one, which tells you something about how this technology can be most useful. Rekki, a startup in London, recently told Bloomberg<\/a> that it sells access to AI agents that are trained to help restaurants and their suppliers streamline inventory management. A restaurant, for instance, could give the chatbot a long list of ingredients it uses and make sure everything is ordered on time. It works well enough that some companies are cutting staff and paying for the software instead.<\/p>\n
Enter AI-curious consumers, like me, with problems to solve. If you pay the $200 a month for access, you can gain access to a user-friendly version of Operator that looks and acts a lot like ChatGPT. While it currently works as a separate app<\/a> on ChatGPT\u2019s website, OpenAI ultimately plans to integrate Operator into ChatGPT for a seamless experience. Interacting with Operator is already a lot like using ChatGPT: You get Operator to do tasks by typing prompts into a familiar-looking empty box<\/a>. Then things get interesting. Operator opens up a tiny browser window and starts doing the task. You can watch it try and fail in real-time.<\/p>\n
In its current form, Operator amounts to a painfully slow way to use Google \u2014 or rather Bing, thanks to OpenAI\u2019s partnership with Microsoft<\/a>. It can do tasks for you while you\u2019re doing something else, but like ChatGPT before it, you always have to check Operator\u2019s work. I asked it to find me the cheapest flights for a weekend visit to my mom\u2019s house in Tennessee, and it returned a two-week-long itinerary that cost double what I\u2019d expect to pay. When I explained the error, Operator did it again but worse.<\/p>\n
Operator is, in many ways, a mirage. It looks like a proof-of-concept that AI can not just generate text and images but actually perform tasks autonomously, making your life effortless in the process. But the more you ask the agent to do, the more agency it requires. <\/p>\n
This is a big conundrum for the future of AI development. When you put guardrails on tools \u2014 not letting Operator go wild with your credit card, for instance \u2014 you constrain its utility. If you give it more power to make decisions and operate independently, it may be more useful but also more dangerous.<\/p>\n
Which brings us back to the paperclip problem. First popularized by philosopher Nick Bostrom in 2003<\/a>, the paper-clip scenario imagines giving a superintelligent AI the task of manufacturing paper clips, and the freedom to do so unchecked. It doesn\u2019t end well for humans, which is a stark reminder that responsible AI development is not just about preventing an AI from using your credit card without permission. The stakes are much higher.<\/p>\n
This sort of thing is what global leaders were discussing in Paris this week. The consensus from the AI Summit, however, was not encouraging<\/a>, if you care about the future of the human race. Vance called for \u201cunparalleled R&D investments\u201d into AI and called for \u201cinternational regulatory regimes that fosters the creation of AI technology rather than strangles it.\u201d This reflects the same anti-guardrail principles that were in the executive order President Donald Trump signed<\/a> in January revoking President Joe Biden\u2019s plan<\/a> for safe and responsible AI development. <\/p>\n
For the Trump administration, at least, the goal for AI development seems to be growth and dominance at all costs<\/a>. But it\u2019s not clear that the companies developing this technology, including OpenAI, feel the same way. Many of the limitations I found in Operator, for instance, were imposed by its creators. The AI agent\u2019s slow-moving, second-guessing nature made it less useful \u2014 but also more approachable and safe.<\/p>\n
A version of this story was also published in the Vox Technology newsletter. <\/em>Sign up here<\/a><\/em><\/strong> so you don\u2019t miss the next one!<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"