OpenAI's Agent Has a Problem: Before It Does Anything Important, You Have to Double-Check It Hasn't Screwed Up

Behold Operator, OpenAI’s long-awaited agentic AI model that can use your computer and browse the web for you.  It’s supposed to work on your behalf, following the instructions it’s given like your very own little employee. Or “your own secretary” might be more apt: OpenAI’s marketing materials have focused on Operator performing tasks like booking tickets, restaurant reservations, and creating shopping lists (though the company admits it still struggles with managing calendars, a major productivity task.)  But if you think you can just walk away from the computer and let the AI do everything, think again: Operator will need to ask for confirmation before pulling the trigger on important tasks, which throws a wrench into the premise of the AI agent acting on your behalf, since the clear implication is you need to make sure it’s not screwing up before allowing it any real power. “Before finalizing any significant action, such as submitting an order or sending an email, Operator should ask for approval,” reads the safety section in OpenAI’s announcement. This measure highlights the tension between keeping stringent guardrails on AI models while allowing them to freely exercise their purportedly powerful capabilities. How do you put out an AI that can do anything — without it doing anything stupid? Right now, a limited preview of Operator is only available to subscribers of the ChatGPT Pro plan, which costs an eye-watering $200 per month.  The agentic tool uses its own AI model called Computer-Using Agent to interact with its virtual…OpenAI's Agent Has a Problem: Before It Does Anything Important, You Have to Double-Check It Hasn't Screwed Up

Leave a Reply

Your email address will not be published. Required fields are marked *