ChatGPT Agent: When AI Starts to Work, Not Just Talk

July 18, 2025 | Erik Seidel | AI assistant | ChatGPT Agent | OpenAI

OpenAI launches ChatGPT Agent – an AI that books, researches, and acts across the web and files. Full user control, real-world tasks. Learn what it can do.

The latest evolution of artificial intelligence no longer stops at conversation. With the launch of its new ChatGPT Agent, OpenAI has introduced a system that not only understands language but also performs real-world digital tasks — from booking dinner reservations to managing spreadsheets and conducting market research. As G.Business reports, citing OpenAI’s press release, the Agent marks a critical step toward fully interactive and action-capable AI tools.

Unlike previous iterations of ChatGPT, which focused on text-based interactions, the Agent has been designed to act as well as reason. It can operate a web browser, extract data from websites, run code, fill out forms, generate editable slide decks, and dynamically analyze business information — all within a single conversational flow.

From Passive Assistant to Active Operator

The ChatGPT Agent blends capabilities previously split across two separate tools: "Operator," which could browse and interact with web pages, and "Deep Research," which synthesized complex online information. The new system combines these with a virtual computer, API access, and even terminal-level control, allowing it to complete entire workflows — autonomously and efficiently.

In a demonstration shared by OpenAI, the Agent was tasked with finding available weeknight dinner reservations for restaurants rated above 4.3 stars, cross-referencing the user’s Google Calendar, and delivering a curated list of options. It completed the task in under 15 minutes.

Other use cases include:

Creating competitive analyses with editable presentations;
Reading and updating financial spreadsheets;
Scheduling meetings and searching for expert contacts;
Managing inbox summaries and available time slots for calls.

Importantly, users can interrupt, revise, or redirect the agent mid-task — and pick up exactly where they left off.

Performance: Benchmarks Against Humans and Machines

OpenAI tested the ChatGPT Agent against state-of-the-art benchmarks. In many categories, it outperformed not only its predecessor GPT-4o, but even trained human professionals.

Benchmark	ChatGPT Agent	GPT-4o/Copilot	Human Benchmark
DSBench – Data Analysis	89.9%	64.1%	87.9%
DSBench – Data Modeling	85.5%	65.0%	77.1%
SpreadsheetBench	45.5%	20.0%	71.3%
Investment Banking Modeling	71.3%	48.6%	–
WebArena – Web Navigation	78.2%	62.9%	65.4%
BrowseComp – Web Search	68.9%	51.5%	–

These results suggest that the Agent is not just a digital helper — it's a competent, multitasking operator across a wide range of cognitive and practical domains.

Built-In Risk Management

With increased autonomy comes greater risk. OpenAI has acknowledged that the ChatGPT Agent presents "more risks than previous models," and has implemented several safeguards:

Explicit user approval is required before executing irreversible or sensitive actions, such as form submissions or purchases.
Supervised mode enables step-by-step control during tasks like email drafting or interacting with confidential systems.
The system actively rejects high-risk requests, including financial transactions and legal filings.

To prevent manipulation, the Agent is trained to recognize and resist so-called prompt injection attacks, in which malicious code is hidden inside seemingly harmless instructions or web content.

No Ads, But Monetization Looms

OpenAI has emphasized that the ChatGPT Agent does not include sponsored content or paid product placements. However, industry analysts like Niamh Burns of Enders Analysis caution that monetization pressure could eventually lead to some form of commercial influence.

"It’s easy to say the system won’t recommend products for profit today," said Burns. "But what about tomorrow — especially as the AI becomes more embedded in shopping and decision-making workflows?"

OpenAI CEO Sam Altman has previously stated the company may charge a commission — for example, a 2% fee on transactions completed via its research tools — though no formal plans have been confirmed for the Agent.

Availability and Outlook

The ChatGPT Agent is now rolling out to subscribers of ChatGPT Pro, Plus, and Team tiers, with support for Enterprise and Education accounts scheduled for later this month. The older Operator tool will be phased out within 30 days. Deep Research functionality is now fully integrated.

Looking ahead, OpenAI plans to expand the Agent's capabilities in areas such as advanced presentation generation, collaborative document creation, and automated reporting — all while maintaining a high degree of user control and data privacy.

“You’re always in control,” the company wrote in its announcement. “The Agent works for you — and only when you say so.”

Stay connected for news that works — timely, factual, and free from opinion. Learn more about this topic and related developments here: chwoot: he sudo flaw that turns local Linux users into root – in seconds