1 ChatGPT Agent: OpenAI’s Autonomous AI Revolution

ChatGPT Agent: OpenAI’s Autonomous AI Revolution

On July 17, 2025 (July 18 Japan time), OpenAI announced “ChatGPT Agent,” marking a decisive turning point in AI technology history. This represents the first step in AI’s evolution from a mere conversational AI to an “executor” that autonomously plans and executes tasks based on user instructions. This article thoroughly examines the full scope of this revolutionary technology and its impact on our society.

1. Overview and Innovation of ChatGPT Agent

What is ChatGPT Agent?

“ChatGPT Agent” is a groundbreaking system where AI can autonomously think, collect information, process, generate, and perform operations in a step-by-step process simply by users providing detailed instructions. This is like ChatGPT having “its own virtual computer,” autonomously browsing websites, executing code, and integrating with other apps, handling tasks from start to completion.

Differences from Traditional ChatGPT and AI Agents

ChatGPT Agent integrates OpenAI’s previously separate “Operator” and “Deep Research” technologies, further combining ChatGPT’s conversational skills and advanced reasoning capabilities.

ChatGPT (Traditional Chatbot)
Its primary role was to generate text responses to user questions and instructions. While AI agents are “AI subordinates that work autonomously behind the scenes toward objectives,” traditional ChatGPT is likened to a “smart tool that only answers.”

Operator (Released January 2025)
Specialized in automatically operating web browsers and performing online tasks. It could execute restaurant reservations and online shopping but couldn’t perform deep analysis or report creation. Realized through the “Computer-Using Agent (CUA)” model based on GPT-4o, it was trained to analyze GUI (Graphical User Interface) and replicate mouse and keyboard operations.

Deep Research (Released February 2025)
Deeply investigates vast information on the web and generates expert-level research reports in just minutes. It can analyze and integrate hundreds of web sources and return detailed reports with citations, but it was web-browsing only and couldn’t take actions based on that information (like logging in to obtain additional information).

ChatGPT Agent combines these strengths and complements each other’s weaknesses, evolving to “perform everything from deep analysis to document creation and actual web operations.” While traditional AI was “passive,” agents are “active AI” that independently plan, utilize tools, and complete tasks.

Main Functions and Capabilities

ChatGPT Agent can autonomously handle diverse tasks such as:

1. Complex Research and Content Generation

Understands users’ ambiguous requests, intelligently navigates the web to collect and filter information, and generates editable slides and spreadsheets summarizing research results. For example, it can create briefing materials for client meetings or analyze three competitors and create slides.

2. Web Browser Operation

Can browse and operate websites just like humans through graphical user interfaces (GUI). Beyond scrolling, clicking, and form filling, it can access information requiring login. This enables automatic execution of complex tasks like airline and hotel reservations and online shopping.

3. Code Execution

Has a Linux-based virtual terminal environment and executes Python and other code for data analysis, file downloads, and processing. This allows AI agents to complete data processing and calculations previously done manually.

4. External App Integration

Integrates with external applications like Gmail and GitHub through OpenAI’s “ChatGPT Connectors.” For example, it can summarize Gmail emails or read calendar appointments for scheduling, executing tasks based on user-specific data.

5. Flexible Planning and Multi-step Execution

Dynamically creates internal plans to achieve given goals and sequentially selects necessary tools to progress tasks. Has advanced ability to reach solutions through trial and error according to situations.

6. Human Collaboration and Safety

User control is always maintained, requiring permission before important operations. Users can intervene to directly operate browsers or interrupt tasks midway. The execution process is displayed on screen as a “thinking process,” visualizing what the AI is doing. Additionally, it’s trained to resist malicious instructions and fraud, with security and privacy-conscious design.

Pricing and Availability

ChatGPT Agent has been made available to Pro, Plus, and Team plan users.

Plan	Monthly Fee	Usage Limits	Features
Pro	$200	400 messages/month (Deep Research: 125 full + 125 light)	Nearly unlimited task execution
Plus/Team	Existing rates	40 messages/month (Deep Research: 10 full + 15 light)	Can expand with additional credits
Enterprise/Education	Contact sales	Available within weeks	Customized for organizations

Currently, the Operator research preview site will close after 30 days, and Deep Research will continue as part of ChatGPT Agent.

2. The Concept and Evolution of AI Agents

What are AI Agents?

AI agents refer to software that understands user objectives, autonomously plans and executes complex tasks, and improves through learning, combined with hardware that hosts it. Unlike “generative AI” specialized in information generation, AI agents are systems responsible for planning and execution toward goal achievement, distinct from simple question-answering systems like AI assistants.

AI Agent Mechanism: Three-Step Cycle

Task Creation and Management: Breaks down goals into detailed tasks.
Reasoning: Devises optimal methods for executing each task.
Action: Executes decided methods according to plan.

By flexibly repeating this cycle, AI agents autonomously accumulate optimal actions toward goals. Additionally, they have the ability to learn independently, utilizing feedback information for subsequent tasks.

Comparison with Traditional AI and Levels of Autonomy

Traditional AI Assistants only respond to simple questions like “What’s today’s weather?”

Generative AI specializes in content generation like text, images, and videos.

AI Agents utilize these generative AI functions as needed while executing entire tasks necessary for goal achievement using existing information. For example, using image generation AI when creating presentation materials.

Levels of AI Agent Autonomy

Simple Processor
Simple question-answering systems that respond based on pre-set FAQs and scripts, like chatbots. Limited in handling complex tasks and situational adaptability.

Router
Uses LLM output for if/else branching decisions, determining pre-defined workflow branches. For example, systems that route to appropriate workflows based on customer inquiry content.

Advanced AI Agents
• Purpose Understanding and Decomposition: Understands user objectives and breaks them into multiple processes
• Autonomous Action: Can act autonomously toward complex goals without detailed step-by-step instructions
• Learning and Improvement: Utilizes feedback and experience for subsequent tasks, independently learning and improving capabilities
• Reasoning: Enhanced ability to progress reasoning through “multi-step reasoning” that breaks down complex problems step by step, drawing conclusions while referencing external knowledge

Background of AI Agent Attention

While the concept of AI agents existed since around 2023, full-scale practical implementation has progressed in 2025. Behind this are several technological innovations and social environmental changes.

1. Dramatic Evolution of LLMs

As of 2025, the evolution of LLMs (Large Language Models) is accelerating AI agent implementation. LLMs are rapidly advancing through the “scaling law” where performance improves as data scale, computation, and model parameters increase.

OpenAI GPT-2 (2019): 1.5 billion parameters
OpenAI GPT-3 (2020): 175 billion parameters (approximately 120x scale increase)
Google PaLM (2022): 540 billion parameters

Previous LLMs had issues with low planning accuracy, complex multi-step reasoning, and long processing times, but these are now being addressed.

Improved Planning Accuracy: Evolution of reasoning models (models that perform deep analysis and consideration) and hardware accelerators (TPUs, specialized AI chips) enables more accurate real-time response generation.
Improved Complex Multi-step Reasoning: Ability to break down and reason through complex problems step by step has improved through technological developments like Chain of Thought (CoT).
Reduced Processing Time: Hardware accelerator evolution has shortened AI inference processing time and improved speed.

2. Expansion of General AI Use and Increased Agent-type Demand

The spread of generative AI has led to recognition of AI’s potential for various tasks, increasing demand for automating more complex operations.

In Japanese companies, generative AI adoption is progressing centered on large companies and IT industries:

32% of IT companies have adopted (as of December 2024)
23% of finance/insurance companies have adopted (as of December 2024)
13.1% of Japanese companies overall have adopted generative AI (as of December 2024)
Predicted to exceed 50% by around 2030 at this pace

In the future, AI agent use is expected to become standard when discussing AI utilization, with “AI use ≈ AI agent use” anticipated in the 2030s.

3. Advancement of Technological Competition

Major IT companies like Microsoft, Salesforce, and Amazon Web Services are accelerating AI agent development and deployment. OpenAI CEO Sam Altman has stated, “2025 is the year AI agents truly enter society,” citing factors accelerating their spread:

Improved technological maturity
Significant reduction in implementation costs
Accumulation of success cases
Enhancement of development tools

4. Deepening Labor Shortage and Rising Business Automation Needs

Japan faces limits in expanding its labor force due to declining birthrate and aging population, forcing companies to advance business automation.

According to Ministry of Internal Affairs estimates:

Japan’s working-age population (15-64 years) will decrease from 75.09 million in 2020
To 52.75 million in 2050, a approximately 30% decrease

AI agents are expected to contribute to business automation and productivity improvement as a means to solve such labor shortage challenges.

3. Six Types of Intelligence Expanded by AI

AI is conceived as an existence that expands human intelligence, with the expanded intelligence classified into the following six types.

1. Predictive Power (Predict)

The ability to predict the future.

Examples: AI is utilized for diagnosing and predicting water pipe breakage risks and natural disaster risks using satellite image data. Research on medical systems that predict future illnesses humans couldn’t foresee is also advancing, demonstrating AI’s expanding predictive power. In commerce, predictive sales-type commerce may emerge.

2. Discriminative Power (Distinctify)

The ability to find specific events or patterns humans couldn’t notice from massive data.

Examples: Cases where AI identified a hidden phantom self-portrait in Van Gogh’s painting underdrawing, or where AI decoded valuable text information from burned ancient documents that researchers couldn’t read. This demonstrates the ability to extract and classify features from data.

3. Individualizing Power (Individualize)

The ability to optimize according to individual specificity and peculiarities.

Examples: In healthcare, research is advancing where AI proposes personalized treatments from individual genetic and activity data. In education, AI grasps each student’s learning progress and understanding level, utilized for lesson planning, learning guidance, and grade evaluation. In commerce, this connects to the ability to individualize products and services according to user preferences.

4. Communicative Power (Communicate)

The ability to interpret and translate, including understanding user needs through dialogue and providing appropriate information.

Examples: Known medical interview AI where doctor avatars conduct symptom interviews and explain treatment flows before visits, aiming to shorten consultation time. Examples of conversational commerce broadly presenting related products users weren’t explicitly searching for also show expanded communicative power.

5. Structuring Power (Model)

The ability to structure knowledge.

Examples: AI programmers that generate and explain code, or companies where all members except the CEO are AI, are prime examples of expanded structuring power. Recently, AI CEOs have appeared, taking responsibility for overall management, decision-making, and risk management strategy execution with fair judgment prioritizing organizational profit.

6. Creative Power (Create)

The ability to generate new knowledge and create new ideas by combining existing knowledge.

Examples: In Denmark’s Slagelse municipality, AI with creative power attempts a new approach collecting citizen discussions, insights, and proposals to provide information to policymakers. At NASA, AI was used for some satellite equipment, resulting in adopted equipment with shapes humans couldn’t conceive yet structurally superior to conventional ones.

4. Performance of GPT Agent

ChatGPT Agent demonstrates cutting-edge performance in complex real-world tasks such as data modeling, spreadsheet editing, and investment banking operations.

Benchmark Results

Humanity’s Last Exam
Against traditional models’ average accuracy of 9.1%, Deep Research recorded 26.6%, and the ChatGPT Agent-equipped model score reached 43.1%. This demonstrates significant performance improvements across broad fields including medicine, physics, and history.

GAIA
For complex real-world challenges, accuracy improved from 63.64% to 72.57%.

BrowseBench
Achieved 68.9% accuracy in difficult search tasks, outperforming Deep Research by 17.4 points.

Comparison with Humans

According to FutureSearch’s research, while the estimated score for a “perfect AI agent comparable to human researchers” is 0.8, the highest score for “o3,” ChatGPT Agent’s core model, was 0.51. This indicates that even the highest-level AI agents still don’t match “excellent human researchers.”

However, considering that ChatGPT’s year-old model “GPT-4-Turbo” scored 0.27, the performance gap between “excellent human researchers” and “cutting-edge AI agents” has narrowed by approximately 45% in a very short period. OpenAI’s internal evaluation shows that ChatGPT Agent’s results equal or exceed humans in about half of economically valuable intellectual labor tasks.

Current Limitations and Challenges

ChatGPT Agent is still in early development stages with several limitations.

Slideshow generation is in beta and may have formatting and refinement issues.
May still make mistakes in complex tasks.
Hallucination (misinformation) possibility isn’t zero. User reports include incorrect links and information discrepancies. Final trust judgments and decision-making must be carefully performed by humans.
Processing Speed: Cursor movement and page loading may take time compared to manual human operations.
User Effort: Manual user intervention is still required for login, payment information input, and CAPTCHA authentication.
High Cost: The $200 monthly Pro plan pricing may be a significant burden for individual users and small businesses.

5. Business Applications of AI Agents

AI agents are beginning to be utilized across various business domains, with their scope expected to expand further.

Customer Support Automation

Current State (Late 2024-2026)

AI agents enable rapid customer response and human resource reduction through automated chatbot responses.

A major retail company reduced recruitment officer inquiry response time by 60% and improved candidate evaluations through recruitment chatbot implementation.
GMO Media developed an autonomous AI agent specialized in inquiry responses, achieving over 70,000 inquiry reductions, 68%+ work hour reduction, and 12.5 point customer satisfaction improvement in about 1.5 years.

Future Evolution (Post-2030 Predictions)

AI agents will collaborate to enable more advanced customer service. They’ll comprehensively judge customer emotional states, past history, and situational complexity, independently devising and executing solutions even for exceptional cases. Through collaboration with early AGI versions, non-routine inquiry handling will evolve, with routine inquiries fully automated and humans only making final judgments for non-routine responses.

Office Work Efficiency

Current State (Late 2024-2026)

Autonomous processing of routine tasks like schedule management and data entry, task priority adjustment, and routine work optimization through integration with multiple business tools are advancing. About 50% of back-office operations are expected to be covered by AI agents.

Future Evolution (Post-2030 Predictions)

Cross-functional collaboration will advance, enabling comprehensive business management. AI agents will handle desk work, significantly reducing corporate workload, and autonomously design and improve optimal workflows by understanding entire business processes. In planning and creative work, expert-level AGI may create optimal plans with humans only making minor adjustments.

Sales AI Assistants

Current State (Late 2024-2026)

AI agents analyze customer data to suggest optimal approaches to sales representatives and automate contract management for procedural efficiency and error reduction.

Salesforce’s “Agentforce” provides functions including:

Natural language customer interactions
Prospect appointment acquisition
Sales representative coaching
Campaign building support

“AI supervisor” functions where AI agents participate in actual customer meetings and provide real-time advice to sales representatives have also emerged.

Future Evolution (Post-2030 Predictions)

Full sales process automation will advance, with AI agents autonomously planning and executing optimal customer responses and contract proposals through comprehensive analysis of market trends, competitive information, and customer psychology. Sales expert AI agents will create negotiation scenarios and provide optimal proposals according to customer challenges, creating immersive proposal experiences personalized for each purchasing decision-maker.

New Field Possibilities and Industry Applications

Supply Chain

AI agents advance real-time demand forecasting and logistics planning optimization. Post-2030, they’ll comprehensively understand global supply chains, analyze geopolitical risks, weather fluctuations, and market changes to fully automate inventory management and delivery planning. Complete on-demand production systems may realize hyper-personalized experiences tailored to individual consumer needs.

Medical/Healthcare

Remote diagnosis accuracy improvement and individual treatment proposals based on patient data are advancing. Post-2030, expert-level AGI will fully automate diagnostic support and handle medical consultations. They’ll comprehensively analyze patient genetic information, lifestyle data, medical history, and latest medical research to autonomously propose optimal treatments and preventive measures for individual patients.

Financial Institutions

In loan underwriting processes, multiple specialized AI agents share tasks and efficiently process credit risk scenarios, improving efficiency by shortening review cycles by 20% to 60%. Dynamic pricing and personalized promotions also become possible.

Software Development

AI agents streamline software migration processes and improve productivity by analyzing, documenting, and translating old code efficiently, with quality assurance agents generating test cases to improve accuracy. Code generation and explanation by AI programmers is also possible.

Manufacturing/Warehouses

Fujitsu’s video analysis AI agent analyzes video in real-time in warehouses and factories, providing alerts and suggestions to managers and workers. This autonomously supports work efficiency and safe workplace creation.

Marketing

Marketing Research and Analysis: Attempts have begun to generate “AI personas (virtual consumers)” from large consumer data and proprietary consumer research databases to verify consumer needs and marketing measure responses. Kirin Beer began verification where they train AI personas with consumer interview voices for new product development, asking AI personas about product concepts and flavors to pseudo-extract customer insights and suppress development period extension.
Advertising Content Creation: Efficiently creates advertising content using LLMs and image/video/audio generation AI.
Marketing Planning: Utilizes “multi-AI agent systems” where multiple AI agents work collaboratively, cooperating to achieve common goals and advance marketing planning.

Consumer AI Agent Usage

“Shopping AI” provides functions like “purchasing agency” that autonomously purchases products or books services based on consumer instructions or set criteria, “selection agency” that chooses optimal options from multiple choices, and “support” that assists information gathering and decision-making.

Specifically, tasks like “automatically selecting and buying the cheapest replacement when a light bulb burns out” that simultaneously perform selection and purchasing are possible.

By 2040, users are predicted to engage in “AI-first” purchasing activities through AI, with companies and services assuming only AI access potentially emerging. A future is envisioned where AI agent consultations, virtual try-ons, and facial recognition payments are incorporated into purchasing processes.

6. Future Expectations

The emergence of ChatGPT Agent represents significant AI technology advancement, with expectations for future evolution and expanded application scope.

Function Integration and General Agent Development

OpenAI positions ChatGPT Agent, which integrates Deep Research and Operator functions, as “a step toward general agents.” In the future, end-to-end automation where AI performs execution (reservations, purchases, etc.) based on research results is expected to be realized.

For example, when consulting companies conduct market analysis for new businesses, they could consistently rely on AI for hypothesis formulation with o1-pro-mode, data collection with Deep Research, and official site research with Operator.

API Provision and Development Ecosystem

OpenAI plans to eventually release APIs for CUA (Computer-Using Agent), Operator’s foundational technology, enabling external developers to incorporate browser automation functions into their apps. With充実したevelopment tools like Responses API and Agents SDK, possibilities are expanding for companies to build proprietary AI agents and realize complex business automation.

Future Technology Trends

Multimodal LLM

Evolution of multimodal AI that processes multiple data formats including text, images, audio, and video integratively will enable AI to handle more complex tasks. This enables capturing complex emotions beyond surface emotional understanding and contributes to long-term information memory.

Evolution of Reasoning Models

Reasoning models that decompose complex problems and discover hidden patterns and intentions by generating long Chains of Thought (CoT) are evolving. This improves generalization through analogy, with expectations for development into AI capable of PhD-level human-like advanced problem-solving.

Large Action Model (LAM)

AI models that execute actions based on user input, where LLMs understand and generate language while LAMs understand human intentions and execute physical or digital operations. This is expected to apply to AI agents, autonomous robots, smart home control, and more.

AGI Realization Outlook

The emergence of generative AI has increased expectations for AGI realization, with views growing that it will appear within the next 10 years.

2024-2026: Stage where AI gradually acquires capabilities necessary for AGI.
2027-2029: Realization of “AGI with autonomy in digital spaces” like OS and metaverse.
2030 and beyond: Possibility of realizing “AGI with autonomy in the real world” where humans live.

While expert predictions vary, SoftBank Group’s Masayoshi Son predicts “AGI surpassing human intelligence will achieve 10 times all human wisdom within 10 years.”

7. Impact on Society and Work

The spread of AI agents is said to bring major transformation to the 21st century comparable to the impact electricity as a general-purpose technology had on economy and management in the 20th century.

Labor Force Changes and New Job Creation

With AI adoption, routine tasks will be replaced by AI, allowing humans to focus on more creative and strategic work.
“Decrease in white-collar workers” and “increase in no-collar workers (creative jobs as new positions supporting professionals wearing collarless clothes)” are predicted.
Emergence of AI assistants and increase in “AI labor” taking on human work are also predicted.
As Japan’s “middle” (young labor population central to work) decreases due to declining birthrate and aging, new organizational forms where AI handles this “middle” while experienced business people handle “entrance” (design) and “exit” (evaluation/execution) portions are thought to enhance overall organizational competitiveness.

For example, situations like 30 humans and 70 AI agents among 100 employees may become realistic.

Management Impact and Organizational Structure Changes

AI’s expanded intelligence including predictive, discriminative, communicative, individualizing, creative, and structuring powers realizes improved decision-making quality, early problem and opportunity detection, and business precision and real-time capabilities.
The “Singularity Enterprise” concept predicts that post-2035, AI will dynamically reorganize business flows in holacracy organizations, with AI agents autonomously taking diverse roles as main actors, causing not only efficiency and advancement of corporate activities but fundamental transformation of organizational structures and business processes.
70% of business leaders respond that humans and AI will become partners complementing and enhancing each other’s capabilities, recognizing that business processes and organizational structures will change significantly toward 2030.

Implications for Human Leap

Technological advancement including AI is suggested to lead to human leaps.

Realization of Longevity Escape

Dr. Aubrey de Grey, central to longevity research, proposed the concept of “longevity escape,” and US inventor Ray Kurzweil stated “modern humans can live to 500 years old if diligent.” Insilico Medicine, BioAge Labs, and others use AI to search for aging substances and treatments, while Retro Bio funded by OpenAI’s Sam Altman and Google-founded Calico also research anti-aging and longevity.

Emergence of Transhumans

Historian Yuval Noah Harari points to the emergence of “transhumans” in his book “Homo Deus,” predicting that some wealthy individuals will become transhumans (Homo Deus) through AI and biotechnology and dominate the majority of current humans.

Impact on Business and Labor

Full-scale spread of AI agents is predicted to semi-automate many intellectual tasks and significantly transform corporate productivity.

Business Efficiency: Many routine tasks including report creation, market analysis, data processing, accounting, and code generation will be streamlined by AI agents. This allows employees to focus on more creative and high-value work.

Democratization of Expertise: Even without advanced research analysis or programming skills, high-quality results can be enjoyed through AI agents, enabling output beyond human and budget constraints.

New Jobs and Skills: Demand for “personnel who can master AI agents” will increase, with AI literacy and agent supervision/evaluation skills becoming important.

Social Impact

AI agents may bring significant changes to human life. They’ll be able to handle tedious browser operations like travel planning/reservations, online shopping, ticket arrangements, information searches, and form filling. In the future, coexistence as “digital partners” where AI understands human intentions, thinks, and acts autonomously is expected.

8. AI Agent Adoption and Future Outlook

AI agents are predicted to fully penetrate society within the next few years.

Japan Adoption Forecast

AI agent adoption in Japanese companies is accelerating, with corporate AI agents estimated to reach approximately 2-9 million units by 2030. This is based on predictions that about 52% of AI-using companies will have adopted AI agents by 2030.

“AI-First” Purchasing Behavior

Around 2030, “AI agent customerization” will advance, and by 2035, purchasing through personal horizontal AI agents will spread, with users engaging in “AI-first” purchasing activities through AI rather than “mobile-first.”

9. User Opinions on X (formerly Twitter)

After the ChatGPT Agent announcement, diverse opinions have been exchanged on X. Here we summarize some of them.

Positive Opinions and Expectations

“Equipped with agent functions that autonomously use browsers, terminals, and APIs, capable of autonomously planning and executing complex tasks lasting over 30 minutes” – surprise and expectations are expressed about its capabilities.

Admiration is expressed for “an agent system integrating Operator+DeepResearch+ChatGPT functions, appropriately combining visual reasoning abilities and textual deep research abilities to autonomously execute various tasks.”

Attention is focused on performance evaluations stating “in economically valuable real challenges that experts spend hours to over 10 hours tackling, such as competitive analysis, financial model creation, and facility selection, it shows results equal to or better than top-class humans in about half the cases.”

Excitement is shared about the possibility of specific instructions like “Look at the calendar and explain about client meetings based on recent news,” “Plan and purchase ingredients for a Japanese breakfast for four,” and “Analyze three competitors and create slides.”

Opinions point out business convenience, stating “As a corporate employee, it’s billions of times easier to get OpenAI’s ChatGPT Agent approved than Manus’s usage review.”

Emotional comments about AI’s autonomous actions include “Watching computers think, plan, and execute feels special somehow.”

Concerns and Issues

While safety is mentioned with “trained to explicitly confirm with users before important or sensitive actions and actively refuse high-risk tasks,” concerns about job impact also exist with comments like “it’s eliminating humans.”

Issues about functional completeness are noted, such as “slideshow format and sophistication may have remaining challenges.”

Dissatisfaction with regional restrictions is expressed: “Available today to Pro, Team, and Plus users outside the EU. Once again, the EU.”

Questions and observations about task execution behavior are shared: “Does it pause after 3 minutes? When it stops, it seems to resume with ‘continue.'”

These opinions demonstrate that while ChatGPT Agent raises great expectations among users, it’s still evolving and sparking various discussions about its social impact.

10. Challenges and Risks in AI Agent Implementation

While AI agents hold great potential, their implementation and adoption involve various challenges and risks.

Technical Challenges

Risk of Hallucination (Misinformation Generation)

With standalone LLM use, the reasoning process is a black box with high hallucination risk. Generated information appears as natural text, potentially overlooking errors. AI agent output also has reported misinformation, link inaccuracies, and information discrepancies, requiring careful human judgment for final trust decisions and decision-making.

Low Planning Accuracy, Complex Multi-step Reasoning, Long Processing Time

Previously challenges for LLM-based AI agents, improvements are progressing through evolution of reasoning models and hardware accelerators. However, complex tasks still may require minutes to tens of minutes of processing time.

Data Quality

AI reliability directly correlates with data quality used for learning, evaluation, and utilization. If referenced data lacks accuracy or currency, it may encourage incorrect output, making quality management of referenced data and its provenance important.

Business and Operational Challenges

Unclear Effective Utilization Methods: “Not knowing effective utilization methods” is cited as Japanese companies’ primary concern regarding generative AI adoption.

Increasing Costs and Unclear Business Value: According to Gartner’s report, over 40% of AI agent projects may be canceled by the end of 2027 due to “increasing costs” or “unclear business value.”

“Agent Washing”: Movements exist to rebrand existing AI assistants and chatbots as “agent-type AI” without substantial agent capabilities, with many companies facing risks of investing in products without substance.

High Costs and Usage Restrictions: Advanced AI agent functions like OpenAI’s Deep Research are currently limited to Pro plan ($200/month), potentially burdensome for individual users and small businesses. Monthly query limits also exist (Deep Research: 120 times/month).

Handling Diverse Corporate Data: Whether AI can handle companies’ diverse and massive data and ensure security becomes a challenge.

Ethical and Social Risks

Accountability: When AI agents make decisions or actions leading to damage, it’s unclear whether responsibility lies with vendors, developers, trainers, users, or their managers. Situation analysis and cause identification processes when problems occur also become challenges.

Bias Possibility: If AI agent training data contains bias, it may lead to unfair, discriminatory, or even illegal results in decisions like employment or lending. If AI agents are designed to receive rewards through biased judgments, problems may rapidly expand.

Data Privacy and Security: AI agent system operation methods, data used for training, and interactions with other systems increase risks of data breaches and information leaks. Stricter governance is needed as internal and external collaboration increases.

Limits of Human Interpretation: As AI agents evolve as “deeply thinking AI,” their output may become too advanced for human interpretation, potentially causing processing overload.

Malfunction Impact Range: In “Agent to Agent” environments where multiple AI agents collaborate, partial malfunctions may spread widely, requiring careful design and testing implementation and advance preparation of error response procedures.

AI and Human Role Division: Rather than entrusting everything to AI agents, appropriate role division is necessary, clarifying the scope humans should follow up on.

X User Concerns

High Pricing: Frequent criticism exists that “$200/month Pro version exclusive” pricing is difficult for individual users and small businesses to afford. At over 300,000 yen annually, investment recovery is often difficult.

Unfair Function Usage Due to Regional Restrictions: Users in UK, Switzerland, and EEA cannot access Deep Research despite subscribing to Pro plans, with dissatisfaction about “restricted functions despite subscription.” This may be influenced by legal and data protection issues, with OpenAI planning to address these progressively.

Anxiety About Future Price Increases: Some users worry “If this becomes standard, will enhanced Deep Research 2.0 cost $500 or $1,000/month?” doubting that OpenAI’s high-function = high-price approach may limit user base.

Misinformation (Hallucination) Possibility: While AI-generated reports are very convenient, the possibility of misinformation like non-existent links or information discrepancies isn’t zero, with voices saying final trust judgments and decision-making must be carefully performed by humans.

Processing Time and Query Limits: Large-scale research may take minutes to tens of minutes of processing time, and monthly query limits (Deep Research: 120 times/month) may be reached surprisingly quickly, making operation focused on high-necessity themes desirable.

These user concerns suggest that AI agents aren’t omnipotent research engines and their use involves realistic constraints regarding cost, availability, and information accuracy.

11. Strategies for Companies to Adapt to the AI Agent Era

AI agentization is indicated as an irreversible trend at the core of the Fourth Industrial Revolution. Whether companies view this change as a threat or opportunity will determine their future and Japan’s future.

Phased Implementation and Know-how Accumulation

First, consider incorporating AI agents in narrow scopes or partially, such as chatbots related to company sites and services, and accumulate utilization know-how.
During know-how accumulation, gradually understand risks arising from AI implementation (legal issues, degree of human intervention and approval, data and model obsolescence).
Like Klarna’s success case, rather than aiming for perfection from the start, starting with small tasks and gradually expanding through trial and error is important.

Building AI-Driven Organizations and Human Resource Development

Actively delegate tasks that can be entrusted to AI agents, with humans shifting to more strategic and creative domains.
Important to have a new management perspective of “managing and nurturing” AI agents rather than treating them as mere tools.
Designing necessary skill sets in AI-utilizing society and acquiring them through daily work and careers is important for future business people. Abilities to master AI agents and coordination abilities for good human-AI agent collaboration will be needed.
As new AI tools and services continuously emerge, continuous AI utilization training is essential to improve abilities to select optimal ones for your company and literacy to critically examine AI output and judge its reliability.

Strengthening Data and Governance

Concerns are rising about information leaks, security risks, copyright infringement possibilities, and inclusion of ethically inappropriate content or bias accompanying AI utilization.
Companies should strengthen AI governance and create organizational foundations to respond to regulations now.
Creating internal guidelines clarifying what information can be input to AI enables safe utilization.

Building “AI Agent Ecosystems”

Future competitive advantage lies not just in individual AI agent performance but in building “AI agent ecosystems” utilizing company operations, data, and know-how.
Successful companies build unique ecosystems combining company data utilization, business process optimization, human coordination systems, and continuous learning/improvement systems.

Conclusion

OpenAI’s announced “ChatGPT Agent” is revolutionary functionality evolving ChatGPT from conversational AI to AI that “autonomously thinks and acts.” This combines Operator’s web operation capabilities, Deep Research’s advanced research abilities, and ChatGPT’s conversational skills and reasoning power, enabling autonomous planning and execution of diverse complex tasks including financial analysis, slide creation, web operations, and data analysis.

Benchmark tests show high performance, with potential to produce results equal to or exceeding humans in some intellectual labor tasks. However, it’s still in early development stages with challenges including misinformation risks, need for user intervention during login/payment, and high usage fees.

OpenAI expects to evolve it into more refined, versatile tools through continuous improvement. With planned API release of Operator’s CUA technology and integration with existing ChatGPT, AI agents are predicted to penetrate broader business and daily life, bringing the arrival of “Year One of AI Agents” that will significantly transform how we work and live.

ChatGPT Agent is an important step in AI’s evolution from mere “conversation partner” or “information search tool” to understanding our instructions, thinking, and autonomously “executing partner” for complex tasks, with expectations for a future where humans can focus on more creative and valuable work.

AI agents are not merely tool evolution but an existence that will significantly impact social structure, work methods, and human possibilities themselves, requiring continued close attention to their developments.