Manus and the AI Agent Bubble: From Ideal to Disillusionment

"Everyone is doing Agent, but how many of you can actually think for yourself, do it for yourself, and review it for yourself?"
-Agent and its Mainstream Frameworks in One Article


From "universal intelligence" to the Manus myth.

In 2025, AI Agents are on fire. Startups, VCs, and giants are all touting their own "intelligent body revolution". In this wave, theManusIt has become a typical representative - it is regarded as the symbol of "General Agent", but also criticized by industry insiders as a sample of the bubble of "hanging sheep's head to sell dog meat".

Manus' explosion in popularity is no accident. The article points out that its rise relies on three main foundational supports:

Core competenciestechnological baseinstructions
Enhanced modeling capabilitiesLarge models break through planning and scheduling problemsThe premise that Manus can plan complex tasks
Rich toolchainMCP, browser-use, computer-useEnabling AI with execution and external interface access
Data and Memory EngineeringContext Extension and RAG TechnologyReduced hallucinations, increased persistence and feedback

This transformed the Agent from a "toy" into a system capable of performing real-world tasks. However, the gap between the ideal and the reality soon appeared - when Manus's product function was questioned, the financing route was criticized, and even called "engineering shell" by peers, the bubble of AI Agent began to be burst.


The Illusion of the "Universal Agent": Many Functions Do Not Equal Intelligence

Wang Hsien pointedly pointed out in his article:Manus' failure is not in technology, but in product directionThe
Generic Agent sells itself as a "jack of all trades" but is not the best in any specific scenario.

The key to this dilemma is that it does not break the **"scene barrier "**:

  • Lack of specialized domain data and toolchain;
  • Lack of industry certifications and deep business tie-ins;
  • Lack of delivery closure in high-value scenarios.

In other words, Manus can demonstrate the ability to "write reports," "look up information," and "generate images," but in a real workflow, these capabilities seem to beshallow and generalThe

This corroborates the definition of Agent from another article - the

"Agents aren't unusual, it's the ones that can think for themselves, do their own work and review their own work that are good Agents."

A truly intelligent body is not stacked with features, but one that canDynamic planning, cross-system collaboration, continuous learning and self-correctionThe


From the framework level: Agent's "internal training"

To understand why Manus-like products tend to "idle", we must go back to the underlying implementation framework of the Agent.

organizing planspecificitiestypical scenarioOverview of strengths and weaknesses
AutoGPTAutonomous planning + tool invocationMarket research, task breakdownHighly autonomous but difficult to control
LangGraphDiagrammatic Processes + State ManagementMulti-Agent CollaborationStable but complex to develop
DifyLow Code + Workflow VisualizationContent generation, knowledge quizzesQuick to get started, but not smart enough
CrewAITeam-based Multi-IntelligenceCollaborative decision-making, taskingFlexible but context-dependent performance
AutoGen (Microsoft)Event-driven, multi-agent communicationAutonomous systems, client servicesHighly engineered and costly

These frameworks reveal a fact:

The current Agent ecology is still in the "structural engineering" stage, rather than the true "intelligent autonomy stage".

Manus, as a representative of "universal Agent", is more of a secondary packaging on these frameworks, and lacks the accumulation of underlying data and workflow polishing.


Pitfalls of evaluation: how exactly should the intelligence of an Agent be quantified?

In "Rigorous Agent Evaluation Is Harder Than It Looks," the HAL (Holistic Agent Leaderboard) team takes a look at the9 models, 9 benchmarks, 20,000 runsComparisons were made and the conclusions were shocking:

"Higher reasoning effort does not mean higher accuracy."

They found out:

  • 21 out of 36 cases, high inference rather reduces accuracy;
  • Top models (e.g. GPT-5, Opus 4.1)Still frequent errors;
  • Agents often choose "shortcuts" rather than actually solving tasks, for example:
    • Search for answers directly in web tasks;
    • Hard-coding assumptions in scientific tasks;
    • Misbooked flights and refunded incorrect amounts in customer service tasks.

It shows:
Existing Agent assessment criteria are too crude.
Generic accuracy metrics mask key issues such as interpretability, stability and behavioral costs.

dimension (math.)current issueIdeal Assessment Method
accuracyHigh but unstable valuesAdd contextual observability
(manufacturing, production etc) costsToken waste is seriousIntroduction of the Pareto efficiency curve
Behavioral reliabilityThe problem of "shortcuts" is seriousCombining logging with process analysis (e.g. Docent)
generalizabilityLarge variations in performance across tasksMulti-scenario distributed comparison

As a result, generic Agents may seem powerful at the "presentation level", but their behavior is highly uncontrollable and their evaluation transparency is very poor.


The roots of bubbles: capital, engineering and patience

Yeh hit the nail on the head in his comments:

"Agent's fundamental flaws are in engineering, in capital, in determination."

The impatience of the domestic entrepreneurial environment has led many companies to choose to "build momentum before building things".
General Agent has become the most easily packaged "AI concept stock":

  • The technological threshold is relatively replicable;
  • Easy for investors to understand;
  • The Demo effect is stunning;
  • But the landing value is limited.

This has led to an influx of Manus-style projects in a short period of time - some successfully funded, some running off and dissolving.
In the heat of the moment and capital.AI Agent's 'Performance Narrative' Obscured by MarketingThe


The real way out: from generic to vertical, from illusion to certainty

Under the bubble, the industry has also taken a new direction.
For example, medical Agent products OpenEvidence, is considered a successful sample of vertical intelligences:

design dimensionOpenEvidence practicesComparison of Manus-style Generalized Agents
user orientationServing the physician community onlyFor all
Data sourcesNEJM, JAMA and other authoritative medical literatureWeb search or user input
output formStructured "chain of evidence + points"Conversational text generation
intelligent logicWorkflow determinism + modeling assistanceModel Autonomous Decision Making
illusionist controlCitation Traceability + Manual VerificationLack of citation mechanism

This turn reveals the direction of future Agent evolution:

"Workflow + Agent" hybrid model -- Pocketing uncertain intelligence with deterministic processes.


After Manus, where does AI Agent go from here?

The Manus story doesn't end there; it represents an entire industry in a phase of disillusionment.
As several articles collectively convey the core consensus:

  1. Agent is not a panacea, but a task-oriented system;
  2. Assessments need to return to the behavioral level and observability;
  3. The future belongs to vertically deep and data-driven intelligences.

The future of the AI Agent is not in "flashier demos" but in "more stable engineering".
Perhaps true intelligence is not a Manus-like "illusion of omnipotence".
Rather, it is "dumb intelligence" that can solve a problem to the extreme in a small area.

For more products, please check out

See more at

ShirtAI - Penetrating Intelligence The AIGC Big Model: ushering in an era of dual revolution in engineering and science - Penetrating Intelligence
1:1 Restoration of Claude and GPT Official Website - AI Cloud Native Live Match App Global HD Sports Viewing Player (Recommended) - BlueShirt.com
Transit service based on official API - GPTMeta API Help, can anyone of you provide some tips on how to ask questions on GPT? - Knowing
Global Virtual Goods Digital Store - Global SmarTone (Feng Ling Ge) How powerful is Claude airtfacts feature that GPT instantly doesn't smell good? -BeepBeep