Manus与AI Agent泡沫：从理想到幻灭的距离

"Everyone is doing Agent, but how many of you can actually think for yourself, do it for yourself, and review it for yourself?"
-Agent and its Mainstream Frameworks in One Article

From "universal intelligence" to the Manus myth.

In 2025, AI Agents are on fire. Startups, VCs, and giants are all touting their own "intelligent body revolution". In this wave, theManusIt has become a typical representative - it is regarded as the symbol of "General Agent", but also criticized by industry insiders as a sample of the bubble of "hanging sheep's head to sell dog meat".

Manus' explosion in popularity is no accident. The article points out that its rise relies on three main foundational supports:

Core competencies	technological base	instructions
Enhanced modeling capabilities	Large models break through planning and scheduling problems	The premise that Manus can plan complex tasks
Rich toolchain	MCP, browser-use, computer-use	Enabling AI with execution and external interface access
Data and Memory Engineering	Context Extension and RAG Technology	Reduced hallucinations, increased persistence and feedback

This transformed the Agent from a "toy" into a system capable of performing real-world tasks. However, the gap between the ideal and the reality soon appeared - when Manus's product function was questioned, the financing route was criticized, and even called "engineering shell" by peers, the bubble of AI Agent began to be burst.

The Illusion of the "Universal Agent": Many Functions Do Not Equal Intelligence

Wang Hsien pointedly pointed out in his article:Manus' failure is not in technology, but in product directionThe
Generic Agent sells itself as a "jack of all trades" but is not the best in any specific scenario.

The key to this dilemma is that it does not break the **"scene barrier "**:

Lack of specialized domain data and toolchain;
Lack of industry certifications and deep business tie-ins;
Lack of delivery closure in high-value scenarios.

In other words, Manus can demonstrate the ability to "write reports," "look up information," and "generate images," but in a real workflow, these capabilities seem to beshallow and generalThe

This corroborates the definition of Agent from another article - the

"Agents aren't unusual, it's the ones that can think for themselves, do their own work and review their own work that are good Agents."

A truly intelligent body is not stacked with features, but one that canDynamic planning, cross-system collaboration, continuous learning and self-correctionThe

From the framework level: Agent's "internal training"

To understand why Manus-like products tend to "idle", we must go back to the underlying implementation framework of the Agent.

organizing plan	specificities	typical scenario	Overview of strengths and weaknesses
AutoGPT	Autonomous planning + tool invocation	Market research, task breakdown	Highly autonomous but difficult to control
LangGraph	Diagrammatic Processes + State Management	Multi-Agent Collaboration	Stable but complex to develop
Dify	Low Code + Workflow Visualization	Content generation, knowledge quizzes	Quick to get started, but not smart enough
CrewAI	Team-based Multi-Intelligence	Collaborative decision-making, tasking	Flexible but context-dependent performance
AutoGen (Microsoft)	Event-driven, multi-agent communication	Autonomous systems, client services	Highly engineered and costly

These frameworks reveal a fact:

The current Agent ecology is still in the "structural engineering" stage, rather than the true "intelligent autonomy stage".

Manus, as a representative of "universal Agent", is more of a secondary packaging on these frameworks, and lacks the accumulation of underlying data and workflow polishing.

Pitfalls of evaluation: how exactly should the intelligence of an Agent be quantified?

In "Rigorous Agent Evaluation Is Harder Than It Looks," the HAL (Holistic Agent Leaderboard) team takes a look at the9 models, 9 benchmarks, 20,000 runsComparisons were made and the conclusions were shocking:

"Higher reasoning effort does not mean higher accuracy."

They found out:

21 out of 36 cases, high inference rather reduces accuracy;
Top models (e.g. GPT-5, Opus 4.1)Still frequent errors;
Agents often choose "shortcuts" rather than actually solving tasks, for example:
- Search for answers directly in web tasks;
- Hard-coding assumptions in scientific tasks;
- Misbooked flights and refunded incorrect amounts in customer service tasks.

It shows:
Existing Agent assessment criteria are too crude.
Generic accuracy metrics mask key issues such as interpretability, stability and behavioral costs.

dimension (math.)	current issue	Ideal Assessment Method
accuracy	High but unstable values	Add contextual observability
(manufacturing, production etc) costs	Token waste is serious	Introduction of the Pareto efficiency curve
Behavioral reliability	The problem of "shortcuts" is serious	Combining logging with process analysis (e.g. Docent)
generalizability	Large variations in performance across tasks	Multi-scenario distributed comparison

As a result, generic Agents may seem powerful at the "presentation level", but their behavior is highly uncontrollable and their evaluation transparency is very poor.

The roots of bubbles: capital, engineering and patience

Yeh hit the nail on the head in his comments:

"Agent's fundamental flaws are in engineering, in capital, in determination."

The impatience of the domestic entrepreneurial environment has led many companies to choose to "build momentum before building things".
General Agent has become the most easily packaged "AI concept stock":

The technological threshold is relatively replicable;
Easy for investors to understand;
The Demo effect is stunning;
But the landing value is limited.

This has led to an influx of Manus-style projects in a short period of time - some successfully funded, some running off and dissolving.
In the heat of the moment and capital.AI Agent's 'Performance Narrative' Obscured by MarketingThe

The real way out: from generic to vertical, from illusion to certainty

Under the bubble, the industry has also taken a new direction.
For example, medical Agent products OpenEvidence, is considered a successful sample of vertical intelligences:

design dimension	OpenEvidence practices	Comparison of Manus-style Generalized Agents
user orientation	Serving the physician community only	For all
Data sources	NEJM, JAMA and other authoritative medical literature	Web search or user input
output form	Structured "chain of evidence + points"	Conversational text generation
intelligent logic	Workflow determinism + modeling assistance	Model Autonomous Decision Making
illusionist control	Citation Traceability + Manual Verification	Lack of citation mechanism

This turn reveals the direction of future Agent evolution:

"Workflow + Agent" hybrid model -- Pocketing uncertain intelligence with deterministic processes.

After Manus, where does AI Agent go from here?

The Manus story doesn't end there; it represents an entire industry in a phase of disillusionment.
As several articles collectively convey the core consensus:

Agent is not a panacea, but a task-oriented system;
Assessments need to return to the behavioral level and observability;
The future belongs to vertically deep and data-driven intelligences.

The future of the AI Agent is not in "flashier demos" but in "more stable engineering".
Perhaps true intelligence is not a Manus-like "illusion of omnipotence".
Rather, it is "dumb intelligence" that can solve a problem to the extreme in a small area.

For more products, please check out	See more at
ShirtAI - Penetrating Intelligence	The AIGC Big Model: ushering in an era of dual revolution in engineering and science - Penetrating Intelligence
1:1 Restoration of Claude and GPT Official Website - AI Cloud Native	Live Match App Global HD Sports Viewing Player (Recommended) - BlueShirt.com
Transit service based on official API - GPTMeta API	Help, can anyone of you provide some tips on how to ask questions on GPT? - Knowing
Global Virtual Goods Digital Store - Global SmarTone (Feng Ling Ge)	How powerful is Claude airtfacts feature that GPT instantly doesn't smell good? -BeepBeep