Blog

AI Blog: insight into the frontiers of artificial intelligence, sharing technology and trends!

Gemini 3深夜突袭!力压GPT-5.1,谷歌的AI王座终于坐稳了

谷歌于凌晨三点悄然上线Gemini 3 Pro大模型,未举行发布会。该模型在LMArena以1501分Elo登顶,人类最后考试(HLE)获45.8%、MMMU-Pro达81%、Video-MMMU达87.6%,性能超越GPT-5.1。其100万token上下文窗口支持长内容处理,深度思考能力在ARC-AGI-2测试中创45.1%新高,并推出Google Antigravity智能体平台。用户可通过Gemini应用或Google AI Studio体验。

Gemini 3深夜突袭!力压GPT-5.1,谷歌的AI王座终于坐稳了 Read More "

Grok 4.1低调发布!通用能力全面碾压,情感智能登顶第一

马斯克旗下xAI公司低调发布Grok 4.1,在LMArena排行榜以1483分登顶,并在EQ-Bench3情感智能测试包揽前两名。新模型在创造力、情感互动和协作交互方面实现质的飞跃,用户偏好选择率达64.78%,幻觉率显著降低,已通过、X平台及移动应用全面开放。

Grok 4.1低调发布!通用能力全面碾压,情感智能登顶第一 Read More "

Gemini 3提前亮相!巴菲特305亿重仓背后的AI革命

谷歌Gemini 3虽未正式发布,已通过APP超前点映及第三方平台提前亮相,展示SVG绘制和游戏开发等强大能力。巴菲特体验后重仓Alphabet 43亿美元(约305亿人民币),使其成为伯克希尔·哈撒韦第十大持股。Alphabet股价年内飙升46%,谷歌从AI追赶者加速转向领跑者,AI技术革命获资本强力认可。

Gemini 3提前亮相!巴菲特305亿重仓背后的AI革命 Read More "

GPT-5.1悄然上线,OpenAI终于听懂了用户的心声

OpenAI于11月12日悄然发布GPT-5.1,此次更新摒弃传统性能数据宣传,聚焦用户情感需求。核心升级包括GPT-5.1 Instant(更温暖健谈,支持自适应推理)和GPT-5.1 Thinking(优化思考时间分配),提供八种聊天风格预设(新增Professional、Candid、Quirky),允许微调热情度、简洁度等特征。安全评估新增心理健康与情感依赖维度,部分指标略有回退。付费用户可逐步使用,3个月内支持回退至旧模型,强调AI从工具向懂用户伙伴的转变。

GPT-5.1悄然上线,OpenAI终于听懂了用户的心声 Read More "

Kimi K2 Thinking Suddenly Released! 1 Trillion Parameters Open Source Beast Beyond GPT-5

Dark Side of the Moon releases open source thinking Agent model Kimi K2 Thinking with 1 trillion parameters. Its core breakthrough lies in the fact that it can continuously perform 200-300 tool calls without human intervention to complete complex multi-step tasks. The model adopts INT4 quantization technology to improve generation speed, and reduces computational redundancy by streamlining the architecture, with a training cost of $4.6 million. It outperforms GPT-5 in several benchmarks, including Intelligent Body Capability (τ²-Bench Telecom up to 93%), Integrated Reasoning (HLE up to 44.9%), and Programming Practice (SWE-Bench Verified up to 71.3%). The model is completely open source and commercially free under a modified MIT license.

Kimi K2 Thinking Suddenly Released! 1 Trillion Parameters Open Source Beast Beyond GPT-5 Read More "

LTX-2 blew up! The world's first 4K video generation model with synchronized audio and video, supported by ComfyUI!

LTX-2 is the world's first audio-video synchronized 4K video generation model released by Lightricks, generating 20-second, 50fps HD video with text/image input support. It enables character mouthing and voice synchronization, can run and be deployed locally in ComfyUI, and will be open-sourced in late November 5 years. As a professional-grade authoring tool, LTX-2 makes "turning text into a cinematic short film" a reality.

LTX-2 blew up! The world's first 4K video generation model with synchronized audio and video, supported by ComfyUI! Read More "

LTX-2 blew up! The world's first 4K video generation model with synchronized audio and video, supported by ComfyUI!

LTX-2 is the world's first audio-video synchronized 4K video generation model released by Lightricks, generating 20-second, 50fps HD video with text/image input support. It enables character mouthing and voice synchronization, can run and be deployed locally in ComfyUI, and will be open-sourced in late November 5 years. As a professional-grade authoring tool, LTX-2 makes "turning text into a cinematic short film" a reality.

LTX-2 blew up! The world's first 4K video generation model with synchronized audio and video, supported by ComfyUI! Read More "

KAT-Coder: A New Breakthrough in Racer AI Programming

Racer launched AI programming product matrix KAT-Coder, covering self-developed models, tools and platforms, supporting more than 20 programming languages and multiple types of development tasks. Its open source version, KAT-Dev-72B-Exp, surpassed GPT and Claude in the SWE-bench list with 74.6%. The model is capable of code generation, debugging, optimization, etc., is compatible with mainstream development tools, and has shown strong potential for application in the fields of webpage generation, e-commerce websites, 3D special effects, etc., which signifies the official entry of KAT into the AI programming track.

KAT-Coder: A New Breakthrough in Racer AI Programming Read More "

Manus and the AI Agent Bubble: From Ideal to Disillusionment

Manus, as a representative of the AI Agent boom in 2025, relies on large models, tool chains and memory technology to realize task execution, but due to the lack of deep cultivation of professional scenarios and closed-loop delivery, it exposes the bubble of "universal Agent". The root of the problem lies in the lack of engineering accumulation and capital-driven short-sightedness, which leads to the stacking of functions but limited intelligence. The industry is turning to vertical fields, such as medical Agent OpenEvidence, which emphasizes deterministic process and data-driven, revealing that the future belongs to the path of "dumb intelligence" that is focused, evaluable, and solid on the ground.

Manus and the AI Agent Bubble: From Ideal to Disillusionment Read More "

ChatGPT Atlas: a revolution in AI browsers

OpenAI releases ChatGPT Atlas, the first AI-native browser, which deeply integrates ChatGPT's intelligent capabilities. Its core features include: real-time AI-assisted web content summarization and interaction, intelligent writing optimization, natural language control of browser operations, personalized memory recommendations, intelligent body mode to automate shopping and booking tasks, and real-time text processing for cursor chat. The browser improves browsing efficiency, automates tasks and reshapes the human-computer interaction experience through AI technology.

ChatGPT Atlas: a revolution in AI browsers Read More "

Veo 3.1 vs Sora2: Who is the real king of video generation?

Google's Veo3.1 competes with OpenAI's Sora2 in the field of AI video generation; Veo3.1 has the advantage of precise control and high-quality synchronization of audio and video, which is suitable for the creation of professional long videos, while Sora2 has the advantage of smooth and natural dynamic effects and entertainment, which is more suitable for creative short videos. Both have their own advantages, and the choice depends on the specific application scenarios.

Veo 3.1 vs Sora2: Who is the real king of video generation? Read More "

In-depth Review of Six Mainstream AI Agents: Exploring Product Value and Development Direction

The article reviews six mainstream AI Agent products, Manus, Buckle Space, Lovart, Flowith Neo, Skywork, and Super Magee, and analyzes their market competitiveness in terms of execution capability, trustworthiness, and frequency of use.Lovart, Skywork, and Super Magee excel in their respective verticals, with a total score of 18, while the Generalizers face entry and integration challenges. The article points out that the coexistence of specialization and generalization, deliverability, trust mechanism and entrance integration will become important directions for Agent development.

In-depth Review of Six Mainstream AI Agents: Exploring Product Value and Development Direction Read More "

Cursor MCP Servers Configuration Guide and Cursor Practical MCP Recommendations

MCP (Model Context Protocol) is a protocol that allows large models to interact with external tools and services. Cursor IDE supports AI assistants to invoke tools to perform searches, browse the web, and code operations through the MCP Servers feature. MCP servers can be added through the Settings interface and configured at both the global and project levels.MCP is written in multiple languages and allows the AI to run tools automatically or manually and return results, including images. Recommended resources include Awesome-MCP-ZH, AIbase, and several MCP client tools. Commonly used MCP services such as Sequential Thinking, Brave Search, Magic MCP, etc. enhance AI's ability to think, search, front-end development efficiency, and other features, respectively.

Cursor MCP Servers Configuration Guide and Cursor Practical MCP Recommendations Read More "

Veo 3 in-depth analysis: a landmark breakthrough in Google's AI video generation

In May 2025, Google launched Veo 3, the first to achieve AI audio and video synchronization generation, so that AI video characters can "speak". The model breakthroughs include 4K picture, physical consistency and sound synchronization, etc., using V2A technology to encode video vision into semantic signals, generating matching audio tracks, which are applied to talk shows, live games, concerts and other scenes. Although there are deficiencies in complex action generation, the commercialization prospects are significant, pricing tiering, impact on traditional advertising and film production industry.

Veo 3 in-depth analysis: a landmark breakthrough in Google's AI video generation Read More "

In-depth analysis of Gemma model variants: technological breakthroughs and real-world applications of AI in vertical domains

Google's three newly released Gemma specialization models - MedGemma, SignGemma, and DolphinGemma - represent an important shift in AI models from generality to deep vertical domain adaptation.MedGemma focuses on medical scenarios, providing multimodal image and high-precision text reasoning capabilities; SignGemma supports multi-language sign language translation to help the hearing-impaired community communicate; and DolphinGemma explores synthesizing dolphin speech to promote cross-species communication research. These models provide a new path for the industrialization of AI while improving professional performance and taking into account computational efficiency and ease of deployment.

In-depth analysis of Gemma model variants: technological breakthroughs and real-world applications of AI in vertical domains Read More "

Claude 4: Redefining AI Programming Assistants Comes of Age

Anthropic launches the Claude 4 series, spanning Opus 4 and Sonnet 4 versions, focused on programming and advanced reasoning tasks. at the developer conference, CEO Dario Amodei announced that the series outperforms the competition across the board, leading the way in performance across multiple benchmarks, as well as launching Claude Code and new API features that will drive a paradigm shift in the way AI and development are done. model change.

Claude 4: Redefining AI Programming Assistants Comes of Age Read More "

Manus' new features fully revealed: AI graph generation capability officially on line

Manus goes live with image generation, new users get 1,000 bonus points and 300 daily refills. The platform adopts a deep thinking process and supports multi-tool collaboration and task interaction adjustment. Test cases show that it can accomplish complex image generation, brand design, web deployment and other tasks. The consumption of points is high, the free amount of basic functions is limited, and the paid subscription is divided into three levels. Manus' strengths lie in the understanding of intentions and the execution of the whole process, but there are problems such as slow speed, fluctuating quality and high cost, and there is still room for improvement in the future.

Manus' new features fully revealed: AI graph generation capability officially on line Read More "

OpenAI New Generation Programming Revolution: A Comprehensive Analysis of Codex Intelligentsia

OpenAI launches Codex programming intelligence in May 2025, integrated with ChatGPT and based on the codex-1 model, which performs tasks such as writing code, fixing bugs, running tests, and more, in the cloud. codex supports GitHub integrations, provides verifiable evidence of execution, and scored 72.1% in SWE-Bench testing. it is currently available to Pro, Enterprise, and Team users. Codex is currently available to Pro, Enterprise, and Team users, and in the future will further enhance interactivity and development tool integration to help improve software development efficiency.

OpenAI New Generation Programming Revolution: A Comprehensive Analysis of Codex Intelligentsia Read More "

Google DeepMind AlphaEvolve: The Rise of a Revolutionary AI-Coded Intelligence Body

Google DeepMind has launched AlphaEvolve, an AI coding intelligence capable of writing and optimizing code and making scientific discoveries on its own. The system, which incorporates large language models, evolutionary algorithms and automatic evaluators, has already made several breakthroughs in the field of mathematics, such as improving matrix multiplication algorithms and solving geometric puzzles. Meanwhile, it has achieved significant efficiency gains in Google data center optimization, chip design and AI training, marking a new milestone in the transformation of AI from a tool to an algorithmic innovation partner.

Google DeepMind AlphaEvolve: The Rise of a Revolutionary AI-Coded Intelligence Body Read More "

10-second Figma trick: create Apple's wind flow card web page, quickly improve the design texture

Bento Grids (Apple Style) is a visual design style that is minimalistic, clear and highly organized, commonly used in modern web and mobile app interfaces. The style creates a clean reading experience by presenting content through grid modules that emphasize white space, alignment and consistency. The article also provides specific steps to realize this layout using Figma, and recommends related plug-ins and tools.

10-second Figma trick: create Apple's wind flow card web page, quickly improve the design texture Read More "

NVIDIA Llama-Nemotron: The New King of Open Source Beyond DeepSeek-R1

NVIDIA releases open source Llama-NemotronAI models in 8B, 49B and 253B versions. The flagship LN-Ultra outperforms the 671 billion DeepSeek-R1 in several benchmarks with only 253 billion parameters, while enabling more efficient operation on a single xH100 node. The series' five-stage training process with innovative techniques includes inference switching, hardware-aware optimization and synthetic data training. The positive relationship between model performance parameter scale and performance marks the AI efficiency-first era, and its open source license will accelerate technology adoption.

NVIDIA Llama-Nemotron: The New King of Open Source Beyond DeepSeek-R1 Read More "

Google Gemini 2.5 Pro: a multimodal evolution from video to interactive apps

Google releases Gemini version 2.5 Pro, a major realization in the field of multimodal understanding and code generation. The model outperforms competitor Cl 3.7 Sonnet in programming capabilities, and is particularly adept at transforming video content and hand-drawn sketches into fully functional networks, significantly improving development efficiency. It demonstrates revolution in the areas of web development, review optimization, and educational technology, creating a new paradigm for AI-assisted development.

Google Gemini 2.5 Pro: a multimodal evolution from video to interactive apps Read More "

Bolt.new: A Tutorial Guide to Creating Professional Websites with Simple Descriptions

Bolt.new is an AI-driven development platform where users write code by generating full websites directly from natural descriptions. It supports multi-framework generation of applications, installation of software packages, and enables dynamic code optimization and hand-drawn transformations. Users log in and enter website requirements to automate code, support multiple rounds of dialog optimization and real-time preview, and can deploy or download code. The key is to write detailed prompts that specify the type of site, style and target audience, while incorporating editors to improve accuracy. bolt.new is particularly well suited to prototyping, and can be used in conjunction with specialized tools such as Cursor for more complex projects. The platform is initially free, but will be charged in the future, making it suitable for entrepreneurs, content creators and developers.

Bolt.new: A Tutorial Guide to Creating Professional Websites with Simple Descriptions Read More "

DeepSeek Releases Prover-V2 Model: 671B Parameters to Boost Math Theorem Proving

DeepSeek open-sourced the DeepSeek-Prover2 model designed for math proofs on May 1, containing 671 billion parameters and a 7 billion parameter version. The model uses a combination of recursion and reinforcement learning to perform well in several math tests, such as the MiniFF test with a pass rate of 88.9%. The ProBench dataset released at the same time contains 325 questions to evaluate the model's capabilities. Experiments have found that the Chain of Thought model significantly proves accuracy, and the mini-model even outperforms the model on specific problems. The model has been Hugging Face, supporting a new paradigm in math research.

DeepSeek Releases Prover-V2 Model: 671B Parameters to Boost Math Theorem Proving Read More "

Qwen 3 released: 235B model outperforms R1, Grok and o1 with Apache 2.0 license

Ali Tongyi Qianqian team released a new generation of open source large model Qwen3, topped the global open source model list. The series contains models, the flagship model performance exceeds a number of top models, deployment is significantly reduced. qwen 3 in a number of benchmarks to set a new record, and the innovative introduction of "hybrid reasoning" mode the model supports 119 languages, pre-training data up to 36 token, the community response is enthusiastic, within three hours to get the k GitHub star. The model supports 119 languages, and the pre-training data reached 36 token.

Qwen 3 released: 235B model outperforms R1, Grok and o1 with Apache 2.0 license Read More "

Lovable 2.0: How a Collaborative "Ambient Coding" Platform for Multiple People is Changing Software Development

European AI company Lovable launches 2.0 platform for code-free software development through natural language interaction. New support for multiplayer collaboration, intelligent chat agents, security scanning, significantly lowering the development threshold. It provides free and paid programs for startup teams to rapidly build product prototypes, and has 500,000 monthly users. The platform commercializes the concept of AI-generated "ambient coding" to facilitate digital transformation.

Lovable 2.0: How a Collaborative "Ambient Coding" Platform for Multiple People is Changing Software Development Read More "