浏览器自动化开源项目，让 AI 真正“上网干活”

"Stop copying and pasting and let the AI find the answers in the web page itself."

If you're still struggling with these things:

Manually scrub dozens of web pages to grab data;
Repeatedly switching between Taobao/Xiaohongshu/thesis station to compare prices;
Trying to get AI to help you work on the web, but all it can do is "talk" ......

Well, you should definitely try the open source project that recently exploded on GitHub - theNanobrowserThe

Less than a week after it went live, it raked in 17,000+ Star, known by developers as "AI-driven browser automation gods".
Its goal is simple:Getting the big models out of the chat box and onto the page and into the hands of the people!The

🤖 What is Nanobrowser?

Nanobrowser is not a regular browser, but a AI-native web automation frameworkThe

You can read into it:

"Fit your big model with arms and legs so that it can walk, click, read, and summarize freely in the real web world."

It is built by the open source community, fusing Multi-Intelligence Collaboration System + Browser Automation EngineIt supports local operation, is completely open source, and is fully compatible with mainstream big models (e.g. DeepSeek, MiniMax, GPT, Claude, etc.).

🛠️ How does it work? Two intelligences working together.

At the heart of Nanobrowser is the tacit cooperation of two AI characters:

1️⃣ Planner

Responsible for "figuring out what to do".
Let's say you type:

"Go to the Hugging Face papers page and look at the first three papers, summarizing the abstracts and sorting them by number of likes."

Planner automatically breaks it down into steps:
✅ Open https://huggingface.co/papers
✅ Read first title, number of likes, summary
✅ Record URL
✅ Repeat three times
✅ Summarize and sort

2️⃣ Navigator

Responsible for "hands-on implementation".
It will:

Open the page realistically in your browser;
Label each button, text box, and image as clickable;
Simulates human actions: clicking, scrolling, typing, reading DOM;
Real-time feedback to Planner on execution results.

the whole processNo manual intervention required, it's like hiring an intern to run errands, take notes, and report back on the whole thing on their own.

🧪 Real-life example: one sentence to automate complex tasks with AI

✅ Case: Automated Crawl Hugging Face First Three Papers

Your instructions.::

please go to https://huggingface.co/papers, browsing the first three papers in order. Record the title, URL, and number of likes, summarize the abstract, and finally summarize them in order of the number of likes.

Operation of Nanobrowser::

Automatically jumps to a web page;
Accurately recognize the DOM structure of each paper;
Read titles, likes, and summaries;
Returns the structured result:

1. OmniVinci (24 Likes)
   Abstract: Open source omnimodal macromodel with enhanced cross-modal alignment via OmniAlignNet ......
2. Skyfall-GS (15 Likes)
   Abstract: Generate high fidelity 3D city scenes based on satellite images ......
3. LightsOut (13 Likes)
   Abstract: Eliminating Lens Flare with Diffusion Modeling ......

take a period of (x amount of time): 2 1/2 minutes.
spend (time or money): only 0.1 yuan(using the DeepSeek API)

If you do this manually, it will take you at least 10 minutes to start and you'll have to open multiple tabs.

💡 What can you do with it?

Nanobrowser's potential goes far beyond paper crawling. It can easily handle the following scenarios:

Price comparison assistant::
"Find waterproof bluetooth speakers on Taobao, Jingdong, and Pinduoduo, and list the 3 cheapest ones within $50."
Public Opinion Monitoring::
"Crawl through the last 24 hours of Little Red Book's notes about 'LTX-2' to summarize user reviews."
Data Analyst::
"Extracting 2025 Q3 Provincial GDP Data from NSO Web Pages to Generate CSVs."
content creator::
"Go to the YouTube Top Tech channel and grab the latest 5 video titles and synopses to help me find inspiration for my picks."
academic research::
"Search arXiv for 'AI video generation' and download the abstract, sorted by citations."

Bottom line: Nanobrowser can take care of all the web tasks that require human eyes and human hands.

🧩 Technical highlights: why is it so smart?

local operation: Data stays local, privacy and security are guaranteed;
Multi-model support: Simply configure the API Key to access any large model;
DOM Perception: Automatic labeling of web elements, AI can "see" buttons, input boxes, forms;
Mandate traceability: Every step of the operation is logged, and failures can be retried and debugged;
Completely open source: Code, documentation, and examples are all publicly available and community-driven for rapid iteration.

GitHub address:
👉 https://github.com/nanobrowser/nanobrowser

🚀 How to get started? 3 steps.

Install Nanobrowser(Windows / macOS / Linux supported);
Configuring Your Big Model API Key(e.g., DeepSeek, MiniMax, OpenAI, etc.);
Enter natural language commands in the sidebar, click Run!

No need to write scripts, no need to know XPath.Talk and you can tell the AI to go online.The

🌟 Write in the end

In the past, AI was a "question and answer machine";
Now, Nanobrowser makes it a "digital employee".

It may not be perfect - complex pop-ups are occasionally misrecognized and dynamically loaded content requires waiting.
But it's unmistakable in its direction:Bringing Automation Back to Intelligence, Making Intelligent Bodies Truly "Actionable"The

If you're tired of repetitive web manipulation
If you want an AI that doesn't just "talk", but "does".
So.Nanobrowser may be the tool you've been waiting for!The

For more products, please check out	See more at
ShirtAI - Penetrating Intelligence	The AIGC Big Model: ushering in an era of dual revolution in engineering and science - Penetrating Intelligence
1:1 Restoration of Claude and GPT Official Website - AI Cloud Native	Live Match App Global HD Sports Viewing Player (Recommended) - BlueShirt.com
Transit service based on official API - GPTMeta API	Help, can anyone of you provide some tips on how to ask questions on GPT? - Knowing
Global Virtual Goods Digital Store - Global SmarTone (Feng Ling Ge)	How powerful is Claude airtfacts feature that GPT instantly doesn't smell good? -BeepBeep