Skip to content
All tags

#bytedance

2 posts
ai deep-dive

Midscene.js: Betting on Pure Vision for Cross-Platform UI Automation

An MIT-licensed open-source UI automation framework from ByteDance (~13k GitHub stars). UI actions rely solely on feeding screenshots to vision-language models (Qwen3-VL / Doubao / Gemini-3 / UI-TARS), with no DOM parsing. A single JS API works across Web / Android / iOS / desktop, and starting from v1.0, the DOM action mode was removed entirely. The trade-off: each step is slower and more token-expensive.

tech project

DeerFlow: ByteDance's Open-Source Super Agent Harness for Long-Running Research Tasks

DeerFlow is ByteDance's open-source Super Agent Harness built on Python 3.12 + LangGraph. It orchestrates long-running tasks through sandboxes, long-term memory, sub-agents, skills, and a messaging gateway. It hit #1 on GitHub Trending in February 2026, now surpassing 63,000 stars, with support for Telegram/Slack/Feishu, Claude Code integration, and multiple search backends.