My early tests with this are very impressive.

The interesting technical wedge isn’t a bigger model; it’s embedding a browser-driven test harness the agent trusts and iterates against. By controlling the environment (editor, preview, auth, deploys), Replit can let the agent click through UIs, assert behaviors, and refactor until green—then keep going for hours under “Max Autonomy.” That’s cheaper and faster than generic desktop automation precisely because it narrows the domain.