- tool-server.mjs: extractElements() scrapes all interactive elements with coordinates
- tool-server.mjs: formatElements() returns numbered list for LLM to read
- tool-server.mjs: click/type now support {index: N} for element-based interaction
- tool-server.mjs: new /api/browser/elements and /api/browser/keypress endpoints
- browser-tool.ts: updated schema with index, key params and elements/keypress actions
- browser-tool.ts: elementsText included in every LLM response so model can see the page
- browser-tool.ts: detailed workflow instructions in tool description
- Enables text-only models (Llama 3.3 etc) to navigate and interact with web pages
- Merge 3 servers into single tool-server.mjs on port 7700
- HTTP API: POST /api/bash, /api/browser/*
- WebSocket: /ws/terminal (xterm.js panel)
- WebSocket: /ws/browser (live browser panel)
- SHARED Playwright instance between LLM browser tool and user panel
- When AI navigates a page, user sees it live in browser panel
- When user clicks in panel, AI tools see the same page state
- Remove standalone terminal-server.mjs (was :7701)
- Remove standalone browser-server.mjs (was :7702)
- Update browser-panel.ts: ws://localhost:7700/ws/browser
- Update terminal-panel.ts: ws://localhost:7700/ws/terminal
- Agent Zero-inspired system prompt with:
- Structured problem-solving methodology (analyse/plan/execute/verify/report)
- Clear tool usage rules (no tools for casual chat)
- Detailed tool descriptions with usage guidance
- Resourceful retry behaviour on failures
- npm run dev starts both vite + unified server via concurrently