AI browser assistant for improved productivity
An AI browser assistant that reduces context switching, saves time on summarization and research tasks, and improves task completion stability. This solution leverages unique references for elements, context compression, and software guardrails to enhance efficiency.
Key Features
- Unique references for elements to reduce LLM calls
- Context compression between steps to manage context growth
- Software guardrails to verify actions and ensure task stability
- Cross-browser compatibility
Related Problems (1)
Sources (1)
I've been developing a page-aware AI browser assistant, Browse Bot, for some time now. It’s an agent-like assistant that helps reduce context switch, save time on summarization and research tasks, and so on. I’ve just rolled out a new version and wanted to share with the community 3 key changes that allowed me to improve essential productivity metrics. Maybe it’ll be useful to some of you. 1 - References instead of element search via LLM When parsing a page, each element receives unique references (@1, etc.). The agent clicks on the reference directly, without an additional LLM call to search for the element. This allowed me to remove 3-4 hidden LLM calls per task. 2 - Context compression between steps By using the prepareStep hook (AI SDK v6), the results of old steps are replaced with single-line summaries. Context no longer grows linearly with each step. 3 - A software guardrail after actions After each click, the system injects a verification instruction: The agent must read the page and ensure that the result matches expectations. If not, it stops and notifies the user. **How the metrics for the same task improved:** * Tokens: 65k → 28k (-57%) * LLM calls: \~9 → 6 (-33%) * Task completes correctly: unstable → stable * Hidden costs: existing → non-existing I’d be happy to answer any questions or to share the link to the extension if some of you are willing to try it (available for Chrome and Firefox).