Users often face inefficiencies when performing research and summarization tasks in browsers. This leads to wasted time and increased cognitive load due to context switching and redundant processes.
Pain Points
- High cognitive load due to context switching
- Time-consuming summarization and research tasks
- Redundant LLM calls increasing hidden costs
- Unstable task completion
I've been developing a page-aware AI browser assistant, Browse Bot, for some time now. It’s an agent-like assistant that helps reduce context switch, save time on summarization and research tasks, and so on. I’ve just rolled out a new version and wanted to share with the community 3 key changes that allowed me to improve essential productivity metrics. Maybe it’ll be useful to some of you. 1 - References instead of element search via LLM When parsing a page, each element receives unique references (@1, etc.). The agent clicks on the reference directly, without an additional LLM call to search for the element. This allowed me to remove 3-4 hidden LLM calls per task. 2 - Context compression between steps By using the prepareStep hook (AI SDK v6), the results of old steps are replaced with single-line summaries. Context no longer grows linearly with each step. 3 - A software guardrail after actions After each click, the system injects a verification instruction: The agent must read the page and ensure that the result matches expectations. If not, it stops and notifies the user. **How the metrics for the same task improved:** * Tokens: 65k → 28k (-57%) * LLM calls: \~9 → 6 (-33%) * Task completes correctly: unstable → stable * Hidden costs: existing → non-existing I’d be happy to answer any questions or to share the link to the extension if some of you are willing to try it (available for Chrome and Firefox).
An AI browser assistant designed to improve productivity by reducing context switching and enhancing task completion stability. It uses unique references, context compression, and software guardrails to achieve these goals.