Research

I get a daily AI research briefing without opening a browser

ArXiv Watcher·4 min read·

Keeping up with ML research used to mean opening ArXiv, scrolling through 50 paper titles, opening 10 in tabs, skimming abstracts, and bookmarking 3 I'd never read. It took 45 minutes and most of it was noise.

I wanted a 2-minute version: what came out today that's actually relevant to my work, in plain English, on my phone while I drink coffee.

The agent

ArXiv Watcher searches ArXiv by keyword, author, or category. It summarizes abstracts, can break down specific papers in detail, and keeps a running log of everything you've discussed. Think of it as a research assistant that actually reads the papers.

Setting it up

Took me about 7 minutes from start to first message:

1

Find the agent

Head to the Hiregents marketplace and find ArXiv Watcher. Read the skill file if you want to know exactly what it can do. Everything is transparent.

2

Sign in

Click Deploy. You'll get a magic link to your email. No passwords, no account setup. Click the link and you're in.

3

Pick a plan

Hobby ($39/mo) or Pro ($49/mo). Both get you a private server. The difference is the AI model: Hobby runs Gemini Flash, Pro runs Claude Sonnet. Annual billing saves 25%.

4

Connect your keys

Two things needed: an OpenRouter API key (takes 30 seconds to create at openrouter.ai) and a Telegram bot token (message @BotFather on Telegram, type /newbot, follow the prompts). Both get validated in real-time before you can continue.

5

Review and deploy

Check the summary: agent name, plan, email. Hit Deploy. You'll go through Stripe checkout and land on a status page.

6

Wait about 5 minutes

The status page shows a progress circle while your server boots. Docker pulls, security gets configured, the agent installs its tools. You don't need to stay on the page. It'll be ready when you come back.

7

Open Telegram

Find your bot on Telegram (the one you created with BotFather). Send /start. ArXiv Watcher is live.

What the conversation looks like

12:34
AW

ArXiv Watcher

online

Today

What new AI agent papers dropped this week?

12:34
AW

ArXiv Watcher

Found 4 notable papers: 1. "AgentBench v2" — Updated benchmark for LLM agents. Tests planning, tool use, and self-correction. GPT-4o scores 67%, Claude 3.5 scores 71%. 2. "ReAct-Retry" — Adds automatic retry with reflection to ReAct agents. 23% fewer failures on multi-step tasks. 3. "Memory-Augmented Agents" — Proposes episodic + semantic memory for long conversations. Tested on 500-turn dialogues. 4. "ToolFormer 2" — Self-taught tool discovery. Agents learn new API integrations from documentation alone. Want a deeper dive on any of these?

12:34

Tell me more about the memory paper

12:34
AW

ArXiv Watcher

"Memory-Augmented Agents for Extended Interaction" (Chen et al., 2026) Key idea: Two-tier memory — episodic (specific past interactions) and semantic (extracted facts/preferences). The agent decides what to store and when to retrieve. Results: On 500-turn dialogues, memory-augmented agents maintained coherence 3x longer than fixed-context approaches. Most gains came from semantic memory — knowing user preferences mattered more than remembering specific exchanges. Limitation they flag: memory corruption — if the agent stores something wrong early, it compounds. They suggest periodic memory audits but don't solve it. Paper ID: 2603.14892

12:34
Message

What I got

A research briefing that actually fits into my morning. I ask one question, get the papers that matter, then drill into the ones I care about. The agent remembers what I've looked at, so it doesn't repeat itself across days.