The Hidden Cost of Bad Cron Jobs: A $15M Token/Day Story

One of our clients had an AI agent monitoring their HubSpot instance for new contacts and engagement changes. Simple enough — check for updates, send a summary, move on. Except the cron job was running every 5 minutes, pulling the entire contact database each time, and feeding it through an LLM for "analysis." The result: 15 million tokens burned per day on a task that should have cost a few thousand.

How It Happened

The agent was built quickly — a prototype that became production without anyone reviewing the implementation. The HubSpot API call had no pagination limits, no date filters, and no caching. Every run fetched all 40,000+ contacts and sent them to Claude for a diff analysis. The prompt was 300 lines of instructions that could have been replaced with a simple timestamp comparison. This is the most common pattern we see: AI agents that work correctly but wastefully.

The Real Cost

At 15M tokens/day across input and output, the API bill was approaching $2,000/month — for a single monitoring job. The client had 22 cron jobs running across their agent fleet. Several others had similar inefficiencies, just at smaller scale. The total waste across the system was roughly $4,500/month in unnecessary API costs. That's $54K/year for work that could be done for under $500/month.

The Fix: Incremental, Not Exhaustive

We replaced the full-database pull with an incremental sync using HubSpot's recently modified contacts endpoint. Added a local cache layer so the agent only processes genuinely new data. Replaced the LLM-based diff with a deterministic comparison — no AI needed for "did this field change?" The LLM now only sees the 5-10 contacts that actually changed, with a focused prompt asking for actionable insights. Same functionality, 99.7% fewer tokens.

Lessons for Agent Builders

Audit your cron jobs. Not just "does it work" but "how much does it cost per run." Use deterministic logic wherever possible — not everything needs an LLM. Add pagination and date filters to every API integration. Cache aggressively. And never let a prototype become production without a cost review. The most expensive line of code in your agent system isn't the one that fails — it's the one that runs successfully, thousands of times, doing far more work than it needs to.