Anthropic's Compute Conundrum: A Default Decision That's Causing More Harm Than Good
Anthropic has been fumbling lately. They gained a lot of good will when they argued with trump about autonomous spying and weapon control but they’ve been burning it ever since. I cancelled my monthly 20x sub when they decided to not allow me to use my subscription with openclaw. They made the argument that it was for compute optimization, and I can see the strength of that argument considering how horribly optimized the openclaw initial prompt is, but that argument really falls down with them setting 1M as the default model.
Speaking of which, they’re clearly very strained on compute and it feels like that started with making 1M the default. In case you aren’t aware, every single turn you send to the LLM requires sending the entire message history for the session to generate the next turn. Yes caching comes into play, but unambiguously processing a session at 800k session length is much more expensive than at 180k. Not just for them, but also for you, the user; token use grows exponentially as context length grows. If your company is struggling with having enough compute, why make the default 1M? Could that strain have made negotiations with google and other compute providers more difficult?