interesting. yeah the only reason it’s on cron is because the sentry-github inte...

Dimittri · 2026-02-18T04:08:49 1771387729

totally get the 'token wastage' point—sending noise to an LLM is literally burning money.

but an other maybe bigger cost might be your time reviewing those 'bandaid fixes.' if you're merging only 30%, that means you're spending 70% of your review bandwidth on PRs that shouldn't exist right?

we deduplicate before the claude analysis with the alert context and after based on the rca so we ensure we have no noise in the PRs you have to review

why don't you trust an agent to triage alerts+issues?

nojs · 2026-02-18T07:59:31 1771401571

Yeah. what I find in practice is that since the majority of these PRs require manual intervention (even if minor, like a single follow up prompt), it's not significantly better than just hammering them all out in one session myself a few times per week, and giving it my full attention for that period of time.

The exception is when a fix is a) trivial or b) affecting a real user and therefore needs to be fixed quickly, in which case the current workflow is useful. But yeah, the real step-change was having Claude hitting the Sentry APIs directly and getting the info it needs, whether async or not.

I'd also imagine that people's experiences with this vary a lot depending on the size and stage of the company - our focus is developing new features quickly rather than maintaining a 100% available critical production service, for example.

Dimittri · 2026-02-18T18:21:49 1771438909

Interesting. it makes sense that it depends on the number of alerts you receive. but I’d think that if 70% of the PRs you receive are noise, an AI triager could be useful—if you give it the context it needs based on your best practices. I’m very curious about the kinds of manual intervention you do on PRs when one is required. What does the follow-up prompt look like? Is it because the fix was bad, because the RCA itself was wrong, or because of something else?

nojs · 2026-02-24T03:53:16 1771905196

The most common case would be defensive sentry logging that tracks unexpected LLM API call responses/parsing bugs, which are handled gracefully in the app (so not critical), but that i still want to know about to improve the prompts, response structures, cleaning code, etc.

Claude will typically resolve these in a surface level way when in practice they often require deeper changes (to prompts, routing to different model, more general cleaning code, etc) and it’s hard/impossible to have Claude do these without some input.

Other noise arises from Claude thinking related issues are unrelated and so solving them separately, and also just intermittent infrastructure type issues that are clearly transient being “solved” with some weird code change.