AI agents invade observability: snake oil or the future of SRE?

RyeCombinator | 45 points

The issue I see is that this is pretty much the final boss for AI systems. Not because the tasks to do are inherently too difficult or whatever, but the integration of data and quality of that data is so variable that you just can't get something done reliably.

Compare this to codebase AI, where much of the data you need lies in your codebase or repo. Even then, most of these coding tools aren't even close to automating meaningful coding tasks in practice, and while that doesn't mean they can't in the future, it's a long ways off!

Now in the ops world, there's little to no guarantee that you'll have relevant diagnostic data coming out of a system that you need to diagnose it. That weird way you're using kafka right now? The reason for it is told via oral tradition on the team. Runbooks? Oh, those things that we don't bother looking at since they're out of date? ...and so on.

The challenge here is in effective collection of quality data and context, not the AI models, and that's precisely what's so hard about operations engineering in the first place.

phillipcarter | 21 hours ago

I think the past few years have amply demonstrated that they can be both snake oil and the future of SRE. A dim future indeed.

RodgerTheGreat | 18 hours ago

I feel AI will get in the way same as other products have done in the past. Sure, it’ll fit some areas and we’ll hear a happy story here and there but business need to focus on their core competencies and do a job with that well before hoping for a magic solution. Us, the workers will need to clean up the mess…

grugagag | 21 hours ago

The words reliability and LLM do not currently belong in the same sentence.

nineteen999 | 18 hours ago

> If every major APM vendor and dozens of startups release agents in the next year, it will be difficult for customers to tell what’s snake oil or what’s actually useful. One approach, also seen in the financial space, is having open benchmarks for assessing how well agents can answer questions and show domain-specific knowledge.

IME benchmarks, though valuable, don't fully reflect the real world, often only reflecting the easily quantifiable. The best way is to be able to quickly try out an agent to see how it performs on your work environment. Sort of like having a private test set you can try different agents on to see how they perform in the real world quickly.

Disclaimer: I'm building MinusX, a data science agent (github.com/minusxai/minusx)

ppsreejith | 18 hours ago

I cannot for the life of me understand why SRE of all roles would be the one to attempt to use agents for. IMO it's one of the last roles that it would apply, long after core development.

I mean is the AI going to read your sourcecode, read all your slack messages for context, login to all your observability tools, run repeated queries, come up with a hypothesis, test it against prod? Then run a blameles retrospective, institute new logging, modify the relevant processes with PRs, and create new alerts to proactively catch the problem?

As an aside - this is garbage attempt at an article, kinda saying nothing.

zug_zug | 21 hours ago

What is SRE?

1over137 | 19 hours ago

SysAdmin. Even AI needs a hero.

desktopninja | 20 hours ago

I believe that expertise combined with automation can build a strong foundation that can be delegated to AI agents. Presumably not all of them will deliver greater good but let’s not lose hope, it’s still infancy.

ddmma | 17 hours ago

I need to buy a vowel, Pat.

I can't solve the article.

RecycledEle | 21 hours ago