Using AI to secure AI

MattSayar | 89 points

I think it's funny that I don't see any findings from either Claude or DataDog that couldn't be detected using static analysis. They're pretty simple code bases and maybe that's why.

I'll pay more attention when they start finding vulnerabilities in commonly used, more complex applications.

bink | 4 hours ago

Currently living through a great litmus test of competency versus luck by company leaders

mmsc | 5 hours ago

We’ve kinda solved the detection of issues. what we still lack is understanding what’s important.

I think an underappreciated use case for LLMs is to contextualize security issues.

Rather than asking Claude to detect problems, I think it’s more useful to let it figure out the context around vulnerabilities and help triage them.

(for better or worse, I am knee-deep in this stuff)

gbrindisi | 3 hours ago

Next: using AI to sue AI.

amelius | 2 hours ago

The quotation is more impactful in the original Latin: Quis custodiet ipsos custodes?

ryao | 3 hours ago

This has already been leading to some incredible profits for security companies like mine.

So please, don’t be too loud about how terrible it is :)

ofjcihen | 4 hours ago

is this the Blackwall from Cyberpunk, kinda reminds me of that.

scarlettadham | 2 hours ago

At this point, fuck it, do it, I'm here for the laughs now.

Let Claude run on your production servers and delete ld when something doesn't run (https://www.reddit.com/r/linux4noobs/comments/1mlveoo/help/). Let it nuke your containers and your volumes because why the fuck not (https://github.com/anthropics/claude-code/issues/5632). Let the vibecoders put out thousands of lines of shit code for their stealth B2B startup that's basically a wrapper around OpenAI and MySQL (5.7, because ChatGPT read online that MERN is a super popular stack but relational databases are gooder), then laugh at them when it inevitably gets "hacked" (the user/pw combo was admin/admin and PHPMyAdmin was open to the internet). Burn through thousands of CPU hours generating dogshit code, organising "agents" that cost you 15 cents to do a curl https://github.com/api/what-did-i-break-in/cba3df677. Have Gemini record all your meetings, then don't read the notes it made, and make another meeting with 5 different people the next week.

It will reveal a bunch of things: which companies are ran by incompetent leaders, which ones are running on incompetent engineers, which ones keep existing because some dumbass VC wants to throw money in the money burning pit.

Stand back, have a laugh. When you're thrust in a circus, don't participate in the clown show.

ohdeargodno | 4 hours ago

who watches the watch man?

johntiger1 | 3 hours ago

According to my company's senior leadership there's nothing the magic dust of AI can't solve. Even problems with AI can be solved by more AI

malfist | 6 hours ago