Vulnerability Finding and PoCs: A Defender’s Point of View — in the era of LLM coding agents

Posted on 2026-04-13

This is an additional perspective to my 2019 post on PoC disclosure, taking into account recent advances in automated vulnerability finding and PoC creation through “AI” — more specifically, generative ML using Large Language Models (LLMs).

Context

The last two years have seen rapid development of LLMs for various use cases. I used to be very negative on most of them, both because of their inherently probabilistic nature (and therefore hard-to-control error rates) and the ecological implications in training and inference due to massive energy consumption. And yes, I still can’t help my feeling of LLMs being so inelegant with the underlying brute-force approach based on their stupid-big parameter sizes. There must be more efficient ways of solving these problems.

Nonetheless, I have been surprised by the rapidly improving code generation quality in the last 6-12 months, and very recently by successes in finding (and maybe fixing) vulnerabilities using coding AI agents.

Further context: I very rarely have a binary opinion on security topics, because — erm — most of the time it’s just hard and complicated and a nuanced discussion is better than sticking to one position and claiming it is the only answer. There are no silver bullets, especially not in information/computer security. Don’t trust anybody who claims to solve all your problems with this new thingie (Blockchain, I am looking at you!).

I am also taking the position of the defender’s side, i.e. building sandboxing, mitigation, and attack surface reduction measures on both an architectural (system) and in-depth (code) level.

Using AI coding agents for identifying potential vulnerabilities is (probably) a good thing

GenAI coding is still prone to produce many security issues, so I’m not yet holding my breath on auto-generated code going into production without manual oversight.

But the opposing side — automating vulnerability finding — is much more promising, because we should be able to verify the output. So if the AI agent(s) produce:

false positives (claimed vulnerabilities that aren’t real), then the PoC simply won’t trigger an exploit, so there’s no reduction in overall security (the only harm being the still massive energy consumption of this brute-force approach…); or
false negatives (fail to find real vulnerabilities), then we’re no worse off than without using these new tools.

However, it seems important that these agents not only produce text output claiming a particular bug/vulnerability, but also produce working PoC code to automatically test if it’s real. If they fail to do that, we just overload human reviewers with false positives in the bug queue. The real step change is therefore to have this outer loop that uses ensembles of AI agents to automatically find potential vulnerability candidates, tries to produce PoC exploit code, tests if it actually triggers security-relevant bugs, and tries to fix the PoC if it doesn’t.

Funnily enough, this leads to the same message as my 2019 post on PoC disclosure: PoC code is positive! My first point back then was exactly that it makes it easy to test if a bug is (still) present in a codebase. This has become so much more important in the era of AI agents because of the pure scale of to-be-expected real vulnerability reports.

I want to finish by pointing out that, with the current state-of-the-art, the best use cases of current GenAI/LLM tools seem to be those that have deterministic verifiers. If it’s possible to automatically check the LLM output and filter out any incorrect result candidates, we get rid of the (badly named) “hallucination” issues. The problem of energy consumption remains and is a real one, though; this will need to be mitigated with much more optimized models for this approach to scale and become ecologically and economically sustainable.