The Delve Collapse and the Problem of Lazy Penetration Testing

A compliance automation startup just had a very bad winter, and the wreckage is worth studing.

In December 2025, a company called Delve accidentally left a Google spreadsheet publicly accessible, one that contained links to hundreds of their clients’ confidential audit reports. It also, rather inconveniently, revealed that Delve had been generating those audit reports themselves, complete with auditor conclusions, test results, and sign-off language, before any actual auditor had looked at a single piece of evidence. They would then hand the finished product to Indian certification mills with US mailbox addresses, who would stamp their names on it and call it a day.

When this was discovered, Delve’s CEO sent clients an email describing the exposing message as “AI-generated” and containing “falsified claims.” The claims turned out to be real, the spreadsheet was real, and the analysis that followed, published by a consortium of affected clients who apparently had both receipts and free time, runs about 30,000 words and reads like a very specific kind of nightmare.

For a compliance product that ran on the promise of speed, the collapse has been impressively thorough. But this piece is not really about Delve. They are a cautionary tale, not the point. The point is what their model reveals about a specific category of corner-cutting that shows up in the pentest world with uncomfortable regularity, and what to actually look for if you want evidence that will hold up when someone looks at it hard.

The Part That Should Concern You

The Delve investigation covers a lot of ground: fake board meeting minutes, fabricated security incident logs, nonexistent MDM systems confidently described in audit reports, the same identically worded SOC 2 conclusions stamped across 259 client reports including a grammatical error about how “there no security incidents.” But one item in particular should catch the attention of anyone who has ever needed a pentest for compliance purposes.

Delve’s platform recommended pentest-tools.com for the security testing control, and their trust pages listed both vulnerability scanning and penetration testing as completed controls regardless of what clients had actually done. One client describes it plainly: “It says we did vulnerability scanning and a pentest, when we only ever did the scan.” Another found out what that meant in practice when an enterprise prospect came asking for the pentest report they had paid for. Delve had told them the pentest-tools.com scan was sufficient to satisfy the requirement, a fact that became clear only when a real security person on the other side of a procurement review started asking questions.

Nothing against pentest-tools.com specifically. It may be a perfectly capable scanner. But a scanner is not a penetration test, and listing one as the other is an audit problem that tends to surface at the worst possible moment.

The certification mills Delve was using were not there to catch any of this. That was the point of using them. They appear to have been chosen because they would stamp whatever was put in front of them.

This Is Not Unique to Delve

What Delve did is an extreme version of something that happens at a lot of pentest firms with far less drama. Scanner-only engagements sold as manual tests, junior staff running automated tools while a senior name appears on the cover page, template reports with your company name dropped into the blanks, findings that read identically to what appeared in someone else’s report last month. None of it requires the audacity Delve had, which is why it keeps happening at firms that would never dream of writing their own audit conclusions.

The tell is usually in the report itself. A real web application penetration test looks like someone spent time in your application making decisions, with findings specific to your code, your logic, your authentication flows, your API design, describing attack chains rather than just the presence of a vulnerability. A finding that says “SQL Injection present in parameter X, exploitable via Y technique, demonstrated to access Z data” is a different document than a finding that says “SQL Injection: High. SQL injection is a vulnerability in which…” followed by a paragraph from a textbook.

The other tell is in the conversation. A vendor who has actually tested your application can answer questions about it, tell you what they tried that did not work, what they escalated to when the obvious paths were blocked, what they would do next if they had another week. A vendor running a scanner tends to give a different kind of answer, which is to say a vague one.

What an Audit-Ready Pentest Actually Requires

SOC 2 auditors vary in how closely they scrutinize pentest evidence, and there are good ones and bad ones. The bad ones let things slide that the good ones would flag immediately, and cheap rubber-stamp auditors exist precisely because there is demand for them. The problem is that your pentest report does not only have to survive your SOC 2 audit. It also has to survive the security questionnaire from the enterprise prospect whose security team actually reads these things, and that is a different and less forgiving audience.

What holds up under scrutiny is a report that was scoped correctly, tested manually by people making real-time decisions about your application, and documented with findings specific to what they found rather than what the template expected them to find, written in a way that a developer can read and act on without needing a translation.

The OWASP Application Security Verification Standard gives this process a structure auditors can verify. More importantly, it forces the testing team to declare what they actually tested, at what depth, and against which controls, which makes it much harder to hand-wave vague coverage into existence.

It also matters who did the testing, and not the company name on the cover but the actual person. A senior tester who has spent years in your vertical is going to find things that a junior running a checklist will not, and will document them in a way that reflects how an attacker would actually approach your system rather than how a compliance template categorizes risks.

A real penetration test is slow in the right places. It involves people making decisions, adjusting when initial paths fail, chaining small weaknesses into larger ones, and documenting how those chains actually work in your environment. That process does not scale as neatly as automated report generation, but it produces evidence that survives scrutiny.

If you want a practical checklist for evaluating whether you are working with a real testing partner or a checkbox shop, our free guide Audit-Proof Your Pentest covers 17 mistakes that will blow your audit and how to avoid them. It is exactly what the Delve clients in that investigation wish someone had handed them before they signed.

The Practical Summary

Delve was selling compliance theater at scale, and their pentest offering was perhaps the clearest example of it, because the substitution of a scanner for a manual test is so direct and so documentable. The broader lesson is not that compliance automation is bad but that speed and price, by themselves, are not reasons to trust a vendor with something that is supposed to represent your actual security posture to the people who most need accurate information about it.

If you are in the market for a web application pentest, ask to see a sample report, ask whether the findings were generated by a person or a tool, ask who specifically will be doing your testing, and ask what happens if your auditor or a prospective customer asks a detailed question about the methodology. If the answers are vague, the report probably will be too, and vague reports have a habit of becoming very specific problems during procurement reviews.

If you want to get the most value out of the process once you have chosen a testing partner, we have written in detail about how to approach that in How to Milk a Pentest for Everything It’s Worth.

Similar Posts