AI-Generated Code Ships Fast. So Do Its Bugs.
Why LLM-generated code is a security risk — and how to mitigate it.
I use Copilot. I use Claude. I'm not going to pretend otherwise — they've genuinely made me faster. But here's what I've noticed after months of heavy usage: the code they generate is almost always functional and almost never secure.
That gap is a problem, and it's one that's going to get worse before it gets better.
The "It Works" Trap
When an LLM generates code that runs on the first try, there's a dopamine hit. You paste it in, the tests pass, it does the thing. But "working" and "secure" are two completely different bars.
I asked Claude to write a basic JWT authentication flow last month. The code compiled. It handled tokens. It also:
- Stored the JWT secret in a constant at the top of the file
- Used
HS256without even mentioning thatRS256exists - Didn't validate token expiry properly
- Had no rate limiting on the auth endpoint
Was it broken? No. Would I ship it to production? Absolutely not. But someone less paranoid about security might not catch those issues. And that's the entire problem.
Why LLMs Write Insecure Code
The models aren't stupid — they just learn from what they're given.
Most open-source code on GitHub doesn't follow security best practices. Stack Overflow answers optimize for "here's how to make it work," not "here's how to make it work without opening three attack vectors." The training data reflects the average, and the average is not great.
On top of that:
- LLMs optimize for correctness, not safety. They want the code to compile and do what you asked. Security constraints are rarely part of the prompt.
- They parrot common patterns. If 90% of the auth code on GitHub uses
HS256, that's what you'll get. Even ifRS256is the better choice for your architecture. - They don't understand your threat model. An LLM doesn't know if your app handles financial data, medical records, or just cat memes. It generates one-size-fits-all code.
Real Examples I've Caught
This isn't theoretical. Here's stuff I've personally caught in AI-generated code over the past few months:
SQL Injection in ORM Wrappers
I asked for a Prisma query helper. The generated function took user input and interpolated it directly into a $queryRaw call. Prisma's type-safe queries are great — but $queryRaw bypasses all of that. The LLM chose the raw query because my prompt mentioned "complex query," and it defaulted to the most flexible (and dangerous) option.
Overly Permissive CORS
Asked for an Express middleware setup. Got cors({ origin: '*' }). For a development server, sure. But the comment said "production-ready API server." Wildcard CORS in production is basically an invitation.
Hardcoded Secrets
Multiple times. API keys in source files, database URLs with credentials inline, AWS access keys as string constants. The models seem to treat placeholder values (your-api-key-here) and real-looking values interchangeably, and both end up in code that someone might commit without thinking.
Eval and innerHTML
In frontend code, I've seen dangerouslySetInnerHTML used casually with user input, and eval() in Node scripts that handle external data. The model generates what "works," and sometimes what works is an XSS vulnerability waiting to happen.
The Bigger Systemic Issue
Here's what concerns me long-term: AI coding tools are lowering the barrier to building software. That's mostly good. But they're also lowering the barrier to shipping insecure software, and that's a problem we haven't figured out yet.
A junior developer using Copilot can build a full authentication system in an afternoon. Five years ago, that same developer would have spent a week reading docs, following tutorials, and — importantly — absorbing security context along the way. The learning friction was annoying, but it was also educational.
Now that friction is gone. The code appears. It works. You ship it. And nobody reviews it because "the AI wrote it, it's probably fine."
What I Actually Do About It
I'm not saying stop using AI tools — that would be hypocritical and also dumb. But here's how I've adapted my workflow:
- Treat every AI output like an untrusted PR. Review it line by line. Don't just check if it works — check what it doesn't do. Missing input validation? Missing rate limiting? Missing error handling that could leak stack traces?
- Always prompt with security context. Instead of "write a login endpoint," I say "write a login endpoint that handles rate limiting, uses bcrypt with a work factor of 12, and returns generic error messages to prevent user enumeration."
- Run SAST tools on AI-generated code. Semgrep catches a surprising amount. ESLint security plugins too. If the AI writes it, the scanner should review it.
- Keep a personal checklist. I have a mental list of things LLMs consistently get wrong: auth flows, CORS, input sanitization, error messages that leak internals, missing CSP headers. I check for these specifically every time.
The Actual Problem
We're in this weird transition period where AI makes us 3x faster at writing code but 0x better at writing secure code. The tools will improve — they're already getting better at understanding context. But right now, in 2026, the security responsibility is entirely on you.
Skip that step and you're not "moving fast" — you're accumulating security debt at machine speed.
Enjoyed this read?
Share it with your network.