Auditing AI-Built Code: What We Found in a Tax Engine
We audited a 5,385-line AI-generated tax planning engine. Here's what we found — and what it cost to fix.
A client came to us with a tax planning platform built almost entirely by AI. It worked — in demos. In production, it was a different story. They asked us to audit the codebase and tell them what was actually wrong. What we found wasn't surprising, but the scale of it was.
The Monolith Problem
The entire application lived in a handful of massive files. One file was 5,385 lines of untyped JavaScript. No modules, no separation of concerns, no clear data flow. AI code generators are excellent at producing code that runs. They're terrible at producing code that's maintainable. Every function had implicit dependencies on global state, making changes anywhere a risk to everything.
Security Gaps You Can't See in a Demo
The audit uncovered unvalidated user inputs feeding directly into tax calculations, API keys stored in client-side code, and authentication checks that could be bypassed by modifying request headers. None of this shows up in a demo. It only shows up when someone actually looks — or when it's too late. AI-generated code tends to take the happy path. It rarely considers adversarial inputs.
Zero Tests, Zero Confidence
There wasn't a single test in the codebase. For a tax calculation engine. We wrote 336 tests as part of the restructure — unit tests for calculation logic, integration tests for API flows, and regression tests for edge cases the client had already encountered in production. Several of those tests failed on the first run, revealing bugs that had been silently producing incorrect results.
The Fix: Structure, Not Rewrites
We didn't rewrite the application from scratch. We decomposed it. The monolith became typed TypeScript modules with clear boundaries. We added CI/CD, strict linting, and a test suite that runs on every push. The client's team can now make changes with confidence. Total cost: $11K for the audit and restructure. The alternative — discovering these issues in production with real tax filings — would have cost significantly more.
Ready to build your AI workforce?
Get Started →