The productivity trap: Meet the perils and promises of AI-assisted coding

VIEW SPEECH SUMMARY

Overview of AI Claims and Limitations
- AI tools claim to significantly speed up coding, e.g., GitHub Copilot advertised as 55% faster.
- Foundational studies demonstrating this speedup are limited to very specific scenarios (e.g., Node.js HTTP server in a classroom).
- Early studies did not assess impact on code quality, raising concerns.
- A Microsoft study showed 26% task completion speedup with no negative code quality impact, but code quality was measured only by build success rate—a minimal indicator.
- Another unbiased study from Uplevel found no productivity gains and a 41% increase in bugs with AI-assisted coding, highlighting risks especially from technical debt.

Technical Debt and Code Quality Challenges
- Technical debt (difficult, costly-to-maintain code) may increase due to AI-assisted coding.
- The industry has historically neglected technical debt, but AI acceleration may exacerbate it to a critical problem.
- Developers spend only about 5% of their time writing new code; majority is spent understanding and maintaining existing code.
- AI productivity improvements focused only on new code generation offer limited benefits (~1 hour saved per week).
- Greater potential lies in using AI to understand and improve existing code quality and maintainability.

Using AI for Refactoring: Research and Findings
- Refactoring: improving code design without changing behavior, crucial for maintainability.
- Study tested various large language models (LLMs) on refactoring problematic code.
- AI produced valid code 60-70% of the time but only improved code health in about 55-60% of cases.
- However, preserved correct behavior in only about 30-40% (most attempts broke tests).
- Such high failure rates would be unacceptable from a human developer, yet often tolerated in AI.
- Optimistic approach: if correct refactorings can be automatically identified, nearly half of application technical debt might be fixable automatically.

AI-powered Refactoring Tool Demo and Architecture
- Developed a VS Code extension using a “model selector” that picks the best LLM for a refactoring task.
- The suggested code refactoring is validated through automated tests and code health metrics.
- If validation fails, it retries with another AI model until a valid refactoring is found or attempts exhausted.
- Demonstration on a complex Java codebase (Minecraft server called Glovestone) showed AI could successfully extract responsibilities into methods and improve readability.
- AI excels at naming extracted methods, an area difficult for human developers.
- The tool provides “quick inspection” labels indicating safe-to-apply refactorings.

Enhanced AI Refactoring Performance
- With an automated fact-checking and validation layer, AI refactoring correctness improved from ~40% to ~97-98%.
- This performance exceeds expert human refactoring correctness.
- This approach enables confident application of AI-assisted refactoring even on complicated legacy code.

Key Takeaways and Recommendations
- Focus AI assistance on enhancing code understanding and improving existing code quality to tackle the largest bottleneck in software development.
- Safeguard AI-generated code using validation layers to ensure generation of healthy, maintainable code and avoid technical debt accumulation.
- Healthy codebases enable up to 10x faster feature development and significantly reduce onboarding times.
- Advances in AI-assisted refactoring can amplify human development efforts by automating debt reduction and code clarity improvements.

Actionable Items / Tasks
- When adopting AI coding assistants, implement checks to validate code health and behavioral correctness.
- Use or try the presented VS Code extension for AI-assisted refactoring with validation.
- Educate development teams on the importance of focusing AI on code quality and maintenance, not just new code generation.
- Review and integrate code quality metrics (e.g., Code Health) into development workflows.
- Explore opportunities to automate refactoring of legacy code to reduce technical debt using validated AI tools.
- Refer to provided resources such as research papers, blog posts, and tools linked in the presentation to deepen understanding:
* Blog post with rules to safeguard AI-generated code.
* "Code Red" white paper linking code quality and development time.
* "Refactoring vs Refactoring" study on AI refactoring evaluation.
* VS Code extension available on the Visual Studio Marketplace.
* Code Health metric details and Code Scene tool for code quality assessment.

The productivity trap: Meet the perils and promises of AI-assisted coding

10:40 - 11:10, 27th of May (Tuesday) 2025 / DEV AI & DATA STAGE

As AI accelerates the pace of coding, organizations will have a hard time keeping up; acceleration isn't useful if it's driving our projects straight into a brick wall of technical debt. This presentation explores the consequences of AI-assisted coding, weighing its potential to improve productivity against the risks of deteriorating code quality.

Adam delivers a fact-based examination of the short and long-term implications of using AI assistants in software development. Drawing from extensive research analyzing over 100,000 AI-driven refactorings in real-world codebases, we scrutinize the claims made by contemporary AI tools, demonstrating that increased coding speed does not necessarily equate to true productivity. Additionally, we also look at the correctness of AI generated code, a concern for many organizations today due to the error-prone nature of current AI tools.

Finally, the talk offers strategies for succeeding with AI-assisted coding. This includes introducing a set of automated guardrails that act as feedback loops, ensuring your codebase remains maintainable even after adopting AI-assisted coding.

LEVEL:

Basic Advanced Expert

TRACK:

AI/ML Dev Software

TOPICS:

AI FutureTrends SoftwareEngineering

Adam Tornhill

CodeScene