| Abstract: |
This study compared prompt-engineered and hand-written code across 15 programming tasks in Python, Java, and JavaScript. An experienced developer and a prompt engineer using GPT-4o and Claude 3.5 Sonnet completed identical tasks, evaluated on development speed, functional accuracy, cyclomatic complexity, security, and maintainability. Although the development time with AI-based assistance decreased by around 82%, the code generated by AI showed a 14.7% decreased maintainability, a 61.9% increased cyclomatic complexity, and had no security risks compared with the manually written code. In conclusion, a Pareto-type of problem was revealed. The AI solves 80% of any task rapidly, while the 20% edge cases and architecture/security logic needs a real human being. A hybrid model is proposed wherein developers retain authorship of core architecture and business logic while delegating well-defined implementation patterns to AI tools.
|