AI Assistants Do Not Make Good Code

Introduction

AI-powered coding assistants churn out code fast, but speed isn’t everything. They lack structural depth, often leading to duplication and unclear class responsibilities. Instead of refining domain models, they generate isolated, surface-level solutions. Engineers, however, should absorb domain knowledge, create meaningful abstractions, and produce richer models with less code.

AI’s Failure in Domain Modelling

Shallow Understanding of Domain Models

AI assistants predict patterns but don’t understand the context. The result? Code that works but is bloated and repetitive. They miss opportunities to introduce domain-driven concepts, instead generating scattered methods that fail to encapsulate business logic.

For example, when asked to generate customer order methods, an AI might create multiple similar functions instead of abstracting common behaviors into a well-structured class. This increases redundancy and makes maintenance harder.

Duplication & Overlapping Responsibilities

AI-generated code frequently duplicates logic across multiple places, adding unnecessary complexity. A human engineer, recognising patterns, would extract shared behaviours into reusable components, making the system more coherent and maintainable.

Imagine an AI assistant generating separate classes for customer validation, order processing, and payment handling. While this might seem fine at first, behaviours often overlap. A skilled engineer would unify related responsibilities to avoid unnecessary fragmentation.

Code Without Concepts = More Maintenance

Good software design isn’t just about functionality—it’s about meaningful abstractions that reduce complexity over time. AI assistants prioritize immediate results over long-term maintainability. Without a solid domain model, logic becomes fragmented, and minor changes require modifying multiple sections of code, increasing the risk of bugs.

How Engineers Should Approach It

Absorb the Domain Knowledge

Great engineering starts with understanding the problem space. Developers should engage with domain experts, study business logic, and identify core concepts. This leads to more meaningful abstractions and intuitive code.

Introduce Richer Concepts & Abstractions

Instead of accepting AI-generated boilerplate, engineers should focus on designing models that encapsulate core behaviors.

For example, rather than separate, redundant order-handling methods, a well-designed system might introduce an “Order” domain object to encapsulate common logic. This reduces duplication and makes the system easier to extend.

Conclusion

AI coding assistants are useful for quick snippets, but they don’t produce maintainable software. They fail to distill domain models, leading to duplication, overlapping responsibilities, and bloated codebases. Engineers must take ownership of design, prioritising deep understanding and meaningful abstractions. The goal isn’t just working code—it’s better code.

The main activities for developers to work in a codebase are the following: make some changes, run tests, package and upload some artifacts, deploy the artifacts to a dev environment, and then perform automated or manual tests against the deployed changes. All these tasks are usually automated.

A developer might run ./gradlew test to execute tests, ./gradlew shadowJar to create an artifact for distribution, docker build and docker push to create and publish a Docker image, ansible or terraform to apply infrastructure changes, and curl for API testing.

These knowledge areas are usually captured in some automation scripts, typically some bash scripts, manual steps in README.md, or other markdown files in the codebase, or in the worst cases, reside in your team members’ brains, waiting to be captured somewhere.

It is common to see code snippets such as ./scripts/do_something.sh followed by some more text, then ./scripts/do_something_else.sh in the README.md.

It can be tedious for developers to read through hundred lines of text and understand the correct usage of the scripts. Sometimes developers have to scan through the bash script to figure out why the provided parameter does not work. Let us all be honest, it takes great effort to become an expert in working with shell script. Also, shell script lacks some modern scripting language features, making it hard to create reusable, maintainable code.


In the last two years, I have experienced those pains in few codebases that I worked on. After a few times of frustration with these scripts, I recalled some good examples I saw in Ruby on Rails projects many years ago. Almost all those tasks above are automated as a rake task. Running rake -T, it will show a list of all automation tasks in a codebase. Each task has a detailed description of its responsibility, expected parameters, etc.

Here are a few benefits of using Rake for task automation, in my opinion.

  • It has extremely easy integration with the operating system. Simply put command within backticks “``“, Rake will invoke the command as a separate OS process. e.g date
  • It is written in the Ruby programming language; developers can create classes and functions for better maintainability.

There are some downsides to using Rake; some Ruby packages (referred to as gem) require OS-specific native dependencies. These dependencies could break during OS updates and cause headaches for developers who have not worked with the Ruby ecosystem.

After shopping around, I found pyinvoke, which provides similar functionality to rake, but it is a Python-based tool. invoke -l will list all tasks in a codebase. Each pyinvoke task is a Python function annotated with @task. Developers define classes and functions and use them in pyinvoke tasks. Currently, we have migrated all of our shell-script-based build scripts to pyinvoke tasks. The introduction of pyinvoke has helped us standardize our deployment process, and more developers feel comfortable creating new automation tasks.

With the concepts outlined in Unified interface for developer and Build Pipeline, we have implemented several pyinvoke tasks featuring a --local flag. This enhancement enables developers to test changes locally without the need to push to a branch, thereby creating a quicker feedback loop.

The same task will be used in Bitbucket Pipeline without the --local flag. This has made our life much easier while dealing with some pipeline failures.

Developer’s experience with pyinvoke

With those tasks automated with Pyinvoke, here is what a developer will perform in their daily work.

  • invoke test to run all unit tests locally;
  • invoke package --version=<version> to create the jar and build a Docker image with a specific version;
  • invoke publish --version=<version> to publish the specific image you just built;
  • invoke deploy --version=<version> --env=<env> to deploy the version to a specific region;
  • invoke smoke-test --env=<env> to run some basic post-deployment validation against your service in an environment. If they forget which task to use, invoke -l will show the full list of existing automation tasks. Developers can also easily create a new one if none of the existing tasks fulfill their needs.

今天在实现 Logging Correlation ID 的功能。理想状态下,我是期望能够在不影响业务逻辑代码的情况下,参照AOP的理念,给Topology的每个processor的增加如下行为:

  • 从header提取CorrelationID
  • CorrelationID 设置到 Mapped Diagnositc Context(其底层用ThreadLocal实现)
  • 在logger的 pattern增加 CorrelationID
  • 清理Mapped Diagnositc Context

这样业务代码不需要做任何改动就可以获得打印correlationID的能力。 然而由于KafakStream的dsl不暴露ConsumerRecord给我们,操作header并不是特别方便。 参见Kafka Record Header的背景以及使用场景。

Zipkin中对KafkaStream的tracing的实现方式与我在前一个项目中做的SafeKafkaStream的实现方式非常类似:通过一个wrapper,实现KafkaStream接口,把各种operation delegate到wrapper例的delegatee并添加额外的行为。

最后采取的思路如下:

  • 使用原始消息的payload中的id作为correlation id,使用一个“知道如何从各种类型的Record中提取correlation id”的CorrelationIDExtractor提取 correlationId
  • 把各个operator的参数, 大多为一个 function,使用 withContext 装饰起来,在装饰后的function里进行 setupcleanup的操作。

这种折衷的方案仍然有如下优点:

  • CorrelationID的获取集中在CorrelationIDExtractor这个一个地方,后续如果KafkaStream有更新对header的支持很容易切换到新的方案。
  • withContext尽量减少了对业务代码的侵入。

Issue

I’ve see many times that developers struggle to diagnosis failing build. “Why the test/deployment passed on my machine but failed in build pipeline?” This might be the most frequent questions developer asked.
Back in the old days, developer can ssh to the build agent, go straight into the build’s working directory, and start diagnosis.
Now, with many pipeline as service such as CireCI, Buildkite , developers have much less access to build server and agent than ever before. They can not perform these kind of diagnosis, without mentioning many drawbacks of this approach.

What are the alternative? One common approach I seen is, making small tweaks and pushing changes fiercely, hoping for one of those many fixes would work or reveals the root cause. This is both insufficient.

Solution

I tend to follow one principal when setup pipeline in a project.

Unified CLI interface for dev machine and ci agent.
Developer’s should be able to run a build task on their dev machine and CI agent if they’re granted the correct permission.

Examples

For example, a deployment script should have the following command line interface

./go deploy <env> [--local]

When this script is executed on a build agent, the script will try to use the build agent role to perform deployment.

When it is executed from a developer’s machine, they would need to provide their user id and prompted for password (potentially one time password) to acquire permission for deployment.

Benefits

There’re many benefits by following this principal:

  • Improved developer experience.
    • The feedback loop of changes are much faster comparing to testing every changes on pipeline.
    • Enabling developer executing tasks on their local machine help them trial new ideas and trouble shooting.
  • Knowledge are persisted.
    • Developer are smart, they would always find some trick to improve their efficiency for trouble shooting. It could be temporarily comment out or add few lines in the script, these knowledge tend get lost if they were not persisted as a pattern.
      The principal would encourage developers to persist these knowledge into build script, this would benefit all developers worked on this project.

Local Optimization and Its Impact:

Local optimization refers to optimizing specific parts of the process or codebase without considering the impact on the entire system or project. It’s essential to remember that software development is a collaborative effort, and every team member’s work contributes to the overall project’s success. While local optimizations might improve a specific area, they can hinder the project’s progress if they don’t align with the project’s goals or create dependencies that slow down overall development. For instance, optimizing a single service without considering its alignment with the entire system can cause unintended bottlenecks. The impacts of local optimization include increased complexity, delayed delivery due to unforeseen consequences, and limited flexibility. Local optimizations can lead to complex and hard-to-maintain code, ultimately slowing down future development. Moreover, an obsession with optimizing a single part can cause delays due to unexpected consequences. Additionally, code that’s over-optimized for specific scenarios might be less adaptable to changing requirements, limiting its usefulness in the long run. To address these issues, it’s crucial to measure the impact of optimizations on the entire system’s performance, rather than just focusing on isolated metrics. Prioritizing high-impact areas for optimization is another key strategy. By doing so, we ensure that our efforts align with the project’s overall success and deliver the most value to stakeholders.

Scoping, Prioritization, and Re-Prioritization:

Clearly defined scopes are essential for effective prioritization. Establishing frequent and fast feedback loops ensures that we can adjust our priorities as we receive new information. When dealing with technical debt, it’s wise to focus on high-impact areas and set clear boundaries for our goals. Breaking down larger goals into smaller milestones allows us to track progress and maintain a sense of accomplishment. Frequent re-prioritization based on newly learned context is a proactive approach. By doing so, we adapt quickly to changes and align our efforts with the evolving needs. It’s not just acceptable; it’s vital for our success. This practice ensures that our work remains aligned with our goals, continuously delivers value, and effectively responds to our dynamic environment.

Considering a Rewrite:

When technical debt reaches a high level, considering a rewrite might be a more efficient solution than extensive refactoring. A rewrite can result in a cleaner, more maintainable codebase, improved performance, and enhanced functionality. However, undertaking a rewrite requires a thorough exploration of effort, risks, and mitigation plans. A well-executed rewrite leverages the lessons learned from past mistakes, incorporates new technologies, and follows modern design patterns.

Prioritizing Simplicity over Flexibility:

Simplicity is a cornerstone of maintainability and readability within our codebase. Clear, straightforward code that follows consistent patterns is easier to understand and maintain. While flexibility might seem appealing for accommodating potential changes, it can introduce unnecessary complexity. Complex code paths and intricate component interactions hinder our ability to make changes efficiently. Prioritizing simplicity sets the foundation for a codebase that remains valuable over time. It ensures that we strike the right balance between adaptability and maintainability while avoiding unnecessary complications.