Key Challenges of AI-Assisted Software Engineering

Recently, I published a post on accelerating software engineering with the help of Artificial Intelligence (AI). In that post, I shared my team’s hands-on experience with AI agents, exploring four scenarios where we could easily achieve a significant boost in productivity. In these four scenarios – which included, for example, developing short-lived scripts – software quality was not a primary concern, so we could move faster by relaxing our usual quality standards.

Here, in this follow-up post, I shift focus from these best-case scenarios to explore a more realistic situation where quality should not be compromised: introducing new long-lived functionality into the codebase of a live production system.

Introducing new functionality into our production system brought to light several challenges that may be easily overlooked by teams that are beginning to adopt AI tools. Below, I describe these challenges in detail and outline strategies for addressing them. These strategies will allow you to speed up your development process while keeping quality high in the age of AI.

1. Misunderstanding the system

In an AI-powered IDE (such as Cursor), asking an agent to implement a new feature may require the agent to fully understand and reason about your entire project. Any gaps or misunderstandings will lead to problems. For example, the agent may:

Overlook files that should be modified.
Suggest changes to files that should remain unchanged.
Generate tests for unintended behavior.
Miss tests for expected behavior.

Many factors can lead an agent to misunderstand your system. Some of these factors are intrinsic to the nature of the project and are difficult to address. For example, the system that I describe in my previous post is event-driven. Event-driven systems are inherently decoupled: an event is published and multiple subscribers react, each potentially located in distant and seemingly unrelated parts of the system. Establishing these logical connections by inspecting the code is a difficult task, even for an AI agent.

Other factors, however, are within our control, and we can take proactive action to help the AI agent interpret our system correctly.

1.1. How to mitigate this problem?

Harold Abelson and Gerald Jay Sussman stated in their seminal book “Structure and interpretation of computer programs”:

Programs must be written for people to read, and only incidentally for machines to execute.

Martin Fowler expressed a similar idea:

Any fool can write code that a computer can understand. Good programmers write code that humans can understand.

This way of thinking has positively influenced many software engineers for decades, but it is misaligned with the current technological reality. In the age of AI, machines not only execute code; they care about meaning and intent, making readability relevant for machines as well.

Clear and intent-revealing variable names, combined with strong separation of concerns, a well-designed domain model and a comprehensive test suite (which offers unambiguous examples of the system’s actual behavior) are now more relevant than ever. Writing high-quality code reduces the likelihood that AI agents misinterpret your intentions, making AI agents less likely to make mistakes.

In the AI era, we must write clean, readable code for both humans and machines, as both need to understand it to apply changes safely and effectively.

2. Loss of control

Have you ever engaged in pair programming with a driver who moves at a frenetic pace, jumping between files and changing large chunks of code so quickly that it seems impossible to keep up?

Welcome the cowboy (or cowgirl) driver.

The cowboy driver gives little consideration to the navigator. The only priority is speed and the navigator is an obstacle, not a partner. When I am working with such a driver, I feel that I lose control. I am unsure whether new bugs are being introduced or whether all relevant tests are in place. I have no time to gather my thoughts and offer meaningful suggestions.

AI agents make this situation more likely. The driver will be producing more code in less time, since most of the code will be AI-generated. This leaves the navigator even less time to inspect it, unless the driver is disciplined about code reviews.

Note that this problem can also arise when programming alone. An AI agent is like a pair-programming partner that produces code at great speed. It is tempting not to review the code carefully. When this happens, you lose control and the snowball effect will take over: quality will deteriorate increasingly fast with each passing day.

2.1. How to mitigate this problem?

To mitigate this problem, you need to slow down to regain control. A technical practice that is well suited to this purpose is Test-Driven Development (TDD). TDD encourages progress in tiny steps, writing one test at a time and ensuring each test passes before moving on to the next test.

From my brief experience with AI-powered IDEs, I have found that, when working with AI agents, you can take slightly larger steps. An AI agent can write a test and make it pass in a single step; or it can handle multiple tests at once, passing them in one go. Your level of confidence will guide you when deciding the size of your steps.

Regardless of your choice, review your code carefully and do not skip refactoring. Regular refactoring keeps you close to the code, improves its quality over time, and ultimately lets you move faster in the long run.

3. Weakened learning loop

Software development is inherently iterative. You add a feature to the system, show it to customers, gather feedback, and iterate. At a lower level, you define expected behavior as an automated test, make the test pass, refactor, and iterate by writing the next test.

There are many levels in between, but they all have one thing in common: the learning loop. Each iteration – whether with a customer, a teammate, or yourself – produces valuable new knowledge.

When AI agents enter the picture, the learning loop is weakened. Developers interact less directly with the code and this causes two main problems.

Reduced familiarity with the codebase. When most of the code is generated automatically, developers spend less time reasoning about it. This is especially problematic for new team members, who need hands-on interaction with the code to understand how the system actually works.
Eroded technical skills. Much like forgetting how to divide manually once we start relying on a calculator, writing less code can negatively affect your abilities. Over time, depending heavily on AI can make it harder for you to do technical work without automated assistance. Some people may argue that you will not need these technical skills in the future, but no one can reliably predict how the field will evolve, and sacrificing foundational skills for short-term convenience is a risky trade-off.

3.1. How to mitigate this problem?

It may sound obvious, but you can reap the benefits of interacting with the code by interacting with the code. When you ask an AI agent to generate code, do a thorough code review and do not hesitate to edit the code manually, for example, to add a test that the AI agent missed.

Another great way to stay engaged with the code is through refactoring. It has always been tempting to skip refactoring. Most developers make the code work and move on to the next feature, leaving behind suboptimal solutions. You can set yourself apart by making small improvements whenever you identify an opportunity. Not only does this improve code quality, but it also has a positive impact on your learning.

Further advice

You can also follow these general guidelines to fully leverage AI agents and mitigate the problems described above:

Be specific. The AI agent knows less than you do. Explaining a technical solution in plain English can be challenging, but it is essential to provide as much context as possible. Imagine that you are explaining the problem to a brilliant junior developer. Despite being highly productive, they still need every detail to understand the problem fully.
Write commands and rules. In AI-powered IDEs like Cursor, you can store coding conventions and good practices as rules or reusable commands. By doing so, you reduce errors, maintain consistency across the codebase, and it also allows you to focus on problem solving (instead of repeating the same instructions over and over).

Conclusions

AI agents can accelerate software engineering, but they introduce challenges that, if not carefully managed, can negatively affect quality and the health of the team.

This post outlines three of these challenges and proposes practical ways to address them. Practices such as TDD, refactoring and disciplined code reviews, which promote clean and readable code, are essential to ensuring that AI adds value, rather than becoming a source of technical debt.

AI agents will make you produce bad code faster, if you are careless; but, if you are disciplined, they will help you produce good code just as quickly. AI agents act as amplifiers of your current outputs. Software teams that harness AI wisely will deliver value more effectively, while those that use it unwisely will gradually lose their ability to satisfy customers, slowing down as problems accumulate.

Key Challenges of AI-Assisted Software Engineering