A Story about Working with Legacy Code
This is a story about refactoring a legacy system.
First you have to understand something about the system we were working on. It was a legacy, enterprise web-app written mostly in C++. Numerous consultants wrote much of it back in the 20’th century. At the time, money had been tight, and the most important goal the consultants had had was getting new features delivered. So a consultant would be hired to add features as fast as he could and pad his resume at the same time. For example, one of these consultants apparently wanted to be able to say that he developed a web server. So he did. And when I worked on the system, we were still saddled with it.
Yes, you heard right. Our enterprise web-app used a web server that wasn’t a real web server, but rather was something hacked together by a former consultant. But that’s only the beginning. It also used an app server that wasn’t recognizable as an app server. And interwoven with these were the application model and domain model layers, all kind of mushed together into one indecipherable mass. There was duplicated code everywhere—a sure sign of inefficient, unmanageable design—and each piece of code had at least five different responsibilities—a lack of focus that’s as bad in software design as it is in business. Before working on this system, I thought I had seen unmaintainable legacy code. Ha! If you’ve ever read Working Effectively With Legacy Code, you might have thought Michael Feathers was joking, or making stuff up. But I routinely faced real-life problems as bad as the most terrifying stories Michael Feathers recounts. Whatever code you’re wrestling with, take heart that it probably isn’t as bad as this was. The basic rules every first-year comp-sci student learns: high cohesion and loose coupling— Remember those? Well, our system had no trace of them anywhere.
A good design is like an egg. If you crack an egg into a bowl, you’ll see the yolk in the middle and the white around it. Each of these parts has its own form and function within the egg. Each is highly cohesive. The two parts have a well-defined relationship with each other. Each never intrudes on the other’s space, but they work together to form the whole egg. They are loosely coupled.
Take a fork and whip it through the egg over and over again. Now you have a scrambled egg. That was this system. And my job was to unscramble the egg.
I wanted this responsibility. I’d been pushing for it. You might think this was an instance of “be careful what you wish for.” But I was neither afraid nor overwhelmed, because I knew the secret to unscrambling an egg. The secret is to do it with tweezers, one pinch at a time. Pick up a tiny bit of egg. Is it yolk or white? Yolk? Okay, put it over here. Next tiny bit. Yolk or white? White? Okay, put it over there. Similarly, the secret to unscrambling legacy code is to do it bit by bit.
So the way it worked out, I was in charge. No, it wasn’t on a grand scale. It was just a mini sub-project. I didn’t make a Gantt chart. I did end up posting a burn chart, which I’ll get to in a moment. I was not officially a manager. But I was in charge, for a few weeks. So I gave it my best shot.
The first thing I did was to prepare a small presentation for the rest of the team. I went over how the bits of our system fit into a proper enterprise architecture. And I identified a first step: Take our code that responds to HTTP requests, and refactor it to use a new, well designed IHttpResponse interface—well, new to us anyhow—instead of typing in HTTP response text and pushing it at the open TCP socket. Yes, that’s really what the programmers did. They hard-coded HTTP responses and pushed them at the open TCP socket. And that code was sprinkled throughout the entire system.
I boiled this refactoring task down to a set of techniques we could use. I pulled the general process from Working Effectively With Legacy Code, but I applied it to the specific problem we were facing at that moment. I provided refactoring templates that applied to most of the code that we needed to refactor, so that the other developers could search for unrefactored code and apply the templates in order to refactor it to use IHttpResponse. Yes, there were special cases. But we could handle them as they arose.
When I gave the presentation, a junior engineer had already been working with me. He was refactoring the parts of the system he was intimate with. Afterward, another senior engineer was asked to help out. She picked a module with some pretty hairy refactorings that were all interrelated and had been bugging her. I worked on the rest. I picked a module to refactor and went at it. Then I picked another module.
But how did we know what code we needed to refactor? We searched through the code for a particular function call: the Send() function. Send() was the function that pushed response data up the open TCP socket. We could just search for instances of Send(). Then instead of generating HTTP and calling Send(), now we wanted to generate HTML and use IHttpResponse.
This also made it very easy to chart our progress. Just search for instances of Send(), and find out how many instances we’ve eliminated. I threw together a semi-automated process and did this every day. Then I updated a burn chart, which I posted in our shared hallway. Everybody appreciated seeing the progress, especially my manager.
You can imagine my elation the day I actually deleted the definition of Send() and rebuilt the project, proving that Send() was finally gone from the system, once and for all. I told everyone what I’d just done. All that excitement over one little function that had its tentacles woven throughout the whole system.

Technorati Tags: