What is Legacy Code?

Is legacy code any “old” code?
Is it code I can’t fix because it might break?
Is it code without documentation?
Or code without unit tests?
Is it code I can’t throw away because it works?
Is legacy code just any code I didn’t write?

While it’s hard to define legacy code, we know what it feels like to work with. It’s messy, opaque. It’s frustrating, confusing. It doesn’t let us work the way we want to. I know something else, too: I’ve got thousands of lines of it, and they’re not going away any time soon.

So I’ve set out on a mission: to get better with legacy code. To learn how to deal with it and make it manageable, not frustrating, and not confusing. To learn how to clean it up and to clear it up.

But to do that, it’s good to have a definition. And in researching my forthcoming review of “Working Effectively with Legacy Code” (coming soon; a spoiler: it’s not the solution to my mission its title implies), I ran across an insightful comment at Slashdot:

> Legacy code is anything developed under a different process than you’re using now. The only thing that remains constant in the recognition of difficult maintenance is this: “We didn’t plan to maintain it the way we’re maintaining it now. – [devnullkac](http://news.slashdot.org/comments.pl?sid=979493&cid=25196045)

Most of our techniques for dealing with software are at their best when we can rely on their simplifying assumptions. Unit tests simplify our development by quickly verifying existing program behavior. Coding styles simplify the task of reading code. Architectures simplify development by making the dependencies between major modules few and clear. When we can rely on these things, it reduces our mental overhead and makes it easy to maintain.

Legacy code is those parts of the system that don’t conform. *Legacy code is things developed without the simplifying assumptions that you’re using now.*

That’s part of what makes legacy code so frustrating – it’s the wild, untamed spaghetti. Its got bones in – traps and pitfalls that you aren’t used to dealing with or which throw your techniques for a loop. It may even be unforkable (have I stretched the metaphor too much?) – so different that the tools you’d like to bring to bear on it, from automated builds to static analysis, become major projects just to set up.

I like this definition a lot because it encompasses most of the other definitions I’ve heard. Defining legacy code as code without documentation or tests is just stating a preference on simplifying assumptions. Defining it as “old code” or code “I didn’t write” is trying to get at that feeling that things are done differently now, and the code from before is *missing* something. It even covers the definitions which describe legacy code as unknown quantities – code you’re afraid to break or code that has to stay because it’s worked for too long to throw away. These ones define legacy code by it’s complexity, by its lack of even basic simplifying assumptions.

And for me, thinking about legacy code like this casts it in a new light. It is no longer a pit of doom, but a new frontier where I’ll need new strategies and I’ll need to put new spins on old ones. I know I’m starting to pay attention to what assumptions I can reasonably make, and which ones I would most like to make to simplify my tasks. Instead of thinking, “augh! This all needs to be rewritten,” I’m thinking, “What can I do to make this workable, to let me make assumptions and fit it in my head?”

This entry was posted in The Craft and tagged , , . Bookmark the permalink. Both comments and trackbacks are currently closed.