Seven deadly sins of programming – Sin #1

August 3, 2006 at 5:24 pm

So, the time has come for the worst sin.


Just to recap – and so there is one post that lists them all – here are the ones that I’ve covered so far:



Some people have remarked that all of these are judgement calls, and really more a matter of aesthetics than actual sins.


That is true. I didn’t include things like “naming your variables i, j, & k” as sins, because I don’t think that’s a real problem in most of the code I’m likely to have to deal with, and there really isn’t much argument over whether it’s a good idea or not.


It perhaps would have been better to title this series, “Seven things Eric would really prefer that you don’t do in code that he has to work with”, but that is both ungainly and lacking the vitality of a post with the term “sin” in it.


It’s all marketing, you see – or you would if you were actually reading this post, but given my track record on the last six, it’s probably a good idea to cut your losses now and spend your time more productively, like in switching your entire codebase from tabs to spaces (or spaces to tabs…)


When I was a kid, I was fairly interested in WWII. I read a lot of books about it, from general histories about the war, to books on the warfare in the Pacific, to books about the ground war in Europe.


One of the interesting features of the military during that time – one that I didn’t appreciate until much later – was how they balanced the desire for advancement in their officer corps vs the need to only advance the most talented and capable. There were really two schools of thought at the time.


The first school advocated an approach where a lower-ranked officer – say, a colonel – would be promoted to fill a vacancy directly, on the theory that it made the chain of command cleaner, and you’d quickly find out if he had “the right stuff”.


The second group advocated using “field promotions”, in which a colonel would be temporarily promoted to see if he could perform in the job. The theory here was that the service would end up with only the best colonels promoted, and that it was much easier (and better for both the officer and the service) to let a field promotion expire rather than demote an officer already given a full promotion.


Over time, the approach advocated by the second group was borne out as having far better results, and the danger of the first approach was recognized.


Which brings us on our roundabout journey to our final sin:


Sin #1 – Premature Generalization


Last week I was debugging some code in a layout manager that we use. It originally came from another group, and is the kind of module that nobody wants to a) own or b) modify.


As I was looking through it, I was musing on why that was the case. Not to minimize the difficulty in creating a good layout manager (something I did a bit of in a previous life), but what this module does really isn’t that complex, and it has some behavior that we would really like to change.


The problem is that there are at least three distinct layers in the layout manager. I write a line of code that says:


toolbarTable.SetColMargin(0, 10);


and when I step into it, I don’t step into the appropriate TableFrame. I step into a wrapper class, which forwards the call onto another class, which forward onto another class, which finally does something.


Unfortunately, the relation between the something that gets done and the TableFrame class isn’t readily apparent, because of the multiple layers of indirection.


Layers of indirection that, as far as I can tell (and remember that nobody wants to become the owner of this code by showing an any interest in it or, god forbid, actually making a modification to it…), aren’t used by the way we use the layout manager. They’re just mucking things up…


Why is this the #1 sin?


Well, as I’ve been going through the sins, I’ve been musing on how I ranked them. One of the primary factors that I used is the permanence of the sin.


And this one is pretty permanent. Once something is generalized, it’s pretty rare that it ever gets de-generalized, and I this case, I think it would be very difficult to do so.


<Agile aside>


This might be slightly different if there were full method-level tests for the component – one could consider pulling out that layer. But even with that, it would be hard to approach in a stepwise fashion – it could easily turn into one of those 3-hour refactorings that makes you grateful that your source code control system has a “revert” feature.


</Agile aside>


Or, to put it another, fairly obvious way:


Abstraction isn’t free


In one sense this seems obvious – when you develop a component that is used by multiple clients, you have to spend a little extra effort on design and implementation, but then you sit back and reap the benefits.


Or do you?


It turns out that you only reap the benefits if your clients are okay with the generalized solution.


And there’s a real tendency to say, “well, we already have the ListManager component, we can just extend it to deal with this situation”.


I’ve know teams where this snowballed – they ended up with a “swiss army knife” component that was used in a lot of different scenarios. And like many components that do a lot, it was big, complex, and had a lot of hard-to-understand behavior. But developing it was an interesting technical challenge for the developers involved (read that as “fun and good for their careers”…)


The problem came when the team found that one operation took about 4 times as long as it should. But because of the generalized nature of the component doing the operation, there was no easy way to optimize it.


If the operation had been developed from scratch without using the “uber-component”, there would have been several easy optimization approaches to take. But none of those would work on the generalized component, because you couldn’t just implement an optimization in one scenario – it would have to work for all scenarios. You couldn’t afford the dev cost to make it work everywhere, and in this case, even if you could, it would cause performance to regress in other scenarios.


(At this point, I’m going to have to have anybody thinking “make it an option” escorted out of this post by one our friendly ushers. How do you think it got so complex in the first place?)


At that point, you often have to think about abandoning the code and redeveloping in the next version. And in the next cycle, this group *did* learn from their mistakes – instead of the uber-component, they built a small and simple library that different scenarios could use effectively. And it turned out that, overall, they wrote less code than before.


HaHA. I make joke.


What they really did was build *another* uber-component that looked really promising early (they always do), but ultimately was more complex than the last version and traded a new set of problems for the old ones. But, building it was a technical challenge for the developers involved, and that’s what’s really important…


How do you avoid this sin?


Well, YAGNI is one obvious treatment, but I think a real treatment involves taking a larger view of the lifecycle costs of abstraction and componentization.


It that one general component really going to be better than two custom solutions?


(if you didn’t understand the story, look here and and see what rank is above a colonel…)

Seven deadly sins of programming – Sin #2

July 24, 2006 at 5:33 pm

One of the guys on my wing during my freshman year of college was a trust-fund kid. His great-grandfather had done well in the market, and had left some money for each the great-grandkids.


Michael wasn’t paying for his schooling, nor was he paying for his cars, his clothes, his nice stereo, or his ski vacations. But that was okay with us, because he *was* paying for pizza, and he *was* paying for beer. Or his trust fund was.


Everything was fine until spring came around, and Michael got some bad news. His trust fund had been set up to carry him through 4 years of school, but because he spent money so fast, he had burned through all of it in less than two years. He was forced to get a job at the local grocery store to finish the year out, but couldn’t afford tuition and had to leave school and find a job to support himself.


========


It typically shows up in chapter two or three of the programming book. There’s a section titled something like, “What is an object?”, which speaks in flowing terms about the wonderful work of object-oriented development, and uses one of the following examples:



  • Geometric shapes

  • Animals

  • Musical instruments

  • Cephalopods (rare)

In this section, we find out that a square is an example of a polygon, a cheetah is a cat and also a mammal, and so on.


All of this to introduce is to the concept of “is-a”.


We then see an example where we can ask any polygon how much area it covers, tell any mammal to walk, and tell any cat to ignore us while we’re talking to it, all through the wonders of inheritance and virtual methods.


Good taste and a firm grasp of the relative usefullness of this concept would dictate spending 10 or 20 pages explaining these concepts, but most texts significantly exceed that, and most programming assignments spend a fair bit of time on that as well. Kindof like how linked lists rule your life for a month or so.


It’s therefore not surprising that many younger developers think that inheritance is a feature that you should, like, “use” when you write software.


There are a few problems with this.


The first is that “is-a” is, in my experience, a pretty rare relationship between objects. More common is the “looks the same on one dimension but has different behavior across another dimension”.


Good luck implementing this:


public class Monotreme: Mammal
{
    public override GiveBirth()
    {
        // add appropriate implementation here
    }
}


That’s the sort of thing that tends to be non-obvious when you first create an object, but annoying obvious when you’ve bought into the whole inheritance mindset.


That’s not to say that inheritance isn’t useful. It’s just to say that you should understand that a MemoryStream isn’t really a Stream, at least in the true “is-a” sense, and be prepared to deal with it.


The second problem is more philosophical and aesthetic. I recently wrote code like this:


class ClimbList: List<Climb>
{

}


Which seems like a perfectly reasonable thing to do, at least on the surface.


But the surface – or at least the surface area – is a problem. List<T> is a pretty extensive class, and when I defined ClimbList, I was saying that ClimbList does the proper thing when any of the List<T> methods or properties are called.


I’m pretty sure that’s not true. Or at least, I’m not at all sure that it *is* true (I have no tests to support such a belief), but users of ClimbList don’t have any way of knowing that the only methods I’m currently using are Add() and the indexer. Intellisense brings up all of them when I go to call one of the methods I wrote.


Which brings us in a long and roundabout way to our penultimate sin:

Sin #2 – Overuse of Inheritance

So, what should you use if you don’t use inheritance (and I am advocating that you approach it carefully and thoughtfully – it’s a similar decision to adding a virtual method)?


Composition. Make the object a field inside of your object, and then write the forwarders that you need (it’s usually not more than one or two).


And, if you really need inheritance in your design, add it carefully and thoughtfully.

Seven Deadly Sins of Programming – #3

July 18, 2006 at 5:19 pm

I’m sure this is going to be a contentious one. I don’t have a good story about this one, so I’m just going to go right to the sin.


Sin #3 – Overuse of Virtual


There are two schools of thought when it comes to virtual. They are:



  1. Always use virtual

  2. Only use virtual when absolutely necessary

The first group’s argument is that original designer of a class is a poor judge of how the class might ultimately be used. Making all methods virtual allows the users of the class to easily extend/modify the behavior. Non-virtual methods make more work for the user.


(or that’s the argument I’ve heard – please comment if there’s something I missed…)


The argument is true. Virtual does make extensibility easier. The problem with virtual is robustness.


The original designer of a class has a good idea for what the class is about, and the implementation is designed (either explicitly through the tests written with TDD, or more implicitly through the mind of the designer) to support that idea (or contract, if you prefer).


In other words, the designer has an explicit idea about what extension points he wants to support in the class. If virtual is only on those points, then the designer has (or should have) done sufficient testing on those points, and it’s pretty likely that users who extend through those points get the behavior they want. And since only a few methods are virtual, the fact that a specific method is virtual is an important clue to the user that that method *is* an expected extension point.


If virtual is on every method, the user can extend in a lot of ways, but it’s very unlikely that the designer thought of all those ways – or all combinations of those ways – which means that such a user extension is going into uncharted territory. Maybe it doesn’t work when you try it. Maybe it works now, but breaks with the first update when the object behavior changes.


I prefer the “you can only do these 2 things, but I promise that they work” classes to the “you can do any of these 12 things, but they may or may not work” classes.

Seven Deadly Sins of Programming – #4

June 26, 2006 at 5:38 pm

Our next sin is the one that I’ve certainly been prone to.


Long ago my wife and I owned a house east of Renton, Washington (those who know about Renton at that time can probably understand why one would say “east of Renton”).


Like many homes, this one had had various indignities committed on it by the previous owners. One of the most obvious was what can only be described as a “lean-to”, constructed out of surplus Boeing shipping containers and shielding the washer and dryer from the rest of the garage.


After dealing with cold and dirty feet for a few months, we decided to convert the space to a real room, with features such as a floor, a laundry sink, a ceiling that was actually attached to the joists, etc. The framing went quickly, but when it came to the point of finishing the drywall, we ran into a problem.


My wife and I had very different ideas of how smooth a wall should be before it was painted. She wanted it to be mostly smooth, while I spent a fair amount of time with a light at an oblique angle, trying to get it really smooth.


I did the bulk of the drywall finishing myself in that room.


And after it was all painted, it turned out that it didn’t really matter. Her standard was fine.


Which leads us, in the usual roundabout way, to our destination:


Sin #4 – Premature Optimization


Or, in other words, spending time on things that don’t really matter in the end.


It’s happened to all of us. You’re in the middle of writing a class, and you think “a linear search of an array just isn’t going to be fast enough here”. So, you spend some extra time, and you use a hash table, or a tree, or a binary search, or something like that, and when you’re done, you say to yourself, “now *that* isn’t going to cause any problems”.


And then later, when you’re getting close to done, you run your code through a profiler, and you find that a) the array never has more than 3 elements and b) the code is spending a ton of time elsewhere in the code.


You wasted the time that it took you to do the optimization (bad), and you may have ended up with code that is considerably harder to read (worse), depending on how much optimizing you did.


How do you avoid this? I have two bits of advice:



  1. YAGNI came out of XP, and it’s a good dictum to follow. If you are doing TDD, this is easier to do as you are more focused on making one small change and not on “finishing the class”.

  2. Go read what a wise man has said about performance, and work on creating a performance culture in your group.

Seven Deadly Sins of Programming – #5

June 15, 2006 at 6:29 pm

[Several readers pointed out that I made a mistake in phraseology, and didn’t say the thing I meant. Thanks to them for knowing what I wanted to say more tha I did.]


Thanks to all who have commented on the previous sins, and to those who have written their own lists. I have avoided reading them, but I will take a look when (if…) I get done with my list, and I’ll write a summary post linking to them, and likely listing the ones that I would have put on my list if I’d only been a bit smarter and a bit more thoughtful.


Back when I was a pre-teen, I thought I really knew what was going on. I didn’t need my parents advice on what to eat, or who to hang around with, and I *certainly* didn’t need their advice around oral hygiene.


So I didn’t brush my teeth. Oh, I’d rinse my mouth out, but no actual brushing.


This didn’t cause any problems – my teeth were fine when I went to the dentist. No worries.


For a year or so. Then I had to have 4 cavities done in one fall. Big cavities. Cavities that weakened my teeth enough that I’m slowly buying my dentist some nice accessories for his boat, as they start to crack and I get crowns put on.


Code hygiene is also important.


Back when I started coding, my teacher (who I am in debt to for being at least 10 years ahead of his time and setting up a real programming curriculum (no, we didn’t use abacuses…)) made us write down our programs on paper before we were allowed to use computer time. Not only did you have to write it down, you had to show it to him, and if it wasn’t good enough, he’d make you copy it from scratch with the necessary changes.


When I got better, I got introduced to my first text editor. Line oriented, a real improvement. Rather than typing a line over, you could make changes to it.


Then, the fateful day came when I got to use a screen-oriented text editor. You just used the arrow keys, and you could make changes *right in your code*.


Very powerful. And very dangerous, because it supported


COPY AND PASTE


(Cue emphatic and scary music).


Copy and paste is not a sin – it’s actually quite a useful thing, but features like it make it far too easy to do bad things. Need a function that’s almost like the one you have? Copy and Paste. Need to add another conditional into that double-nested loop? Just insert it in there. Need a class that’s almost like what you have? File copy, then search and replace to change the names.


Which leads to too much function complexity and duplication, and to our fifth sin:


Sin #5 – Deferred Refactoring


Once you start deferring refactoring, things get bad. Even if you have good intentions to come back and “clean that up a big”, you often don’t.


That’s why I find agile methods and shared code ownership so interesting. If there are good tests in a codebase, I can find the ugly code that a coworker has written and refactor it into something nicer rather than just trying to get picked to work on “the new codebase”…


Now go brush your teeth, and it wouldn’t hurt you to floss sometimes…

A close escape, and more on inappropriate cleverness

June 2, 2006 at 11:53 am

Monday I went on the 7 hills ride.

Wednesday night, I wrote a long and detailed post about it, and hit the “Post” button.

In a fit of editorial brilliance, my blog server chose that exact moment to experience a problem, and the post vanished into the ether.

Few will quarrel that that event was a net positive for western civilization as a whole.

So, here’s the short version.

HomeDownPauseUpDownUpDownUpDownUpDownUpDownPauseUpDownUpDownStrawberryShortcakeUpHome

Now, that that’s out of the way, there were a few comments on the 6th deadly sin post that I’d like to comment on:

(BobHy) “Inappropriate” is relative to time (as your first story illustrates) and point-of-view (“cui bono”).  Would you agree that cleverness which benefits the user who runs the code (efficiency) or the team who maintains it (maintainability) is appropriate?  And if the clever bits trade off one of these against the other, who gets to decide the direction?  

Inappropriate is obviously a value judgement. Who decides? Well, I do, since it was my post. In general, I’d expect that the team would decide, though in my experience there aren’t that many teams with the sort of culture that allows this.

But, yeah, I agree with this.

(BobHy) OTOH, even when we consider cleverness whose only benefit is to amuse its author, it’s fair to ask whether the author’s happiness is itself a benefit to the user community (“a happy programmer is a cooperative, bug-fixing programmer”) or to the team (“a happy programmer remains on the team and trains his/her successor”).  

An ancient Chinese programmer once said,

I pity the fool who has to maintain or extend the code of a programmer who wrote clever code because it made him happy.

I’ve experienced many cases where a developer writes code that is clever, moves on to another job, and then the people who have to deal with the code have to deal with that cleverness. My current team is dealing with this now, and it’s a big problem for us.

If you want to amuse yourself writing clever code, do it on your own time. But don’t make your team deal with it.

(BobHy) I’m beginning to think “inappropriate cleverness” may be a vacuous term. But there certainly is “incorrect” cleverness, where the claimed benefit of the design is refutable.

You are obviously not a long-term reader (assuming, for the sake of argument, that such a person exists) if you find the presence of a vacuous term in a post of *mine* surprising.

The difference between inappropriate and incorrect seems mostly one of semantics. I chose inappropriate because I think it’s more of a judgement call, while incorrect implies a black and white choice.

(Tomer) It’s funny, I keep running into arguments as to the pros and cons of the sort of “codespace-optimized” loops you mentioned. I’m actually very much for them – they’re usually used where the algorithm is very simple (for example, when writing to a frame buffer and adding a stride value to your target pointer) and tend to make the code “look” a lot more elegant.

I will confess to being somewhat of a code aesthete in my earlier days, looking for more compact expressions. I got started in that back when I was writing games in interpreted basic, and the size of your code could have a big impact on how fast your program ran.

Over the years, writing a lot of code and trying to understand other people’s code, I realized that the #1 most important feature of well-written code is clarity. And for me, the two most important facets of that are:

  1. Predictability (the code in method a looks like the code in method b)
  2. Simplicity

The world where for loops just do looping is a conceptual simplification over the world where they sometimes do additional things. In the first case, I read the for statement once, and then concentrate on the loop body. In the second case, I read the for statement, read the loop body, but have to remember that there’s something else going on that I have to go back and refer to later. When you write that, you’re making me work harder, when you simply could have put “j += stride;” as the last statement in the loop.

That’s why foreach is such a useful construct. It’s not that you type less. It’s that it’s conceptually simple – you know that there’s no funny business going on, and all you have to worry about is the loop body.

(Shahar) We used to have a dev in a previous company I worked at (circa 2000) who had his own special way of doing HRESULT-COM error checking..

Shahar goes on to detail what the practice was, which I agree is bad, but it’s a matter of degree, since all HRESULT handling techniques suck.

*****

Most code is read far more times that it is written. Go read this post by Peter Hallam, one of the C# compiler devs.

Seven Deadly Sins of Programming – #6

May 31, 2006 at 11:14 am

Seven Deadly Sins of Programming – #6


Way back in my early career, I worked on a graphics program written in FORTRAN that was used to plot engineering data.


It ran on Tektronix 4014 terminals.


Our application – known as E.G.G. (Engineering Graphics Generator, one of a whole series of not-very-creative punny names used by the group) – used a menu-based approach. The user would place the cursor (moved by the thumbwheels on the right) over a menu option, hit the spacebar, and then the code would figure out what to do.


The internal architecture of the app was really pretty good. There were, if I recall correctly, 39 different screens, and each item on the screen had an id associated with it. The code would find the right item on the page, come up with a numerical code (menu number * 100 + item number), find that in an array of function pointers, and then call the appropriate function.


It worked fine, and allowed us to have a batch mode as well.


But after I started playing around with it a bit, I noticed that the command dispatch was a bit slow. After you selected an item on a screen, it took at least a couple of seconds for the screen to erase and redraw (this was running on shared Vaxen, so you didn’t get much CPU time, and what you got was fairly slow).


I did some digging, and found that the original developer (no longer with the group) had written the item lookup code using a Fibonaccian search.  This is a search that is similar to binary search, but doesn’t require complex operations such as divide by two or shift right (that alone should tell you the vintage of this search).


Such a search *may* have made sense on earlier systems, but on Vaxes, both the divide by two or shift right are in the hardware. So, we were using a search method optimized for constraints that we didn’t have that nobody in the group understood, in an attempt to be faster than binary search.


But it was a *clever* algorithm.


At this time, we had about 1200 items in our dispatch table. The first thing I did was build sub-index that stored the starting index of the first item in each menu, which meant the search space for a typical menu item went from 1200 items to about 30 items. The second thing I did was switch to interpolation search, which, given the nearly perfect distribution of items numbers within a menu (most menus had item numbers from 1-n, with very few holes), gave an almost perfect search. And it was code that most people in the group could understand.


So, the dispatch time went from a few seconds to about half a second, and I wrote up a “cost savings” in which I multiplied the number of users we had by the amount of time they used the application by the number of menu picks per hour, and figured out that I had saved the company around $30K per year. For which I got a $100 gift certificate.


So, what’s the point of that story?


Well, it makes me look good, which is the primary goal. It also illustrates that cleverness is an over-rated attribute of developers. How many times have you seen:


for (int i = 0, j =bounds; i < upper; i++, j += JUMP_SIZE)
{
    …
}


That’s somebody who is trying to be clever. They should have written:


int j = bounds;
for (int i = 0; i < upper; i++)
{
    …
    j += JUMP_SIZE
}


Which takes us on a long and tortuous trip to


Sin #6 – Inappropriately clever code


What examples of this have you seen?

Seven Deadly Sin of Programing – #7 – Excessive coupling

May 23, 2006 at 11:43 am

(these will be in reverse order…)


#7 – Excessive Coupling


We start out with a simple one.


Coupling is obviously a necessary evil. The Console class, for example, wouldn’t be able to send text through Win32 without being coupled to Win32.


Given that, less coupling is nearly always better that more coupling. A few cases that I see:



  1. Using switch statements rather than polymorphism or interfaces.

  2. Classes that do too much, rather than being focused on a single goal.

What do you think? Do you have more examples?

Random sometimes, not random other times

May 19, 2006 at 11:43 am

From a email:


private void button1_Click(object sender, EventArgs e)
{
   Customer c = new Customer();
   c.Randomize();


   Customer b = new Customer();
   b.Randomize();


   MessageBox.Show(string.Format(
        “object c random number = {0}, object b random number = {1}”,
        c.RandomNumber, b.RandomNumber));
}


public class Customer
{
   private int random = 0;


   public void Randomize()
   {
      Random r = new Random();
      random = r.Next();
   }


   public int RandomNumber
   {
      get { return random; }
   }
}


If I run the above code without debugging it always returns the same random number, but if I step through the code I get different random numbers.


******


This is a variant of a question that comes up fairly often.


When you create an instance of the Random class, the pseudo-random number generator needs to be seeded with an initial value. Some generators always seed with zero, which means you always get the same sequence of numbers. The .NET Random class seeds using the time, which means you get different results from run to run without having to set the seed yourself.


It also means, however, that if you create two Random instances one right after another, the seed value (I think it uses GetTickCount()) hasn’t had time to change, and you get the same sequence. But if you run it in the debugger, enough time passes between the creation of the two instances that you get different seeds and different numbers.


I think the best solution here is to move the Random instance to a static field:


static Random r = new Random();


And then just use that instance of Random from all the Customer instances.

The Seven Deadly Sins of Programmers

May 16, 2006 at 12:53 pm

A while back, I made an offhand comment about something being one of the seven deadly sins of programmers (/programming/developers/software engineering/coding/…).


At the time, I really didn’t have 7 things in mind, but after a little thought, I came up with what I think is a good list.


But before I write the first entry, I’d like you to spend 5 minutes and write down your list of deadly programmer sins. Though 7 is the canonical number of sins, your list can have any number of items. After I’ve gone through my list, I’ll do another post where you can share your list.


[Update: Sorry I wasn’t clearer. Write your list down, and then post it when I’m finished. Or post it now on your own blog if you’d like]