When it comes to reading code (mine or someone else’s, it doesn’t matter; reading is reading), I have a really short attention span. I don’t want to spend a lot of time analyzing the code to figure out what it’s doing. I want to treat the code just like the sun: glance to get a sense of it, and then look away.
This is especially true when debugging. Most of the time you spend debugging is trying to find the cause of the problem. Once you have found the cause and collected your data/evidence, the actual task of fixing the problem is pretty trivial. When I debug, I’d like to instantly know what the code is doing. Any extra time I spend figuring out what the code is doing is really annoying, even if it’s just a few seconds.
A Second Saved Is A Second Earned
Granted, the time we’re talking about is really short, perhaps just a minute or so, but for me, a big part of debugging is maintaining momentum. I treat anything that acts like a speed bump on my Debugging Autobahn as a big problem: it’s hard to get to a cruising speed of 120MPH when you spend a lot of your time changing the tires.
I guess the main reason it annoys me the most is that unclear code is so easy to avoid. If the original programmer thought ahead, even just a little, it would save a lot of effort down the road. Even if I’m the original programmer and I’ll be the only one that will ever see the code, I want to make debugging as easy for myself as possible. Debugging isn’t the most fun part of a project, so when you have to do it, treat it like a guerrilla strike: get in, fix the problem, and get out.
Here are some things I’ve learned over the years that could help you save time. Some might see this as another “coding style” article, but it really isn’t. It’s merely a list of things you can do when writing code to help you in your debugging efforts.
Self-Documenting Code
I’ll start by briefly talking about self-documenting code, since it’s probably the most obvious. You should choose function names that are indicative of the task the function performs. Variables should be named based on what they store. I won’t suggest a naming format here, but all I ask is that you remain consistent to whatever format you use.
I’d also suggest that you err on the side of verbosity. For example, what is easier to understand: memcpy() or memorycopy()? How about this: memcpy() or copymemory()? Brevity doesn’t really get you anywhere except a few less keystrokes, but I’d rather have a better idea about what the function does than trying to decypher a short name.
Good Use of Whitespace
Write code like you would write a book (or blog post). I don’t mean flowery prose or using words that you wouldn’t normally use. I just mean write your code to make it easy to read. The best way to do that is to use whitespace.
Which block of code is easier to read?
m_audio_system = NULL; m_audio_type = AUDIOTYPE_NOTSET; m_current_song_id = -1; for (int i = 0; i < NUM_SONGS; i++) { m_song[i].m_audio_item = NULL; m_song[i].m_channel = -1; m_song[i].m_data = NULL; } for (int i = 0; i < NUM_SFX; i++) { m_sfx[i].m_audio_item = NULL; m_sfx[i].m_channel = -1; m_sfx[i].m_data = NULL; }
or
m_audio_system = NULL; m_audio_type = AUDIOTYPE_NOTSET; m_current_song_id = -1; for (int i = 0; i < NUM_SONGS; i++) { m_song[i].m_audio_item = NULL; m_song[i].m_channel = -1; m_song[i].m_data = NULL; } for (int i = 0; i < NUM_SFX; i++) { m_sfx[i].m_audio_item = NULL; m_sfx[i].m_channel = -1; m_sfx[i].m_data = NULL; }
Notice here that “whitespace” has a pretty broad definition. I don’t just mean blank lines. I also include braces and whitespace in the actual lines of code as well. (I don’t want you to dwell on where the braces are. I’d probably agree with you that if your code style has the { on the same line it would still be just as readable. I’d rather you focus on the general readability of the blocks of code instead.)
Here is an exercise: compare how long it takes you to find the assignments in each block of code. Would you rather scan each line for the = in the first listing or have your eyes instantly focus on the ‘block’ of assignments in the second listing? The code in the second listing almost forces your eyes to look at the assignments first.
Doing this doesn’t take any more typing effort either. It’s just a tab versus a space.
One Statement Per Line
One thing I really dislike when stepping through code is when there is more than one statement per line. Once I come across it, I have to take extra steps to see what the code will actually do. Usually ending with me stepping into the code when I didn’t necessarily want to.
The same can be said for a statement on the same line as a conditional (for example). I have to step into it to see the results of the test because the if could be calling a routine (or a macro or whatever).
Comments and Commenting
Some would argue that comments are vital and that a programmer should comment as much as possible. Others would say that if you have self-documenting code, you don’t need comments. I disagree with both and say that comments and self-documenting code can (and should) happily coexist.
Enhance the Flavor
Here’s something you might not have heard before: comments can lie, don’t trust them. It’s only the code that tells the program how to behave, not the comments. The comments are just a vague notion as to what the code does.
So far, It might sound like I’m totally against commenting; I’m not. What I’m suggesting is for you to complement the code with effective comments. I like to think of comments like salt: a little sprinkle enhances the flavor of the meal. Too much salt and you ruin it. To that end, commenting every line is silly. If the code is clearly understandable, don’t make it less understandable by adding a comment to it.
For example, don’t do this:
// increment the number of trees m_num_trees++;
Why vs. How
Comments should discuss the why, not the how. The source code is already the how. Here is a list of guidelines on what I consider to be effective comments:
Comments can act as reminders
Use comments to jog your memory when you have to implement something later. Just make sure you remove the comments once you do implement it.
Describe an algorithm
If an algorithm is complex, it’s a good idea to give an overview of what the algorithm is supposed to achieve and how it does it (in general terms).
Give signposts
If a routine is multi-purpose, (let’s say for example that it loads the resources for a section of a game, and the resources are graphics, audio, and data), comments can be used to break-up the routine into digestible sections. One could argue that the routine should be split into sub-routines, one for loading just graphics, one for just audio, and another for the data. That is certainly an option if you’re so inclined but my point is that you can separate sections of code using comments.
Comment the choices made
When you optimize code (and I mean really optimize it), sometimes what you’re left with is convoluted and unreadable. Comments can help understand what the process was with how the code got to end up in that state.
This is very different than commenting bad code. If you’re using comments as a crutch, your time would be better spent rewriting the code.
Using Parentheses
I think I’d have to say that the lack of parentheses is probably the biggest time-waster when trying to analyze code. Sure, parentheses aren’t needed all of time, but let me ask you this: why would you waste time on determining if a set of parenthesis were needed or not? Why not skip that hesitation and just add them all the time? Who cares if they’re not needed? They’ll make the code easier to understand. Here is a simple example. Which would you rather read?
new_xpos = getX() + cos(direction) * speed + offset * factor;
or
new_xpos = getX() + (cos(direction) * speed) + (offset * factor);
How about when the operations get really complex? What about for complex conditional statements?
Conditionals
Conditionals are also a good place for speed bumps to hide. It’s perfectly legal for assignments to be made within a conditional, as in the following example:
int some_int; if ((some_int = atoi(...)) < 100)
However, this breaks the “only one statement per line” guideline. It’s much easier to trace if it was rewritten like this:
int some_int = atoi(...); if (some_int < 100)
The reason why it’s easier to trace because now you can see the value in a variable before the condition is checked.
The ?: Operator
The ?: operator is just a shorthand version of an if/else construct and it also breaks the “only one statement per line” guideline. You can use this operator in a number of places where you can’t use a traditional if/else construct (function arguments is probably the most common), but I would advise against using it at all. It is just an if/else in disguise so you have to trace both paths. Also, if you’re not careful and don’t see them when tracing, you could step over it by accident.
switches vs. if/else
Are you ready for a coding horror show? Here is an actual block of code from a commercially released program (no, it wasn’t written by me). It’s about 11 years old and it’s so wrong on so many levels that it would take an entire post to describe everything that’s wrong with it.
switch(((recievecheck[plno] == 2) << 1) + (recievetime[plno] <= LOSTCHECK * SEC)) { case 0: { int counterj = 0; for(counterj = 0;counterj < PLAYER_MAX;counterj++) { if(toInfo[counterj].addr == addr[plno]) { toInfo[counterj].addr = 0; --toMax; } } CheckStart(); } break; case 1: ++recievetime[plno]; break; case 2: case 3: recievetime[plno] = 0; break; }
Don’t ever do this. Ever. Or you will be shot. Repeatedly.
Consistent Bits
The final topic is how to use bits effectively. There are two ways to use bits: mathematically and bitwise operations. I’m not saying to use only one or the other, both have their purposes, but what I am suggesting is that you choose the one that’s appropriate for your needs and stick with it. Don’t mix them.
If you are working in a math environment, always use math operations. Here is an example:
half = some_number / 2;
Compare that to this version:
half = some_number >> 1;
Both give the correct results, but it’s sort-of “old-skool thinking” to believe the >> 1 version is more efficient. Given the maturity of today’s compilers, this isn’t true any more. Which one is instantly understandable?
How about a bitwise example?
some_short = (high_byte << 8) | low_byte;
versus
some_short = (high_byte * 256) + low_byte;
Since the result will most likely be used (or viewed when debugging) as bits in a hex variable, keep the operations as bits to give the result more context.
The Take Away
I hope I have given you some food for thought. It’s really simple to make it easy for yourself (and your co-workers if you’re sharing code) to debug and maintain your code.