nedelja, 6. november 2011

Comments are not code

Code reviewImage by Richard Masoner / Cyclelicious via Flickr
I'm a firm believer that the best software documentation is the running code. If the code is well structured and written, it speaks for itself and it does not need any additional documentation. Comments are not code and therefore should not be used where better code organization would suffice.

A misplaced use of comments that I often see while doing code reviews is to use comments to divide a method into logical subunits. For example:

def check_specific_candidate():
    
    # first check if we already have X by any chance
    < 10 lines of code, return if true>

    # Try out if candidate is Y
    < 30 lines of code, return if true>

    # candidate is not Y, try out if it is Z
    < another 30 lines of code, return if true> 

    # construct a list of elements in the candidate
    < another 30 lines of code>

    if len(list_of_elements) > 0:
        # process list of elements for the candidate
        < another 10 lines of code>

This example is based on actual routine in Zemanta code base that is altogehter 140 lines long. Supporting such code is not a nice experience. While comments in this routine do help, they are actually a symptom of a larger problem, i.e. poor code organization. Comments would immediately become redundant, if this routine would be split into logical steps with each step being a separate routine. Let's refactor the above routine as such:

def check_specific_candidate(candidate):

    if _candidate_has_X(candidate):
        return

    if _candidate_is_Y(candidate):
        return

    if _candidate_is_Z(candidate):
        return

    list_of_elements = _get_list_of_elements(candidate)
    if len(list_of_elements) > 0:
        _process_list_of_elements(list_of_elements)

So instead of using comments, this routine is now documented using method names. When you approach such code for the first time, seeing such nice 15-lines long routine is much less stressful than seeing a 140-lines long monster. 
Enhanced by Zemanta

1 komentar:

  1. Couldn't agree more. When I read examples of code I often have to delete the comments to see what is going on better.

    When I am writing code, comments to me is a distraction, s duplicate, unfunctioning (and so untested) "code" that sooner or later gets out of sync with real code. Real code is by default enforced to represent what program does (or it doesn't do), while with comments it's just up go good will and consistency of the person that last time updated something.

    I read once "The need for comments is just a sign you haven't refactored code well enough" and your 140line function is a perfect example.

    In this case compiled languages give you chance to write better code, because you can refactor as much as you want and count on the compiler to inline/optimize/whatever it to make execution normal while at interpreted lanuages you have to "manually inline" sometimes because each function call can represent some performance penalty. I have no idea how this stands with JIT-s..

    OdgovoriIzbriši