The Dangers of Hidden Complexity

One of the things that I think is very dangerous in today’s software development world in hidden complexity.  Hidden complexity is best demonstrated by something that all MFC developers are familiar with – CString.  And I present a simple scenario to help with the demonstration.

Imagine that you are interviewing someone for a senior developer position at your company.  You have been handed a standard Q&A list from HR which includes some code-related questions that requires typed answers.  Questoin #3 seems simple enough:  How would you format a random int value for display in a Win32 message box? Each question has a 15 minute time limit for the answer, so after presenting this question to the developer, you decide to go and get some coffee.

When you return 10 minutes later you are surprised to see that the developer is still coding his answer!  They complete a couple of minutes later and present you with the following code: 

void    Exam1( void )                      

{   

    long    lTheValue = rand();

    int        iCount = 0;

 

    if( !( rand() % 3 ) )                  

    {                                      

        lTheValue = -lTheValue;            

    }

    unsigned long    lValue = 0;     

    int                iDigitVal = 0;

    if( lTheValue < 0 )              

    {

        ++iCount;                    

        lValue = (unsigned long)(-(long)lTheValue);

    }

    else                                          

    {

        lValue = lTheValue;                       

    }

    do

    {

        iDigitVal = ( lValue % 10 );              

        lValue /= 10;                             

        ++iCount;                                 

   

    }    while( lValue > 0 );                     

    char    *cpBuf = new TCHAR[ iCount + 1 ];

    char    *cpCursor = cpBuf;               

    char    *cpFirstDigit = NULL;            

 

    if( lTheValue < 0 )                      

    {

        *cpCursor++ = ‘-‘;                   

        lValue = (unsigned long)(-(long)lTheValue);

    }

    else                                     

    {

        lValue = lTheValue;                        

    }

    cpFirstDigit = cpCursor;                       

 

    do

    {

        iDigitVal = ( lValue % 10 );               

        lValue /= 10;                              

        *cpCursor++ = char( iDigitVal + ‘0’ );     

   

    } while( lValue > 0 );                         

 

    *cpCursor– = _T( ‘\0’ );     

 

    do

    {

        char cSwap = 0;                            

       

        cSwap = *cpCursor;                         

        *cpCursor– = *cpFirstDigit;               

        *cpFirstDigit++ = cSwap;                 

 

    } while( cpFirstDigit < cpCursor );          

 

    ::MessageBox( NULL, cpBuf, _T( “The Value Is” ),

            MB_OK );                             

   

    delete [] cpBuf;                             

 

    return;                                      

} 

So you take a good look at the code and see that it is performing a few distinct steps:

  • It is processing the integer value one time to determine how much space will be required to store the int-converted-to-string value
     
  • It is dynamically allocating the memory required for the string value based on the calculation above
     
  • It processes the integer value a second time to actually generate the int-converted-to-string value and builds it in the allocated buffer
     
  • It shows the value in a Win32 MessageBox
     
  • It deallocates the memory it allocated earlier

Now, step back from the interview scenario and think about the code as it relates to the problem.  True, it does exactly what it is supposed to do, but does it not seem a tad bit heavyweight for such a simple problem?

Most developers would look at that solution and figure that there must be a simplier way to do things, and they are 100% correct.  However, the problem is this – suppose the developer, instead of writing the code above, wrote this instead:

void    Exam1( void )

{   

    CString sValue;

    int     iValue = rand();

   

    sValue.Format( _T( “%d” ), iValue );

    ::MessageBox( NULL, sValue, _T( “The Value Is” ), MB_OK );

}

Here is the problem with this code…  Some inexperienced developers (even ones that do not realize that they are inexperienced) would think that this code is much simplier than the original code above, and might even consider it to be an acceptable answer.  And, truth be told, it sure looks like a much simpler solution. 

But would you believe that the code executed for this usage of CString::Format(…) is actually more complex than the original code above?  If the CString object was not a new object, and had already been used, it might be even more complex if reallocation was necessary.  Now what do you think about this code?  Not really as simple as it looks, is it?

Why would people accept the above CString example as a better solution?  Such is the danger with hidden complexity – there is lots of innocent looking code out there that looks “simple” to the inexperienced/untrained eye.  And there are lots of inexperienced/untrained eyes out there. 

As modern, professional developers, we have to start paying attention to the stuff going on behind the scenes.  We have to realize that just because something looks simple, does not mean that it is.  Knowning what your code is actually doing is important to understanding how to implement a solution.

When the hidden details involve dynamically allocated resources, things get really important.  With today’s modern desktop CPUs offering things like true multi-core ability (Athlon X2, Pentium D, etc.), proper multi-threaded development becomes very important.  However, most standard runtime heap implementations use a shared heap.  When doing multi-threaded development, the word shared normally impies contention. 

If you do not know exactly what your code is doing behind the scenes, you cannot identify potential trouble spots where contention may be a concern.  Oh, and BTW – since these CString functions can allocate memory, it means that each time they are called is another possible exception point.  Do you ever see developers wrapping each call to CString::Format(…) with an try/catch block?

This kind of ignorance is not something that should be encouraged nor tolerated in the field of software development.  With the two top complaints about software generally being about speed/performance and stability, developers cannot continue to be ignorant of what goes on behind the scenes.  This is true for hand-written complied applications just as it is for managed/interpreted ones built on a framework.

N.B: We are making some strides here – the new ATL-based CString class seems to have a way to customize the allocators used.  This is a very nice feature, but I bet it will be under-utilitized.  For example, Win32 developers have always been able to create pre-thread heaps to help avoid thread contention, but very few actually make use of them.  The STL allocators can also be customized in a similar way, but this is rarely done.  As such, noone should expect these developers to suddenly wake-up and start using per-thread heaps just because CString supports it.  Developers have to rise to the task by gaining a more detailed/deep understanding, and using that knowledge to build better solutions.

Leave a Reply

Blog Site for myself, and my friends


Spam Karma 2 has sent 0 comments to hell and 0 comments to purgatory. The total spam karma of this blog is 0. What's your karma?