Bad Habit #3

Solutions for Developers, Services for Companies, Software for Users™

A blog is now available Here.

Interested in becoming an Affiliate and earning 25% of all generated sales? Click Here.

Handmade Art and
Treasures!

25% off for first-time customers at VistaPrint!

Bad Habit #3: misuse of string objects

[Previous Article In Segment] [Next Article In Segment]

String objects are wonderful. They take care of NUL termination, they automatically handle growing and shrinking as required, and they handle the maintenance baggage normally associated with buffer-based strings (length, buffer capacity, allocation, deallocation, etc).

Anyone that is familiar with MFC's CString object or any object derived from the STL basic_string class knows what I am talking about. String objects allows us to easily grab a string from an edit control in MFC:

CString sStrObj;

m_editName.GetWindowText( sStrObj );

Or to return a constructed string:

    CString CSomeClass::GetNameWithHello( void )
    {
        CString sStrObj;
        CString sReturn;

        m_editName.GetWindowText( sStrObj );
        sReturn = "Hello ";
        sReturn += sStrObj;
        sReturn += "!";

        return( sReturn );
    }

Simple, easy, and probably the slowest way to handle such a small amount of required memory! String objects normally use dynamically allocated memory, which as shown in the Previous Article, is often misused for small buffer requirements such as this.

Real abuse of string objects comes when you start requiring string objects as parameters to a method but for no good reason. For example:

    void CSomeClass::SetName( CString sName )
    {
        m_editName.SetWindowText( sName );
        return;
    }

Or...

    void CSomeClass::SetName( std::string ssName )
    {
        m_editName.SetWindowText( ssName.c_str() );
        return;
    }

The above method requires a string object, but does not really need one; a simple const char * (or TCHAR) would work just as well, if not better. What the writer of this method has done is not made life simpler by using string objects, but rather has complicated it. If the caller is not using string objects, or a different kind of string object, they now have to create one (either explicitly or implicitly) in order to satisfy the requirements of the function.

Disappearing Memory

Another danger comes when you get too comfortable with certain string objects, like CString, which have overloaded operators that allow them to be used in place of a const char * (or a const wchar_t *). For example, using the GetNameWithHello(...) function created above, you can write the following code:

LPCSTR cpHelloString = GetNameWithHello( void );

MessageBox( cpHelloString );

The above code will usually compile and link successfully. But will likely crash or display garbage when executed. Can you see the problem? Worse yet, the error resulting from the problem may not happen at all under certain situations!

The problem is that a temporary CString object is returned from the function to the caller. This CString object gets constructed and initialized when the function exists and then gets immediately destructed, before the return value gets back to the caller. What does that mean? It means that the cpHelloString variable is now pointing to invalid data, because the temporary CString object has been destroyed! That being said, you often find code written exactly like the above that works correctly! What gives?

If the function was written as:

    CString CSomeClass::GetNameWithHello( void )
    {
        CString sStrObj;

        m_editName.GetWindowText( sStrObj );
        m_sReturn = "Hello ";
        m_sReturn += sStrObj;
        m_sReturn += "!";

        return( m_sReturn );
    }

Note that the returned CString object is now a copy of a member of the containing class. This means that it (m_sReturn) lives beyond the scope of the function. Although the return of the function is a CString and not a reference to a CString, and a temporary CString is still created, calling this version of the function results in valid, usable data being placed into the cpHelloString pointer! Why? Because CString objects use reference counting and copy-on-write semantics.

So when the new version of the function is called, the temporary CString object that is constructed references the same memory as the m_sReturn CString object. The result? The underlying string data is not destroyed when the temporary CString object is, and the pointer ends up pointing to that existing (valid) data. These kinds of gotchas are what confuse developers when they start having trouble involving temporary CString objects.

The Big Killer

Take a look at the following code. It is representative of code that I encountered in an otherwise well-designed application:

    //
    // ...
    //
    catch( CMemoryException *pEx )
    {
        CString sMessage( _T( "The operation could not be performed:\n" ) );

        BuildStackTrace( sMessage );
        DecodeErrorMessage( sMessage, pEx );

        MessageBox( sMessage, _T( "..." ), ( MB_ICONERROR | MB_OK ) );
        pEx -> Delete();
    }

Can you see the major problem with the above code? It stems from being too comfortable with string objects. If the application is handling a memory-related issue (often an out-of-memory condition or heap corruption), why in the hell would you then try to allocate more memory in trying to handle the situation?

In Closing

The best way to avoid all of the problems is to follow basic KISS philosophy: Keep It Simple, Stupid! If you do not really need a string object, then do not take/create one! Always use the simplest (or "more basic") type that you need to solve a particular problem.

[Previous Article In Segment] [Next Article In Segment]