Relish a Small Victory

 
 

I've often lamented that, as technical folks, we seem to be constantly in the middle of a "fire fight". We no sooner resolve an issue to turn and be confronted with another.

It started innocently enough, chatting on Skype with a family member when a pause in the conversation led to an unassuming check of the work e-mail account - and there it was. They all seem to start with the same subject, "HELP!!! I need..." and then all hell breaks loose.

Being a glutton for punishment, I decided to open it. No, it wasn't an ex-Nigerian Prince offering me his fortune - I'm still waiting for the last one to pony up. Quite the contrary, it was an exception message but not just *any* exception message, one involving C++ - I was hooked! There was no turning back. I opened it immediately and saw "Access Violation".

In my current role as an Application Architect I am working with a highly competent Tech Lead, with the intensity of a heroin addict needing a fix and the brilliance of Einstein, who is an aspiring Architect. It's easy enough to conclude how sadly naive this young man is, only because he wants to be an Architect.

The cautionary theme I consistently opine about are the "assumptions". Yes, we can make assumptions as we are working through the definition of the system scope and context, but too many assumptions will lead you astray. It's a nauseating repetitive cadence, but one well established through experience and the lesson of hard-knocks.

It reminds me of the scene in the movie "Colors" where Sean Penn, a rookie L.A. cop, is driving a squad car with the seasoned vetran played by Robert Duvall. Penn, in his need for instant gratification finds law breakers on every street corner - and he's dead set to "get his man". Duvall turns to him and says (I'm paraphrasing here):

"Did you hear the story of the old Bull and the young Bull standing atop a hillside staring into a pasture, of Cows, below? The young Bull turns to the old one and says, "Hey, let's run down the hill and have sex with a Cow!", with all of the unbridled enthusiasm you would expect of youth. The old Bull pauses for a minute, then turns to the young Bull and replies, 'How about we walk down and have sex with them all?'"

I find that story to be an apt metaphor when approaching Architecture and Software Design, at least from the perspective of trying to "eat the elephant in one bite". So what does this have to do with reading an e-mail? It's about the assumptions: never assume anything.

So, returning to the e-mail and the "Access Violation" error. I performed a quick scan of the body of the e-mail, formulated my conclusion and clicked, "Reply-to-all", proudly announcing, "It's a memory management issue." Problem solved, right? Nope. A couple of e-mail exchanges and I learned that my "quick scan" neglected to uncover a massive abuse of Java threading - allocating hundreds of threads using, what appeared to be, shared data. A definite design smell. Ah well, being late, I could go to bed and sleep tight knowing "I solved this problem!"

Though I may not have solved this problem with anything other than a hunch, I was able to gleen a great deal from the "Access Violation" error message. Notably, if memory served me correctly, this was a Debug-compiled error generated from code running under Microsoft Windows and compiled with Visual Studio. It's actually a very helpful message, however, the primary issue is that debug-compiled VS code changes the physical layout of the binary which, given a constrained memory environment, would lead to completely different behavior when non-debug code was run.

Waking, around 2:30 for my usual, old-man, mid-sleeping "trip", I returned to bed and thought, "If it's an offshore group, maybe they've responded. I should check." Thankfully the rational side of my brain overruled the impetuous side and I fell back asleep.

Returning to work in the morning I was greeted with a mid-day, 30 minute discussion, picking up from our previous night's discussion.

The discussion led to a code walk through, starting from the point where the memory access was being thrown. I must admit, it felt good to be looking C++ code, even if wasn't idiomatic C++. Idiomatic-C++, in my opinion, is where you end up mixing a couple of very distinct paradigms: Object-Orientation and Template Metaprogramming. While you can argue that C++ is capable of multiple paradigms, combining the reusability of both object-orientation and functional programming constructs that template metaprogramming provides leads to a very elegant and powerful approach.

The code was loosely organized into classes, the object-oriented construct in C++, however it resembled a highly procedural codebase and would fall under the category of "Legacy Code", as defined by Michael Feathers in his "Working Effectively with Legacy Code" book - meaning it lacked suitable unit test coverage.

The algorithm we were reviewing seemed fairly straightforward:

  • a string was allocated inside of Java, using the Java Heap, the contents of which were passed into a native function running within natively compiled C++
  • memory was allocated in the native heap using "malloc" (real C++ programmers use "new")
  • once in the function, the string contents were copied and a callback was issued to the calling process returning the memory address of the "malloc" call, presumably to cache the address - I was never able to figure out what the reason was for this: allocating memory on the native-OS heap and then informing Java, presumably to cache the address, but why?
  • which was used and later freed (yes, using "free" not "delete") from within the same function executing the callback
  • the processor architecture was either 32, or 64, bit depending on the chosen output target

The observations I made:

  • it seemed like a design smell having both the allocation of, and de-allocation of, memory crossing physical code boundaries even though it may not have crossed process boundaries.
  • the code contained low cohesion, meaning that while a class member function would perform a single capability of the containing class, the containing class seemed to have a number of responsibilities.
  • I didn't say it, but I honestly felt, from a purists point of view, that dropping the C-idioms (malloc/free) in favor of the C++-idioms (new/delete) seemed cleaner, though semantically would not get us any closer to resolving the issue

As we continued to explore the code and discuss the assumptions and the rationale that informed both design and implementation choices - allocating and freeing on the Heap rather than using Stack-based automatic clean up when the variable goes out of scope, sharing memory addresses between native and non-native code - an insiduous "WTF?" moment happened when, almost simultaneously, the presenter mentioned that the code is compiled for both 32 and 64-bit Wintel processor architectures and the following line showed up on the screen:

	MyObject* ptr = ... // derived (either thorugh allocation or caching the address) from an external callback
	...
	someFunction((int*) ptr);
	...
					

While not the exact syntactic representation of the code, the issue seemed immediately obvious: "What is the size of an Object pointer in 64-bits compiled code? 8 bytes. How about the size of an integer pointer in 64-bit? 4 bytes. It seemed, upon first inspection, that a pointer, on a 64-platform, was being statically "down cast" to a 32-bit address, discarding significant bits of the original 64-bit address.

While not an exact replica of the problem, this reminds me of the situation when you pass a derived object, in C++, by value to a method that accepts an instance of it's base class. It's an issue known as "slicing" - you lose all of the value of the v-table and polymorphic dispatching. Essentially, they were truncating a native object pointer from 8 bytes to 4 bytes and discarding, likely, a fairly signficant portion of the actual pointer's address.

Testing My Hypothesis

In order to test my "blurted assertion", I need a simple program to prove that the different processor architectures referred to different sizes of both a "long" (JNI pointer length) and an "int". The following code snippet, run on 32 and 64-bit Linux and 32-bit Windows, seemed to do the trick:

	#include 
	#include 
	#include 
	
	class MyClass {
	public:
	    void sayHello() {
	        std::cout << "Hello" << std::endl;
	    }
	};
	
	int main(int argc, char* argv[]) {
	
	    MyClass* myp = new MyClass();
	
	    std::cout << "sizeof(int)  = " << sizeof(int) << std::endl;
	    std::cout << "sizeof(long) = " << sizeof(long) << std::endl;
	    std::cout << "sizeof(char) = " << sizeof(char) << std::endl;
	    std::cout << "sizeof(char*) = " << sizeof(char*) << std::endl;
	    std::cout << "sizeof(MyClass*) = " << sizeof(myp) << std::endl;
	
	    delete myp;
	    return EXIT_SUCCESS;
	}
					

Running the programming produced the following output:

	Results on 64-bit Linux:
	
	----------------------------------
	sizeof(int)  = 4
	sizeof(long) = 8
	sizeof(char) = 1
	sizeof(char*) = 8
	sizeof(MyClass*) = 8
	----------------------------------
	
	Results on 32-bit Linux:
	----------------------------------
	
	sizeof(int)  = 4
	sizeof(long) = 4
	sizeof(char) = 1
	
	sizeof(char*) = 4
	sizeof(MyClass*) = 4
	----------------------------------
	 
	Results on Windows in
	32-bit:
	----------------------------------
	sizeof(int)  = 4
	sizeof(long) = 4
	sizeof(char) = 1
	sizeof(char*) = 4
	sizeof(MyClass*) = 4
	----------------------------------
					

So, based on these observations, what recommendations did I have?

  • Given the limited size of the string allocation, something around 32 characters, why not avoid Heap memory allocation and simply pass a copy of the value into the function, avoid the callback, and create a locally-scoped copy to use within the lifetime of a stack-based variable?
  • You could also use a smart pointer, either auto_ptr (less preferred) or even Boost's smart_pointer (more preferred). I recommend the Boost approach not only because it is soon to be part of the newly ratified C++X0 standard but because of the ownership issues that plague the C++ Standards original auto_ptr.
  • The other recommendation was to use non-platform specific standard size definitions, something akin to the size_t type defintion that is defined as an unsigned long in many 64-bit compilers
  • Another implementation suggestion I had was to not only consider the design approach of allowing all memory-management to occur within the context of the native code, but more specifically, implemented using the RAII pattern (Resource Acquisition Is Initialization).
  • Read Michael Feathers' book - seriously, this is a gem for honing your test-driven development skills and one of the two books I recommend regularly.
  • Employ a test-driven development approach using cppunit as the approach to creating a test harness and writing unit tests.

While any of these recommendations, in isolation, would not solve the issue entirely, applying all together certainly would and it would also move the design up the quality continuum alleviating the problem and freeing the application of inconsistent crashes on a 64-bit architecture.

In the end, the resource management and pointer truncating turned out to be the central issue.

The other lesson here is to use your judgment when checking work e-mail at night - you never know what you might find.

All in all, a very fun day to be a geek and face the next "fire fight". Now, I just need to make sure not to check my e-mail when my body and mind say, "Go to bed!"...