Sunday, August 25, 2013

Anatomy of a Singleton

So why would I want to cover such a fundamental concept as how to create a singleton?  Well there are two reasons.  First, it is actually surprising how few developers I come across that know how to make an effective singleton object - so perhaps the concept, though fundamental, is not nearly so basic.  Second, as with many patterns in programming, there are many ways to accomplish the same task, and that can be fun to examine.  Let's step through the ways we can implement a singleton in Objective-C, some of which are applicable in concept to other languages too.

Lots of ways to skin a cat...er...singleton


First, let's just explicitly establish what we mean when we say a singleton.  A singleton is object that persists across the entire application globally, generally (though not always) accessible via a static method call.  [NSFileManager defaultManager] is an example of a singleton instance of NSFileManager. The benefit is that with a singleton you don't need to reinitialize whenever you need it and it persists it's state across the entire application.

Let's take a look at the naive approach to implementing a singleton and then we'll expand from there.

// Code 0.0 - Not Thread Safe
+ (MyClass*) sharedInstance
{
    static MyClass* s_singleton = nil;
    if (!s_singleton)
        s_singleton = [[MyClass alloc] init];
    return s_singleton;
}

Now the naive approach effectively implements the general concept we are trying to achieve. At the time of the first access to the sharedInstance the instance is allocated and returned so that all subsequent calls can just return that same instance. What the naive approach neglects, however, is thread safety. Since the idea behind a singleton is to have an instance that is global, it can be accessed from any thread. One common approach is to utilize a class' initialize static method. Apple's documentation states:

The runtime sends initialize to each class in a program exactly one time just before the class, or any class that inherits from it, is sent its first message from within the program. (Thus the method may never be invoked if the class is not used.) The runtime sends the initialize message to classes in a thread-safe manner. Superclasses receive this message before their subclasses.

Given the thread safety of the initialize method and that it is the first method executed for any class we can use it for our singleton.

// Code 1.0 - Valid, but restrictive
static MyClass* s_singleton = nil;

+ (void) initialize
{
   s_singleton = [[MyClass alloc] init];
}

+ (MyClass*) sharedInstance
{
   return s_singleton;
}

Now this is a really effective and popular mechanism for creating a singleton in Objective-C.  However, it has 2 deficiencies.  1) It is tied directly to the class' implementation and 2) it is an Objective-C specific pattern that doesn't translate to any other languages for our learning.   By being so tied to the implementation of the class, we don't get the benefit of having multiple different uses of singletons that are for the same class type.  Take for example the NSDateFormatter. Say we want to have multiple singletons for NSDateFormatter that support different formats, like ISO-8601 and a human readable version. We'd want to create singletons via a category which would make the initialize method unavailable to us.

So let's get back to our naive implementation and try to make it thread safe.  The easiest thing to do is just synchronize the entire method.  I'll use the @synchronized keyword for easy thread synchronization. Other languages/platforms have other mechanisms for locking a thread for synchronization - so when applying what we learn here to other languages/platforms you can simply substitute the correct thread synchronization pattern.

// Code 2.0 - Valid, but inefficient
+ (MyClass*) sharedInstance
{
    static MyClass* s_singleton = nil;
    @synchronized([MyClass class])
    {
        if (!s_singleton)
            s_singleton = [[MyClass alloc] init];
    }
    return s_singleton;
}

Now this effectively works, however it is very inefficient. Since synchronization requires context switching, it is incredibly expensive. And to synchronize every time we access our instance is just a waste of precious CPU cycles. Now, how can we improve our access to the shared instance by making it so we only synchronize when necessary? One thing we can note is that once we have the singleton instance created, we always just want to return that instance. So we really only want to synchronize when the instance COULD be nil. I'll change our method to use the double-nil-check method of lazy loading an instance. Now a word of warning: double-nil-check method has very specific considerations that need to be made for it to work. I will get to these later, so be sure to get through the entire article and don't just start using the double-nil-check code you see without understanding it.

// Code 3.0 - Valid but care should be taken
+ (MyClass*) sharedInstance
{
    static MyClass* s_singleton = nil;
    if (!s_singleton)
    {
        @synchronized([MyClass class])
        {
            if (!s_singleton)
            {
                s_singleton = [[MyClass alloc] init];
            }
        }
    }
    return s_singleton;
}

You can see the double-nil-check method keeps true to its name. First, if the instance definitely exists we just return it, saving us from having to synchronize. This will take care of nearly all calls to our sharedInstance. Now if the first nil check passes, we know that we MIGHT have to initialize the instance. The MIGHT is very important since multiple threads could pass the first check at the same (relative) time.  That's where the synchronized second nil check comes into play.  If multiple threads pass the first check, they are synchronized and only the first thread will initialize our instance while the second thread will take a back seat and wait for it to finish.  This is an effective and highly optimal singleton implementation, however this code falls down if the singleton instance needs to have any additional prep work done before it is ready for use.  Let's look at the naive implementation and then pick apart it's deficiencies.

// Code 3.1 - Race condition bug
+ (MyClass*) sharedInstance                                //  0
{                                                          //  1
    static MyClass* s_singleton = nil;                     //  2
    if (!s_singleton)                                      //  3
    {                                                      //  4
        @synchronized([MyClass class])                     //  5
        {                                                  //  6
            if (!s_singleton)                              //  7
            {                                              //  8
                s_singleton = [[MyClass alloc] init];      //  9
                [s_singleton performAdditonalPreparation]; // 10
            }                                              // 11
        }                                                  // 12
    }                                                      // 13
    return s_singleton;                                    // 14
}                                                          // 15

The real focus of how multithreading affects this singleton implementation can be found on lines 9 and 10.  First, the object is allocated, initialized and then assigned to our instance reference variable.  Second, the instance is updated so that it is finished being prepared to act as the singleton we need.  With concurrent programming, however, it is possible that while one thread is in the synchronized block and has assigned s_singleton to the object but a second thread then comes in and uses that instance BEFORE the first thread has finished executing performAdditionalPreparation.  This race condition is very low in surface area, but image trying to debug a situation where this exact issue occurs.  "Impossible" comes to mind.  So what we need to ensure is that the s_singleton variable is not assigned until AFTER the instance object has finished being prepared.  Let's take a crack at this revision.

// Code 3.2 - Race condition bug
+ (MyClass*) sharedInstance                                //  0
{                                                          //  1
    static MyClass* s_singleton = nil;                     //  2
    if (!s_singleton)                                      //  3
    {                                                      //  4
        @synchronized([MyClass class])                     //  5
        {                                                  //  6
            if (!s_singleton)                              //  7
            {                                              //  8
                MyClass* tmp = [[MyClass alloc] init];     //  9
                [tmp performAdditonalPreparation];         // 10
                s_singleton = tmp;                         // 11
            }                                              // 12
        }                                                  // 13
    }                                                      // 14
    return s_singleton;                                    // 15
}                                                          // 16

So now looking at lines 9 through 11, we can see we are preparing a temporary reference to the object before assigning it to our s_singleton variable.  Problem solved!  Right? Well, this is where we cross the boundary from what the code says it will do to what the compiler tells the CPU to execute.  What's really happening under the hood as these lines end up executed on the CPU?  Well, ultimately, when the compiler goes through the implementation of Code 3.2 it will notice that an unnecessary temporary variable is created so it will optimize that unnecessary variable away and directly assign the allocated object to the s_singleton variable.  So after the compiler optimizes our code we end up with the exact same compiled code as Code 3.1.  So that means we need to look lower level at how to separate the preparation of our object to it's assignment to our static variable.  The separation we are looking for is a memory barrier.  To oversimplify, a memory barrier basically instructs the CPU that any loads and stores that occur before the memory barrier MUST complete BEFORE any loads or stores execute AFTER the memory barrier.  Put simply, it puts a barrier in to ensure that we can create our temporary variable - thus our code changes to this:

// Code 3.3 - Valid and optimal
+ (MyClass*) sharedInstance                                //  0
{                                                          //  1
    static MyClass* s_singleton = nil;                     //  2
    if (!s_singleton)                                      //  3
    {                                                      //  4
        @synchronized([MyClass class])                     //  5
        {                                                  //  6
            if (!s_singleton)                              //  7
            {                                              //  8
                MyClass* tmp = [[MyClass alloc] init];     //  9
                [tmp performAdditonalPreparation];         // 10
                OSMemoryBarrier();                         // 11
                s_singleton = tmp;                         // 12
            }                                              // 13
        }                                                  // 14
    }                                                      // 15
    return s_singleton;                                    // 16
}                                                          // 17

Now with the insertion of OSMemoryBarrier(), we are set!  Yay!  But we're not done yet, I'm going to take a short moment to move into C++ with respect to our solution before I go into our final implementation option for a singleton which uses Apple's GCD.

In Objective-C, we see that order of operations matters for the assignment of our static variable reference.  If we don't have to do anything beyond the alloc and init of our object, we don't actually have to worry about the memory barrier since the order of operations will compile as allocate, initialize, assign.  However, C++ has a different behavior for it's order of operations in the same scenario making it such that a memory barrier is ALWAYS mandatory.

// Code 4.0 - Race condition bug
static MyCppClass* SharedInstance(void)         //  0
{                                               //  1
    static MyCppClass* s_singleton = NULL;      //  2
    if (!s_singleton)                           //  3
    {                                           //  4
        CPP_SYNCHRONIZATION_START;              //  5
            if (!s_singleton)                   //  6
            {                                   //  7
                s_singleton = new MyCppClass(); //  8
            }                                   //  9
        CPP_SYNCHRONIZATION_END;                // 10
    }                                           // 11
    return s_singleton;                         // 12
}                                               // 13

Note how Code 4.0 looks the same as Code 3.0, but there is an important difference.  In C++, line 8 has a different order of operations that Objective-C would have.  As previously mentioned, the order in Objective-C is 1) allocate, 2) initialize and 3) assign.  In C++, however, the order is 1) allocate, 2) assign 3) execute constructor (aka initialize).   This means that C++ will have a the same race condition as in Code 3.1, where we have additional prep work.  So we always need the memory barrier.  So for C++ we would have to do something like this (note changes are on lines 8 though 10):

// Code 4.1 - Valid and optimal
static MyCppClass* SharedInstance(void)             //  0
{                                                   //  1
    static MyCppClass* s_singleton = NULL;          //  2
    if (!s_singleton)                               //  3
    {                                               //  4
        CPP_SYNCHRONIZATION_START;                  //  5
            if (!s_singleton)                       //  6
            {                                       //  7
                MyCppClass* tmp = new MyCppClass(); //  8
                PLATFORM_MEMORY_BARRIER;            //  9
                s_singleton = tmp;                  // 10
            }                                       // 11
        CPP_SYNCHRONIZATION_END;                    // 12
    }                                               // 13
    return s_singleton;                             // 14
}                                                   // 15

Having completed our tour of the double-nil-check (aka the double-NULL-check) method for singletons, we can visit our last, and easiest implementation of a singleton using GCD's dispatch_once function.  Since the dispatch_once function gates any execution that comes after it by ensuring what's executing in it's block finishes first, we have a much cleaner and easier to implement singleton solution, which has become my default implementation of singletons on the Apple Mac OS X and iOS platforms.

// Code 5.0 - Valid and optimal
+ (MyClass*) sharedInstance
{
    static MyClass* s_singleton;
    dispatch_once_t onceToken;
    dispatch_once(&onceToken, ^{
        s_singleton = [[MyClass alloc] init];
        [s_singleton performAdditionalPreparation];
    });
    return s_singleton;
}

Alright!  There you have it!  Numerous ways to write a singleton, some of which are flawed (though often widely used none the less).  Now that I've explained the most important piece, the accessor to the singleton, we can examine the last piece and that is the overriding of methods to ensure a singleton stays a singleton.

Now in a lot of cases, the implementation of the accessor is all that is needed.  However, if you are writing the class itself that is a singleton and want to make it so that it MUST only ever be accessed as a singleton, you can take some steps to prevent any poor code from using your class improperly.  These overrides are taken directly from Apple's documentation:

+ (id)allocWithZone:(NSZone *)zone
{
    return [[self sharedManager] retain];
}
 
- (id)copyWithZone:(NSZone *)zone
{
    return self;
}
 
- (id)retain
{
    return self;
}
 
- (NSUInteger)retainCount
{
    return NSUIntegerMax;  //denotes an object that cannot be released
}
 
- (void)release
{
    //do nothing
}

- (id)autorelease
{
    return self;
}

That's it for this entry!  I hope this empowers you to make efficient singleton objects in your next project.  Be sure to stay tuned for future blog entries at NSProgrammer!

3 comments: