Friday, June 14, 2013

Logging: The Most Important Part of Any Application

Bold statement, right? But after over a decade of production development work, whether it's a brand new project, coming into a pre-existing project to clean up or just a prototype; the first and most important task I take on is having a robust logging mechanism. 2nd is ensure there is a robust crash handling mechanism in place - which I should touch on some other time - but first and foremost is logging.

Logs! Logs! My kingdom for some logs!


First, lets take a moment to really ask ourselves what we need from "logging".
  1. Logs must be fast and not interfere with application execution
  2. Logs should be filterable by their importance
  3. Logs must be easy to write
  4. Logs must be accurately timestamped
  5. Logs must be persisted on disk
  6. Logs must be retrievable
  7. Logs must not adversely affect users (like disk bloat)
  8. Logs should be visible as they occur (in the console)

Now this list may seem like overkill, after all it's just logging right? But I adamantly believe that not having any one of these items taken care of from the start will only lead to slowed productivity in the development lifecycle of any application. Now, nicely enough, iOS and Mac OS X have built in logging to the console with NSLog that should easily take care of all of our needs, right? First, I want to say I love NSLog, nothing is easier for just getting output right away in an application without any fuss. But sadly, NSLog just doesn't meet the requirements leaving it to us as developers to create our own solutions (which I will share my lightweight, go-to, re-useable logging solution at the end). Well if NSLog is out, let's tackle the requirements one at a time and establish what we need and why we need it.


1. Fast logging


Now why, of all the requirements, do I list logging as needing to be fast first? Well, to be blunt and obvious, because it's the most important! Often we don't think of logging as being the source of any form of slowdown, and that is exactly the mindset we should be able to afford ourselves as developers. What terrible world to live in where you have to concern yourself with the performance of you application every time you write a log statement!

So let's ensure, above all else, that when we have a logging solution, it has no side affects to our applications performance. Now to figure out how to make logging performant, we really need to establish what logging does and the expectations we have when using it.

Logging is incredibly simple, it's merely the writing of strings to an output so that it can be read later to help diagnose and understand the behavior of an application as it runs. Simple right! Well, let's decompose that a little. First, note that we've establish that logs are useful for fixing bugs and knowing how our apps behave. Great! They have a use! Now the way they are used is by being read later. Okay let's focus on that for a second - we want to read our logs later. You may say you need to read the logs as your debugging, not later! Yes, agreed - but later just means you are reading the message after you write the statement to the logs, not necessarily after a day. But that brings 2 important requirements: #5 and #8 in our list - both having the logs persist for extended periods of time (which is offered by persisting them to disk) and being visible with minimal delay in the debug console or in the terminal output. Now these are two important requirements we'll get into deeper detail soon, but what we've established is there are two things that our logs need to be able to provide at once: immediacy and persistence. This boils down to having our logs output in multiple I/O mechanisms - first is display (relatively fast I/O) and second is persistence (relatively slow disk I/O). Focusing on the fact that logging performs I/O to disk we now have something to focus on making performant. Logging on the main thread with blocking I/O of a log file could lead to atrocious performance, particularly if you are trying to capture the performance of a procedure and the time it takes has to take into account all the logging that is wasting precious main-thread cycles. So what's the punch line? We want to be sure that all of our logging is done OFF of the main thread. And that is the first real downside to NSLog. NSLog does a great job of both persisting the logs to the console log file and outputting to the stdout for reading in the debug console and/or the terminal, but by doing I/O on the thread it's called on (as well as synchronizing across all threads synchronously) there can be serious performance costs to using NSLog. In order to avoid these performance cost we're going to need to ensure that log messages are sent to a background thread for writing asynchronously so that the calling thread can immediately continue on its merry little way.


2. Log levels


Next on the list of requirements is the ability to differentiate the importance of individual log messages, either at write time by preventing the writing of logs that are outside the scope of the current build, or at the point at which we read the logs and need to find the important issues quickly. This is actually another lacking feature of NSLog, no native support for filtering logs based on the importance of a message. First, we'll just establish that we're going to have a discrete set of log levels that we can filter on. This means that we'll need discrete ways of calling our logger to identify the level at which a message ought to be logged. Once the logger is called with the desired level, it will have the authority to ignore attempts to write a log at a level that is too verbose for the build being run at the time. Okay, sounds like a plan, but what should these levels be? There are many different Log Level patterns out there - but I'll briefly describe the most popular one and then explain why I dislike it as a pattern and think that it needs to go the way of the dodo.

A very common pattern, that has been around for many decades is the use of Severity Levels for log levels. It basically breaks down into the following discrete set of levels or a subset there-in. I'll list a very detailed enum of levels (from highest level of importance for being logged, to least) but I'll take an extra step to bold those that seem to be the most commonly used:
  • Critical
  • Error
  • Warning
  • Information
  • Verbose
  • Debug

A log might look something like this:
TIMESTAMP-[INFO] : User moved from page A to B
TIMESTAMP-[DEBUG] : Transition took 0.005 seconds
TIMESTAMP-[INFO] : User moved from page B to C
TIMESTAMP-[DEBUG] : Transition took 0.002 seconds
TIMESTAMP-[VERBOSE] : User was prompted to log in
TIMESTAMP-[ERROR] : User logged in successfully
TIMESTAMP-[INFO] : User moved from page C to D
TIMESTAMP-[DEBUG] : Transition took 0.015 seconds
TIMESTAMP-[WARNING] : User is on a page that could crash
TIMESTAMP-[ERROR] : Error Occurred: User tried to access page E, but page E was not found!
TIMESTAMP-[CRITICAL]: Crash! Could not recover - uncaught exception! ... crash dump ...

Now as you look at these logs, they seem to be useful. They have information showing the user flow that led up to a crash. However, I want to point out a few issues I have with it. Let's look at the ERROR log entries first. Notice the first ERROR entry and how it logs that the user logged in successfully. It's not an error at all! This is the biggest issue I have with severity level logs, by distinguishing all important logs as ERROR or CRITICAL, even logs that aren't errors at all will end up looking like they are errors - but if this is the log level model we have, and we have an important log entry to enter, we're forced to log it as an ERROR even if it's not an error. Now if you look at the second ERROR message, you can see an actual error occurred. What else do you notice, though? The same thing I see from nearly every error that gets logged - text that says error! So now the message says it's an error twice! That may be nitpicky and only bother me, but I think it's a terrible redundancy. Let's move onto the CRITICAL message. Why in the world is there both a CRITICAL and ERROR log level? They are obviously both very important and need to be logged. In fact the entire concept of splitting ERROR and CRITICAL as well as DEBUG and VERBOSE is a complete waste of energy. They offer no really boundary for discernment, which is why most log levels now don't use CRITICAL or DEBUG. Now to the WARNING message. True this is a warning, but why is this elevated as more important than an informational log entry or less so than an error log entry? It's arbitrary in the first place where the message logs to by the developer, so we should leave it to their judgment whether it's worth being logged at a higher level or not, and thus the WARNING log level is completely unnecessary.

So you probably already see where I'm going with this, but a better pattern for logging goes like this:
  • Important
  • Informational
  • Verbose

These logs, which are priority based instead of severity based, look like this:
TIMESTAMP-[INFO] : User moved from page A to B
TIMESTAMP-[VERB] : Transition took 0.005 seconds
TIMESTAMP-[INFO] : User moved from page B to C
TIMESTAMP-[VERB] : Transition took 0.002 seconds
TIMESTAMP-[VERB] : User was prompted to log in
TIMESTAMP-[IMPT] : User logged in successfully
TIMESTAMP-[INFO] : User moved from page C to D
TIMESTAMP-[VERB] : Transition took 0.015 seconds
TIMESTAMP-[INFO] : User is on a page that could crash
TIMESTAMP-[IMPT] : Error Occurred: User tried to access page E, but page E was not found!
TIMESTAMP-[IMPT] : Crash! Could not recover - uncaught exception! ... crash dump ...

Much better, now the logs don't have so many different levels, the priority is there and the message itself can describe what it is telling us. But I'm never one to be satisfied and I want to take things a step further. Now, though I did log like this for quite some time, I often got annoyed looking at log lines and the way my eyes naturally parse the text, the "I" of IMPT and INFO messages line up making it that much harder to read and distinguish important and information logs. I did try IMPORTANT and INFO, so that the longer text of important logs would stick out more, but ultimately, I wasn't liking my logs not aligning and was still having to look harder between INFO and IMPORTANT logs to distinguish which was which. Finally, I've come to the ultimate log level enumeration - it's so simple it's like it wasn't even thought of by an engineer.
  • High
  • Medium
  • Low

These logs, which are also priority based instead of severity based, look like this:
TIMESTAMP-[MID]  : User moved from page A to B
TIMESTAMP-[LOW]  : Transition took 0.005 seconds
TIMESTAMP-[MID]  : User moved from page B to C
TIMESTAMP-[LOW]  : Transition took 0.002 seconds
TIMESTAMP-[LOW]  : User was prompted to log in
TIMESTAMP-[HIGH] : User logged in successfully
TIMESTAMP-[MID]  : User moved from page C to D
TIMESTAMP-[LOW]  : Transition took 0.015 seconds
TIMESTAMP-[MID]  : User is on a page that could crash
TIMESTAMP-[HIGH] : Error Occurred: User tried to access page E, but page E was not found!
TIMESTAMP-[HIGH] : Crash! Could not recover - uncaught exception! ... crash dump ...

I really like logging like this - visually parsing an H, M and L is MUCH easier and I even like the 1 extra character that HIGH level log messages get (vs IMPORTANT having 5 extra characters over INFO). Ultimately, it's the developer's (or perhaps team of devs') choice in what to use so if you want to log as SUGER, SPICE, and EVERYTHING_NICE - go for it; just remember to have distinct log levels. The way I approach it is that everything logged MID and HIGH will end up in our deployed to customers RELEASE builds, but all of our internal DEBUG, QA, DOGFOOD and BETA builds will have LOW level logging turned on. Now since we are using Objective-C and can take advantage of macros; I often take my LOW log level logs and distinguish them in 2 ways. The first is just the regular logging at a low level so that the logger can ignore the logging messages at runtime; however the second low level logging method I use is a macro that on DEBUG builds, logs as a LOW level message, but on non-DEBUG builds don't even get compiled. You can see the implementation details of this in the next section (3. Easy Logging). But what this provides is any debug log statements you put in just for development testing won't even get compiled into the app so there's no unnecessary calls to the logger which would filter them out anyway.

An extra step that can be taken, and I've used somewhat in the past, is the ability to configure the log level in the app.  This can be useful when debugging with a customer - have them increase their logging (by permitting the less important logs at the LOW log level to be logged).  I wouldn't call this mandatory for logging, but it can be useful.


3. Easy Logging


Nothing is better than being able to provide ourselves a tool for app development than one that is easy to use. NSLog definitely comes through in this department. Just provide a formatting NSString followed by any format arguments and you've got yourself a log message. Sweet! So since we've ruled out NSLog as our logging solution, let's try to at least replicate it's ease of use, while incorporating the benefits of log levels.

Jumping straight to the chase, I implement my logger as an object and while thinking of my logger as object is fantastic for development and code reuse, it lacks in ease of expression. Forcing a developer to log a statement like the following is a little too much to bear:
[[NSPLogger sharedLogger] writeASync:@"My Message" level:NSPLogLevel_Low];

No worries, though! That's what our best friends in pre-processing, macros, are for! In fact, using a macro is so helpful, it not only alleviates boiler plate code, it can also add the benefit of variable arguments like NSLog has, for free, without even needing to implement a var args method! Now that's what I call easy ;)
// base macro
#define LOG(lvl, fmt, ...) [[NSPLogger sharedLogger] writeASync:[NSString stringWithFormat:fmt, ##__VA_ARGS__] level:lvl]
// easy macros
#define LOG_HI(fmt, ...)   LOG(NSPLogLevel_High, fmt, ##__VA_ARGS__)
#define LOG_MD(fmt, ...)   LOG(NSPLogLevel_Mid,  fmt, ##__VA_ARGS__)
#define LOG_LO(fmt, ...)   LOG(NSPLogLevel_Low,  fmt, ##__VA_ARGS__)
#ifdef DEBUG
#define LOG_DB(fmt, ...)   LOG_LO(fmt, ##__VA_ARGS__)
#else
#define LOG_DB(fmt, ...)   ((void)0) /* no-op */
#endif


I think it's safe to say that logging each individual log message is just as easy as NSLog now.
LOG_HI(@"Important Message: %@", @1);
LOG_MD(@"Informative Message: %@", @2);
LOG_LO(@"Verbose Message: %@", @3);
LOG_DB(@"Debug Only Verbose Message: %@", @4);



4. Timing is everything



At this point we've established that we want all of our logs to occur on a background thread so as not to disrupt the performance of our main thread (or even our other worker threads perhaps). But just saying "do it on background thread" isn't enough of a solution. Logs are very useful in not only conveying information as readable strings as well the priority of the message, it also needs to convey the timing information of the message too! The use case of the app in development may dictate the granularity needed for the timestamp of log message, whether that's in seconds, milliseconds or even more granular. So we need the timestamp of the log message to be as accurate as possible. Another important aspect of log timing is also sequencing. We really must have the logs be in the correct order, or legibility of the logs if the order isn't accurate would be infuriating.

Now these are two separate issues, but they very much complement each other in determining the most effective way to log our logs while meeting both criteria. Just focusing on accurate timestamps may prompt us to immediately fire off a new thread for each log message string we want to log to disk, but doing so could affect sequencing since the order at which each thread performs its I/O is not guaranteed. Just focusing on sequencing may lead us to queue up all our logs on a specific background thread for logging, but if we timestamp on that background thread, the I/O of the logs could cause a backup and then the timestamps would lose their accuracy. Taking advantage of the fact that we only need to see the logs AFTER the log message is sent to be logged, we can simply get the timestamp of the log at the moment the logger is called to log the message, and then enqueue that log message AND timestamp for writing out onto a dedicated thread for logging. The simplicity of this should not cause us to underestimate the importance of what this procedure for logging provides. Sequencing and timestamp-sensitivity are two very easy bugs to introduce that can easily go unnoticed for long time...right up until you really need your logs to be accurate.


5 and 6. File logs in a sandboxed world



Let's tackle requirements 5 and 6 together since they are so tightly related. With our logging we need both the ability for the logs to persist to disk (#5) but we also need to be able to retrieve those logs at a future point in time (#6). NSLog does offer log persistence by logging to the Console logs, however, those files are only readable when the device is connected to XCode as developer device (this holds for iOS and Mac OS X app store apps, Mac OS X installable apps can access the logs). I don't know the last time a user was willing to ship you their device so you could retrieve logs, but it's never happened for me. So we need to persist the logs AND be able to retrieve them later. My personal preference (on iOS) is to create a logs directory in the application's sandboxed Documents directory. This way the logs persist across installs and are even backed up if necessary. As for retrieving the logs from users (whether they're QA, coworkers or customers), there are a lot of different choices, but no matter the choice, it needs to be easy for anyone to do. You can have a way for users to email you directly with the logs attached, you can have UI that offers to send the logs over the internet to your own servers or you can even send the logs automatically as an issue is encountered. Always be sure of 2 things when you send customer logs: 1) you never logged any customer sensitive information and 2) that you encrypt the logs in transit. On the topic of number 1, I can't stress this enough - don't log customer information including, but not limited to, their full name, address, phone number, credit card, or password.


7. Log Rolling, and I don't mean the caber toss



Now one thing that seems to be ignored or forgotten far too often in custom loggers, at least that I've seen, is the bloat of disk spaced used by having the logs just grow and grow and grow. Don't just have your logger open the same log file every time the app runs and append to the end of it forever, that will lead your most devoted users to run out of disk space just because of your logs. We need to have a way to continue to write new log messages while purging old log messages that are no longer of use to us. My preferred mechanism for this is rolling over logs. Doing 1 log file and then replacing it with a new log file after it reaches its limit can be a problem since you may come to a point of needing to see the logs right after you've rolled over making it so that you don't have the pertinent log messages when you try to retrieve them. Too many log files rolling over (like 100) is just silly and fragmented with having so many small files. Picking the right number of log files to roll over is pretty much a gut choice, as is selecting the restrictions on max size of a log file. Myself, I go with 10, it's nice and easy to deal with. The logic is fairly simple, you maintain your logs with easily sortable unique names (like a timestamp) and once you reach the cap of you log file, you roll over to a new log file. Once you reach you file count limit, you simply delete the oldest log file before continuing on. You can create your own logger to behave however you like and tune the limits you set for all your log files. Personally, I'm pretty happy with logging 10 log files with a maximum of 500 writes per log file (that is: calls to log a message, not bytes written). Choose whatever makes you happy :)


8. Logs in the eye of the keyboard holder



So we've gone through all the requirements we feel are necessary for creating a logging mechanism that is useful for writing logs that can be later retrieved and reviewed for finding bugs in our apps. We've also established that NSLog just isn't up to the task of meeting all these requirements so we'll have to put our own solution into our app. But now we're at the point where we need the real-time feedback in the app that our log statements are providing a file on disk. How do we get the log statements to also show up in the console as we debug with XCode? If only we had a function that wrote log statements to stdout and the device's console log as well. Wait, we do! NSLog!!! So though NSLog can be a performance hog and has no business being in a RELEASE build, and NSLog doesn't meet or requirements for logging to disk - it absolutely can work for us as we are debugging in XCode. So simply take the log message we composed to write out to disk for our logs and put it into an NSLog call. We'll be sure to protect it with an #ifdef DEBUG so that only the developers use it, but ultimately it fills that last piece we needed for a responsive logger (since the NSLog call will also only ever be made from the background logging thread).

Conclusion


So there you have it, a point by point breakdown of what we need in a logging mechanism, why we need it, and how we can get it. I hope, no matter what you use as your logger, that you agree that application logging is the most valuable debug tool you have at your disposal and is the first thing you should have implemented in any of your applications. Here's an overview of our conclusions:

  1. Logs must be fast and not interfere with application execution
    • Log via a background thread
  2. Logs should be filterable by their importance
    • Use High, Medium and Low log levels (or some variation)
  3. Logs must be easy to write
    • LOG_HI, LOG_MD, LOG_LO and LOG_DB macros
  4. Logs must be accurately timestamped
    • Timestamp at the time of the logging call
    • Enqueue log messages with their timestamps onto a single serial background queue for logging
  5. Logs must be persisted on disk
    • Save to the application's sandboxed Documents directory under a logs directory
  6. Logs must be retrievable
    • Email or transfer logs securely from the app
  7. Logs must not adversely affect users (like disk bloat)
    • Roll logs over to ensure a limit in disk usage
  8. Logs should be visible as they occur (in the console)
    • NSLog on the background thread in DEBUG builds


Visit my github repository to see the logger I use. It's free to use, copy, modify, distribute to your hearts content. Please give credit where credit is due though. While you're there, check out any of the other useful tools and code you find.

// using NSPLogger is easy
- (BOOL) application:(UIApplication*)application didFinishLaunchingWithOptions:(NSDictionary*)launchOptions
{
    // Prep logger
    NSPLogLevel level = NSPLogLevel_MidLevel;
#ifdef DEBUG
    level = NSPLogLevel_LowLevel;
#endif
    @try
    {
        NSString* dir = [NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES) objectAtIndex:0];
        dir = [dir stringByAppendingPathComponent:@"logs"];
        NSPLogger* logger = [[NSPLogger alloc] initWithDirectory:dir logLevel:level];
        [NSPLogger setSharedLog:logger];
        LOG_HI(@"application:%@ didFinishLaunchingWithOptions:%@", application, launchOptions);
    }
    @catch(NSException* exception)
    {
        NSLog(@"Couldn't load NSPLogger: %@", exception);
    }
    
    // other prep work...
}

- (void) applicationWillTerminate:(UIApplication*)application
{
    // Terminate work...
    [NSPLogger setSharedLog:nil];
}

// if you get a callback before crashing - you should flush the logger
- (void) callbackForHandlingCrashBeforeCrash:(id)someObject
{
    [[NSPLogger sharedLog] sendSyncMessage:[NSString stringWithFormat:@"Crash occurred! %@", someObject] logLevel:NSPLogLevel_HighLevel]; // log the crash synchronously to ensure the message is queued up in the I/O buffer to be flushed to disk
    [[NSPLogger sharedLog] flush];
}

2 comments:

  1. For those who need robust logging in a language other than Objective-C, I recommend looking at the Apache Logging Project. The loggers it has are very good, though they filter based on severity levels vs priority levels (see requirement #2 for why I don't like using severity levels in logging).

    http://logging.apache.org/

    ReplyDelete
  2. I am liking this Idea for logging, and may start using something similar in a future project. Just to make it a little easier for other readers to find, the specific files for the logger appear to be

    https://github.com/NSProgrammer/NSProgrammer/blob/master/code/NOBLib/NOBLogger.h

    https://github.com/NSProgrammer/NSProgrammer/blob/master/code/NOBLib/NOBLogger.m

    Since you changed the prefix from NSP to NOB in November, it was harder to find this since your code snippet uses NSPLogger as its name.

    Excellent work.

    ReplyDelete