Fizz-Buzz To Assess Candidates? You're Better Off Having A Drink

Assessing software development candidates is both an art and a science. When it's done right, it's multi-disciplinary, with techniques taken from the fields of psychotherapy, counselling, and psychology, not to mention from the technical subject that is actually being assessed.

When it's done wrongly, quick internet lookups of "how to interview candidates" can lead to methods that are questionable. Nuance can be lost, and dogma can prevail.

Assessing properly is an inherently tricky activity, so it's only natural to look for shortcuts. If the candidate can do X (which is easy to assess), it follows that they can do Y (the actual job) kind of thing. One such short-cut is a common software coding challenge called Fizz-Buzz (more on what this is later).

During the assessment process, if you're not careful, quick shortcuts can give rise to a temptation that plagues all interviewers and assessors. That is, kidding ourselves that we're testing a particular thing when we're in fact, testing something different, or nothing at all, in other words, over-conclusion!

Over-concluding: Do you really think you're a genius if you can find the N? Or that a programmer is good if they can ace a single well-known coding challenge?

The focus of any assessment strategy should only be to:

Sort candidates based on their real-world ability to do the job!

Getting to the bare truth means not being tempted by short-cuts that make false promises. However tempting, it's important to avoid making up convenient but inaccurate stories regarding candidates' abilities.

Ill contend in this article that using a coding task called a "code kata" (using the FizzBuzz code kata as an example), is one of the worst offenders here. But first, a bit of background to what this is all about...

To understand where Fizz-Buzz came from and why it's used today, we'll need to take a step backwards in time a temporary sidestep into a different subject.

Let's step into a karate dojo in the last few years of the 20th century and let me explain how what happened there is relevant to modern day software development.

Software, Karate and Drinking Games

Real Katas: It's a karate thing

To understand what Fizz-Buzz is all about, let's first step into a 1990s karate dojo and join a father, who, having dropped his son off at the local karate club, decided to stay and watch.

In every lesson, without fail, certain pre-arranged movements were practised and refined.

Punching, kicking, and blocking all done in the same sequences against an imaginary opponent (as a dance might be).

These movements are called katas.

Katas aren't just made up on the spot, each has its own name and they're passed down through the mists of time.

Katas are practised repeatedly, becoming ingrained into the practitioner's mind.

The idea is, by practising certain moves until they're automatic, when the real fight begins and the adrenalin starts pumping you'll be good to kick ass - automatically!

Code Katas: It's a programming thing

What has this got to do with software? Well, the father in this story was Dave Thomas, the co-author of the 1999 seminal programming book 'The Pragmatic Programmer' (good book)

Thomas was one of the founders of Agile, and definitely a big wig in the field of software development.

If katas work for karate, he figured, why not for programming too? He reasoned that for common programming problems or concepts, practising the solutions repeatedly using, set named exercises, would really get it ingrained into the nogin

This would make programming easier and more efficient. In short, it would make you a better programmer. He was probably right.

Dave Thomas: Agile Boffin and with unquestionable code kata skills. Actual ass kicking skills questionable though

Since the turn of the century named programming katas have begun emerging. A bowling alley scoring system is a well known kata, so is the Roman numerals kata that challenges the programmer to convert commonly used numbers into roman numerals.

A curated list of code katas

Check out the list above, there are a dozens of well-known code katas, way more than exist in your average sprog's karate style.

Arguably, the most well-known kata is "Fizz-Buzz" which challenges the programmer to create the Fizz-Buzz drinking game in code (using a special method).

Fizz-Buzz For Real: It's a Drinking Game

Fizz-Buzz: It's a studenty, slightly geeky drinking game. Personally, I'd rather have a conversation than play Fizz-Buzz, and I'd rather play actual Fizz-Buzz than run a computer simulation of it, but each to their own!

The real Fizz-Buzz drinking game is all to do with how well you can do simple division when getting slowly more drunk. I can think of better drinking games. If you'd like the details of how to play it in the bar, check out this link.

The Fizz-Buzz code kata is a computer simulation of this game. It's one of a group of code katas that practices a particular way of programming that's gained popularity in the last decade called TDD (test driven development).

Fizz-Buzz assesses Test Driven Development (TDD) right?

For those who don't know, TDD is an approach to building software; it stands for "Test Driven Development". It was created (or rediscovered) by a guy called Kent Beck at around the turn of the century. Like Dave Thomas, he was one of the founders of Agile and like Kent, is a very well respected dude.

Kent Beck of 'doing your tests first' fame! Here he's sporting a fetching hat.

TDD is an approach to building software that (among other things) requires the tests for your code to be written before the code its self.

For a developer that is new to this way of doing things TDD requires a shift in their thinking. It can seem 'bass ackwards' at first and needs practice before the penny drops. Fizz-Buzz is a code kata that uses this test first (TDD) approach.

Developing Fizz-Buzz, using a TDD style over and over, is probably a good idea if you're learning, but as a tool to test how good a candidate is, it falls short. Let's see why.

What's the problem with using a Kata (say, Fizz-Buzz) for an assessment?

Misses the mark (their strength is their weakness)

The strength of code katas as a practice tool is that the more you practice them, the better you get. That's the whole point, and that's their strength. As an assessment tool though, it can also be their weakness.

Fizz-Buzz tests TDD in the very narrow context of the Fizz-Buzz kata it's self. It's very easily practiced, so it's easy for a candidate to practice the kata with only scant understanding of the underlying principles.

Generalising from it is dangerous. It doesn't test the programmer's ability to apply TDD principles to real world scenarios that can't be rehearsed. A candidate practised in a code kata is worlds apart from any specific real world application.

What if A candidate who uses TDD in real world applications regularly isn't familiar with Fizz-Buzz (or whichever kata is set as the assessment)? Alternatively, what if a candidate has rehearsed Fizz-Buzz, but isn't familiar with using TDD in the wild... You see the problem?

Inadvertently tests the wrong things

There are many factors at play during a technical assessment that aren't supposed to be the thing under scrutiny. Just to pick a few:

Interview preparation.
Memory.
Ability to talk while coding (multitasking).
Familiarity with remote interviews.
How recently a project was started from scratch.
Familiarity with programming language constructs specific to the kata.
Nerves from being assessed.
How well someone responds to being given an unexpected task (if it is unexpected).
Familiarity with the problem to be solved.

The only solution for the candidate, regardless of their experience is to practice the katas that are common in interviews. For the interviewer the problem is that interview practice should not be the criteria under test.

It Introduces Randomness Into Recruitment

When we're trying to answer a simple question of whether the candidate is right for the job, when we introduce any factors unrelated to this, like the factors above, then we detract from the accuracy and purity of our assessment. We wouldn't want to judge whether someone is good for a job on the basis of whether we like their shoes or their hair-do, so why introduce factors like those above?

Is using fizz-buzz as bad as throwing dice?

Fizz-Buzz will skew our ability to get the right person for the job almost as much.

Too Easy to Practice & Game

Short term preparation of targeted factors unrelated or only slightly related to the real world job has a name, it's called gaming the system. It exploits errors and over conclusions made in the assessment strategy. More about this in future articles.

Because prepping for an assessment does work. But is it interview prep that you want to test, or long term effectiveness?

It's commonly set at interview and the solutions to it are easily found on the internet, what you're actually testing is the candidates recent familiarity with Fizz-Buzz. The Fizz-Buzz solution is very easily practised by under skilled candidates using internet guides; hence this test is ripe for gameing.

A Learning tool is not a Litmus Test

Knowing Fizz-Buzz is like knowing the Green Cross Code (the set of rules taught to children in the UK to help them cross the road). Knowing it doesn't mean you can cross the road safely, and not knowing it doesn't mean you'll get run over.

The Green Cross Code is a useful teaching tool for kids but don't conflate this with an effective assessment tool. It's too specific for that.

The Green Cross Code Man. He helped kids cross the road in the 70s and 80s. Left-field fact, he also played Darth Vader.

I'd advise every candidate to know their way around Fizz-Buzz because if it's used to assess them, practice will help them immensely. But for that very same reason, I'd advise every interviewer never to set it as an assessment task.

Full Disclosure: I don't know the Green Cross Code (I expect you don't either)... But I'm pretty sure we can all cross the road. Point made!

What to do Instead

If good interviewing can be seamlessly melded with technical assessments, and they can be closely related to the job on the ground - then you've hit the gold standard.

It takes skill to contrive a good technical test that's genuinely related to the programming job at hand. It takes time and it takes effort. Here are a few high level, general tips for setting a technical test portion of an assessment.

Contrive a work-like test

The task should use the same frameworks and technology as the client's working environment. It'll be as close as possible to an actual work task.

Contriving a technical test that's similar to real life work can be time-consuming, but it has great advantages. The bonus of this method is that the candidate can even be informed of the task without causing a practice bias - unlike with a kata.

Be kind. You're not supposed to be testing how someone responds to being surprised. You shouldn't warn about a code kata test because then it can be unfairly practised, but with a unique real world test, you can.

Unlike code katas, the intricate details of the task are unique to your system, realistically these details are to complex to be communicated by an agent, and they're impossible to look up online. In this way you mitigate any practice bias and pre warning problems.

Yup, it takes a bit of effort to put together, but fear not, if security and commercial restrictions allow, there's an awesome shortcut that works a treat. Read on.

The Best Shortcut

If mimicking work tasks using work technology is the aim, how about instead of mimicking, we use the real deal. Consider performing a series of real world bug fixes as a pair programming exercise alongside the candidate.

Of course, you'll need to mitigate your risks as much as possible, ensuring that it's done in a sandboxed, cloned environment and making sure that the tasks don't touch commercially sensitive areas of code.

Doing this requires a balance between absolute security precautions and giving candidates access to the system. In the end it may not be possible, but it's worth having a sensible rather than a blanket policy here

If the assessor is trained to focus on what is being assessed, and understands they should help the candidate with superfluous or less relevant issues (like machine set up), (and they can tell the difference) then you'll be streets ahead of most.

If the assessor is experienced and is careful not to over-conclude from any one single bug fix, then this assessment with be very difficult to game.

In Conclusion / TL;DR

When used for assessments katas like Fizz-Buzz are predictable, practicable and gameable. They're too easily warned about in advance and in reality test a different thing to their supposed purpose, meaning you risk rejecting a good candidate or taking on a bad one.

It's important to remember that a technical test is just one part of an effective assessment process, but if you're doing one, it's important that it's not a red herring. Interviews too, when done correctly can glean very high quality information from the candidate, and they're just as important to get right (but more on that in future articles).

In the field of software development a coding task called a kata is often performed to practice development concepts.
These "katas" were inspired by martial arts and are meant repeatedly performed for practice
Wrongly, they are often used to assess software development candidates
They actually assess things other than a developers ability to do the job, including whether they have practised the kata or not.
There is a better way that is much closer to the job being assessed for through contriving a test closer to the real work environment.

A lot of this is about knowing what you're doing, not just with the technology, but with the assessment process its self.

With a little effort, many companies that need devs can massively reduce the risk of a failed hire. And to take a karate analogy, which company doesn't want their best chance of kicking ass into the long term?

Let me know if you disagree because after all, dogma is our enemy, and as always... I'm not over-concluding.