Monday, October 3, 2016

Beyond the UI: Pitfalls and Best Practices in JavaScript

When I was asked to present at DevCon about JavaScript, there were a lot of things that came to mind - new things I was exploring that I found exciting and long-held beliefs I am generally eager to act as evangelist for - many of those were dismissed as I talked to the conference organizers and they asked that I "go back to basics" but somehow cover both user-agent JavaScript and JavaScript in Node. With that suggestion in mind, I came to this - the pitfalls and best practices in JavaScript.

A little research has revealed that 0.2 percent of web traffic is from user agents that either do not have JavaScript available as a feature or have JavaScript disabled. That means a whopping 99.8 percent of user agent traffic has JavaScript in the front-end. If we add back-end and middle layers that use JavaScript through things like node.js, it's pretty easy to see that JavaScript is everywhere.

While 99.8 percent of front-end traffic is from user agents with JavaScript enabled, it is still very easy to functionally disable JavaScript, and when that happens, everything from simple interactions to complex Single Page Applications fail, and can leave your users - your customers - staring at a blank page. That's not acceptable. That means we must write good code. How do we do that?

Writing code is a skill - which means it can be taught and practiced - and practice is really the only way we get better at any skill, but that's where we often go wrong.

My daughter studies Irish dance, and I asked her one day how she could get better so she could advance to the next grade, and she responded that her teacher told her "practice makes perfect". Seeing what we parents like to call a "teachable moment", I asked her what would happen if she practiced her dances, but practiced the wrong steps - what if she put the steps of her reel in the place of her light jig, for instance? Our discussion of the possible outcomes ranged from disqualification - or as she likes to call it, "getting a big fat zero" - to possibly winning a medal if the judge(s) didn't notice the difference.

That's when I told her that it's not the practice, that makes perfect - it takes more. It's perfect practice that makes perfect. It's not enough that she just practice - she has to practice the right steps, and that's how it is with writing good code. It's not a matter of how practiced we are, it's a matter of how much we've practiced the right things.

So, what are the right things when it comes to good code? Ultimately, good code is good writing. Yes, we're solving a problem, but if we look at it another way, we are telling a story about how to solve the problem. What are the elements of good code - the same elements as good writing.

Good writing has clarity and focus. As engineers who have built object-oriented code for two decades, we pretty much have the 'focus' element down - we just call it 'encapsulation'. Clarity, on the other hand, could use some work.

Good writing is organized. Back in day, we struggled against spaghetti code. We've gotten beyond that now, especially with event-oriented programming, where we have the very real threat of callback hell.

Good writing uses precise language. It says exactly what it means. Here, we can think about data types and how we speak or write outside of code. If you use the word biker in one community it may mean something entirely different in another, and likely does. That word, in a sense, refers to two different data types and is, unfortunately, not very precise.

Good writing uses correct grammar and style. Again, one of these - the grammar portion of the rule - we have down. The syntax is the syntax, and there isn't really a way around it. The style portion of the rule doesn't refer to your "personal style", rather it refers to the stylistic elements like how indentation and braces are used. Using incorrect style is something like dropping a comma from a sentence. We, of course, realize there is a big difference between "it's time to eat, grandma" and "it's time to eat grandma".

Good writing expresses the author's voice. This is your personal style - what makes your code different than code written by someone else. This encompasses things like which substring function you use or how you build a string - whether you use an array and join or the plus operator. These are the things we sometimes don't even notice unless they're vastly different than our own way of doing things.

Ultimately, it all comes down to one thing - we can describe good code in terms of how understandable the story is to other people, because "any fool can write code that a computer can understand; good programmers write code that humans understand".

Now, let's put these elements in terms of best practices that help us understand how we can achieve clarity and focus, how we might approach organization, what it means to use precise language, and what correct style is.

In the language of those who evaluate and manage risk, we hear about "known knowns", "known unknowns", and "unknown unknowns". After nearly two decades of experience with a language and millions of developers all over the world writing in it, we've pretty much identified all the gaps in the language. We often still struggle with where the gaps are in our applications. In a sense, we know that we're missing a piece of the map, but we don't really know what that piece of the map represents.

Is Sky Broadband familiar to anyone? Sky Broadband is a large Internet Service Provider in the UK, and in the early days of 2014, Sky Broadband mistakenly marked the subdomain code.jquery.com as malware - the entire CDN repository. Suddenly pages all over the UK stopped working - the user experience for literally millions of users was ruined, regardless of whether it was a simple interaction or a more complex application.

Of course this not only happens in front-end code, it happens elsewhere as well. It can even happen in Nodejs applications - anywhere JavaScript is used.

Earlier this year, NPM involved themselves in a brand infringment disagreement that was originally between a developer who had several published modules and an organization who wanted to use the name one of his modules used. After NPM acted, the developer pulled all his published modules from NPM, including the eleven lines of code published as 'left-pad'. When that module was pulled, thousands of projects broke before NPM restored the module.

We can debate the ethics around this second example, but in this case it's worth noting that it was eleven lines of code that padding a string on the left side. It wasn't a complex, multifunction library - its removal should not have broken thousands of projects.

You must always know not only what your dependencies are and where they are, but what they do. Put them squarely in the known unknowns column so you can guard yourself against the time when they disappear, or are blocked by an ISP, or are broken in the new release, or any one of the many things that can happen. Not fully identifying dependencies can result in a poor debugging and support experience as well as a poor - or worse, non-existent - user experience while you resolve dependency issues that arise, so the first item on the list of what will be ten principles is to identify dependencies.

We come to it at last. The great battle of our time. And it's the kind of battle few but engineers can have - a battle about design patterns. Make no mistake, nearly everyone takes a side in this battle, but it's no battle between the five armies and the forces of Sauron - there is no good side or evil side - just two competing ideas, each with advantages and disadvantages.
The debate between proponents of the Module Pattern and the Prototype Pattern can be severe; however, if you keep an open mind you will see that these patterns are not terribly dissimilar. The Module Pattern offers advantages in clarity that the Prototype Pattern does not - an advantage more pronounced in teams with greater history in Java development. The Prototype Pattern offers advantages in memory management and performance that the Module Pattern does not - although that advantage can be lessened by applying strict structures and development practices.

By keeping an open mind with an eye focused toward design concerns, you can more easily determine which pattern is appropriate for your needs. It may also be the case that one design pattern is used in a specific context and the other used in a different context; however, I would generally advise against such an arrangement due to the loss of clarity with the reduced consistency. Most importantly, however, the second item on our list of principles is to identify design concerns.

The next contentious debate - albeit one that has died down in recent years - is about indentation style. Years ago, when I was first writing code - about the same time the movie Highlander was released - the hot programming language was C and the indentation and brace style was called K & R - for Kernhigan and Ritchie and The C Programming Language. JavaScript, unlike other languages, has some quirks - different ways of handling control statements and white space and an implied semicolon, or automatic semicolon insertion, for instance. Unfortunately, this sort of behavior could lead you to lose your head if you're not careful, which is why every engineer with more than a modicum of experience with JavaScript will tell you to use the One True Brace Style, or OTBS - because there can be only one.

Here we'll take a look at some statements that use different brace and indentation styles. On the left you'll notice the if statement has no brace but the return statement is indented. In a language such as Python, this would indicate that the return should occur only when the conditional is true; however, in JavaScript the if statement without a brace will only consider the next newline part of the conditional, meaning the return statement will happen every time, not just when the conditional is true. The second example on the left side - the return statement - will get an implied semicolon, causing it to return null every time.

The examples on the right side show the same statements with the left brace on the same line as the control statement. This pattern is the One True Brace Style, and not only does it offer more clarity for the reader, it actually works the way you would expect without inserting semicolons where they're not actually present in the code.

Because of the potential for confusion, not just of humans but the computer as well, always use the One True Brace Style. If not, you may lose your head.

I have a friend - a JavaScript engineer - who is fond of the saying that there are always at least two developers working on a project, you and the developer you will be in six months. I know I look back at code I've written six months ago and say to myself "what were you thinking" or not, as the case may be. Was I confused or tired? More often than not I say to that other developer - me from six months ago - "I suppose you think that was terribly clever" and I think if I only had added comments in I would know just what was going on in the head of that other developer - the me from six months ago.

Comments, though, are only helpful when they're present, and it would seem that a lot of us don't really want to put them in. As a senior engineer at PayPal, one of my tasks was to perform technical interviews when there was a new candidate. I was surprised at the number of people who responded, when asked about comments, that they write self-documenting code. I was often a bit like Inigo Montoya - I don't think that means what a lot of developers think it means. Self-documenting code is code that is written with precision and a measure of clarity. Both of those are qualities of good code, but they are not enough.

You should never assume that self-documenting is enough, because as the six month older me can verify, it's not. It's not enough. Put comments in, and when you're adding comments, you'll want to keep two things in mind. First, always use the whack-star pattern, even for single-line comments. I cannot count the number of times I've seen otherwise beautiful code mangled by a minifier because someone used a whack-whack and a large portion of code that should not have been commented was. As an added bonus, if you use the whack-star pattern and combine it with a whack-whack while you are developing, you can comment and uncomment large blocks of code with the insertion or deletion of a single whack - which in our example would go on the line just after the function declaration before the whack-star, commenting or uncommenting the console.log statement. The second thing you'll want to keep in mind is to use the JavaScript version of JavaDoc - JSDoc. Not only does this offer clarity to your comments and code overall, the syntax is shared among the Java community and it eliminates the need to maintain API documents as a separate project.

The clarity offered by the inclusion of comments is simply too significant to ignore. Clear, precise comments will also help you organize your code and make that organization plain to other developers working on the project - even what that other developer is you. For these reasons, I've added "use comments" to the list of principles as item number four.


Not long after I started at PayPal, we found a bug in the checkout process. It was especially difficult to root out because all the tests in development passed and all the tests in the QA sandbox passed, but when we went to production things just didn't go quite as we expected them too - there was evil there that did not sleep.

Part of the difficulty in diagnosing the issue was the nature of web sites between development, quality assurance, and production. There are resource intensive features - like analytics - that we don't want on development or QA boxes and sometimes that means our production environment is best described the way engineers used to describe aircraft - a collection of parts in close proximity moving quickly in the same direction. If we add poor coding practices into that particular mix, the risk of failure is significant.

In our case, the error was eventually traced to a variable in our analytics package that was not in the proper scope. This previously unknown global variable in a dependency wrecked havoc until it was fixed. Take a lesson here - avoid global variables. In fact, as you can, avoid the global namespace altogether. Also, always use the var keyword when declaring variables - leaving it out will force the previously undeclared variable into the global scope, even if it's only used in one little function.

Third, declare variables at the beginning of the scope. Not only does this offer a little written clarity and orgnization, it offers conceptual clarity as well as it accounts for the hoisting that JavaScript does. Fourth - and this is as important as the previous three statements - call things by their name. If you'll recall, one of the elements of good writing is that it uses precise language. Calling things by their name is part of that precise language.

Fifth, and finally, define variables in the narrowest scope possible. It's not enough to just avoid global variables, define variables as narrowly as possible. Not only will this offer a degree of precision and clarity, it will help with memory management as well.

Now I'm going to combine these five steps into a single principle we can add to the list. We're going to call that principle "name and scope variables properly" and we're going to add it as the fifth item on the list.

At this point we're nearly halfway done with our list and we've still not talked directly about code - at least not more than using the var keyword - it's all been about the softer elements like style.

We've come quite far, though - all the way to Dagobah. Master Yoda - ever encouraging and training Jedi. Of course we know he's in the Star Wars universe and we're in the JavaScript universe, because for us there is a try - and catch - and finally - and we need to use them.

The try statement will be one of your best friends. Gone are the days when it wasn't supported - I believe WebTV was the last user agent that didn't handle it - so feel free to use it any time it's appropriate. It can be especially helpful when you have dependencies and you need to minimize the risk associated with them, because you can catch exceptions thrown when those dependencies are not present.

The try statement also gives you the ability to further customize your user experience. Within the catch statement you can determine the kind of exception thrown by using the instanceof operator as shown on the slide. One word of warning - you may see examples where the exception type conditional is wrapped in the parentheses of the catch statement - don't use that practice - it's not supported. When testing the exception, use the pattern shown here. If you wish to dismiss the exception, use ignore rather than err as the parameter name. It offers a degree of clarity that is also recognized by most linters.

As our sixth principle, then, we are to do the opposite of what Master Yoda says and use try - and catch - and sometimes finally. Speaking from experience, it really will save the user experience or, at the very least, allow you to fail somewhat more gracefully than you would otherwise.

Next we'll have a word about data types.

JavaScript is a loosely-typed language, which means the data type of a variable is not declared. As expected, sometimes engineers have difficulty with the lack of precision that typically accompanies loosely-typed languages, and the lack of strict types been one of the arguments engineers have used against JavaScript in the past.

Concerns about the lack of precision and clarity in loosely-typed languages seem reasonable. After all, you wouldn't want a doctor type if you needed an engineer, or vice versa, and we all know the difficulty Bones had - he had to remind Captain Kirk countless times that he was a doctor type and not some other type.

So let's take a look at a function - one that will be used again as we progress through our list - and see how it might instruct us or what we can glean from it to add to our list.

First, you'll likely notice that there are no comments - if only there had been comments we might have a clue what it is we're look at. Fortunately for us, it's self-documenting code, so we can see that the foo function is looping forward through a list of some sort - from the syntax it appears to be an array - and changing a property on each element of the array that's an object. Seems simple enough - there shouldn't be any problem at all, right?

Except there might be. Notice that on line seven we check to make sure the element is an object. We do that because we're accessing a property - in this case className - on line eight and if the data type of the element were say a number or boolean trying to access the className would likely raise a TypeError exception. Are there other types that would raise a TypeError exception? Sure - a null value would - but that's not an object, right? Unfortunately it is - the typeof operator returns object when the type is null. Line seven will have to be fixed.

Any other problems? As I mentioned earlier, it looks like the list we're using is an array, but we have no comments that tell us that and there isn't a type check to enforce that, we're just going by the square bracket used to reference items on the list. Look at lines twelve and thirteen. Using those lines we can see that the list we're getting is actually a NodeList, which is a live collection. Looping forward through a live collection incurs a significant risk of infinite looping…but there is no way for the developer who wrote the foo function to know that the developer who wrote the function call is going to pass in a live collection, and the developer who wrote the foo function didn't give any clues to the developer who wrote the function call that an array was expected.

Ultimately, we can see that this code is not nearly as precise as it should be. The engineer writing the foo function could have used item instead of square brackets, or alternatively checked the type of the list variable to ensure it was an array. The engineer should also have checked for other types rather than just relying on a type object before modifying properties. You must always keep data type in mind, otherwise your code is prone to serious errors - errors that will often be very difficult to find and correct.

The I, Mudd episode of the original Star Trek series is one of my favorites - mostly because of its use of logic, but also because of the phrases that seem to belie the seriousness of the situation - phrases like "we seem to be taking an unscheduled ride" - it's rather like "oh, it appears we've been taken hostage and may die, but at least there's tea". Infinite loops - like those triggered by looping forward through a live collection - are the unscheduled rides of JavaScript. They're a little humorous in that they're usually brought on by a careless mistake but they can really ruin your day.

I mentioned that we would return to the foo function, so let's do that now - but let's make some corrections to it in an attempt to fix the issues. First, we'll correct line seven so that we'll catch a null value that would raise a TypeError exception - we can do that by putting a truthy check in there. Next, we'll set the loop up to start and the end and loop backward to the beginning - if we're looping through a live collection or an array that's the safe way of doing it. We'll set the index variable to the length of the list and change our for loop into a while - it's faster anyway so bonus there. Now we're all set to go - we've modified the code so it's flexible enough to handle live collections and arrays without blowing up.

Sort of. There's a potential problem in our while loop, associated with the fact that we're using the index value - a number - in a truthy check. A zero value is false, but all other numbers are true - which means negative numbers are true. We're using a decrement operator on the loop index to move backward, but the operator is in front of the variable name rather than behind the variable name, meaning the decrement is performed before the variable is evaluated. If the list length is zero, the index value is negative before it's evaluated the first time, so the truthy value is true and continues to be true as the index is decremented, and your unscheduled ride is on in earnest.

There are ways we can compensate for this - ways we can avoid the unscheduled ride, but it's the simplest way that also adds clarity without losing precision in the language - and that's to avoid confusing operators, which you'll notice is now the eighth principle on our list.

You'll recall that our while loop in the foo function - while decrement index - used a number that was decremented before being evaluated and as a result, the negative number was treated as true and sent us into an unscheduled ride. This is ultimately because not only is JavaScript loosely typed, but it's also what I call type fluid. Variables change data types in one of three ways: they can be converted, which requires an explicit function call; they can be explicitly coerced, which typically doesn't use a function call but instead uses a clue or hint that we want the variable coerced; and they can be implicitly coerced, which is when one data type is directly treated as another type.

Other languages give the ability to convert between data types; however, JavaScript goes beyond that. In fact, it will often coerce values even when we are neither aware the value will be coerced nor want the variable coerced. In a sense, it's as if JavaScript says "we will convert your data type to our own, resistance is futile".

So let's take a look at a simple variable statement and see where the potential problems lie and use that to determine what principle we should follow to write good code.

Here you'll see we have the value for PI as a string in scientific notation. We then convert that string into a number using two different methods - Number and parseFloat - both returning a floating-point value, before we attempt to convert the value to an integer using parseInt or coerce the string using the plus and minus operators and finally coerce it from an ordinary floating-point to a fixed precision number.

Notice that four of those statements are identified as potential problems.

Let's start with the first potential problem - parseInt. One might assume that the number would be converted to a floating-point then converted to an integer; however, that is not even remotely the case. The conversion reads the string starting at the left and moving right to the first non-digit. Everything in that substring - here '0314' - is then converted to an integer. There's a hidden gem in this particular conversion method, however. If the left-most character in the string is a zero, the parseInt function - which has an optional radix parameter - may default to octal and convert 0314 to the integer 204. Suddenly, though you were expecting three, you have the number 314, or possibly 204.

Next, we look at the plus2 variable. In this instance, we have two different data types - string and number - but the code is not converting either of them; the code is, however, explicitly coercing one of the values - the question is which value will it coerce.

As you can see, the two was coerced to a string data type and added to the existing string, giving us a much different numeric value - one that is a mere fraction of the value of PI.

While I would like to be able to give a hard and fast rule that says the plus operator will always concatenate strings first - that it will coerce a number to a string and then concatenate it - it's not that simple. If the plus operator is at the beginning of the variable name without white space between the plus and the variable name, the value will be coerced to a number before other operations are performed. This is true even if the value is a boolean - which is a one when true and zero when false after it's coerced.

Next, let's look at the minus2 variable. In this instance, we again have two different data types - string and number - and the code is not converting either of them. The difference between this variable and the plus2 variable is the use of the minus operator in place of the plus operator.

We might assume, given the behavior of the plus operator in the plus2 variable, that the result would be the value of the pi variable with the two removed. Of course that is not the case - the minus operator does not split a string or perform any sort of substitution on a string variable. In this case, pi is coerced to a number and two is subtracted from it. You'll notice, however, that when the pi variable is coerced, the value is different - it's not 3.14. If you're checking equivalence of coerced values, you will likely be surprised.

Finally, let's look at the asDollar variable. Of course you would not be very likely to request a precision of eighteen when referring to a dollar value; however, this example is intended to showcase a particular problem. You may think, especially given the name of the function being called - toFixed - that you are simply requesting a specific precision, but you are not. If you'll notice, the string data types on this slide have been identified by, one, being enclosed in single quotes and, two, using an italic typeface. The value for our asDollar variable is enclosed in single quotes and uses an italic typeface. You thought you were asking for a fixed precision number, but what was returned was a string.

Coercion, both implicit and explicit, is problematic. It, at the very least, lacks clarity and precision. Because of the severity of the errors that may be introduced, I've added the principle "avoid coercion" as the ninth item on the list of ten principles.

With the ninth item having been added to our list of ten principles, it's clear that we're near the end, and at this point we're going to leave code behind.

After we've followed all nine principles, and we have well-commented, self-documenting code that is precise in language and structure it'll work without a problem, right? We can have total faith that what we've written will do just exactly what we want it to do the first time and every time. That kind of faith is fine with Lord Vader - who finds a lack of faith disturbing - everyone else wants you to test your code, then test it again before you test it again. Write and run every unit test you can think of and test cases you think are highly improbable, because especially if you do the amount of business that a company like PayPal does - which was nearly five billion transactions in 2015 - you can be surprised at just how often the highly improbable happens.

In addition to running every unit test you can think of and several you hadn't thought of, be sure to use a linter to verify your code - jshint, jslint, and eslint are all very good and all do basically the same thing but in different ways. Each of the three has its own interface and can be integrated into the development process at a different point. If you wish to use all three, that can be done - but it's a little overkill. I would suggest that you integrate a linter with your version management tool, and while I have no direct experience with it, I understand having a linter integrated with your IDE makes life significantly easier.

Third, minify your code and test again. Be sure here you're using the same aggregation and minification process you use when rolling your code out into production. Each process has its own way of accomplishing the task and each can introduce something special into the mix that will break your code. I've seen beautiful code mangled and crushed into an inoperable state by aggregation and minification.

Fourth, perform code reviews. Here's something else we tend to get a little bit wrong much like we do with the expression "practice makes perfect". Typically, senior team members review code before a pull request is approved to verify that the code is sound and adheres to whatever standards exist within the organization. Those are good criteria; however, code review should not stop there. Code review should include the most junior, least experienced engineers in the group. The practice of heavily involving junior engineers in the code review process gives them the opportunity to learn to code better - just like reading good writers makes a writer better - and it gives the authoring engineer confirmation that their code contains the elements of good writing. If your code does not have the clarity to be understandable to the least experienced engineer on your team, it is not clear enough. Here I should point out that the junior members of your team need not know how to code a solution to every problem - if that were the case they would likely be senior engineers - but they should be able to understand the solution they are reading.

Fifth, as part of this process, have a human test your peer-reviewed, unit-tested, linted, minified code. This step is extremely important for any JavaScript intended for a user-agent even though it may be decidedly less so for code that runs in Node. Basically, any time a human meets the code - whether that's in a user-agent or on the command line - test the interaction. Humans have quirks and habits that help them generalize experience and reduce cognitive load that will affect how they interact with the code. Simply put, in general, no two people will interface with code the same way. There will a lot of overlap, to be certain, but it's the edge cases that you're testing here - all those behaviors that are especially active when someone is interacting with a new problem - all those things that are highly improbable.

All five steps in this process have one thing in common - that one thing that Lord Vader finds disturbing - they show a lack of faith. We need to remember as engineers, however, that our realm is not what Han Solo called a "hokey religion" - our realm is one of facts and data and requires proof. Show a lack of faith in your code and prove that its good code and good writing.

So there they are - ten principles that not only help you avoid some of the most common pitfalls, but principles that will help you become a better writer as you make their implementation habit. Follow them, and you'll discover that not only does the computer understand your code, humans do too.


A Special Note

This is the extended version of the Beyond the UI session I delivered at the American Express Mobile-Web DevCon on 28 September 2016.

No comments:

Post a Comment