Tuesday, February 26, 2013

Address validation

Periodically I get requests to add "address validation" to a web page. For most of them I talk with the designers and product owners to determine what it is they feel they need and what can be reasonably done in a given scenario. Inevitably those discussions are enormously helpful - sometimes the desired "validation" can be easily accomplished and sometimes not - in all cases, though, the product owners have been considerably more receptive to input than fellow User Interface Engineers. (I tend to believe that's because we engineers don't like leaving things we see as "problems" unsolved.)

If you're considering adding some form of address validation, I'll offer a few points you'll want to consider as the interested parties attempt to come to consensus. Note that I'm not offering conclusions (or even a conclusion) here, just suggesting that you need to evaluate your users and determine what's in their best interest. If your site (or page) is country-specific then your needs are going to be very different than if your site is used in several countries or is intended to be used globally.

First, let's address the elephant in the room...international addresses pose several challenges. Luckily for us, there is a long-established standard for data interchange in this regard, from the IETF no less. The VCARD format (RFC 6350) has a specification for address properties (see Section 6.3). It will be to your benefit if you follow this specification, and its implementations (one of which I'll discuss a little later).

With the standards discussion is out of the way, let's dispense with the simple  case - verifying that the user has entered something other than whitespace for required data elements. This use case should be a no-brainer. For user-agents that recognize HTML5, this can easily be accomplished by adding the "required" attribute to the inputs1, and for any others, this is an easy JavaScript check. So, in the words of Nike, just do it.

Now, for the more complicated case - actually verifying some of the data. For this, we'll need to look at the various pieces of an address. To make it easier, we'll start with the postcode (or, if you're American, the Zone Improvement Plan Code or ZIP Code), one of the most normalized pieces of an address.

For the complicated case we should first consider that of the approximately 240 countries identified by ISO country codes, 75% of them use postal codes of any kind. Of those 180 codes, there are few that are similar enough to attempt to re-use. If you decide to validate the postal code, you should determine beforehand if you will deliver all of the formats to all users and sort out on the front-end whether or not a format should be used, or if you will attempt to deliver only those formats likely to be used based on geolocation.

Second, consider how you're going to validate the postcode. Will you just validate it for format (which is relatively easy), or will you use a service that actually validates a specific postcode against a list of valid postcodes? Either is possible (for the 75% of countries that use postcodes), but they aren't equal. Further, if you use the latter, will you attempt to tie a postcode to a region or locality? Will that locality/region be used in the user interface in some manner? If so, be aware that postcodes are not intended to necessarily have a 1:1 relationship with locality/region - they are intended as physical delivery zones and not geopolitical zones. Often the two categories overlap, but that's not necessarily the case.

If your site (or page) is country-specific, you can easily validate regions (e.g. states in the US or provinces in CA); however, locality (e.g. city, town, or village) is a little more tricky. There are exceptions, most notably Singapore, but those are exceptions rather than the rule. Again, the more restricted your site (or page) is, the easier the validation.

That simply leaves the street address and, optionally, the extended address. Here lies the greatest variance in addresses. In the US, for example, the street address can be further divided into a street number, street name, and street type (e.g. 123 Any Blvd); however, other countries just might not follow this pattern - in Ireland, for example, you may see a one or two-line address something like "13 Parkgate Street" or as simple as "Bunowen Pier".

Of course this points to an even greater problem - how many data elements are there in an address? If we go by the hCard microformat2, there are 4 to 6, depending on which you choose to include. Of course while that works fairly well for most addresses, there are some for which it doesn't - a good example is the address for the Lord Mayor of Dublin3.

If you have come far enough to have resolved all of these issues, you then must consider how the user and the interface will interact, especially when there is a conflict. For example, if you alter the address input form to move the postcode earlier in the form than the locality/region, how does that alter the experience? What if the user enters a locality/region that does not match your postcode validation? What if you're using a service (such as an interface to the Royal Mail's Postcode Address File) to validate the address, how do you design your interface to minimize disruption to the user?

This entry has specifically been oriented towards address validation, however, the principles can be applied to other areas as well, and many times will lead us to many of the same conclusions. Part of designing and developing fast, light-weight, and easy-to-use user interfaces requires us to not only know our users and keep their experience in the fore, but also to follow Postel's Law, even if that means not solving every problem that we see.
  1. Accessible HTML5 Forms – Required Inputs
  2. hCard Microformat
  3. Dublin City Council
    Lord Mayor's Office
    Mansion House
    Dawson Street
    Dublin 2 

No comments:

Post a Comment