Invisible characters hell

January 18, 2010

After reading Whitespace: the silent killer and re-reading The Great  Newline Schism, both from Coding Horror, I remembered something I’d like to share here: When you write string literals in your code, NEVER type invisible characters (like spaces, or characters that can be misinterpreted by a bad text editor, or by one hat simply don’t support unicode).

Motivation

In a project I worked sometime ago, we had a javascript code that was something like this:


var clearInvisibleCharacter = function(string) {
return string.replace(" ", '');

}

If you input a string which contains any number of whitespaces, the function will replace the first one with nothing, right? Wrong. The invisible character was NOT a whitespace. It was only shown as a space by the editor we were currently using.

The programmer who created the function probably ctrl+c’ed the character from somewhere else and ctrl+v’ed there. When we had to debug it to find out what was wrong, the behavior just didn’t fit.

The solution

So, how can we avoid this? It’s fairly straightforward, in this case: replace the ” ” literal with what you really wanted replaced, expressed in a unicode code, or a scape sequence if you’re dealing with regular expressions.

Trust me, you don’t want this blank-like characters running all over your code. It would drive you crazy. It drove our team crazy.

There is one good question beyond this, though:

  • Why would a blank-like, non-whitespace character show up in the code? The answer is copy and paste. We have to be extra careful everytime we copy and paste anything, because it will be copied again, and it will spread. Do not trust whitespaces are whitespaces, unless explicitly stated.

I plan on talking about why our editor couldn’t tell us why this was not a whitespace, as soon as I realize what really did happen. Any thoughts on this?

Thanks for Campinho and Gabriel, who’ve helped discovering the problem in the first place, ages ago.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: