Explicit Coercion

Explicit coercion refers to type conversions that are obvious and explicit. There’s a wide range of type conversion usage that clearly falls under the explicit coercion category for most developers.

The goal here is to identify patterns in our code where we can make it clear and obvious that we’re converting a value from one type to another, so as to not leave potholes for future developers to trip into. The more explicit we are, the more likely someone later will be able to read our code and understand without undue effort what our intent was.

It would be hard to find any salient disagreements with explicit coercion, as it most closely aligns with how the commonly accepted practice of type conversion works in statically typed languages. As such, we’ll take for granted (for now) that explicit coercion can be agreed upon to not be evil or controversial. We’ll revisit this later, though.

Explicitly: Strings <—> Numbers

We’ll start with the simplest and perhaps most common coercion operation: coercing values between string and number representation.

To coerce between strings and numbers, we use the built-in String(..) and Number(..) functions (which we referred to as “native constructors” in Chapter 3), but very importantly, we do not use the new keyword in front of them. As such, we’re not creating object wrappers.

Instead, we’re actually explicitly coercing between the two types:

  1. var a = 42;
  2. var b = String( a );
  3. var c = "3.14";
  4. var d = Number( c );
  5. b; // "42"
  6. d; // 3.14

String(..) coerces from any other value to a primitive string value, using the rules of the ToString operation discussed earlier. Number(..) coerces from any other value to a primitive number value, using the rules of the ToNumber operation discussed earlier.

I call this explicit coercion because in general, it’s pretty obvious to most developers that the end result of these operations is the applicable type conversion.

In fact, this usage actually looks a lot like it does in some other statically typed languages.

For example, in C/C++, you can say either (int)x or int(x), and both will convert the value in x to an integer. Both forms are valid, but many prefer the latter, which kinda looks like a function call. In JavaScript, when you say Number(x), it looks awfully similar. Does it matter that it’s actually a function call in JS? Not really.

Besides String(..) and Number(..), there are other ways to “explicitly” convert these values between string and number:

  1. var a = 42;
  2. var b = a.toString();
  3. var c = "3.14";
  4. var d = +c;
  5. b; // "42"
  6. d; // 3.14

Calling a.toString() is ostensibly explicit (pretty clear that “toString” means “to a string”), but there’s some hidden implicitness here. toString() cannot be called on a primitive value like 42. So JS automatically “boxes” (see Chapter 3) 42 in an object wrapper, so that toString() can be called against the object. In other words, you might call it “explicitly implicit.”

+c here is showing the unary operator form (operator with only one operand) of the + operator. Instead of performing mathematic addition (or string concatenation — see below), the unary + explicitly coerces its operand (c) to a number value.

Is +c explicit coercion? Depends on your experience and perspective. If you know (which you do, now!) that unary + is explicitly intended for number coercion, then it’s pretty explicit and obvious. However, if you’ve never seen it before, it can seem awfully confusing, implicit, with hidden side effects, etc.

Note: The generally accepted perspective in the open-source JS community is that unary + is an accepted form of explicit coercion.

Even if you really like the +c form, there are definitely places where it can look awfully confusing. Consider:

  1. var c = "3.14";
  2. var d = 5+ +c;
  3. d; // 8.14

The unary - operator also coerces like + does, but it also flips the sign of the number. However, you cannot put two -- next to each other to unflip the sign, as that’s parsed as the decrement operator. Instead, you would need to do: - -"3.14" with a space in between, and that would result in coercion to 3.14.

You can probably dream up all sorts of hideous combinations of binary operators (like + for addition) next to the unary form of an operator. Here’s another crazy example:

  1. 1 + - + + + - + 1; // 2

You should strongly consider avoiding unary + (or -) coercion when it’s immediately adjacent to other operators. While the above works, it would almost universally be considered a bad idea. Even d = +c (or d =+ c for that matter!) can far too easily be confused for d += c, which is entirely different!

Note: Another extremely confusing place for unary + to be used adjacent to another operator would be the ++ increment operator and -- decrement operator. For example: a +++b, a + ++b, and a + + +b. See “Expression Side-Effects” in Chapter 5 for more about ++.

Remember, we’re trying to be explicit and reduce confusion, not make it much worse!

Date To number

Another common usage of the unary + operator is to coerce a Date object into a number, because the result is the unix timestamp (milliseconds elapsed since 1 January 1970 00:00:00 UTC) representation of the date/time value:

  1. var d = new Date( "Mon, 18 Aug 2014 08:53:06 CDT" );
  2. +d; // 1408369986000

The most common usage of this idiom is to get the current now moment as a timestamp, such as:

  1. var timestamp = +new Date();

Note: Some developers are aware of a peculiar syntactic “trick” in JavaScript, which is that the () set on a constructor call (a function called with new) is optional if there are no arguments to pass. So you may run across the var timestamp = +new Date; form. However, not all developers agree that omitting the () improves readability, as it’s an uncommon syntax exception that only applies to the new fn() call form and not the regular fn() call form.

But coercion is not the only way to get the timestamp out of a Date object. A noncoercion approach is perhaps even preferable, as it’s even more explicit:

  1. var timestamp = new Date().getTime();
  2. // var timestamp = (new Date()).getTime();
  3. // var timestamp = (new Date).getTime();

But an even more preferable noncoercion option is to use the ES5 added Date.now() static function:

  1. var timestamp = Date.now();

And if you want to polyfill Date.now() into older browsers, it’s pretty simple:

  1. if (!Date.now) {
  2. Date.now = function() {
  3. return +new Date();
  4. };
  5. }

I’d recommend skipping the coercion forms related to dates. Use Date.now() for current now timestamps, and new Date( .. ).getTime() for getting a timestamp of a specific non-now date/time that you need to specify.

The Curious Case of the ~

One coercive JS operator that is often overlooked and usually very confused is the tilde ~ operator (aka “bitwise NOT”). Many of those who even understand what it does will often times still want to avoid it. But sticking to the spirit of our approach in this book and series, let’s dig into it to find out if ~ has anything useful to give us.

In the “32-bit (Signed) Integers” section of Chapter 2, we covered how bitwise operators in JS are defined only for 32-bit operations, which means they force their operands to conform to 32-bit value representations. The rules for how this happens are controlled by the ToInt32 abstract operation (ES5 spec, section 9.5).

ToInt32 first does a ToNumber coercion, which means if the value is "123", it’s going to first become 123 before the ToInt32 rules are applied.

While not technically coercion itself (since the type doesn’t change!), using bitwise operators (like | or ~) with certain special number values produces a coercive effect that results in a different number value.

For example, let’s first consider the | “bitwise OR” operator used in the otherwise no-op idiom 0 | x, which (as Chapter 2 showed) essentially only does the ToInt32 conversion:

  1. 0 | -0; // 0
  2. 0 | NaN; // 0
  3. 0 | Infinity; // 0
  4. 0 | -Infinity; // 0

These special numbers aren’t 32-bit representable (since they come from the 64-bit IEEE 754 standard — see Chapter 2), so ToInt32 just specifies 0 as the result from these values.

It’s debatable if 0 | __ is an explicit form of this coercive ToInt32 operation or if it’s more implicit. From the spec perspective, it’s unquestionably explicit, but if you don’t understand bitwise operations at this level, it can seem a bit more implicitly magical. Nevertheless, consistent with other assertions in this chapter, we will call it explicit.

So, let’s turn our attention back to ~. The ~ operator first “coerces” to a 32-bit number value, and then performs a bitwise negation (flipping each bit’s parity).

Note: This is very similar to how ! not only coerces its value to boolean but also flips its parity (see discussion of the “unary !“ later).

But… what!? Why do we care about bits being flipped? That’s some pretty specialized, nuanced stuff. It’s pretty rare for JS developers to need to reason about individual bits.

Another way of thinking about the definition of ~ comes from old-school computer science/discrete Mathematics: ~ performs two’s-complement. Great, thanks, that’s totally clearer!

Let’s try again: ~x is roughly the same as -(x+1). That’s weird, but slightly easier to reason about. So:

  1. ~42; // -(42+1) ==> -43

You’re probably still wondering what the heck all this ~ stuff is about, or why it really matters for a coercion discussion. Let’s quickly get to the point.

Consider -(x+1). What’s the only value that you can perform that operation on that will produce a 0 (or -0 technically!) result? -1. In other words, ~ used with a range of number values will produce a falsy (easily coercible to false) 0 value for the -1 input value, and any other truthy number otherwise.

Why is that relevant?

-1 is commonly called a “sentinel value,” which basically means a value that’s given an arbitrary semantic meaning within the greater set of values of its same type (numbers). The C-language uses -1 sentinel values for many functions that return >= 0 values for “success” and -1 for “failure.”

JavaScript adopted this precedent when defining the string operation indexOf(..), which searches for a substring and if found returns its zero-based index position, or -1 if not found.

It’s pretty common to try to use indexOf(..) not just as an operation to get the position, but as a boolean check of presence/absence of a substring in another string. Here’s how developers usually perform such checks:

  1. var a = "Hello World";
  2. if (a.indexOf( "lo" ) >= 0) { // true
  3. // found it!
  4. }
  5. if (a.indexOf( "lo" ) != -1) { // true
  6. // found it
  7. }
  8. if (a.indexOf( "ol" ) < 0) { // true
  9. // not found!
  10. }
  11. if (a.indexOf( "ol" ) == -1) { // true
  12. // not found!
  13. }

I find it kind of gross to look at >= 0 or == -1. It’s basically a “leaky abstraction,” in that it’s leaking underlying implementation behavior — the usage of sentinel -1 for “failure” — into my code. I would prefer to hide such a detail.

And now, finally, we see why ~ could help us! Using ~ with indexOf() “coerces” (actually just transforms) the value to be appropriately boolean-coercible:

  1. var a = "Hello World";
  2. ~a.indexOf( "lo" ); // -4 <-- truthy!
  3. if (~a.indexOf( "lo" )) { // true
  4. // found it!
  5. }
  6. ~a.indexOf( "ol" ); // 0 <-- falsy!
  7. !~a.indexOf( "ol" ); // true
  8. if (!~a.indexOf( "ol" )) { // true
  9. // not found!
  10. }

~ takes the return value of indexOf(..) and transforms it: for the “failure” -1 we get the falsy 0, and every other value is truthy.

Note: The -(x+1) pseudo-algorithm for ~ would imply that ~-1 is -0, but actually it produces 0 because the underlying operation is actually bitwise, not mathematic.

Technically, if (~a.indexOf(..)) is still relying on implicit coercion of its resultant 0 to false or nonzero to true. But overall, ~ still feels to me more like an explicit coercion mechanism, as long as you know what it’s intended to do in this idiom.

I find this to be cleaner code than the previous >= 0 / == -1 clutter.

Truncating Bits

There’s one more place ~ may show up in code you run across: some developers use the double tilde ~~ to truncate the decimal part of a number (i.e., “coerce” it to a whole number “integer”). It’s commonly (though mistakingly) said this is the same result as calling Math.floor(..).

How ~~ works is that the first ~ applies the ToInt32 “coercion” and does the bitwise flip, and then the second ~ does another bitwise flip, flipping all the bits back to the original state. The end result is just the ToInt32 “coercion” (aka truncation).

Note: The bitwise double-flip of ~~ is very similar to the parity double-negate !! behavior, explained in the “Explicitly: * —> Boolean” section later.

However, ~~ needs some caution/clarification. First, it only works reliably on 32-bit values. But more importantly, it doesn’t work the same on negative numbers as Math.floor(..) does!

  1. Math.floor( -49.6 ); // -50
  2. ~~-49.6; // -49

Setting the Math.floor(..) difference aside, ~~x can truncate to a (32-bit) integer. But so does x | 0, and seemingly with (slightly) less effort.

So, why might you choose ~~x over x | 0, then? Operator precedence (see Chapter 5):

  1. ~~1E20 / 10; // 166199296
  2. 1E20 | 0 / 10; // 1661992960
  3. (1E20 | 0) / 10; // 166199296

Just as with all other advice here, use ~ and ~~ as explicit mechanisms for “coercion” and value transformation only if everyone who reads/writes such code is properly aware of how these operators work!

Explicitly: Parsing Numeric Strings

A similar outcome to coercing a string to a number can be achieved by parsing a number out of a string‘s character contents. There are, however, distinct differences between this parsing and the type conversion we examined above.

Consider:

  1. var a = "42";
  2. var b = "42px";
  3. Number( a ); // 42
  4. parseInt( a ); // 42
  5. Number( b ); // NaN
  6. parseInt( b ); // 42

Parsing a numeric value out of a string is tolerant of non-numeric characters — it just stops parsing left-to-right when encountered — whereas coercion is not tolerant and fails resulting in the NaN value.

Parsing should not be seen as a substitute for coercion. These two tasks, while similar, have different purposes. Parse a string as a number when you don’t know/care what other non-numeric characters there may be on the right-hand side. Coerce a string (to a number) when the only acceptable values are numeric and something like "42px" should be rejected as a number.

Tip: parseInt(..) has a twin, parseFloat(..), which (as it sounds) pulls out a floating-point number from a string.

Don’t forget that parseInt(..) operates on string values. It makes absolutely no sense to pass a number value to parseInt(..). Nor would it make sense to pass any other type of value, like true, function(){..} or [1,2,3].

If you pass a non-string, the value you pass will automatically be coerced to a string first (see “ToString“ earlier), which would clearly be a kind of hidden implicit coercion. It’s a really bad idea to rely upon such a behavior in your program, so never use parseInt(..) with a non-string value.

Prior to ES5, another gotcha existed with parseInt(..), which was the source of many JS programs’ bugs. If you didn’t pass a second argument to indicate which numeric base (aka radix) to use for interpreting the numeric string contents, parseInt(..) would look at the beginning character(s) to make a guess.

If the first two characters were "0x" or "0X", the guess (by convention) was that you wanted to interpret the string as a hexadecimal (base-16) number. Otherwise, if the first character was "0", the guess (again, by convention) was that you wanted to interpret the string as an octal (base-8) number.

Hexadecimal strings (with the leading 0x or 0X) aren’t terribly easy to get mixed up. But the octal number guessing proved devilishly common. For example:

  1. var hour = parseInt( selectedHour.value );
  2. var minute = parseInt( selectedMinute.value );
  3. console.log( "The time you selected was: " + hour + ":" + minute);

Seems harmless, right? Try selecting 08 for the hour and 09 for the minute. You’ll get 0:0. Why? because neither 8 nor 9 are valid characters in octal base-8.

The pre-ES5 fix was simple, but so easy to forget: always pass 10 as the second argument. This was totally safe:

  1. var hour = parseInt( selectedHour.value, 10 );
  2. var minute = parseInt( selectedMiniute.value, 10 );

As of ES5, parseInt(..) no longer guesses octal. Unless you say otherwise, it assumes base-10 (or base-16 for "0x" prefixes). That’s much nicer. Just be careful if your code has to run in pre-ES5 environments, in which case you still need to pass 10 for the radix.

Parsing Non-Strings

One somewhat infamous example of parseInt(..)‘s behavior is highlighted in a sarcastic joke post a few years ago, poking fun at this JS behavior:

  1. parseInt( 1/0, 19 ); // 18

The assumptive (but totally invalid) assertion was, “If I pass in Infinity, and parse an integer out of that, I should get Infinity back, not 18.” Surely, JS must be crazy for this outcome, right?

Though this example is obviously contrived and unreal, let’s indulge the madness for a moment and examine whether JS really is that crazy.

First off, the most obvious sin committed here is to pass a non-string to parseInt(..). That’s a no-no. Do it and you’re asking for trouble. But even if you do, JS politely coerces what you pass in into a string that it can try to parse.

Some would argue that this is unreasonable behavior, and that parseInt(..) should refuse to operate on a non-string value. Should it perhaps throw an error? That would be very Java-like, frankly. I shudder at thinking JS should start throwing errors all over the place so that try..catch is needed around almost every line.

Should it return NaN? Maybe. But… what about:

  1. parseInt( new String( "42") );

Should that fail, too? It’s a non-string value. If you want that String object wrapper to be unboxed to "42", then is it really so unusual for 42 to first become "42" so that 42 can be parsed back out?

I would argue that this half-explicit, half-implicit coercion that can occur can often be a very helpful thing. For example:

  1. var a = {
  2. num: 21,
  3. toString: function() { return String( this.num * 2 ); }
  4. };
  5. parseInt( a ); // 42

The fact that parseInt(..) forcibly coerces its value to a string to perform the parse on is quite sensible. If you pass in garbage, and you get garbage back out, don’t blame the trash can — it just did its job faithfully.

So, if you pass in a value like Infinity (the result of 1 / 0 obviously), what sort of string representation would make the most sense for its coercion? Only two reasonable choices come to mind: "Infinity" and "∞". JS chose "Infinity". I’m glad it did.

I think it’s a good thing that all values in JS have some sort of default string representation, so that they aren’t mysterious black boxes that we can’t debug and reason about.

Now, what about base-19? Obviously, completely bogus and contrived. No real JS programs use base-19. It’s absurd. But again, let’s indulge the ridiculousness. In base-19, the valid numeric characters are 0 - 9 and a - i (case insensitive).

So, back to our parseInt( 1/0, 19 ) example. It’s essentially parseInt( "Infinity", 19 ). How does it parse? The first character is "I", which is value 18 in the silly base-19. The second character "n" is not in the valid set of numeric characters, and as such the parsing simply politely stops, just like when it ran across "p" in "42px".

The result? 18. Exactly like it sensibly should be. The behaviors involved to get us there, and not to an error or to Infinity itself, are very important to JS, and should not be so easily discarded.

Other examples of this behavior with parseInt(..) that may be surprising but are quite sensible include:

  1. parseInt( 0.000008 ); // 0 ("0" from "0.000008")
  2. parseInt( 0.0000008 ); // 8 ("8" from "8e-7")
  3. parseInt( false, 16 ); // 250 ("fa" from "false")
  4. parseInt( parseInt, 16 ); // 15 ("f" from "function..")
  5. parseInt( "0x10" ); // 16
  6. parseInt( "103", 2 ); // 2

parseInt(..) is actually pretty predictable and consistent in its behavior. If you use it correctly, you’ll get sensible results. If you use it incorrectly, the crazy results you get are not the fault of JavaScript.

Explicitly: * —> Boolean

Now, let’s examine coercing from any non-boolean value to a boolean.

Just like with String(..) and Number(..) above, Boolean(..) (without the new, of course!) is an explicit way of forcing the ToBoolean coercion:

  1. var a = "0";
  2. var b = [];
  3. var c = {};
  4. var d = "";
  5. var e = 0;
  6. var f = null;
  7. var g;
  8. Boolean( a ); // true
  9. Boolean( b ); // true
  10. Boolean( c ); // true
  11. Boolean( d ); // false
  12. Boolean( e ); // false
  13. Boolean( f ); // false
  14. Boolean( g ); // false

While Boolean(..) is clearly explicit, it’s not at all common or idiomatic.

Just like the unary + operator coerces a value to a number (see above), the unary ! negate operator explicitly coerces a value to a boolean. The problem is that it also flips the value from truthy to falsy or vice versa. So, the most common way JS developers explicitly coerce to boolean is to use the !! double-negate operator, because the second ! will flip the parity back to the original:

  1. var a = "0";
  2. var b = [];
  3. var c = {};
  4. var d = "";
  5. var e = 0;
  6. var f = null;
  7. var g;
  8. !!a; // true
  9. !!b; // true
  10. !!c; // true
  11. !!d; // false
  12. !!e; // false
  13. !!f; // false
  14. !!g; // false

Any of these ToBoolean coercions would happen implicitly without the Boolean(..) or !!, if used in a boolean context such as an if (..) .. statement. But the goal here is to explicitly force the value to a boolean to make it clearer that the ToBoolean coercion is intended.

Another example use-case for explicit ToBoolean coercion is if you want to force a true/false value coercion in the JSON serialization of a data structure:

  1. var a = [
  2. 1,
  3. function(){ /*..*/ },
  4. 2,
  5. function(){ /*..*/ }
  6. ];
  7. JSON.stringify( a ); // "[1,null,2,null]"
  8. JSON.stringify( a, function(key,val){
  9. if (typeof val == "function") {
  10. // force `ToBoolean` coercion of the function
  11. return !!val;
  12. }
  13. else {
  14. return val;
  15. }
  16. } );
  17. // "[1,true,2,true]"

If you come to JavaScript from Java, you may recognize this idiom:

  1. var a = 42;
  2. var b = a ? true : false;

The ? : ternary operator will test a for truthiness, and based on that test will either assign true or false to b, accordingly.

On its surface, this idiom looks like a form of explicit ToBoolean-type coercion, since it’s obvious that only either true or false come out of the operation.

However, there’s a hidden implicit coercion, in that the a expression has to first be coerced to boolean to perform the truthiness test. I’d call this idiom “explicitly implicit.” Furthermore, I’d suggest you should avoid this idiom completely in JavaScript. It offers no real benefit, and worse, masquerades as something it’s not.

Boolean(a) and !!a are far better as explicit coercion options.