Unicode codepoint escapes in strings
ES6 introduces escapes that allow users to represent a Unicode codepoint using just a single escape.
As an example, consider the need to escape a string that contains the character ‘𠮷’. In UTF-16/UCS2, ‘𠮷’ is represented as a surrogate pair, meaning that it’s encoded using a pair of 16-bit code units of values, specifically 0xD842
and 0xDFB7
. Previously this meant that you’d have to escape the codepoint as "\uD842\uDFB7"
. This has the major downside that it’s difficult to discern two independent characters from a surrogate pair.
With ES6’s codepoint escapes, you can cleanly represent that exact character in strings and template strings with a single escape: "\u{20bb7}"
. TypeScript will emit the string in ES3/ES5 as "\uD842\uDFB7"
.