- Chapter 19. Regular Expressions
- Regular Expression Syntax
- Unicode and Regular Expressions
- Creating a Regular Expression
- RegExp.prototype.test: Is There a Match?
- String.prototype.search: At What Index Is There a Match?
- RegExp.prototype.exec: Capture Groups
- String.prototype.match: Capture Groups or Return All Matching Substrings
- String.prototype.replace: Search and Replace
- Problems with the Flag /g
- Tips and Tricks
- Regular Expression Cheat Sheet
buy the book to support the author.
Chapter 19. Regular Expressions
This chapter gives an overview of the JavaScript API for regular expressions. It assumes that you are roughly familiar with how they work. If you are not, there are many good tutorials on the Web. Two examples are:
- Regular-Expressions.info by Jan Goyvaerts
- JavaScript Regular Expression Enlightenment by Cody Lindley
Regular Expression Syntax
The terms used here closely reflect the grammar in the ECMAScript specification. I sometimes deviate to make things easier to understand.
Atoms: General
The syntax for general atoms is as follows:
- Special characters
- All of the following characters have special meaning:
- \ ^ $ . * + ? ( ) [ ] { } |
You can escape them by prefixing a backslash. For example:
- > /^(ab)$/.test('(ab)')
- false
- > /^\(ab\)$/.test('(ab)')
- true
Additional special characters are:
- Inside a character class
[…]
:
- -
- Inside a group that starts with a question mark
(?…)
:
- : = ! < >
The angle brackets are used only by the XRegExp library (see Chapter 30), to name groups.
- Pattern characters
- All characters except the aforementioned special ones match themselves.
.
(dot)- Matches any JavaScript character (UTF-16 code unit) except line terminators (newline, carriage return, etc.). To really match any character, use
[\s\S]
. For example:
- > /./.test('\n')
- false
- > /[\s\S]/.test('\n')
- true
- Character escapes (match single characters)
- Specific control characters include
\f
(form feed),\n
(line feed, newline),\r
(carriage return),\t
(horizontal tab), and\v
(vertical tab). \0
matches the NUL character (\u0000
).- Any control character:
\cA
–\cZ
. - Unicode character escapes:
\u0000
–\xFFFF
(Unicode code units; see Chapter 24). - Hexadecimal character escapes:
\x00
–\xFF
.
- Specific control characters include
Character class escapes (match one of a set of characters)
- Digits:
\d
matches any digit (same as[0-9]
);\D
matches any nondigit (same as[^0-9]
). - Alphanumeric characters:
\w
matches any Latin alphanumeric character plus underscore (same as[A-Za-z0-9_]
);\W
matches all characters not matched by\w
. - Whitespace:
\s
matches whitespace characters (space, tab, line feed, carriage return, form feed, all Unicode spaces, etc.);\S
matches all nonwhitespace characters.
- Digits:
Atoms: Character Classes
The syntax for character classes is as follows:
[«charSpecs»]
matches any single character that matches at least one of thecharSpecs
.[^«charSpecs»]
matches any single character that does not match any of thecharSpecs
.
The following constructs are all character specifications:
- Source characters match themselves. Most characters are source characters (even many characters that are special elsewhere). Only three characters are not:
- \ ] -
As usual, you escape via a backslash. If you want to match a dash without escaping it, it must be the first character after the opening bracket or the right side of a range, as described shortly.
- Class escapes: Any of the character escapes and character class escapes listed previously are allowed. There is one additional escape:
- Backspace (
\b
): Outside a character class,\b
matches word boundaries. Inside a character class, it matches the control character backspace.
- Ranges comprise a source character or a class escape, followed by a dash (
-
), followed by a source character or a class escape.
To demonstrate using character classes, this example parses a date formatted in the ISO 8601 standard:
function
parseIsoDate
(
str
)
{
var
match
=
/^([0-9]{4})-([0-9]{2})-([0-9]{2})$/
.
exec
(
str
);
// Other ways of writing the regular expression:
// /^([0-9][0-9][0-9][0-9])-([0-9][0-9])-([0-9][0-9])$/
// /^(\d\d\d\d)-(\d\d)-(\d\d)$/
if
(
!
match
)
{
throw
new
Error
(
'Not an ISO date: '
+
str
);
}
console
.
log
(
'Year: '
+
match
[
1
]);
console
.
log
(
'Month: '
+
match
[
2
]);
console
.
log
(
'Day: '
+
match
[
3
]);
}
And here is the interaction:
- > parseIsoDate('2001-12-24')
- Year: 2001
- Month: 12
- Day: 24
Atoms: Groups
The syntax for groups is as follows:
(«pattern»)
is a capturing group. Whatever is matched bypattern
can be accessed via backreferences or as the result of a match operation.(?:«pattern»)
is a noncapturing group.pattern
is still matched against the input, but not saved as a capture. Therefore, the group does not have a number you can refer to (e.g., via a backreference).
\1
, \2
, and so on are known as backreferences; they refer back to a previously matched group. The number after the backslash can be any integer greater than or equal to 1, but the first digit must not be 0.
In this example, a backreference guarantees the same amount of a’s before and after the dash:
- > /^(a+)-\1$/.test('a-a')
- true
- > /^(a+)-\1$/.test('aaa-aaa')
- true
- > /^(a+)-\1$/.test('aa-a')
- false
This example uses a backreference to match an HTML tag (obviously, you should normally use a proper parser to process HTML):
- > var tagName = /<([^>]+)>[^<]*<\/\1>/;
- > tagName.exec('<b>bold</b>')[1]
- 'b'
- > tagName.exec('<strong>text</strong>')[1]
- 'strong'
- > tagName.exec('<strong>text</stron>')
- null
Quantifiers
Any atom (including character classes and groups) can be followed by a quantifier:
?
means match never or once.*
means match zero or more times.+
means match one or more times.{n}
means match exactlyn
times.{n,}
means matchn
or more times.{n,m}
means match at leastn
, at mostm
, times.
By default, quantifiers are greedy; that is, they match as much as possible. You can get reluctant matching (as little as possible) by suffixing any of the preceding quantifiers (including the ranges in curly braces) with a question mark (?
). For example:
- > '<a> <strong>'.match(/^<(.*)>/)[1] // greedy
- 'a> <strong'
- > '<a> <strong>'.match(/^<(.*?)>/)[1] // reluctant
- 'a'
Thus, .?
is a useful pattern for matching everything until the next occurrence of the following atom. For example, the following is a more compact version of the regular expression for HTML tags just shown (which used [^<]
instead of .*?
):
/<(.+?)>.*?<\/\1>/
Assertions
Assertions, shown in the following list, are checks about the current position in the input:
^
| Matches only at the beginning of the input. |
$
| Matches only at the end of the input. |
\b
|
Matches only at a word boundary. Don’t confuse with [\b] , which matches a backspace.
|
\B
| Matches only if not at a word boundary. |
(?=«pattern»)
|
Positive lookahead: Matches only if pattern matches what comes next. pattern is used only to look ahead, but otherwise ignored.
|
(?!«pattern»)
|
Negative lookahead: Matches only if pattern does not match what comes next. pattern is used only to look ahead, but otherwise ignored.
|
This example matches a word boundary via \b
:
- > /\bell\b/.test('hello')
- false
- > /\bell\b/.test('ello')
- false
- > /\bell\b/.test('ell')
- true
This example matches the inside of a word via \B
:
- > /\Bell\B/.test('ell')
- false
- > /\Bell\B/.test('hell')
- false
- > /\Bell\B/.test('hello')
- true
Note
Lookbehind is not supported. Manually Implementing Lookbehind explains how to implement it manually.
Disjunction
A disjunction operator (|
) separates two alternatives; either of the alternatives must match for the disjunction to match. The alternatives are atoms (optionally including quantifiers).
The operator binds very weakly, so you have to be careful that the alternatives don’t extend too far.For example, the following regular expression matches all strings that either start with aa
or end with bb
:
- > /^aa|bb$/.test('aaxx')
- true
- > /^aa|bb$/.test('xxbb')
- true
In other words, the disjunction binds more weakly than even ^
and $
and the two alternatives are ^aa
and bb$
. If you want to match the two strings 'aa'
and 'bb'
, you need parentheses:
- /^(aa|bb)$/
Similarly, if you want to match the strings 'aab'
and 'abb'
:
- /^a(a|b)b$/
Unicode and Regular Expressions
JavaScript’s regular expressions have only very limited support for Unicode. Especially when it comes to code points in the astral planes, you have to be careful. Chapter 24 explains the details.
Creating a Regular Expression
You can create a regular expression via either a literal or a constructor and configure how it works via flags.
Literal Versus Constructor
There are two ways to create a regular expression: you can use a literal or the constructor RegExp
:
Literal |
/xyz/i
| Compiled at load time |
Constructor (second argument is optional) |
new RegExp('xyz', 'i')
| Compiled at runtime |
A literal and a constructor differ in when they are compiled:
- The literal is compiled at load time. The following code will cause an exception when it is evaluated:
function
foo
()
{
/[/;
}
- The constructor compiles the regular expression when it is called. The following code will not cause an exception, but calling
foo()
will:
function
foo
()
{
new
RegExp
(
'['
);
}
Thus, you should normally use literals, but you need the constructor if you want to dynamically assemble a regular expression.
Flags
Flags are a suffix of regular expression literals and a parameter of regular expression constructors; they modify the matching behavior of regular expressions. The following flags exist:
Short name | Long name | Description |
g
|
global
|
The given regular expression is matched multiple times. Influences several methods, especially replace() .
|
i
|
ignoreCase
| Case is ignored when trying to match the given regular expression. |
m
|
multiline
|
In multiline mode, the begin operator ^ and the end operator $ match each line, instead of the complete input string.
|
The short name is used for literal prefixes and constructor parameters (see examples in the next section).The long name is used for properties of a regular expression that indicate what flags were set during its creation.
Instance Properties of Regular Expressions
Regular expressions have the following instance properties:
- Flags: boolean values indicating what flags are set:
global
: Is flag/g
set?ignoreCase
: Is flag/i
set?multiline
: Is flag/m
set?
- Data for matching multiple times (flag
/g
is set):
lastIndex
is the index where to continue the search next time.
The following is an example of accessing the instance properties for flags:
- > var regex = /abc/i;
- > regex.ignoreCase
- true
- > regex.multiline
- false
Examples of Creating Regular Expressions
In this example, we create the same regular expression first with a literal, then with a constructor, and use the test()
method to determine whether it matches a string:
- > /abc/.test('ABC')
- false
- > new RegExp('abc').test('ABC')
- false
In this example, we create a regular expression that ignores case (flag /i
):
- > /abc/i.test('ABC')
- true
- > new RegExp('abc', 'i').test('ABC')
- true
RegExp.prototype.test: Is There a Match?
The test()
method checks whether a regular expression, regex
, matches a string, str
:
regex
.
test
(
str
)
test()
operates differently depending on whether the flag /g
is set or not.
If the flag /g
is not set, then the method checks whether there is a match somewhere in str
. For example:
- > var str = '_x_x';
- > /x/.test(str)
- true
- > /a/.test(str)
- false
If the flag /g
is set, then the method returns true
as many times as there are matches for regex
in str
. The property regex.lastIndex
contains the index after the last match:
- > var regex = /x/g;
- > regex.lastIndex
- 0
- > regex.test(str)
- true
- > regex.lastIndex
- 2
- > regex.test(str)
- true
- > regex.lastIndex
- 4
- > regex.test(str)
- false
String.prototype.search: At What Index Is There a Match?
The search()
method looks for a match with regex
within str
:
str
.
search
(
regex
)
If there is a match, the index where it was found is returned. Otherwise, the result is -1
. The properties global
and lastIndex
of regex
are ignored as the search is performed (and lastIndex
is not changed).
For example:
- > 'abba'.search(/b/)
- 1
- > 'abba'.search(/x/)
- -1
If the argument of search()
is not a regular expression, it is converted to one:
- > 'aaab'.search('^a+b+$')
- 0
RegExp.prototype.exec: Capture Groups
The following method call captures groups while matching regex
against str
:
var
matchData
=
regex
.
exec
(
str
);
If there was no match, matchData
is null
. Otherwise, matchData
is a match result, an array with two additional properties:
- Array elements
- Element 0 is the match for the complete regular expression (group 0, if you will).
- Element n > 1 is the capture of group n.
Properties
input
is the complete input string.index
is the index where the match was found.
First Match (Flag /g Not Set)
If the flag /g
is not set, only the first match is returned:
- > var regex = /a(b+)/;
- > regex.exec('_abbb_ab_')
- [ 'abbb',
- 'bbb',
- index: 1,
- input: '_abbb_ab_' ]
- > regex.lastIndex
- 0
All Matches (Flag /g Set)
If the flag /g
is set, all matches are returned if you invoke exec()
repeatedly. The return value null
signals that there are no more matches. The property lastIndex
indicates where matching will continue next time:
- > var regex = /a(b+)/g;
- > var str = '_abbb_ab_';
- > regex.exec(str)
- [ 'abbb',
- 'bbb',
- index: 1,
- input: '_abbb_ab_' ]
- > regex.lastIndex
- 6
- > regex.exec(str)
- [ 'ab',
- 'b',
- index: 7,
- input: '_abbb_ab_' ]
- > regex.lastIndex
- 10
- > regex.exec(str)
- null
Here we loop over matches:
var
regex
=
/a(b+)/g
;
var
str
=
'_abbb_ab_'
;
var
match
;
while
(
match
=
regex
.
exec
(
str
))
{
console
.
log
(
match
[
1
]);
}
and we get the following output:
- bbb
- b
String.prototype.match: Capture Groups or Return All Matching Substrings
The following method call matches regex
against str
:
var
matchData
=
str
.
match
(
regex
);
If the flag /g
of regex
is not set, this method works like RegExp.prototype.exec()
:
- > 'abba'.match(/a/)
- [ 'a', index: 0, input: 'abba' ]
If the flag is set, then the method returns an array with all matching substrings in str
(i.e., group 0 of every match) or null
if there is no match:
- > 'abba'.match(/a/g)
- [ 'a', 'a' ]
- > 'abba'.match(/x/g)
- null
String.prototype.replace: Search and Replace
The replace()
method searches a string, str
, for matches with search
and replaces them with replacement
:
str
.
replace
(
search
,
replacement
)
There are several ways in which the two parameters can be specified:
search
- Either a string or a regular expression:
- String: To be found literally in the input string. Be warned that only the first occurrence of a string is replaced. If you want to replace multiple occurrences, you must use a regular expression with a
/g
flag. This is unexpected and a major pitfall. - Regular expression: To be matched against the input string. Warning: Use the
global
flag, otherwise only one attempt is made to match the regular expression.
replacement
- Either a string or a function:
- String: Describes how to replace what has been found.
- Function: Computes a replacement and is given matching information via parameters.
Replacement Is a String
If replacement
is a string, its content is used verbatim to replace the match. The only exception is the special character dollar sign ($
), which starts so-called replacement directives:
- Groups:
$n
inserts group n from the match.n
must be at least 1 ($0
has no special meaning). - The matching substring:
$`
(backtick) inserts the text before the match.$&
inserts the complete match.$'
(apostrophe) inserts the text after the match.
$$
inserts a single$
.
This example refers to the matching substring and its prefix and suffix:
- > 'axb cxd'.replace(/x/g, "[$`,$&,$']")
- 'a[a,x,b cxd]b c[axb c,x,d]d'
This example refers to a group:
- > '"foo" and "bar"'.replace(/"(.*?)"/g, '#$1#')
- '#foo# and #bar#'
Replacement Is a Function
If replacement
is a function, it computes the string that is to replace the match. This function has the following signature:
function
(
completeMatch
,
group_1
,
...,
group_n
,
offset
,
inputStr
)
completeMatch
is the same as $&
previously, offset
indicates where the match was found, and inputStr
is what is being matched against.Thus, you can use the special variable arguments
to access groups (group 1 via arguments[1]
, and so on). For example:
- > function replaceFunc(match) { return 2 * match }
- > '3 apples and 5 oranges'.replace(/[0-9]+/g, replaceFunc)
- '6 apples and 10 oranges'
Problems with the Flag /g
Regular expressions whose /g
flag is set are problematic if a method invoked on them must be invoked multiple times to return all results. That’s the case for two methods:
RegExp.prototype.test()
RegExp.prototype.exec()
Then JavaScript abuses the regular expression as an iterator, as a pointer into the sequence of results. That causes problems:
- Problem 1:
/g
regular expressions can’t be inlined - For example:
// Don’t do that:
var
count
=
0
;
while
(
/a/g
.
test
(
'babaa'
))
count
++
;
The preceding loop is infinite, because a new regular expression is created for each loop iteration, which restarts the iteration over the results. Therefore, the code must be rewritten:
var
count
=
0
;
var
regex
=
/a/g
;
while
(
regex
.
test
(
'babaa'
))
count
++
;
Here is another example:
// Don’t do that:
function
extractQuoted
(
str
)
{
var
match
;
var
result
=
[];
while
((
match
=
/"(.*?)"/g
.
exec
(
str
))
!=
null
)
{
result
.
push
(
match
[
1
]);
}
return
result
;
}
Calling the preceding function will again result in an infinite loop. The correct version is (why lastIndex
is set to 0 is explained shortly):
var
QUOTE_REGEX
=
/"(.*?)"/g
;
function
extractQuoted
(
str
)
{
QUOTE_REGEX
.
lastIndex
=
0
;
var
match
;
var
result
=
[];
while
((
match
=
QUOTE_REGEX
.
exec
(
str
))
!=
null
)
{
result
.
push
(
match
[
1
]);
}
return
result
;
}
Using the function:
- > extractQuoted('"hello", "world"')
- [ 'hello', 'world' ]
Tip
It’s a best practice not to inline anyway (then you can give regular expressions descriptive names). But you have to be aware that you can’t do it, not even in quick hacks.
- Problem 2:
/g
regular expressions as parameters - Code that wants to invoke
test()
andexec()
multiple times must be careful with a regular expression handed to it as a parameter. Its flag/g
must active and, to be safe, itslastIndex
should be set to zero (an explanation is offered in the next example). - Problem 3: Shared
/g
regular expressions (e.g., constants) - Whenever you are referring to a regular expression that has not been freshly created, you should set its
lastIndex
property to zero, before using it as an iterator (an explanation is offered in the next example). As iteration depends onlastIndex
, such a regular expression can’t be used in more than one iteration at the same time.
The following example illustrates problem 2. It is a naive implementation of a function that counts how many matches there are for the regular expression regex
in the string str
:
// Naive implementation
function
countOccurrences
(
regex
,
str
)
{
var
count
=
0
;
while
(
regex
.
test
(
str
))
count
++
;
return
count
;
}
Here’s an example of using this function:
- > countOccurrences(/x/g, '_x_x')
- 2
The first problem is that this function goes into an infinite loop if the regular expression’s /g
flag is not set. For example:
countOccurrences
(
/x/
,
'_x_x'
)
// never terminates
The second problem is that the function doesn’t work correctly if regex.lastIndex
isn’t 0, because that property indicates where to start the search. For example:
- > var regex = /x/g;
- > regex.lastIndex = 2;
- > countOccurrences(regex, '_x_x')
- 1
The following implementation fixes the two problems:
function
countOccurrences
(
regex
,
str
)
{
if
(
!
regex
.
global
)
{
throw
new
Error
(
'Please set flag /g of regex'
);
}
var
origLastIndex
=
regex
.
lastIndex
;
// store
regex
.
lastIndex
=
0
;
var
count
=
0
;
while
(
regex
.
test
(
str
))
count
++
;
regex
.
lastIndex
=
origLastIndex
;
// restore
return
count
;
}
A simpler alternative is to use match()
:
function
countOccurrences
(
regex
,
str
)
{
if
(
!
regex
.
global
)
{
throw
new
Error
(
'Please set flag /g of regex'
);
}
return
(
str
.
match
(
regex
)
||
[]).
length
;
}
There’s one possible pitfall: str.match()
returns null
if the /g
flag is set and there are no matches. We avoid that pitfall in the preceding code by using []
if the result of match()
isn’t truthy.
Tips and Tricks
This section gives a few tips and tricks for working with regular expressions in JavaScript.
Quoting Text
Sometimes, when you assemble a regular expression manually, you want to use a given string verbatim. That means that none of the special characters (e.g., *
, [
) should be interpreted as such—all of them need to be escaped. JavaScript has no built-in means for this kind of quoting, but you can program your own function, quoteText
, that would work as follows:
- > console.log(quoteText('*All* (most?) aspects.'))
- \*All\* \(most\?\) aspects\.
Such a function is especially handy if you need to do a search and replace with multiple occurrences. Then the value to search for must be a regular expression with the global
flag set. With quoteText()
, you can use arbitrary strings. The function looks like this:
function
quoteText
(
text
)
{
return
text
.
replace
(
/[\\^$.*+?()[\]{}|=!<>:-]/g
,
'\\$&'
);
}
All special characters are escaped, because you may want to quote several characters inside parentheses or square brackets.
Pitfall: Without an Assertion (e.g., ^, $), a Regular Expression Is Found Anywhere
If you don’t use assertions such as ^
and $
, most regular expression methods find a pattern anywhere. For example:
- > /aa/.test('xaay')
- true
- > /^aa$/.test('xaay')
- false
Matching Everything or Nothing
Matching everything
The empty regular expression matches everything. We can create an instance of RegExp
based on that regular expression like this:
- > new RegExp('').test('dfadsfdsa')
- true
- > new RegExp('').test('')
- true
However, the empty regular expression literal would be //
, which is interpreted as a comment by JavaScript. Therefore, the following is the closest you can get via a literal: /(?:)/
(empty noncapturing group). The group matches everything, while not capturing anything, which the group from influencing the result returned by exec()
. Even JavaScript itself uses the preceding representation when displaying an empty regular expression:
- > new RegExp('')
- /(?:)/
Matching nothing
The empty regular expression has an inverse—the regular expression that matches nothing:
- > var never = /.^/;
- > never.test('abc')
- false
- > never.test('')
- false
Manually Implementing Lookbehind
Lookbehind is an assertion. Similar to lookahead, a pattern is used to check something about the current position in the input, but otherwise ignored. In contrast to lookahead, the match for the pattern has to end at the current position (not start at it).
The following function replaces each occurrence of the string 'NAME'
with the value of the parameter name
, but only if the occurrence is not preceded by a quote. We handle the quote by “manually” checking the character before the current match:
function
insertName
(
str
,
name
)
{
return
str
.
replace
(
/NAME/g
,
function
(
completeMatch
,
offset
)
{
if
(
offset
===
0
||
(
offset
>
0
&&
str
[
offset
-
1
]
!==
'"'
))
{
return
name
;
}
else
{
return
completeMatch
;
}
}
);
}
- > insertName('NAME "NAME"', 'Jane')
- 'Jane "NAME"'
- > insertName('"NAME" NAME', 'Jane')
- '"NAME" Jane'
An alternative is to include the characters that may escape in the regular expression. Then you have to temporarily add a prefix to the string you are searching in; otherwise, you’d miss matches at the beginning of that string:
function
insertName
(
str
,
name
)
{
var
tmpPrefix
=
' '
;
str
=
tmpPrefix
+
str
;
str
=
str
.
replace
(
/([^"])NAME/g
,
function
(
completeMatch
,
prefix
)
{
return
prefix
+
name
;
}
);
return
str
.
slice
(
tmpPrefix
.
length
);
// remove tmpPrefix
}
Regular Expression Cheat Sheet
Atoms (see Atoms: General):
.
(dot) matches everything except line terminators (e.g., newlines). Use[\s\S]
to really match everything.- Character class escapes:
\d
matches digits ([0-9]
);\D
matches nondigits ([^0-9]
).\w
matches Latin alphanumeric characters plus underscore ([A-Za-z0-9_]
);\W
matches all other characters.\s
matches all whitespace characters (space, tab, line feed, etc.);\S
matches all nonwhitespace characters.
- Character class (set of characters):
[…]
and[^…]
- Source characters:
[abc]
(all characters except\ ] -
match themselves) - Character class escapes (see previous):
[\d\w]
- Ranges:
[A-Za-z0-9]
- Groups:
- Capturing group:
(…)
; backreference:\1
- Noncapturing group:
(?:…)
Quantifiers (see Quantifiers):
- Greedy:
? * +
{n} {n,} {n,m}
- Reluctant: Put a
?
after any of the greedy quantifiers.
Assertions (see Assertions):
- Beginning of input, end of input:
^ $
- At a word boundary, not at a word boundary:
\b \B
- Positive lookahead:
(?=…)
(pattern must come next, but is otherwise ignored) - Negative lookahead:
(?!…)
(pattern must not come next, but is otherwise ignored)
Disjunction: |
Creating a regular expression (see Creating a Regular Expression):
- Literal:
/xyz/i
(compiled at load time) - Constructor:
new RegExp('xzy', 'i')
(compiled at runtime)
Flags (see Flags):
- global:
/g
(influences several regular expression methods) - ignoreCase:
/i
- multiline:
/m
(^
and$
match per line, as opposed to the complete input)
Methods:
regex.test(str)
: Is there a match (see RegExp.prototype.test: Is There a Match?)?
/g
is not set: Is there a match somewhere?/g
is set: Returntrue
as many times as there are matches.
str.search(regex)
: At what index is there a match (see String.prototype.search: At What Index Is There a Match?)?regex.exec(str)
: Capture groups (see the section RegExp.prototype.exec: Capture Groups)?
/g
is not set: Capture groups of first match only (invoked once)/g
is set: Capture groups of all matches (invoked repeatedly; returnsnull
if there are no more matches)
str.match(regex)
: Capture groups or return all matching substrings (see String.prototype.match: Capture Groups or Return All Matching Substrings)
/g
is not set: Capture groups/g
is set: Return all matching substrings in an array
str.replace(search, replacement)
: Search and replace (see String.prototype.replace: Search and Replace)
search
: String or regular expression (use the latter, set/g
!)replacement
: String (with$1
, etc.) or function (arguments[1]
is group 1, etc.) that returns a string
For tips on using the flag /g
, see Problems with the Flag /g.
Acknowledgments
Mathias Bynens (@mathias) and Juan Ignacio Dopazo (@juandopazo) recommended using match()
and test()
for counting occurrences, and Šime Vidas (@simevidas) warned me about being careful with match()
if there are no matches. The pitfall of the global flag causing infinite loops comes from a talk by Andrea Giammarchi (@webreflection). Claude Pache told me to escape more characters in quoteText()
.