This page lists the regular expression syntax accepted by RE2. It also lists syntax accepted by PCRE, PERL, and VIM.
Grayed out expressions are not supported by RE2.
Feature | ICU | PCRE | TRE | RE2 | Boost.Regex |
“+” quantifier | o | o | o | o | o |
Negated character classes | o | o | o | o | o |
Non-greedy quantifiers | o | o | o | o | o |
non-capturing groups | o | o | o | o | o |
Recursion | x | o | x | x | o |
Look-ahead | o | o | x | x | o |
Look-behind | o | o | x | x | o |
Backreferences | o | o | o | x | o |
>9 indexable captures | o | o | x | o | o |
Flags modifiers | o | o | o | o | o |
Conditionals | x | o | x | x | o |
Independent sub-expressions | o | o | x | x | o |
Named capture | x | o | x | o | o |
Comments | o | o | o | x | o |
Embedded code | x | o | x | x | x |
Unicode property support | o | o | ? | Some | Some |
Balancing groups | x | x | x | x | x |
Variable-length look-behinds | x | x | x | x | x |
kinds of single-character expressions | examples |
---|---|
any character, possibly including newline (s=true) |
.
|
character class |
[xyz]
|
negated character class |
[^xyz]
|
Perl character class (link) |
\d
|
negated Perl character class |
\D
|
ASCII character class (link) |
[[:alpha:]]
|
negated ASCII character class |
[[:^alpha:]]
|
Unicode character class (one-letter name) |
\pN
|
Unicode character class |
\p{Greek}
|
negated Unicode character class (one-letter name) |
\PN
|
negated Unicode character class |
\P{Greek}
|
Composites | |
---|---|
xy
|
x followed by y
|
x|y
|
x or y (prefer x )
|
Repetitions | |
---|---|
x*
|
zero or more x , prefer more
|
x+
|
one or more x , prefer more
|
x?
|
zero or one x , prefer one
|
x{n,m}
|
n or n +1 or … or m x , prefer more
|
x{n,}
|
n or more x , prefer more
|
x{n}
|
exactly n x
|
x*?
|
zero or more x , prefer fewer
|
x+?
|
one or more x , prefer fewer
|
x??
|
zero or one x , prefer zero
|
x{n,m}?
|
n or n +1 or … or m x , prefer fewer
|
x{n,}?
|
n or more x , prefer fewer
|
x{n}?
|
exactly n x
|
x{}
|
(≡ x* ) (NOT SUPPORTED) VIM
|
x{-}
|
(≡ x*? ) (NOT SUPPORTED) VIM
|
x{-n}
|
(≡ x{n}? ) (NOT SUPPORTED) VIM
|
x=
|
(≡ x? ) (NOT SUPPORTED) VIM
|
Implementation restriction: The counting forms x{n,m}
, x{n,}
, and x{n}
reject forms that create a minimum or maximum repetition count above 1000. Unlimited repetitions are not subject to this restriction.
Possessive repetitions | |
---|---|
x*+
|
zero or more x , possessive (NOT SUPPORTED)
|
x++
|
one or more x , possessive (NOT SUPPORTED)
|
x?+
|
zero or one x , possessive (NOT SUPPORTED)
|
x{n,m}+
|
n or … or m x , possessive (NOT SUPPORTED)
|
x{n,}+
|
n or more x , possessive (NOT SUPPORTED)
|
x{n}+
|
exactly n x , possessive (NOT SUPPORTED)
|
Grouping | |
---|---|
(re)
|
numbered capturing group (submatch) |
(?P<name>re)
|
named & numbered capturing group (submatch) |
(?<name>re)
|
named & numbered capturing group (submatch) (NOT SUPPORTED) |
(?’name’re)
|
named & numbered capturing group (submatch) (NOT SUPPORTED) |
(?:re)
|
non-capturing group |
(?flags)
|
set flags within current group; non-capturing |
(?flags:re)
|
set flags during re; non-capturing |
(?#text)
|
comment (NOT SUPPORTED) |
(?|x|y|z)
|
branch numbering reset (NOT SUPPORTED) |
(?>re)
|
possessive match of re (NOT SUPPORTED)
|
re@>
|
possessive match of re (NOT SUPPORTED) VIM
|
%(re)
|
non-capturing group (NOT SUPPORTED) VIM |
Flags | |
---|---|
i
|
case-insensitive (default false) |
m
|
multi-line mode: ^ and dollar symbol match begin/end line in addition to begin/end text (default false)
|
s
|
let . match \n (default false)
|
U
|
ungreedy: swap meaning of x and x? , x+ and x+? , etc (default false)
|
Flag syntax is xyz
(set) or -xyz
(clear) or xy-z
(set xy
, clear z
).
Empty strings | |
---|---|
^
|
at beginning of text or line (m =true)
|
dollar symbol
|
at end of text (like \z not \Z ) or line (m =true)
|
\A
|
at beginning of text |
\b
|
at ASCII word boundary (\w on one side and \W , , or \z on the other)
|
\B
|
not at ASCII word boundary |
\g
|
at beginning of subtext being searched (NOT SUPPORTED) PCRE |
\G
|
at end of last match (NOT SUPPORTED) PERL |
\Z
|
at end of text, or before newline at end of text (NOT SUPPORTED) |
\z
|
at end of text |
(?=re)
|
before text matching re (NOT SUPPORTED)
|
(?!re)
|
before text not matching re (NOT SUPPORTED)
|
(?<=re)
|
after text matching re (NOT SUPPORTED)
|
(?<!re)
|
after text not matching re (NOT SUPPORTED)
|
re&
|
before text matching re (NOT SUPPORTED) VIM
|
re@=
|
before text matching re (NOT SUPPORTED) VIM
|
re@!
|
before text not matching re (NOT SUPPORTED) VIM
|
re@<=
|
after text matching re (NOT SUPPORTED) VIM
|
re@<!
|
after text not matching re (NOT SUPPORTED) VIM
|
\zs
|
sets start of match (= \K) (NOT SUPPORTED) VIM |
\ze
|
sets end of match (NOT SUPPORTED) VIM |
\%^
|
beginning of file (NOT SUPPORTED) VIM |
\%$
|
end of file (NOT SUPPORTED) VIM |
\%V
|
on screen (NOT SUPPORTED) VIM |
\%#
|
cursor position (NOT SUPPORTED) VIM |
\%’m
|
mark m position (NOT SUPPORTED) VIM
|
\%23l
|
in line 23 (NOT SUPPORTED) VIM |
\%23c
|
in column 23 (NOT SUPPORTED) VIM |
\%23v
|
in virtual column 23 (NOT SUPPORTED) VIM |
Escape sequences | |
---|---|
\a
|
bell (≡ \007 )
|
\f
|
form feed (≡ \014 )
|
\t
|
horizontal tab (≡ \011 )
|
\n
|
newline (≡ \012 )
|
\r
|
carriage return (≡ \015 )
|
\v
|
vertical tab character (≡ \013 )
|
\*
|
literal , for any punctuation character
|
\123
|
octal character code (up to three digits) |
\x7F
|
hex character code (exactly two digits) |
\x{10FFFF}
|
hex character code |
\C
|
match a single byte even in UTF-8 mode |
\Q…\E
|
literal text … even if … has punctuation
|
\1
|
backreference (NOT SUPPORTED) |
\b
|
backspace (NOT SUPPORTED) (use \010 )
|
\cK
|
control char ^K (NOT SUPPORTED) (use \001 etc)
|
\e
|
escape (NOT SUPPORTED) (use \033 )
|
\g1
|
backreference (NOT SUPPORTED) |
\g{1}
|
backreference (NOT SUPPORTED) |
\g{+1}
|
backreference (NOT SUPPORTED) |
\g{-1}
|
backreference (NOT SUPPORTED) |
\g{name}
|
named backreference (NOT SUPPORTED) |
\g<name>
|
subroutine call (NOT SUPPORTED) |
\g’name’
|
subroutine call (NOT SUPPORTED) |
\k<name>
|
named backreference (NOT SUPPORTED) |
\k’name’
|
named backreference (NOT SUPPORTED) |
\lX
|
lowercase X (NOT SUPPORTED)
|
|
uppercase x (NOT SUPPORTED)
|
\L…\E
|
lowercase text … (NOT SUPPORTED)
|
\K
|
reset beginning of $0 (NOT SUPPORTED)
|
\N{name}
|
named Unicode character (NOT SUPPORTED) |
\R
|
line break (NOT SUPPORTED) |
\U…\E
|
upper case text … (NOT SUPPORTED)
|
\X
|
extended Unicode sequence (NOT SUPPORTED) |
\%d123
|
decimal character 123 (NOT SUPPORTED) VIM |
\%xFF
|
hex character FF (NOT SUPPORTED) VIM |
\%o123
|
octal character 123 (NOT SUPPORTED) VIM |
\%u1234
|
Unicode character 0x1234 (NOT SUPPORTED) VIM |
\%U12345678
|
Unicode character 0x12345678 (NOT SUPPORTED) VIM |
Character class elements | |
---|---|
x
|
single character |
A-Z
|
character range (inclusive) |
\d
|
Perl character class |
[:foo:]
|
ASCII character class foo
|
\p{Foo}
|
Unicode character class Foo
|
\pF
|
Unicode character class F (one-letter name)
|
Named character classes as character class elements | |
---|---|
[\d]
|
digits (≡ \d )
|
[^\\d]
|
not digits (≡ \D )
|
[\D]
|
not digits (≡ )
|
[^\\D]
|
not not digits (≡ \d )
|
[[:name:]]
|
named ASCII class inside character class (≡ [:name:] )
|
[^[:name:]]
|
named ASCII class inside negated character class (≡ [:^name:] )
|
[\p{Name}]
|
named Unicode property inside character class (≡ \p{Name} )
|
[^\\p{Name}]
|
named Unicode property inside negated character class (≡ \P{Name} )
|
Perl character classes (all ASCII-only) | |
---|---|
\d
|
digits (≡ [0-9] )
|
\D
|
not digits (≡ [^0-9] )
|
\s
|
whitespace (≡ [\t\n\f\r ] )
|
\S
|
not whitespace (≡ [^\t\n\f\r ] )
|
\w
|
word characters (≡ [0-9A-Za-z_] )
|
\W
|
not word characters (≡ [^0-9A-Za-z_] )
|
\h
|
horizontal space (NOT SUPPORTED) |
\H
|
not horizontal space (NOT SUPPORTED) |
\v
|
vertical space (NOT SUPPORTED) |
\V
|
not vertical space (NOT SUPPORTED) |
ASCII character classes | |
---|---|
[[:alnum:]]
|
alphanumeric (≡ [0-9A-Za-z] )
|
[[:alpha:]]
|
alphabetic (≡ [A-Za-z] )
|
[[:ascii:]]
|
ASCII (≡ [\x00-\x7F] )
|
[[:blank:]]
|
blank (≡ [\t ] )
|
[[:cntrl:]]
|
control (≡ [\x00-\x1F\x7F] )
|
[[:digit:]]
|
digits (≡ [0-9] )
|
[[:graph:]]
|
graphical |
[[:lower:]]
|
lower case (≡ [a-z] )
|
[[:print:]]
|
printable (≡ [ -~] == [ [:graph:]] )
|
[[:punct:]]
|
punctuation (≡ [!-/:-@[- ` {-~] )
|
[[:space:]]
|
whitespace (≡ [\t\n\v\f\r ] )
|
[[:upper:]]
|
upper case (≡ [A-Z] )
|
[[:word:]]
|
word characters (≡ [0-9A-Za-z_] )
|
[[:xdigit:]]
|
hex digit (≡ [0-9A-Fa-f] )
|
Unicode character class names–general category | |
---|---|
C
|
other |
Cc
|
control |
Cf
|
format |
Cn
|
unassigned code points (NOT SUPPORTED) |
Co
|
private use |
Cs
|
surrogate |
L
|
letter |
LC
|
cased letter (NOT SUPPORTED) |
L&
|
cased letter (NOT SUPPORTED) |
Ll
|
lowercase letter |
Lm
|
modifier letter |
Lo
|
other letter |
Lt
|
titlecase letter |
Lu
|
uppercase letter |
M
|
mark |
Mc
|
spacing mark |
Me
|
enclosing mark |
Mn
|
non-spacing mark |
N
|
number |
Nd
|
decimal number |
Nl
|
letter number |
No
|
other number |
P
|
punctuation |
Pc
|
connector punctuation |
Pd
|
dash punctuation |
Pe
|
close punctuation |
Pf
|
final punctuation |
Pi
|
initial punctuation |
Po
|
other punctuation |
Ps
|
open punctuation |
S
|
symbol |
Sc
|
currency symbol |
Sk
|
modifier symbol |
Sm
|
math symbol |
So
|
other symbol |
Z
|
separator |
Zl
|
line separator |
Zp
|
paragraph separator |
Zs
|
space separator |
Unicode character class names–scripts |
---|
Ahom
|
Anatolian_Hieroglyphs
|
Arabic
|
Armenian
|
Avestan
|
Balinese
|
Bamum
|
Bassa_Vah
|
Batak
|
Bengali
|
Bopomofo
|
Brahmi
|
Braille
|
Buginese
|
Buhid
|
Canadian_Aboriginal
|
Carian
|
Caucasian_Albanian
|
Chakma
|
Cham
|
Cherokee
|
Common
|
Coptic
|
Cuneiform
|
Cypriot
|
Cyrillic
|
Deseret
|
Devanagari
|
Duployan
|
Egyptian_Hieroglyphs
|
Elbasan
|
Ethiopic
|
Georgian
|
Glagolitic
|
Gothic
|
Grantha
|
Greek
|
Gujarati
|
Gurmukhi
|
Han
|
Hangul
|
Hanunoo
|
Hatran
|
Hebrew
|
Hiragana
|
Imperial_Aramaic
|
Inherited
|
Inscriptional_Pahlavi
|
Inscriptional_Parthian
|
Javanese
|
Kaithi
|
Kannada
|
Katakana
|
Kayah_Li
|
Kharoshthi
|
Khmer
|
Khojki
|
Khudawadi
|
Lao
|
Latin
|
Lepcha
|
Limbu
|
Linear_A
|
Linear_B
|
Lisu
|
Lycian
|
Lydian
|
Mahajani
|
Malayalam
|
Mandaic
|
Manichaean
|
Meetei_Mayek
|
Mende_Kikakui
|
Meroitic_Cursive
|
Meroitic_Hieroglyphs
|
Miao
|
Modi
|
Mongolian
|
Mro
|
Multani
|
Myanmar
|
Nabataean
|
New_Tai_Lue
|
Nko
|
Ogham
|
Ol_Chiki
|
Old_Hungarian
|
Old_Italic
|
Old_North_Arabian
|
Old_Permic
|
Old_Persian
|
Old_South_Arabian
|
Old_Turkic
|
Oriya
|
Osmanya
|
Pahawh_Hmong
|
Palmyrene
|
Pau_Cin_Hau
|
Phags_Pa
|
Phoenician
|
Psalter_Pahlavi
|
Rejang
|
Runic
|
Samaritan
|
Saurashtra
|
Sharada
|
Shavian
|
Siddham
|
SignWriting
|
Sinhala
|
Sora_Sompeng
|
Sundanese
|
Syloti_Nagri
|
Syriac
|
Tagalog
|
Tagbanwa
|
Tai_Le
|
Tai_Tham
|
Tai_Viet
|
Takri
|
Tamil
|
Telugu
|
Thaana
|
Thai
|
Tibetan
|
Tifinagh
|
Tirhuta
|
Ugaritic
|
Vai
|
Warang_Citi
|
Yi
|
Vim character classes | |
---|---|
\i
|
identifier character (NOT SUPPORTED) VIM |
\I
|
\i except digits (NOT SUPPORTED) VIM
|
\k
|
keyword character (NOT SUPPORTED) VIM |
\K
|
\k except digits (NOT SUPPORTED) VIM
|
\f
|
file name character (NOT SUPPORTED) VIM |
\F
|
\f except digits (NOT SUPPORTED) VIM
|
\p
|
printable character (NOT SUPPORTED) VIM |
\P
|
except digits (NOT SUPPORTED) VIM
|
|
whitespace character (≡ [ \t] ) (NOT SUPPORTED) VIM
|
\S
|
non-white space character (≡ [^ \t] ) (NOT SUPPORTED) VIM
|
\d
|
digits (≡ [0-9] ) VIM
|
\D
|
not \d VIM
|
\x
|
hex digits (≡ [0-9A-Fa-f] ) (NOT SUPPORTED) VIM
|
\X
|
not \x (NOT SUPPORTED) VIM
|
\o
|
octal digits (≡ [0-7] ) (NOT SUPPORTED) VIM
|
\O
|
not \o (NOT SUPPORTED) VIM
|
\w
|
word character VIM |
\W
|
not \w VIM
|
\h
|
head of word character (NOT SUPPORTED) VIM |
\H
|
not \h (NOT SUPPORTED) VIM
|
\a
|
alphabetic (NOT SUPPORTED) VIM |
\A
|
not \a (NOT SUPPORTED) VIM
|
\l
|
lowercase (NOT SUPPORTED) VIM |
\L
|
not lowercase (NOT SUPPORTED) VIM |
\u
|
uppercase (NOT SUPPORTED) VIM |
\U
|
not uppercase (NOT SUPPORTED) VIM |
\_x
|
\x plus newline, for any x (NOT SUPPORTED) VIM
|
\c
|
ignore case (NOT SUPPORTED) VIM |
\C
|
match case (NOT SUPPORTED) VIM |
\m
|
magic (NOT SUPPORTED) VIM |
\M
|
nomagic (NOT SUPPORTED) VIM |
\v
|
verymagic (NOT SUPPORTED) VIM |
\V
|
verynomagic (NOT SUPPORTED) VIM |
\Z
|
ignore differences in Unicode combining characters (NOT SUPPORTED) VIM |
Magic | |
---|---|
(?{code})
|
arbitrary Perl code (NOT SUPPORTED) PERL |
(??{code})
|
postponed arbitrary Perl code (NOT SUPPORTED) PERL |
(?n)
|
recursive call to regexp capturing group n (NOT SUPPORTED)
|
(?+n)
|
recursive call to relative group +n (NOT SUPPORTED)
|
(?-n)
|
recursive call to relative group -n (NOT SUPPORTED)
|
(?C)
|
PCRE callout (NOT SUPPORTED) PCRE |
(?R)
|
recursive call to entire regexp (≡ (?0) ) (NOT SUPPORTED)
|
(?&name)
|
recursive call to named group (NOT SUPPORTED) |
(?P=name)
|
named backreference (NOT SUPPORTED) |
(?P>name)
|
recursive call to named group (NOT SUPPORTED) |
(?(cond)true|false)
|
conditional branch (NOT SUPPORTED) |
(?(cond)true)
|
conditional branch (NOT SUPPORTED) |
(*ACCEPT)
|
make regexps more like Prolog (NOT SUPPORTED) |
(*COMMIT)
|
(NOT SUPPORTED) |
(*F)
|
(NOT SUPPORTED) |
(*FAIL)
|
(NOT SUPPORTED) |
(*MARK)
|
(NOT SUPPORTED) |
(*PRUNE)
|
(NOT SUPPORTED) |
(*SKIP)
|
(NOT SUPPORTED) |
(*THEN)
|
(NOT SUPPORTED) |
(*ANY)
|
set newline convention (NOT SUPPORTED) |
(*ANYCRLF)
|
(NOT SUPPORTED) |
(*CR)
|
(NOT SUPPORTED) |
(*CRLF)
|
(NOT SUPPORTED) |
(*LF)
|
(NOT SUPPORTED) |
(*BSR_ANYCRLF)
|
set \R convention (NOT SUPPORTED) PCRE |
(*BSR_UNICODE)
|
(NOT SUPPORTED) PCRE |