forked from boostorg/regex
@ -393,7 +393,7 @@
|
|||||||
</h5>
|
</h5>
|
||||||
<p>
|
<p>
|
||||||
A character set is a bracket-expression starting with <code class="literal">[</code>
|
A character set is a bracket-expression starting with <code class="literal">[</code>
|
||||||
and ending with <code class="literal">]</code>, it defines a set of characters, and
|
and ending with <code class="literal"></code>], it defines a set of characters, and
|
||||||
matches any single character that is a member of that set.
|
matches any single character that is a member of that set.
|
||||||
</p>
|
</p>
|
||||||
<p>
|
<p>
|
||||||
@ -568,12 +568,12 @@
|
|||||||
<tr>
|
<tr>
|
||||||
<td>
|
<td>
|
||||||
<p>
|
<p>
|
||||||
<code class="literal">\n</code>
|
<code class="literal"><br> </code>
|
||||||
</p>
|
</p>
|
||||||
</td>
|
</td>
|
||||||
<td>
|
<td>
|
||||||
<p>
|
<p>
|
||||||
<code class="literal">\n</code>
|
<code class="literal"><br> </code>
|
||||||
</p>
|
</p>
|
||||||
</td>
|
</td>
|
||||||
</tr>
|
</tr>
|
||||||
@ -1012,10 +1012,10 @@
|
|||||||
The following escape sequences match the boundaries of words:
|
The following escape sequences match the boundaries of words:
|
||||||
</p>
|
</p>
|
||||||
<p>
|
<p>
|
||||||
<code class="literal">\<</code> Matches the start of a word.
|
<code class="literal"><</code> Matches the start of a word.
|
||||||
</p>
|
</p>
|
||||||
<p>
|
<p>
|
||||||
<code class="literal">\></code> Matches the end of a word.
|
<code class="literal">></code> Matches the end of a word.
|
||||||
</p>
|
</p>
|
||||||
<p>
|
<p>
|
||||||
<code class="literal">\b</code> Matches a word boundary (the start or end of a word).
|
<code class="literal">\b</code> Matches a word boundary (the start or end of a word).
|
||||||
@ -1040,10 +1040,10 @@
|
|||||||
\' Matches at the end of a buffer only.
|
\' Matches at the end of a buffer only.
|
||||||
</p>
|
</p>
|
||||||
<p>
|
<p>
|
||||||
\A Matches at the start of a buffer only (the same as =\`=).
|
\A Matches at the start of a buffer only (the same as <code class="literal">\`</code>).
|
||||||
</p>
|
</p>
|
||||||
<p>
|
<p>
|
||||||
\z Matches at the end of a buffer only (the same as <code class="literal">\\'</code>).
|
\z Matches at the end of a buffer only (the same as <code class="literal">\'</code>).
|
||||||
</p>
|
</p>
|
||||||
<p>
|
<p>
|
||||||
\Z Matches a zero-width assertion consisting of an optional sequence of newlines
|
\Z Matches a zero-width assertion consisting of an optional sequence of newlines
|
||||||
@ -1071,7 +1071,7 @@
|
|||||||
<p>
|
<p>
|
||||||
The escape sequence <code class="literal">\Q</code> begins a "quoted sequence":
|
The escape sequence <code class="literal">\Q</code> begins a "quoted sequence":
|
||||||
all the subsequent characters are treated as literals, until either the end
|
all the subsequent characters are treated as literals, until either the end
|
||||||
of the regular expression or \E is found. For example the expression: <code class="literal">\Q\*+\Ea+</code>
|
of the regular expression or \E is found. For example the expression: <code class="literal">\Q*+\Ea+</code>
|
||||||
would match either of:
|
would match either of:
|
||||||
</p>
|
</p>
|
||||||
<pre class="programlisting"><span class="special">\*+</span><span class="identifier">a</span>
|
<pre class="programlisting"><span class="special">\*+</span><span class="identifier">a</span>
|
||||||
@ -1317,19 +1317,19 @@
|
|||||||
<span class="emphasis"><em>no-pattern</em></span>.
|
<span class="emphasis"><em>no-pattern</em></span>.
|
||||||
</li>
|
</li>
|
||||||
<li class="listitem">
|
<li class="listitem">
|
||||||
=(?(<span class="emphasis"><em>N</em></span>)yes-pattern|no-pattern)= Executes <span class="emphasis"><em>yes-pattern</em></span>
|
<code class="literal">(?(<span class="emphasis"><em>N</em></span>)yes-pattern|no-pattern)</code>
|
||||||
if subexpression <span class="emphasis"><em>N</em></span> has been matched, otherwise executes
|
Executes <span class="emphasis"><em>yes-pattern</em></span> if subexpression <span class="emphasis"><em>N</em></span>
|
||||||
<span class="emphasis"><em>no-pattern</em></span>.
|
|
||||||
</li>
|
|
||||||
<li class="listitem">
|
|
||||||
=(?(<<span class="emphasis"><em>name</em></span>>)yes-pattern|no-pattern)= Executes
|
|
||||||
<span class="emphasis"><em>yes-pattern</em></span> if named subexpression <span class="emphasis"><em>name</em></span>
|
|
||||||
has been matched, otherwise executes <span class="emphasis"><em>no-pattern</em></span>.
|
has been matched, otherwise executes <span class="emphasis"><em>no-pattern</em></span>.
|
||||||
</li>
|
</li>
|
||||||
<li class="listitem">
|
<li class="listitem">
|
||||||
=(?('<span class="emphasis"><em>name</em></span>')yes-pattern|no-pattern)= Executes <span class="emphasis"><em>yes-pattern</em></span>
|
<code class="literal">(?(<<span class="emphasis"><em>name</em></span>>)yes-pattern|no-pattern)</code>
|
||||||
if named subexpression <span class="emphasis"><em>name</em></span> has been matched, otherwise
|
Executes <span class="emphasis"><em>yes-pattern</em></span> if named subexpression <span class="emphasis"><em>name</em></span>
|
||||||
executes <span class="emphasis"><em>no-pattern</em></span>.
|
has been matched, otherwise executes <span class="emphasis"><em>no-pattern</em></span>.
|
||||||
|
</li>
|
||||||
|
<li class="listitem">
|
||||||
|
<code class="literal">(?('<span class="emphasis"><em>name</em></span>')yes-pattern|no-pattern)</code>
|
||||||
|
Executes <span class="emphasis"><em>yes-pattern</em></span> if named subexpression <span class="emphasis"><em>name</em></span>
|
||||||
|
has been matched, otherwise executes <span class="emphasis"><em>no-pattern</em></span>.
|
||||||
</li>
|
</li>
|
||||||
<li class="listitem">
|
<li class="listitem">
|
||||||
<code class="literal">(?(R)yes-pattern|no-pattern)</code> Executes <span class="emphasis"><em>yes-pattern</em></span>
|
<code class="literal">(?(R)yes-pattern|no-pattern)</code> Executes <span class="emphasis"><em>yes-pattern</em></span>
|
||||||
@ -1368,7 +1368,7 @@
|
|||||||
<span class="special">[::]</span> <span class="special">[..]</span></code>
|
<span class="special">[::]</span> <span class="special">[..]</span></code>
|
||||||
</li>
|
</li>
|
||||||
<li class="listitem">
|
<li class="listitem">
|
||||||
Escaped characters <code class="literal">\</code>
|
Escaped characters [^]
|
||||||
</li>
|
</li>
|
||||||
<li class="listitem">
|
<li class="listitem">
|
||||||
Character set (bracket expression) <code class="computeroutput"><span class="special">[]</span></code>
|
Character set (bracket expression) <code class="computeroutput"><span class="special">[]</span></code>
|
||||||
|
@ -198,7 +198,7 @@
|
|||||||
</p>
|
</p>
|
||||||
</div>
|
</div>
|
||||||
<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
|
<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
|
||||||
<td align="left"><p><small>Last revised: January 31, 2013 at 17:33:20 GMT</small></p></td>
|
<td align="left"><p><small>Last revised: April 20, 2013 at 15:59:03 GMT</small></p></td>
|
||||||
<td align="right"><div class="copyright-footer"></div></td>
|
<td align="right"><div class="copyright-footer"></div></td>
|
||||||
</tr></table>
|
</tr></table>
|
||||||
<hr>
|
<hr>
|
||||||
|
@ -12,7 +12,7 @@
|
|||||||
|
|
||||||
The Perl regular expression syntax is based on that used by the
|
The Perl regular expression syntax is based on that used by the
|
||||||
programming language Perl . Perl regular expressions are the
|
programming language Perl . Perl regular expressions are the
|
||||||
default behavior in Boost.Regex or you can pass the flag =perl= to the
|
default behavior in Boost.Regex or you can pass the flag [^perl] to the
|
||||||
[basic_regex] constructor, for example:
|
[basic_regex] constructor, for example:
|
||||||
|
|
||||||
// e1 is a case sensitive Perl regular expression:
|
// e1 is a case sensitive Perl regular expression:
|
||||||
@ -34,9 +34,9 @@ The single character '.' when used outside of a character set will match
|
|||||||
any single character except:
|
any single character except:
|
||||||
|
|
||||||
* The NULL character when the [link boost_regex.ref.match_flag_type flag
|
* The NULL character when the [link boost_regex.ref.match_flag_type flag
|
||||||
=match_not_dot_null=] is passed to the matching algorithms.
|
[^match_not_dot_null]] is passed to the matching algorithms.
|
||||||
* The newline character when the [link boost_regex.ref.match_flag_type
|
* The newline character when the [link boost_regex.ref.match_flag_type
|
||||||
flag =match_not_dot_newline=] is passed to
|
flag [^match_not_dot_newline]] is passed to
|
||||||
the matching algorithms.
|
the matching algorithms.
|
||||||
|
|
||||||
[h4 Anchors]
|
[h4 Anchors]
|
||||||
@ -47,7 +47,7 @@ A '$' character shall match the end of a line.
|
|||||||
|
|
||||||
[h4 Marked sub-expressions]
|
[h4 Marked sub-expressions]
|
||||||
|
|
||||||
A section beginning =(= and ending =)= acts as a marked sub-expression.
|
A section beginning [^(] and ending [^)] acts as a marked sub-expression.
|
||||||
Whatever matched the sub-expression is split out in a separate field by
|
Whatever matched the sub-expression is split out in a separate field by
|
||||||
the matching algorithms. Marked sub-expressions can also repeated, or
|
the matching algorithms. Marked sub-expressions can also repeated, or
|
||||||
referred to by a back-reference.
|
referred to by a back-reference.
|
||||||
@ -58,23 +58,23 @@ A marked sub-expression is useful to lexically group part of a regular
|
|||||||
expression, but has the side-effect of spitting out an extra field in
|
expression, but has the side-effect of spitting out an extra field in
|
||||||
the result. As an alternative you can lexically group part of a
|
the result. As an alternative you can lexically group part of a
|
||||||
regular expression, without generating a marked sub-expression by using
|
regular expression, without generating a marked sub-expression by using
|
||||||
=(?:= and =)= , for example =(?:ab)+= will repeat =ab= without splitting
|
[^(?:] and [^)] , for example [^(?:ab)+] will repeat [^ab] without splitting
|
||||||
out any separate sub-expressions.
|
out any separate sub-expressions.
|
||||||
|
|
||||||
[h4 Repeats]
|
[h4 Repeats]
|
||||||
|
|
||||||
Any atom (a single character, a marked sub-expression, or a character class)
|
Any atom (a single character, a marked sub-expression, or a character class)
|
||||||
can be repeated with the =*=, =+=, =?=, and ={}= operators.
|
can be repeated with the [^*], [^+], [^?], and [^{}] operators.
|
||||||
|
|
||||||
The =*= operator will match the preceding atom zero or more times,
|
The [^*] operator will match the preceding atom zero or more times,
|
||||||
for example the expression =a*b= will match any of the following:
|
for example the expression [^a*b] will match any of the following:
|
||||||
|
|
||||||
b
|
b
|
||||||
ab
|
ab
|
||||||
aaaaaaaab
|
aaaaaaaab
|
||||||
|
|
||||||
The =+= operator will match the preceding atom one or more times, for
|
The [^+] operator will match the preceding atom one or more times, for
|
||||||
example the expression =a+b= will match any of the following:
|
example the expression [^a+b] will match any of the following:
|
||||||
|
|
||||||
ab
|
ab
|
||||||
aaaaaaaab
|
aaaaaaaab
|
||||||
@ -83,7 +83,7 @@ But will not match:
|
|||||||
|
|
||||||
b
|
b
|
||||||
|
|
||||||
The =?= operator will match the preceding atom zero or one times, for
|
The [^?] operator will match the preceding atom zero or one times, for
|
||||||
example the expression ca?b will match any of the following:
|
example the expression ca?b will match any of the following:
|
||||||
|
|
||||||
cb
|
cb
|
||||||
@ -95,11 +95,11 @@ But will not match:
|
|||||||
|
|
||||||
An atom can also be repeated with a bounded repeat:
|
An atom can also be repeated with a bounded repeat:
|
||||||
|
|
||||||
=a{n}= Matches 'a' repeated exactly n times.
|
[^a{n}] Matches 'a' repeated exactly n times.
|
||||||
|
|
||||||
=a{n,}= Matches 'a' repeated n or more times.
|
[^a{n,}] Matches 'a' repeated n or more times.
|
||||||
|
|
||||||
=a{n, m}= Matches 'a' repeated between n and m times inclusive.
|
[^a{n, m}] Matches 'a' repeated between n and m times inclusive.
|
||||||
|
|
||||||
For example:
|
For example:
|
||||||
|
|
||||||
@ -120,7 +120,7 @@ be repeated, for example:
|
|||||||
|
|
||||||
a(*)
|
a(*)
|
||||||
|
|
||||||
Will raise an error, as there is nothing for the =*= operator to be applied to.
|
Will raise an error, as there is nothing for the [^*] operator to be applied to.
|
||||||
|
|
||||||
[h4 Non greedy repeats]
|
[h4 Non greedy repeats]
|
||||||
|
|
||||||
@ -128,19 +128,19 @@ The normal repeat operators are "greedy", that is to say they will consume as
|
|||||||
much input as possible. There are non-greedy versions available that will
|
much input as possible. There are non-greedy versions available that will
|
||||||
consume as little input as possible while still producing a match.
|
consume as little input as possible while still producing a match.
|
||||||
|
|
||||||
=*?= Matches the previous atom zero or more times, while consuming as little
|
[^*?] Matches the previous atom zero or more times, while consuming as little
|
||||||
input as possible.
|
input as possible.
|
||||||
|
|
||||||
=+?= Matches the previous atom one or more times, while consuming as
|
[^+?] Matches the previous atom one or more times, while consuming as
|
||||||
little input as possible.
|
little input as possible.
|
||||||
|
|
||||||
=??= Matches the previous atom zero or one times, while consuming
|
[^??] Matches the previous atom zero or one times, while consuming
|
||||||
as little input as possible.
|
as little input as possible.
|
||||||
|
|
||||||
={n,}?= Matches the previous atom n or more times, while consuming as
|
[^{n,}?] Matches the previous atom n or more times, while consuming as
|
||||||
little input as possible.
|
little input as possible.
|
||||||
|
|
||||||
={n,m}?= Matches the previous atom between n and m times, while
|
[^{n,m}?] Matches the previous atom between n and m times, while
|
||||||
consuming as little input as possible.
|
consuming as little input as possible.
|
||||||
|
|
||||||
[h4 Possessive repeats]
|
[h4 Possessive repeats]
|
||||||
@ -150,15 +150,15 @@ a match is found. However, this behaviour can sometime be undesireable so there
|
|||||||
also "possessive" repeats: these match as much as possible and do not then allow
|
also "possessive" repeats: these match as much as possible and do not then allow
|
||||||
backtracking if the rest of the expression fails to match.
|
backtracking if the rest of the expression fails to match.
|
||||||
|
|
||||||
=*+= Matches the previous atom zero or more times, while giving nothing back.
|
[^*+] Matches the previous atom zero or more times, while giving nothing back.
|
||||||
|
|
||||||
=++= Matches the previous atom one or more times, while giving nothing back.
|
[^++] Matches the previous atom one or more times, while giving nothing back.
|
||||||
|
|
||||||
=?+= Matches the previous atom zero or one times, while giving nothing back.
|
[^?+] Matches the previous atom zero or one times, while giving nothing back.
|
||||||
|
|
||||||
={n,}+= Matches the previous atom n or more times, while giving nothing back.
|
[^{n,}+] Matches the previous atom n or more times, while giving nothing back.
|
||||||
|
|
||||||
={n,m}+= Matches the previous atom between n and m times, while giving nothing back.
|
[^{n,m}+] Matches the previous atom between n and m times, while giving nothing back.
|
||||||
|
|
||||||
[h4 Back references]
|
[h4 Back references]
|
||||||
|
|
||||||
@ -180,12 +180,12 @@ You can also use the \g escape for the same function, for example:
|
|||||||
|
|
||||||
[table
|
[table
|
||||||
[[Escape][Meaning]]
|
[[Escape][Meaning]]
|
||||||
[[=\g1=][Match whatever matched sub-expression 1]]
|
[[[^\g1]][Match whatever matched sub-expression 1]]
|
||||||
[[=\g{1}=][Match whatever matched sub-expression 1: this form allows for safer
|
[[[^\g{1}]][Match whatever matched sub-expression 1: this form allows for safer
|
||||||
parsing of the expression in cases like =\g{1}2= or for indexes higher than 9 as in =\g{1234}=]]
|
parsing of the expression in cases like [^\g{1}2] or for indexes higher than 9 as in [^\g{1234}]]]
|
||||||
[[=\g-1=][Match whatever matched the last opened sub-expression]]
|
[[[^\g-1]][Match whatever matched the last opened sub-expression]]
|
||||||
[[=\g{-2}=][Match whatever matched the last but one opened sub-expression]]
|
[[[^\g{-2}]][Match whatever matched the last but one opened sub-expression]]
|
||||||
[[=\g{one}=][Match whatever matched the sub-expression named "one"]]
|
[[[^\g{one}]][Match whatever matched the sub-expression named "one"]]
|
||||||
]
|
]
|
||||||
|
|
||||||
Finally the \k escape can be used to refer to named subexpressions, for example [^\k<two>] will match
|
Finally the \k escape can be used to refer to named subexpressions, for example [^\k<two>] will match
|
||||||
@ -193,24 +193,24 @@ whatever matched the subexpression named "two".
|
|||||||
|
|
||||||
[h4 Alternation]
|
[h4 Alternation]
|
||||||
|
|
||||||
The =|= operator will match either of its arguments, so for example:
|
The [^|] operator will match either of its arguments, so for example:
|
||||||
=abc|def= will match either "abc" or "def".
|
[^abc|def] will match either "abc" or "def".
|
||||||
|
|
||||||
Parenthesis can be used to group alternations, for example: =ab(d|ef)=
|
Parenthesis can be used to group alternations, for example: [^ab(d|ef)]
|
||||||
will match either of "abd" or "abef".
|
will match either of "abd" or "abef".
|
||||||
|
|
||||||
Empty alternatives are not allowed (these are almost always a mistake), but
|
Empty alternatives are not allowed (these are almost always a mistake), but
|
||||||
if you really want an empty alternative use =(?:)= as a placeholder, for example:
|
if you really want an empty alternative use [^(?:)] as a placeholder, for example:
|
||||||
|
|
||||||
=|abc= is not a valid expression, but
|
[^|abc] is not a valid expression, but
|
||||||
|
|
||||||
=(?:)|abc= is and is equivalent, also the expression:
|
[^(?:)|abc] is and is equivalent, also the expression:
|
||||||
|
|
||||||
=(?:abc)??= has exactly the same effect.
|
[^(?:abc)??] has exactly the same effect.
|
||||||
|
|
||||||
[h4 Character sets]
|
[h4 Character sets]
|
||||||
|
|
||||||
A character set is a bracket-expression starting with =[= and ending with =]=,
|
A character set is a bracket-expression starting with [^[] and ending with [^]],
|
||||||
it defines a set of characters, and matches any single character that is a
|
it defines a set of characters, and matches any single character that is a
|
||||||
member of that set.
|
member of that set.
|
||||||
|
|
||||||
@ -226,14 +226,14 @@ For example [^\[a-c\]] will match any single character in the range 'a' to 'c'.
|
|||||||
By default, for Perl regular expressions, a character x is within the
|
By default, for Perl regular expressions, a character x is within the
|
||||||
range y to z, if the code point of the character lies within the codepoints of
|
range y to z, if the code point of the character lies within the codepoints of
|
||||||
the endpoints of the range. Alternatively, if you set the
|
the endpoints of the range. Alternatively, if you set the
|
||||||
[link boost_regex.ref.syntax_option_type.syntax_option_type_perl =collate= flag]
|
[link boost_regex.ref.syntax_option_type.syntax_option_type_perl [^collate] flag]
|
||||||
when constructing the regular expression, then ranges are locale sensitive.
|
when constructing the regular expression, then ranges are locale sensitive.
|
||||||
|
|
||||||
[h5 Negation]
|
[h5 Negation]
|
||||||
|
|
||||||
If the bracket-expression begins with the ^ character, then it matches the
|
If the bracket-expression begins with the ^ character, then it matches the
|
||||||
complement of the characters it contains, for example [^\[^a-c\]] matches
|
complement of the characters it contains, for example [^\[^a-c\]] matches
|
||||||
any character that is not in the range =a-c=.
|
any character that is not in the range [^a-c].
|
||||||
|
|
||||||
[h5 Character classes]
|
[h5 Character classes]
|
||||||
|
|
||||||
@ -255,7 +255,7 @@ As an extension, a collating element may also be specified via it's
|
|||||||
|
|
||||||
[[.NUL.]]
|
[[.NUL.]]
|
||||||
|
|
||||||
matches a =\0= character.
|
matches a [^\0] character.
|
||||||
|
|
||||||
[h5 Equivalence classes]
|
[h5 Equivalence classes]
|
||||||
|
|
||||||
@ -292,24 +292,24 @@ The following escape sequences are all synonyms for single characters:
|
|||||||
|
|
||||||
[table
|
[table
|
||||||
[[Escape][Character]]
|
[[Escape][Character]]
|
||||||
[[=\a=][=\a=]]
|
[[[^\a]][[^\a]]]
|
||||||
[[=\e=][=0x1B=]]
|
[[[^\e]][[^0x1B]]]
|
||||||
[[=\f=][=\f=]]
|
[[[^\f]][[^\f]]]
|
||||||
[[=\n=][=\n=]]
|
[[[^\n]][[^\n]]]
|
||||||
[[=\r=][=\r=]]
|
[[[^\r]][[^\r]]]
|
||||||
[[=\t=][=\t=]]
|
[[[^\t]][[^\t]]]
|
||||||
[[=\v=][=\v=]]
|
[[[^\v]][[^\v]]]
|
||||||
[[=\b=][=\b= (but only inside a character class declaration).]]
|
[[[^\b]][[^\b] (but only inside a character class declaration).]]
|
||||||
[[=\cX=][An ASCII escape sequence - the character whose code point is X % 32]]
|
[[[^\cX]][An ASCII escape sequence - the character whose code point is X % 32]]
|
||||||
[[=\xdd=][A hexadecimal escape sequence - matches the single character whose
|
[[[^\xdd]][A hexadecimal escape sequence - matches the single character whose
|
||||||
code point is 0xdd.]]
|
code point is 0xdd.]]
|
||||||
[[=\x{dddd}=][A hexadecimal escape sequence - matches the single character whose
|
[[[^\x{dddd}]][A hexadecimal escape sequence - matches the single character whose
|
||||||
code point is 0xdddd.]]
|
code point is 0xdddd.]]
|
||||||
[[=\0ddd=][An octal escape sequence - matches the single character whose
|
[[[^\0ddd]][An octal escape sequence - matches the single character whose
|
||||||
code point is 0ddd.]]
|
code point is 0ddd.]]
|
||||||
[[=\N{name}=][Matches the single character which has the
|
[[[^\N{name}]][Matches the single character which has the
|
||||||
[link boost_regex.syntax.collating_names symbolic name] /name/.
|
[link boost_regex.syntax.collating_names symbolic name] /name/.
|
||||||
For example =\N{newline}= matches the single character \\n.]]
|
For example [^\N{newline}] matches the single character \\n.]]
|
||||||
]
|
]
|
||||||
|
|
||||||
[h5 "Single character" character classes:]
|
[h5 "Single character" character classes:]
|
||||||
@ -352,19 +352,19 @@ to the [link boost_regex.syntax.character_classes names used in character classe
|
|||||||
[[`\P{Name}`][Matches any character that does not have the property Name.][`[^[:Name:]]`]]
|
[[`\P{Name}`][Matches any character that does not have the property Name.][`[^[:Name:]]`]]
|
||||||
]
|
]
|
||||||
|
|
||||||
For example =\pd= matches any "digit" character, as does =\p{digit}=.
|
For example [^\pd] matches any "digit" character, as does [^\p{digit}].
|
||||||
|
|
||||||
[h5 Word Boundaries]
|
[h5 Word Boundaries]
|
||||||
|
|
||||||
The following escape sequences match the boundaries of words:
|
The following escape sequences match the boundaries of words:
|
||||||
|
|
||||||
=\<= Matches the start of a word.
|
[^\<] Matches the start of a word.
|
||||||
|
|
||||||
=\>= Matches the end of a word.
|
[^\>] Matches the end of a word.
|
||||||
|
|
||||||
=\b= Matches a word boundary (the start or end of a word).
|
[^\b] Matches a word boundary (the start or end of a word).
|
||||||
|
|
||||||
=\B= Matches only when not at a word boundary.
|
[^\B] Matches only when not at a word boundary.
|
||||||
|
|
||||||
[h5 Buffer boundaries]
|
[h5 Buffer boundaries]
|
||||||
|
|
||||||
@ -376,9 +376,9 @@ context is the whole of the input text that is being matched against
|
|||||||
|
|
||||||
\\' Matches at the end of a buffer only.
|
\\' Matches at the end of a buffer only.
|
||||||
|
|
||||||
\\A Matches at the start of a buffer only (the same as =\\\`=).
|
\\A Matches at the start of a buffer only (the same as [^\\\`]).
|
||||||
|
|
||||||
\\z Matches at the end of a buffer only (the same as =\\'=).
|
\\z Matches at the end of a buffer only (the same as [^\\']).
|
||||||
|
|
||||||
\\Z Matches a zero-width assertion consisting of an optional sequence of newlines at the end of a buffer:
|
\\Z Matches a zero-width assertion consisting of an optional sequence of newlines at the end of a buffer:
|
||||||
equivalent to the regular expression [^(?=\\v*\\z)]. Note that this is subtly different from Perl which
|
equivalent to the regular expression [^(?=\\v*\\z)]. Note that this is subtly different from Perl which
|
||||||
@ -386,39 +386,39 @@ behaves as if matching [^(?=\\n?\\z)].
|
|||||||
|
|
||||||
[h5 Continuation Escape]
|
[h5 Continuation Escape]
|
||||||
|
|
||||||
The sequence =\G= matches only at the end of the last match found, or at
|
The sequence [^\G] matches only at the end of the last match found, or at
|
||||||
the start of the text being matched if no previous match was found.
|
the start of the text being matched if no previous match was found.
|
||||||
This escape useful if you're iterating over the matches contained within a
|
This escape useful if you're iterating over the matches contained within a
|
||||||
text, and you want each subsequence match to start where the last one ended.
|
text, and you want each subsequence match to start where the last one ended.
|
||||||
|
|
||||||
[h5 Quoting escape]
|
[h5 Quoting escape]
|
||||||
|
|
||||||
The escape sequence =\Q= begins a "quoted sequence": all the subsequent characters
|
The escape sequence [^\Q] begins a "quoted sequence": all the subsequent characters
|
||||||
are treated as literals, until either the end of the regular expression or \\E
|
are treated as literals, until either the end of the regular expression or \\E
|
||||||
is found. For example the expression: =\Q\*+\Ea+= would match either of:
|
is found. For example the expression: [^\Q\*+\Ea+] would match either of:
|
||||||
|
|
||||||
\*+a
|
\*+a
|
||||||
\*+aaa
|
\*+aaa
|
||||||
|
|
||||||
[h5 Unicode escapes]
|
[h5 Unicode escapes]
|
||||||
|
|
||||||
=\C= Matches a single code point: in Boost regex this has exactly the
|
[^\C] Matches a single code point: in Boost regex this has exactly the
|
||||||
same effect as a "." operator.
|
same effect as a "." operator.
|
||||||
=\X= Matches a combining character sequence: that is any non-combining
|
[^\X] Matches a combining character sequence: that is any non-combining
|
||||||
character followed by a sequence of zero or more combining characters.
|
character followed by a sequence of zero or more combining characters.
|
||||||
|
|
||||||
[h5 Matching Line Endings]
|
[h5 Matching Line Endings]
|
||||||
|
|
||||||
The escape sequence =\R= matches any line ending character sequence, specifically it is identical to
|
The escape sequence [^\R] matches any line ending character sequence, specifically it is identical to
|
||||||
the expression [^(?>\x0D\x0A?|\[\x0A-\x0C\x85\x{2028}\x{2029}\])].
|
the expression [^(?>\x0D\x0A?|\[\x0A-\x0C\x85\x{2028}\x{2029}\])].
|
||||||
|
|
||||||
[h5 Keeping back some text]
|
[h5 Keeping back some text]
|
||||||
|
|
||||||
=\K= Resets the start location of $0 to the current text position: in other words everything to the
|
[^\K] Resets the start location of $0 to the current text position: in other words everything to the
|
||||||
left of \K is "kept back" and does not form part of the regular expression match. $` is updated
|
left of \K is "kept back" and does not form part of the regular expression match. $` is updated
|
||||||
accordingly.
|
accordingly.
|
||||||
|
|
||||||
For example =foo\Kbar= matched against the text "foobar" would return the match "bar" for $0 and "foo"
|
For example [^foo\Kbar] matched against the text "foobar" would return the match "bar" for $0 and "foo"
|
||||||
for $`. This can be used to simulate variable width lookbehind assertions.
|
for $`. This can be used to simulate variable width lookbehind assertions.
|
||||||
|
|
||||||
[h5 Any other escape]
|
[h5 Any other escape]
|
||||||
@ -428,7 +428,7 @@ Any other escape sequence matches the character that is escaped, for example
|
|||||||
|
|
||||||
[h4 Perl Extended Patterns]
|
[h4 Perl Extended Patterns]
|
||||||
|
|
||||||
Perl-specific extensions to the regular expression syntax all start with =(?=.
|
Perl-specific extensions to the regular expression syntax all start with [^(?].
|
||||||
|
|
||||||
[h5 Named Subexpressions]
|
[h5 Named Subexpressions]
|
||||||
|
|
||||||
@ -447,25 +447,25 @@ and can also be refered to by name in a [perl_format] format string for search a
|
|||||||
|
|
||||||
[h5 Comments]
|
[h5 Comments]
|
||||||
|
|
||||||
=(?# ... )= is treated as a comment, it's contents are ignored.
|
[^(?# ... )] is treated as a comment, it's contents are ignored.
|
||||||
|
|
||||||
[h5 Modifiers]
|
[h5 Modifiers]
|
||||||
|
|
||||||
=(?imsx-imsx ... )= alters which of the perl modifiers are in effect within
|
[^(?imsx-imsx ... )] alters which of the perl modifiers are in effect within
|
||||||
the pattern, changes take effect from the point that the block is first seen
|
the pattern, changes take effect from the point that the block is first seen
|
||||||
and extend to any enclosing =)=. Letters before a '-' turn that perl
|
and extend to any enclosing [^)]. Letters before a '-' turn that perl
|
||||||
modifier on, letters afterward, turn it off.
|
modifier on, letters afterward, turn it off.
|
||||||
|
|
||||||
=(?imsx-imsx:pattern)= applies the specified modifiers to pattern only.
|
[^(?imsx-imsx:pattern)] applies the specified modifiers to pattern only.
|
||||||
|
|
||||||
[h5 Non-marking groups]
|
[h5 Non-marking groups]
|
||||||
|
|
||||||
=(?:pattern)= lexically groups pattern, without generating an additional
|
[^(?:pattern)] lexically groups pattern, without generating an additional
|
||||||
sub-expression.
|
sub-expression.
|
||||||
|
|
||||||
[h5 Branch reset]
|
[h5 Branch reset]
|
||||||
|
|
||||||
=(?|pattern)= resets the subexpression count at the start of each "|" alternative within /pattern/.
|
[^(?|pattern)] resets the subexpression count at the start of each "|" alternative within /pattern/.
|
||||||
|
|
||||||
The sub-expression count following this construct is that of whichever branch had the largest number of
|
The sub-expression count following this construct is that of whichever branch had the largest number of
|
||||||
sub-expressions. This construct is useful when you want to capture one of a number of alternative matches
|
sub-expressions. This construct is useful when you want to capture one of a number of alternative matches
|
||||||
@ -483,7 +483,7 @@ In the following example the index of each sub-expression is shown below the exp
|
|||||||
|
|
||||||
[^(?=pattern)] consumes zero characters, only if pattern matches.
|
[^(?=pattern)] consumes zero characters, only if pattern matches.
|
||||||
|
|
||||||
=(?!pattern)= consumes zero characters, only if pattern does not match.
|
[^(?!pattern)] consumes zero characters, only if pattern does not match.
|
||||||
|
|
||||||
Lookahead is typically used to create the logical AND of two regular
|
Lookahead is typically used to create the logical AND of two regular
|
||||||
expressions, for example if a password must contain a lower case letter,
|
expressions, for example if a password must contain a lower case letter,
|
||||||
@ -500,13 +500,13 @@ could be used to validate the password.
|
|||||||
against the characters preceding the current position (pattern must be
|
against the characters preceding the current position (pattern must be
|
||||||
of fixed length).
|
of fixed length).
|
||||||
|
|
||||||
=(?<!pattern)= consumes zero characters, only if pattern could not be
|
[^(?<!pattern)] consumes zero characters, only if pattern could not be
|
||||||
matched against the characters preceding the current position (pattern must
|
matched against the characters preceding the current position (pattern must
|
||||||
be of fixed length).
|
be of fixed length).
|
||||||
|
|
||||||
[h5 Independent sub-expressions]
|
[h5 Independent sub-expressions]
|
||||||
|
|
||||||
=(?>pattern)= /pattern/ is matched independently of the surrounding patterns,
|
[^(?>pattern)] /pattern/ is matched independently of the surrounding patterns,
|
||||||
the expression will never backtrack into /pattern/. Independent sub-expressions
|
the expression will never backtrack into /pattern/. Independent sub-expressions
|
||||||
are typically used to improve performance; only the best possible match
|
are typically used to improve performance; only the best possible match
|
||||||
for pattern will be considered, if this doesn't allow the expression as a
|
for pattern will be considered, if this doesn't allow the expression as a
|
||||||
@ -516,21 +516,21 @@ whole to match then no match is found at all.
|
|||||||
|
|
||||||
[^(?['N]) (?-['N]) (?+['N]) (?R) (?0) (?&NAME)]
|
[^(?['N]) (?-['N]) (?+['N]) (?R) (?0) (?&NAME)]
|
||||||
|
|
||||||
=(?R)= and =(?0)= recurse to the start of the entire pattern.
|
[^(?R)] and [^(?0)] recurse to the start of the entire pattern.
|
||||||
|
|
||||||
[^(?['N])] executes sub-expression /N/ recursively, for example =(?2)= will recurse to sub-expression 2.
|
[^(?['N])] executes sub-expression /N/ recursively, for example [^(?2)] will recurse to sub-expression 2.
|
||||||
|
|
||||||
[^(?-['N])] and [^(?+['N])] are relative recursions, so for example =(?-1)= recurses to the last sub-expression to be declared,
|
[^(?-['N])] and [^(?+['N])] are relative recursions, so for example [^(?-1)] recurses to the last sub-expression to be declared,
|
||||||
and =(?+1)= recurses to the next sub-expression to be declared.
|
and [^(?+1)] recurses to the next sub-expression to be declared.
|
||||||
|
|
||||||
[^(?&NAME)] recurses to named sub-expression ['NAME].
|
[^(?&NAME)] recurses to named sub-expression ['NAME].
|
||||||
|
|
||||||
[h5 Conditional Expressions]
|
[h5 Conditional Expressions]
|
||||||
|
|
||||||
=(?(condition)yes-pattern|no-pattern)= attempts to match /yes-pattern/ if
|
[^(?(condition)yes-pattern|no-pattern)] attempts to match /yes-pattern/ if
|
||||||
the /condition/ is true, otherwise attempts to match /no-pattern/.
|
the /condition/ is true, otherwise attempts to match /no-pattern/.
|
||||||
|
|
||||||
=(?(condition)yes-pattern)= attempts to match /yes-pattern/ if the /condition/
|
[^(?(condition)yes-pattern)] attempts to match /yes-pattern/ if the /condition/
|
||||||
is true, otherwise matches the NULL string.
|
is true, otherwise matches the NULL string.
|
||||||
|
|
||||||
/condition/ may be either: a forward lookahead assert, the index of
|
/condition/ may be either: a forward lookahead assert, the index of
|
||||||
@ -542,15 +542,15 @@ Here is a summary of the possible predicates:
|
|||||||
|
|
||||||
* [^(?(?\=assert)yes-pattern|no-pattern)] Executes /yes-pattern/ if the forward look-ahead assert matches, otherwise
|
* [^(?(?\=assert)yes-pattern|no-pattern)] Executes /yes-pattern/ if the forward look-ahead assert matches, otherwise
|
||||||
executes /no-pattern/.
|
executes /no-pattern/.
|
||||||
* =(?(?!assert)yes-pattern|no-pattern)= Executes /yes-pattern/ if the forward look-ahead assert does not match, otherwise
|
* [^(?(?!assert)yes-pattern|no-pattern)] Executes /yes-pattern/ if the forward look-ahead assert does not match, otherwise
|
||||||
executes /no-pattern/.
|
executes /no-pattern/.
|
||||||
* =(?(['N])yes-pattern|no-pattern)= Executes /yes-pattern/ if subexpression /N/ has been matched, otherwise
|
* [^(?(['N])yes-pattern|no-pattern)] Executes /yes-pattern/ if subexpression /N/ has been matched, otherwise
|
||||||
executes /no-pattern/.
|
executes /no-pattern/.
|
||||||
* =(?(<['name]>)yes-pattern|no-pattern)= Executes /yes-pattern/ if named subexpression /name/ has been matched, otherwise
|
* [^(?(<['name]>)yes-pattern|no-pattern)] Executes /yes-pattern/ if named subexpression /name/ has been matched, otherwise
|
||||||
executes /no-pattern/.
|
executes /no-pattern/.
|
||||||
* =(?('['name]')yes-pattern|no-pattern)= Executes /yes-pattern/ if named subexpression /name/ has been matched, otherwise
|
* [^(?('['name]')yes-pattern|no-pattern)] Executes /yes-pattern/ if named subexpression /name/ has been matched, otherwise
|
||||||
executes /no-pattern/.
|
executes /no-pattern/.
|
||||||
* =(?(R)yes-pattern|no-pattern)= Executes /yes-pattern/ if we are executing inside a recursion, otherwise
|
* [^(?(R)yes-pattern|no-pattern)] Executes /yes-pattern/ if we are executing inside a recursion, otherwise
|
||||||
executes /no-pattern/.
|
executes /no-pattern/.
|
||||||
* [^(?(R['N])yes-pattern|no-pattern)] Executes /yes-pattern/ if we are executing inside a recursion to sub-expression /N/, otherwise
|
* [^(?(R['N])yes-pattern|no-pattern)] Executes /yes-pattern/ if we are executing inside a recursion to sub-expression /N/, otherwise
|
||||||
executes /no-pattern/.
|
executes /no-pattern/.
|
||||||
@ -564,10 +564,10 @@ this is usually used to define one or more named sub-expressions which are refer
|
|||||||
The order of precedence for of operators is as follows:
|
The order of precedence for of operators is as follows:
|
||||||
|
|
||||||
# Collation-related bracket symbols `[==] [::] [..]`
|
# Collation-related bracket symbols `[==] [::] [..]`
|
||||||
# Escaped characters =\=
|
# Escaped characters [^\]
|
||||||
# Character set (bracket expression) `[]`
|
# Character set (bracket expression) `[]`
|
||||||
# Grouping =()=
|
# Grouping [^()]
|
||||||
# Single-character-ERE duplication =* + ? {m,n}=
|
# Single-character-ERE duplication [^* + ? {m,n}]
|
||||||
# Concatenation
|
# Concatenation
|
||||||
# Anchoring ^$
|
# Anchoring ^$
|
||||||
# Alternation |
|
# Alternation |
|
||||||
@ -586,42 +586,42 @@ with individual elements matched as follows;
|
|||||||
|
|
||||||
[table
|
[table
|
||||||
[[Construct][What gets matched]]
|
[[Construct][What gets matched]]
|
||||||
[[=AtomA AtomB=][Locates the best match for /AtomA/ that has a following match for /AtomB/.]]
|
[[[^AtomA AtomB]][Locates the best match for /AtomA/ that has a following match for /AtomB/.]]
|
||||||
[[=Expression1 | Expression2=][If /Expresion1/ can be matched then returns that match,
|
[[[^Expression1 | Expression2]][If /Expresion1/ can be matched then returns that match,
|
||||||
otherwise attempts to match /Expression2/.]]
|
otherwise attempts to match /Expression2/.]]
|
||||||
[[=S{N}=][Matches /S/ repeated exactly N times.]]
|
[[[^S{N}]][Matches /S/ repeated exactly N times.]]
|
||||||
[[=S{N,M}=][Matches S repeated between N and M times, and as many times as possible.]]
|
[[[^S{N,M}]][Matches S repeated between N and M times, and as many times as possible.]]
|
||||||
[[=S{N,M}?=][Matches S repeated between N and M times, and as few times as possible.]]
|
[[[^S{N,M}?]][Matches S repeated between N and M times, and as few times as possible.]]
|
||||||
[[=S?, S*, S+=][The same as =S{0,1}=, =S{0,UINT_MAX}=, =S{1,UINT_MAX}= respectively.]]
|
[[[^S?, S*, S+]][The same as [^S{0,1}], [^S{0,UINT_MAX}], [^S{1,UINT_MAX}] respectively.]]
|
||||||
[[=S??, S*?, S+?=][The same as =S{0,1}?=, =S{0,UINT_MAX}?=, =S{1,UINT_MAX}?= respectively.]]
|
[[[^S??, S*?, S+?]][The same as [^S{0,1}?], [^S{0,UINT_MAX}?], [^S{1,UINT_MAX}?] respectively.]]
|
||||||
[[=(?>S)=][Matches the best match for /S/, and only that.]]
|
[[[^(?>S)]][Matches the best match for /S/, and only that.]]
|
||||||
[[[^(?=S), (?<=S)]][Matches only the best match for /S/ (this is only
|
[[[^(?=S), (?<=S)]][Matches only the best match for /S/ (this is only
|
||||||
visible if there are capturing parenthesis within /S/).]]
|
visible if there are capturing parenthesis within /S/).]]
|
||||||
[[=(?!S), (?<!S)=][Considers only whether a match for S exists or not.]]
|
[[[^(?!S), (?<!S)]][Considers only whether a match for S exists or not.]]
|
||||||
[[=(?(condition)yes-pattern | no-pattern)=][If condition is true, then
|
[[[^(?(condition)yes-pattern | no-pattern)]][If condition is true, then
|
||||||
only yes-pattern is considered, otherwise only no-pattern is considered.]]
|
only yes-pattern is considered, otherwise only no-pattern is considered.]]
|
||||||
]
|
]
|
||||||
|
|
||||||
[h3 Variations]
|
[h3 Variations]
|
||||||
|
|
||||||
The [link boost_regex.ref.syntax_option_type.syntax_option_type_perl options =normal=,
|
The [link boost_regex.ref.syntax_option_type.syntax_option_type_perl options [^normal],
|
||||||
=ECMAScript=, =JavaScript= and =JScript=] are all synonyms for
|
[^ECMAScript], [^JavaScript] and [^JScript]] are all synonyms for
|
||||||
=perl=.
|
[^perl].
|
||||||
|
|
||||||
[h3 Options]
|
[h3 Options]
|
||||||
|
|
||||||
There are a [link boost_regex.ref.syntax_option_type.syntax_option_type_perl
|
There are a [link boost_regex.ref.syntax_option_type.syntax_option_type_perl
|
||||||
variety of flags] that may be combined with the =perl= option when
|
variety of flags] that may be combined with the [^perl] option when
|
||||||
constructing the regular expression, in particular note that the
|
constructing the regular expression, in particular note that the
|
||||||
=newline_alt= option alters the syntax, while the =collate=, =nosubs= and
|
[^newline_alt] option alters the syntax, while the [^collate], [^nosubs] and
|
||||||
=icase= options modify how the case and locale sensitivity are to be applied.
|
[^icase] options modify how the case and locale sensitivity are to be applied.
|
||||||
|
|
||||||
[h3 Pattern Modifiers]
|
[h3 Pattern Modifiers]
|
||||||
|
|
||||||
The perl =smix= modifiers can either be applied using a =(?smix-smix)=
|
The perl [^smix] modifiers can either be applied using a [^(?smix-smix)]
|
||||||
prefix to the regular expression, or with one of the
|
prefix to the regular expression, or with one of the
|
||||||
[link boost_regex.ref.syntax_option_type.syntax_option_type_perl regex-compile time
|
[link boost_regex.ref.syntax_option_type.syntax_option_type_perl regex-compile time
|
||||||
flags =no_mod_m=, =mod_x=, =mod_s=, and =no_mod_s=].
|
flags [^no_mod_m], [^mod_x], [^mod_s], and [^no_mod_s]].
|
||||||
|
|
||||||
[h3 References]
|
[h3 References]
|
||||||
|
|
||||||
|
Reference in New Issue
Block a user