Added possessive modifiers ++ *+ ?+ {}+.

Added support for \v and \h as character classes as per Perl-5.10. 

[SVN r52558]
This commit is contained in:
John Maddock
2009-04-23 09:51:31 +00:00
parent ccf465daac
commit 7b10b5dac5
96 changed files with 521 additions and 286 deletions

View File

@ -3,7 +3,7 @@
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>Perl Regular Expression Syntax</title>
<link rel="stylesheet" href="../../../../../../doc/html/boostbook.css" type="text/css">
<meta name="generator" content="DocBook XSL Stylesheets Vsnapshot_8125">
<meta name="generator" content="DocBook XSL Stylesheets V1.74.0">
<link rel="home" href="../../index.html" title="Boost.Regex">
<link rel="up" href="../syntax.html" title="Regular Expression Syntax">
<link rel="prev" href="../syntax.html" title="Regular Expression Syntax">
@ -28,7 +28,7 @@
Syntax</a>
</h3></div></div></div>
<a name="boost_regex.syntax.perl_syntax.synopsis"></a><h4>
<a name="id535061"></a>
<a name="id812854"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.synopsis">Synopsis</a>
</h4>
<p>
@ -45,7 +45,7 @@
</span><span class="identifier">boost</span><span class="special">::</span><span class="identifier">regex</span> <span class="identifier">e2</span><span class="special">(</span><span class="identifier">my_expression</span><span class="special">,</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">regex</span><span class="special">::</span><span class="identifier">perl</span><span class="special">|</span><span class="identifier">boost</span><span class="special">::</span><span class="identifier">regex</span><span class="special">::</span><span class="identifier">icase</span><span class="special">);</span>
</pre>
<a name="boost_regex.syntax.perl_syntax.perl_regular_expression_syntax"></a><h4>
<a name="id535282"></a>
<a name="id813004"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.perl_regular_expression_syntax">Perl
Regular Expression Syntax</a>
</h4>
@ -55,7 +55,7 @@
</p>
<pre class="programlisting">.[{()\*+?|^$</pre>
<a name="boost_regex.syntax.perl_syntax.wildcard"></a><h5>
<a name="id535320"></a>
<a name="id813028"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.wildcard">Wildcard</a>
</h5>
<p>
@ -75,7 +75,7 @@
</li>
</ul></div>
<a name="boost_regex.syntax.perl_syntax.anchors"></a><h5>
<a name="id535401"></a>
<a name="id813079"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.anchors">Anchors</a>
</h5>
<p>
@ -85,7 +85,7 @@
A '$' character shall match the end of a line.
</p>
<a name="boost_regex.syntax.perl_syntax.marked_sub_expressions"></a><h5>
<a name="id535435"></a>
<a name="id813101"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.marked_sub_expressions">Marked
sub-expressions</a>
</h5>
@ -97,7 +97,7 @@
to by a back-reference.
</p>
<a name="boost_regex.syntax.perl_syntax.non_marking_grouping"></a><h5>
<a name="id535490"></a>
<a name="id813132"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.non_marking_grouping">Non-marking
grouping</a>
</h5>
@ -111,7 +111,7 @@
out any separate sub-expressions.
</p>
<a name="boost_regex.syntax.perl_syntax.repeats"></a><h5>
<a name="id535579"></a>
<a name="id813185"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.repeats">Repeats</a>
</h5>
<p>
@ -197,7 +197,7 @@
operator to be applied to.
</p>
<a name="boost_regex.syntax.perl_syntax.non_greedy_repeats"></a><h5>
<a name="id536052"></a>
<a name="id813507"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.non_greedy_repeats">Non greedy
repeats</a>
</h5>
@ -227,8 +227,40 @@
Matches the previous atom between n and m times, while consuming as little
input as possible.
</p>
<a name="boost_regex.syntax.perl_syntax.pocessive_repeats"></a><h5>
<a name="id813599"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.pocessive_repeats">Pocessive
repeats</a>
</h5>
<p>
By default when a repeated patten does not match then the engine will backtrack
until a match is found. However, this behaviour can sometime be undesireable
so there are also "pocessive" repeats: these match as much as possible
and do not then allow backtracking if the rest of the expression fails to
match.
</p>
<p>
<code class="computeroutput"><span class="special">*+</span></code> Matches the previous atom
zero or more times, while giving nothing back.
</p>
<p>
<code class="computeroutput"><span class="special">++</span></code> Matches the previous atom
one or more times, while giving nothing back.
</p>
<p>
<code class="computeroutput"><span class="special">?+</span></code> Matches the previous atom
zero or one times, while giving nothing back.
</p>
<p>
<code class="computeroutput"><span class="special">{</span><span class="identifier">n</span><span class="special">,}+</span></code> Matches the previous atom n or more times,
while giving nothing back.
</p>
<p>
<code class="computeroutput"><span class="special">{</span><span class="identifier">n</span><span class="special">,</span><span class="identifier">m</span><span class="special">}+</span></code>
Matches the previous atom between n and m times, while giving nothing back.
</p>
<a name="boost_regex.syntax.perl_syntax.back_references"></a><h5>
<a name="id536197"></a>
<a name="id813691"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.back_references">Back references</a>
</h5>
<p>
@ -248,7 +280,7 @@
<pre class="programlisting"><span class="identifier">aaabba</span>
</pre>
<a name="boost_regex.syntax.perl_syntax.alternation"></a><h5>
<a name="id536280"></a>
<a name="id813748"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.alternation">Alternation</a>
</h5>
<p>
@ -277,7 +309,7 @@
<code class="computeroutput"><span class="special">(?:</span><span class="identifier">abc</span><span class="special">)??</span></code> has exactly the same effect.
</p>
<a name="boost_regex.syntax.perl_syntax.character_sets"></a><h5>
<a name="id536469"></a>
<a name="id814216"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.character_sets">Character sets</a>
</h5>
<p>
@ -290,7 +322,7 @@
A bracket expression may contain any combination of the following:
</p>
<a name="boost_regex.syntax.perl_syntax.single_characters"></a><h6>
<a name="id536527"></a>
<a name="id814252"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.single_characters">Single characters</a>
</h6>
<p>
@ -298,7 +330,7 @@
or 'c'.
</p>
<a name="boost_regex.syntax.perl_syntax.character_ranges"></a><h6>
<a name="id536578"></a>
<a name="id814283"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.character_ranges">Character
ranges</a>
</h6>
@ -311,7 +343,7 @@
regular expression, then ranges are locale sensitive.
</p>
<a name="boost_regex.syntax.perl_syntax.negation"></a><h6>
<a name="id536658"></a>
<a name="id814334"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.negation">Negation</a>
</h6>
<p>
@ -320,7 +352,7 @@
range <code class="computeroutput"><span class="identifier">a</span><span class="special">-</span><span class="identifier">c</span></code>.
</p>
<a name="boost_regex.syntax.perl_syntax.character_classes"></a><h6>
<a name="id536740"></a>
<a name="id814388"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.character_classes">Character
classes</a>
</h6>
@ -330,7 +362,7 @@
<a class="link" href="character_classes.html" title="Character Class Names">character class names</a>.
</p>
<a name="boost_regex.syntax.perl_syntax.collating_elements"></a><h6>
<a name="id536823"></a>
<a name="id814440"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.collating_elements">Collating
Elements</a>
</h6>
@ -354,7 +386,7 @@
character.
</p>
<a name="boost_regex.syntax.perl_syntax.equivalence_classes"></a><h6>
<a name="id536972"></a>
<a name="id814535"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.equivalence_classes">Equivalence
classes</a>
</h6>
@ -371,7 +403,7 @@
or even all locales on one platform.
</p>
<a name="boost_regex.syntax.perl_syntax.escaped_characters"></a><h6>
<a name="id537075"></a>
<a name="id814592"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.escaped_characters">Escaped
Characters</a>
</h6>
@ -383,7 +415,7 @@
is <span class="emphasis"><em>not</em></span> a "word" character.
</p>
<a name="boost_regex.syntax.perl_syntax.combinations"></a><h6>
<a name="id537181"></a>
<a name="id814661"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.combinations">Combinations</a>
</h6>
<p>
@ -391,7 +423,7 @@
<code class="computeroutput"><span class="special">[[:</span><span class="identifier">digit</span><span class="special">:]</span><span class="identifier">a</span><span class="special">-</span><span class="identifier">c</span><span class="special">[.</span><span class="identifier">NUL</span><span class="special">.]]</span></code>.
</p>
<a name="boost_regex.syntax.perl_syntax.escapes"></a><h5>
<a name="id537259"></a>
<a name="id814714"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.escapes">Escapes</a>
</h5>
<p>
@ -584,7 +616,7 @@
</tbody>
</table></div>
<a name="boost_regex.syntax.perl_syntax._quot_single_character_quot__character_classes_"></a><h6>
<a name="id537972"></a>
<a name="id815263"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax._quot_single_character_quot__character_classes_">"Single
character" character classes:</a>
</h6>
@ -676,6 +708,30 @@
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="identifier">h</span></code>
</p>
</td>
<td>
<p>
Horizontal whitespace
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="identifier">v</span></code>
</p>
</td>
<td>
<p>
Vertical whitespace
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="identifier">D</span></code>
@ -735,10 +791,34 @@
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="identifier">H</span></code>
</p>
</td>
<td>
<p>
Not Horizontal whitespace
</p>
</td>
</tr>
<tr>
<td>
<p>
<code class="computeroutput"><span class="special">\</span><span class="identifier">V</span></code>
</p>
</td>
<td>
<p>
Not Vertical whitespace
</p>
</td>
</tr>
</tbody>
</table></div>
<a name="boost_regex.syntax.perl_syntax.character_properties"></a><h6>
<a name="id538604"></a>
<a name="id815863"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.character_properties">Character
Properties</a>
</h6>
@ -846,7 +926,7 @@
matches any "digit" character, as does <code class="computeroutput"><span class="special">\</span><span class="identifier">p</span><span class="special">{</span><span class="identifier">digit</span><span class="special">}</span></code>.
</p>
<a name="boost_regex.syntax.perl_syntax.word_boundaries"></a><h6>
<a name="id539013"></a>
<a name="id816175"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.word_boundaries">Word Boundaries</a>
</h6>
<p>
@ -868,7 +948,7 @@
Matches only when not at a word boundary.
</p>
<a name="boost_regex.syntax.perl_syntax.buffer_boundaries"></a><h6>
<a name="id539115"></a>
<a name="id816244"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.buffer_boundaries">Buffer boundaries</a>
</h6>
<p>
@ -893,7 +973,7 @@
to the regular expression <code class="computeroutput"><span class="special">\</span><span class="identifier">n</span><span class="special">*\</span><span class="identifier">z</span></code>
</p>
<a name="boost_regex.syntax.perl_syntax.continuation_escape"></a><h6>
<a name="id539198"></a>
<a name="id816298"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.continuation_escape">Continuation
Escape</a>
</h6>
@ -905,7 +985,7 @@
match to start where the last one ended.
</p>
<a name="boost_regex.syntax.perl_syntax.quoting_escape"></a><h6>
<a name="id539248"></a>
<a name="id816325"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.quoting_escape">Quoting escape</a>
</h6>
<p>
@ -918,7 +998,7 @@
<span class="special">\*+</span><span class="identifier">aaa</span>
</pre>
<a name="boost_regex.syntax.perl_syntax.unicode_escapes"></a><h6>
<a name="id539354"></a>
<a name="id817829"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.unicode_escapes">Unicode escapes</a>
</h6>
<p>
@ -929,7 +1009,7 @@
combining characters.
</p>
<a name="boost_regex.syntax.perl_syntax.any_other_escape"></a><h6>
<a name="id539418"></a>
<a name="id817867"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.any_other_escape">Any other
escape</a>
</h6>
@ -938,7 +1018,7 @@
\@ matches a literal '@'.
</p>
<a name="boost_regex.syntax.perl_syntax.perl_extended_patterns"></a><h5>
<a name="id539447"></a>
<a name="id817884"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.perl_extended_patterns">Perl
Extended Patterns</a>
</h5>
@ -947,7 +1027,7 @@
<code class="computeroutput"><span class="special">(?</span></code>.
</p>
<a name="boost_regex.syntax.perl_syntax.comments"></a><h6>
<a name="id539489"></a>
<a name="id817908"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.comments">Comments</a>
</h6>
<p>
@ -956,7 +1036,7 @@
are ignored.
</p>
<a name="boost_regex.syntax.perl_syntax.modifiers"></a><h6>
<a name="id539542"></a>
<a name="id817942"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.modifiers">Modifiers</a>
</h6>
<p>
@ -971,7 +1051,7 @@
applies the specified modifiers to pattern only.
</p>
<a name="boost_regex.syntax.perl_syntax.non_marking_groups"></a><h6>
<a name="id539669"></a>
<a name="id818026"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.non_marking_groups">Non-marking
groups</a>
</h6>
@ -980,7 +1060,7 @@
an additional sub-expression.
</p>
<a name="boost_regex.syntax.perl_syntax.lookahead"></a><h6>
<a name="id539720"></a>
<a name="id818057"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.lookahead">Lookahead</a>
</h6>
<p>
@ -1003,7 +1083,7 @@
could be used to validate the password.
</p>
<a name="boost_regex.syntax.perl_syntax.lookbehind"></a><h6>
<a name="id539861"></a>
<a name="id818151"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.lookbehind">Lookbehind</a>
</h6>
<p>
@ -1017,7 +1097,7 @@
(pattern must be of fixed length).
</p>
<a name="boost_regex.syntax.perl_syntax.independent_sub_expressions"></a><h6>
<a name="id539939"></a>
<a name="id818202"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.independent_sub_expressions">Independent
sub-expressions</a>
</h6>
@ -1030,7 +1110,7 @@
no match is found at all.
</p>
<a name="boost_regex.syntax.perl_syntax.conditional_expressions"></a><h6>
<a name="id540003"></a>
<a name="id818243"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.conditional_expressions">Conditional
Expressions</a>
</h6>
@ -1050,7 +1130,7 @@
sub-expression has been matched).
</p>
<a name="boost_regex.syntax.perl_syntax.operator_precedence"></a><h5>
<a name="id540172"></a>
<a name="id818361"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.operator_precedence">Operator
precedence</a>
</h5>
@ -1086,7 +1166,7 @@
</li>
</ol></div>
<a name="boost_regex.syntax.perl_syntax.what_gets_matched"></a><h4>
<a name="id540350"></a>
<a name="id818487"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.what_gets_matched">What gets
matched</a>
</h4>
@ -1271,7 +1351,7 @@
</tbody>
</table></div>
<a name="boost_regex.syntax.perl_syntax.variations"></a><h4>
<a name="id541265"></a>
<a name="id819179"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.variations">Variations</a>
</h4>
<p>
@ -1280,7 +1360,7 @@
<code class="computeroutput"><span class="identifier">JavaScript</span></code> and <code class="computeroutput"><span class="identifier">JScript</span></code></a> are all synonyms for <code class="computeroutput"><span class="identifier">perl</span></code>.
</p>
<a name="boost_regex.syntax.perl_syntax.options"></a><h4>
<a name="id541360"></a>
<a name="id819238"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.options">Options</a>
</h4>
<p>
@ -1293,7 +1373,7 @@
sensitivity are to be applied.
</p>
<a name="boost_regex.syntax.perl_syntax.pattern_modifiers"></a><h4>
<a name="id541461"></a>
<a name="id819298"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.pattern_modifiers">Pattern
Modifiers</a>
</h4>
@ -1305,7 +1385,7 @@
and <code class="computeroutput"><span class="identifier">no_mod_s</span></code></a>.
</p>
<a name="boost_regex.syntax.perl_syntax.references"></a><h4>
<a name="id541588"></a>
<a name="id819379"></a>
<a class="link" href="perl_syntax.html#boost_regex.syntax.perl_syntax.references">References</a>
</h4>
<p>