Fix gcc warnings from ICU wrappers.

Add optional support for marked sub-expression location information.
Add support for ${n} in format replacement text.
Fixes #2556.
Fixes #2269.
Fixes #2514.

[SVN r50370]
This commit is contained in:
John Maddock
2008-12-23 11:46:00 +00:00
parent c997a1fcc6
commit b4152cd74d
94 changed files with 1344 additions and 1068 deletions

View File

@ -3,8 +3,8 @@
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>POSIX Extended Regular Expression Syntax</title>
<link rel="stylesheet" href="../../../../../../doc/html/boostbook.css" type="text/css">
<meta name="generator" content="DocBook XSL Stylesheets Vsnapshot_2006-12-17_0120">
<link rel="start" href="../../index.html" title="Boost.Regex">
<meta name="generator" content="DocBook XSL Stylesheets Vsnapshot_8125">
<link rel="home" href="../../index.html" title="Boost.Regex">
<link rel="up" href="../syntax.html" title="Regular Expression Syntax">
<link rel="prev" href="perl_syntax.html" title="Perl Regular Expression Syntax">
<link rel="next" href="basic_syntax.html" title="POSIX Basic Regular Expression Syntax">
@ -24,12 +24,12 @@
</div>
<div class="section" lang="en">
<div class="titlepage"><div><div><h3 class="title">
<a name="boost_regex.syntax.basic_extended"></a><a href="basic_extended.html" title="POSIX Extended Regular Expression Syntax"> POSIX Extended Regular
<a name="boost_regex.syntax.basic_extended"></a><a class="link" href="basic_extended.html" title="POSIX Extended Regular Expression Syntax"> POSIX Extended Regular
Expression Syntax</a>
</h3></div></div></div>
<a name="boost_regex.syntax.basic_extended.synopsis"></a><h4>
<a name="id514165"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.synopsis">Synopsis</a>
<a name="id541641"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.synopsis">Synopsis</a>
</h4>
<p>
The POSIX-Extended regular expression syntax is supported by the POSIX C
@ -46,8 +46,8 @@
<a name="boost_regex.posix_extended_syntax"></a><p>
</p>
<a name="boost_regex.syntax.basic_extended.posix_extended_syntax"></a><h4>
<a name="id514430"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.posix_extended_syntax">POSIX
<a name="id541905"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.posix_extended_syntax">POSIX
Extended Syntax</a>
</h4>
<p>
@ -56,8 +56,8 @@
</p>
<pre class="programlisting">.[{()\*+?|^$</pre>
<a name="boost_regex.syntax.basic_extended.wildcard_"></a><h5>
<a name="id514469"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.wildcard_">Wildcard:</a>
<a name="id541945"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.wildcard_">Wildcard:</a>
</h5>
<p>
The single character '.' when used outside of a character set will match
@ -74,8 +74,8 @@
</li>
</ul></div>
<a name="boost_regex.syntax.basic_extended.anchors_"></a><h5>
<a name="id514537"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.anchors_">Anchors:</a>
<a name="id542013"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.anchors_">Anchors:</a>
</h5>
<p>
A '^' character shall match the start of a line when used as the first character
@ -86,8 +86,8 @@
of an expression, or the last character of a sub-expression.
</p>
<a name="boost_regex.syntax.basic_extended.marked_sub_expressions_"></a><h5>
<a name="id514573"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.marked_sub_expressions_">Marked
<a name="id542049"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.marked_sub_expressions_">Marked
sub-expressions:</a>
</h5>
<p>
@ -98,8 +98,8 @@
to by a back-reference.
</p>
<a name="boost_regex.syntax.basic_extended.repeats_"></a><h5>
<a name="id514630"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.repeats_">Repeats:</a>
<a name="id542105"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.repeats_">Repeats:</a>
</h5>
<p>
Any atom (a single character, a marked sub-expression, or a character class)
@ -184,8 +184,8 @@ cab
operator to be applied to.
</p>
<a name="boost_regex.syntax.basic_extended.back_references_"></a><h5>
<a name="id515077"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.back_references_">Back references:</a>
<a name="id542553"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.back_references_">Back references:</a>
</h5>
<p>
An escape character followed by a digit <span class="emphasis"><em>n</em></span>, where <span class="emphasis"><em>n</em></span>
@ -214,8 +214,8 @@ cab
</p></td></tr>
</table></div>
<a name="boost_regex.syntax.basic_extended.alternation"></a><h5>
<a name="id515171"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.alternation">Alternation</a>
<a name="id542647"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.alternation">Alternation</a>
</h5>
<p>
The <code class="computeroutput"><span class="special">|</span></code> operator will match either
@ -227,8 +227,8 @@ cab
will match either of "abd" or "abef".
</p>
<a name="boost_regex.syntax.basic_extended.character_sets_"></a><h5>
<a name="id515274"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.character_sets_">Character
<a name="id542750"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.character_sets_">Character
sets:</a>
</h5>
<p>
@ -240,8 +240,8 @@ cab
A bracket expression may contain any combination of the following:
</p>
<a name="boost_regex.syntax.basic_extended.single_characters_"></a><h6>
<a name="id515311"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.single_characters_">Single
<a name="id542786"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.single_characters_">Single
characters:</a>
</h6>
<p>
@ -249,8 +249,8 @@ cab
or 'c'.
</p>
<a name="boost_regex.syntax.basic_extended.character_ranges_"></a><h6>
<a name="id515361"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.character_ranges_">Character
<a name="id542837"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.character_ranges_">Character
ranges:</a>
</h6>
<p>
@ -260,13 +260,13 @@ cab
within the range <span class="emphasis"><em>y</em></span> to <span class="emphasis"><em>z</em></span>, if it
collates within that range; this results in locale specific behavior . This
behavior can be turned off by unsetting the <code class="computeroutput"><span class="identifier">collate</span></code>
<a href="../ref/syntax_option_type.html" title="syntax_option_type">option flag</a> - in
<a class="link" href="../ref/syntax_option_type.html" title="syntax_option_type">option flag</a> - in
which case whether a character appears within a range is determined by comparing
the code points of the characters only.
</p>
<a name="boost_regex.syntax.basic_extended.negation_"></a><h6>
<a name="id515462"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.negation_">Negation:</a>
<a name="id542938"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.negation_">Negation:</a>
</h6>
<p>
If the bracket-expression begins with the ^ character, then it matches the
@ -274,18 +274,18 @@ cab
range <code class="computeroutput"><span class="identifier">a</span><span class="special">-</span><span class="identifier">c</span></code>.
</p>
<a name="boost_regex.syntax.basic_extended.character_classes_"></a><h6>
<a name="id515544"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.character_classes_">Character
<a name="id543020"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.character_classes_">Character
classes:</a>
</h6>
<p>
An expression of the form <code class="computeroutput"><span class="special">[[:</span><span class="identifier">name</span><span class="special">:]]</span></code>
matches the named character class "name", for example <code class="computeroutput"><span class="special">[[:</span><span class="identifier">lower</span><span class="special">:]]</span></code> matches any lower case character. See
<a href="character_classes.html" title="Character Class Names">character class names</a>.
<a class="link" href="character_classes.html" title="Character Class Names">character class names</a>.
</p>
<a name="boost_regex.syntax.basic_extended.collating_elements_"></a><h6>
<a name="id515627"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.collating_elements_">Collating
<a name="id543103"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.collating_elements_">Collating
Elements:</a>
</h6>
<p>
@ -304,7 +304,7 @@ cab
match either one of the characters 'abc^'.
</p>
<p>
As an extension, a collating element may also be specified via its <a href="collating_names.html" title="Collating Names">symbolic name</a>, for example:
As an extension, a collating element may also be specified via its <a class="link" href="collating_names.html" title="Collating Names">symbolic name</a>, for example:
</p>
<pre class="programlisting"><span class="special">[[.</span><span class="identifier">NUL</span><span class="special">.]]</span>
</pre>
@ -312,15 +312,15 @@ cab
matches a NUL character.
</p>
<a name="boost_regex.syntax.basic_extended.equivalence_classes_"></a><h6>
<a name="id515785"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.equivalence_classes_">Equivalence
<a name="id543264"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.equivalence_classes_">Equivalence
classes:</a>
</h6>
<p>
An expression of the form <code class="computeroutput"><span class="special">[[=</span><span class="identifier">col</span><span class="special">=]]</span></code>,
matches any character or collating element whose primary sort key is the
same as that for collating element <span class="emphasis"><em>col</em></span>, as with colating
elements the name <span class="emphasis"><em>col</em></span> may be a <a href="collating_names.html" title="Collating Names">symbolic
elements the name <span class="emphasis"><em>col</em></span> may be a <a class="link" href="collating_names.html" title="Collating Names">symbolic
name</a>. A primary sort key is one that ignores case, accentation, or
locale-specific tailorings; so for example <code class="computeroutput"><span class="special">[[=</span><span class="identifier">a</span><span class="special">=]]</span></code> matches
any of the characters: a, <20>, <20>, <20>, <20>, <20>, <20>, A, <20>, <20>, <20>, <20>, <20> and <20>. Unfortunately implementation
@ -329,16 +329,16 @@ cab
or even all locales on one platform.
</p>
<a name="boost_regex.syntax.basic_extended.combinations_"></a><h6>
<a name="id515889"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.combinations_">Combinations:</a>
<a name="id543369"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.combinations_">Combinations:</a>
</h6>
<p>
All of the above can be combined in one character set declaration, for example:
<code class="computeroutput"><span class="special">[[:</span><span class="identifier">digit</span><span class="special">:]</span><span class="identifier">a</span><span class="special">-</span><span class="identifier">c</span><span class="special">[.</span><span class="identifier">NUL</span><span class="special">.]]</span></code>.
</p>
<a name="boost_regex.syntax.basic_extended.escapes"></a><h5>
<a name="id515969"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.escapes">Escapes</a>
<a name="id543448"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.escapes">Escapes</a>
</h5>
<p>
The POSIX standard defines no escape sequences for POSIX-Extended regular
@ -363,8 +363,8 @@ cab
extensions are also supported by Boost.Regex:
</p>
<a name="boost_regex.syntax.basic_extended.escapes_matching_a_specific_character"></a><h6>
<a name="id516039"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.escapes_matching_a_specific_character">Escapes
<a name="id543518"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.escapes_matching_a_specific_character">Escapes
matching a specific character</a>
</h6>
<p>
@ -552,8 +552,8 @@ cab
</tbody>
</table></div>
<a name="boost_regex.syntax.basic_extended._quot_single_character_quot__character_classes_"></a><h6>
<a name="id516386"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended._quot_single_character_quot__character_classes_">"Single
<a name="id543866"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended._quot_single_character_quot__character_classes_">"Single
character" character classes:</a>
</h6>
<p>
@ -706,8 +706,8 @@ cab
</tbody>
</table></div>
<a name="boost_regex.syntax.basic_extended.character_properties"></a><h6>
<a name="id517018"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.character_properties">Character
<a name="id544497"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.character_properties">Character
Properties</a>
</h6>
<p>
@ -813,8 +813,8 @@ cab
matches any "digit" character, as does <code class="computeroutput"><span class="special">\</span><span class="identifier">p</span><span class="special">{</span><span class="identifier">digit</span><span class="special">}</span></code>.
</p>
<a name="boost_regex.syntax.basic_extended.word_boundaries"></a><h6>
<a name="id517419"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.word_boundaries">Word Boundaries</a>
<a name="id544898"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.word_boundaries">Word Boundaries</a>
</h6>
<p>
The following escape sequences match the boundaries of words:
@ -888,8 +888,8 @@ cab
</tbody>
</table></div>
<a name="boost_regex.syntax.basic_extended.buffer_boundaries"></a><h6>
<a name="id517612"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.buffer_boundaries">Buffer
<a name="id545091"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.buffer_boundaries">Buffer
boundaries</a>
</h6>
<p>
@ -979,8 +979,8 @@ cab
</tbody>
</table></div>
<a name="boost_regex.syntax.basic_extended.continuation_escape"></a><h6>
<a name="id517847"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.continuation_escape">Continuation
<a name="id545326"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.continuation_escape">Continuation
Escape</a>
</h6>
<p>
@ -991,8 +991,8 @@ cab
match to start where the last one ended.
</p>
<a name="boost_regex.syntax.basic_extended.quoting_escape"></a><h6>
<a name="id517896"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.quoting_escape">Quoting
<a name="id545376"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.quoting_escape">Quoting
escape</a>
</h6>
<p>
@ -1005,8 +1005,8 @@ cab
<span class="special">\*+</span><span class="identifier">aaa</span>
</pre>
<a name="boost_regex.syntax.basic_extended.unicode_escapes"></a><h6>
<a name="id518020"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.unicode_escapes">Unicode
<a name="id545499"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.unicode_escapes">Unicode
escapes</a>
</h6>
<div class="informaltable"><table class="table">
@ -1056,8 +1056,8 @@ cab
</tbody>
</table></div>
<a name="boost_regex.syntax.basic_extended.any_other_escape"></a><h6>
<a name="id518153"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.any_other_escape">Any other
<a name="id545632"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.any_other_escape">Any other
escape</a>
</h6>
<p>
@ -1065,8 +1065,8 @@ cab
\@ matches a literal '@'.
</p>
<a name="boost_regex.syntax.basic_extended.operator_precedence"></a><h5>
<a name="id518183"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.operator_precedence">Operator
<a name="id545662"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.operator_precedence">Operator
precedence</a>
</h5>
<p>
@ -1101,27 +1101,27 @@ cab
</li>
</ol></div>
<a name="boost_regex.syntax.basic_extended.what_gets_matched"></a><h5>
<a name="id518372"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.what_gets_matched">What
<a name="id545852"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.what_gets_matched">What
Gets Matched</a>
</h5>
<p>
When there is more that one way to match a regular expression, the "best"
possible match is obtained using the <a href="leftmost_longest_rule.html" title="The Leftmost Longest Rule">leftmost-longest
possible match is obtained using the <a class="link" href="leftmost_longest_rule.html" title="The Leftmost Longest Rule">leftmost-longest
rule</a>.
</p>
<a name="boost_regex.syntax.basic_extended.variations"></a><h4>
<a name="id518412"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.variations">Variations</a>
<a name="id545892"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.variations">Variations</a>
</h4>
<a name="boost_regex.syntax.basic_extended.egrep"></a><h5>
<a name="id518435"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.egrep">Egrep</a>
<a name="id545915"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.egrep">Egrep</a>
</h5>
<p>
When an expression is compiled with the <a href="../ref/syntax_option_type.html" title="syntax_option_type">flag
When an expression is compiled with the <a class="link" href="../ref/syntax_option_type.html" title="syntax_option_type">flag
<code class="computeroutput"><span class="identifier">egrep</span></code></a> set, then the
expression is treated as a newline separated list of <a href="basic_extended.html#boost_regex.posix_extended_syntax">POSIX-Extended
expression is treated as a newline separated list of <a class="link" href="basic_extended.html#boost_regex.posix_extended_syntax">POSIX-Extended
expressions</a>, a match is found if any of the expressions in the list
match, for example:
</p>
@ -1136,11 +1136,11 @@ cab
used with the -E option.
</p>
<a name="boost_regex.syntax.basic_extended.awk"></a><h5>
<a name="id518593"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.awk">awk</a>
<a name="id546073"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.awk">awk</a>
</h5>
<p>
In addition to the <a href="basic_extended.html#boost_regex.posix_extended_syntax">POSIX-Extended
In addition to the <a class="link" href="basic_extended.html#boost_regex.posix_extended_syntax">POSIX-Extended
features</a> the escape character is special inside a character class
declaration.
</p>
@ -1150,21 +1150,21 @@ cab
these by default anyway.
</p>
<a name="boost_regex.syntax.basic_extended.options"></a><h4>
<a name="id518640"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.options">Options</a>
<a name="id546119"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.options">Options</a>
</h4>
<p>
There are a <a href="../ref/syntax_option_type/syntax_option_type_extended.html" title="Options for POSIX Extended Regular Expressions">variety
There are a <a class="link" href="../ref/syntax_option_type/syntax_option_type_extended.html" title="Options for POSIX Extended Regular Expressions">variety
of flags</a> that may be combined with the <code class="computeroutput"><span class="identifier">extended</span></code>
and <code class="computeroutput"><span class="identifier">egrep</span></code> options when constructing
the regular expression, in particular note that the <a href="../ref/syntax_option_type/syntax_option_type_extended.html" title="Options for POSIX Extended Regular Expressions"><code class="computeroutput"><span class="identifier">newline_alt</span></code></a> option alters the syntax,
while the <a href="../ref/syntax_option_type/syntax_option_type_extended.html" title="Options for POSIX Extended Regular Expressions"><code class="computeroutput"><span class="identifier">collate</span></code>, <code class="computeroutput"><span class="identifier">nosubs</span></code>
the regular expression, in particular note that the <a class="link" href="../ref/syntax_option_type/syntax_option_type_extended.html" title="Options for POSIX Extended Regular Expressions"><code class="computeroutput"><span class="identifier">newline_alt</span></code></a> option alters the syntax,
while the <a class="link" href="../ref/syntax_option_type/syntax_option_type_extended.html" title="Options for POSIX Extended Regular Expressions"><code class="computeroutput"><span class="identifier">collate</span></code>, <code class="computeroutput"><span class="identifier">nosubs</span></code>
and <code class="computeroutput"><span class="identifier">icase</span></code> options</a>
modify how the case and locale sensitivity are to be applied.
</p>
<a name="boost_regex.syntax.basic_extended.references"></a><h4>
<a name="id518768"></a>
<a href="basic_extended.html#boost_regex.syntax.basic_extended.references">References</a>
<a name="id546248"></a>
<a class="link" href="basic_extended.html#boost_regex.syntax.basic_extended.references">References</a>
</h4>
<p>
<a href="http://www.opengroup.org/onlinepubs/000095399/basedefs/xbd_chap09.html" target="_top">IEEE