Fixed typo

[SVN r22438]
This commit is contained in:
John Maddock
2004-03-05 11:32:34 +00:00
parent 4b7f14e72d
commit f90d8c667e
2 changed files with 190 additions and 268 deletions

View File

@ -1,153 +1,114 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html> <html>
<head> <head>
<meta name="generator" content="HTML Tidy, see www.w3.org"> <title>Boost.Regex: FAQ</title>
<title>Boost.Regex: FAQ</title> <meta name="generator" content="HTML Tidy, see www.w3.org">
<meta http-equiv="Content-Type" content= <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
"text/html; charset=iso-8859-1"> <link rel="stylesheet" type="text/css" href="../../../boost.css">
<link rel="stylesheet" type="text/css" href="../../../boost.css"> </head>
</head> <body>
<body> <p></p>
<p></p> <table id="Table1" cellspacing="1" cellpadding="1" width="100%" border="0">
<tr>
<table id="Table1" cellspacing="1" cellpadding="1" width="100%" <td valign="top" width="300">
border="0"> <h3><a href="../../../index.htm"><img height="86" width="277" alt="C++ Boost" src="../../../c++boost.gif" border="0"></a></h3>
<tr> </td>
<td valign="top" width="300"> <td width="353">
<h3><a href="../../../index.htm"><img height="86" width="277" alt= <h1 align="center">Boost.Regex</h1>
"C++ Boost" src="../../../c++boost.gif" border="0"></a></h3> <h2 align="center">FAQ</h2>
</td> </td>
<td width="353"> <td width="50">
<h1 align="center">Boost.Regex</h1> <h3><a href="index.html"><img height="45" width="43" alt="Boost.Regex Index" src="uarrow.gif" border="0"></a></h3>
</td>
<h2 align="center">FAQ</h2> </tr>
</td> </table>
<td width="50"> <br>
<h3><a href="index.html"><img height="45" width="43" alt= <br>
"Boost.Regex Index" src="uarrow.gif" border="0"></a></h3> <hr>
</td> <font color="#ff0000"><font color="#ff0000"></font></font>
</tr> <p><font color="#ff0000"><font color="#ff0000"><font color="#ff0000">&nbsp;Q. Why can't I
</table> use the "convenience" versions of regex_match / regex_search / regex_grep /
regex_format / regex_merge?</font></font></font></p>
<br> <p>A. These versions may or may not be available depending upon the capabilities
<br> of your compiler, the rules determining the format of these functions are quite
complex - and only the versions visible to a standard compliant compiler are
given in the help. To find out what your compiler supports, run
<hr> &lt;boost/regex.hpp&gt; through your C++ pre-processor, and search the output
<font color="#ff0000"><font color="#ff0000"></font></font> file for the function that you are interested in.<font color="#ff0000"><font color="#ff0000"></font></font></p>
<p><font color="#ff0000"><font color="#ff0000"><font color= <p><font color="#ff0000"><font color="#ff0000">Q. I can't get regex++ to work with
"#ff0000">&nbsp;Q. Why can't I use the "convenience" versions of escape characters, what's going on?</font></font></p>
regex_match / regex_search / regex_grep / regex_format / <p>A. If you embed regular expressions in C++ code, then remember that escape
regex_merge?</font></font></font></p> characters are processed twice: once by the C++ compiler, and once by the
regex++ expression compiler, so to pass the regular expression \d+ to regex++,
<p>A. These versions may or may not be available depending upon the you need to embed "\\d+" in your code. Likewise to match a literal backslash
capabilities of your compiler, the rules determining the format of you will need to embed "\\\\" in your code. <font color="#ff0000"></font>
these functions are quite complex - and only the versions visible </p>
to a standard compliant compiler are given in the help. To find out <p><font color="#ff0000">Q. Why does using parenthesis in a POSIX regular expression
what your compiler supports, run &lt;boost/regex.hpp&gt; through change the result of a match?</font></p>
your C++ pre-processor, and search the output file for the function <p>For POSIX (extended and basic) regular expressions, but not for perl regexes,
that you are interested in.<font color="#ff0000"><font color= parentheses don't only mark; they determine what the best match is as well.
"#ff0000"></font></font></p> When the expression is compiled as a POSIX basic or extended regex then
Boost.regex follows the POSIX standard leftmost longest rule for determining
<p><font color="#ff0000"><font color="#ff0000">Q. I can't get what matched. So if there is more than one possible match after considering the
regex++ to work with escape characters, what's going whole expression, it looks next at the first sub-expression and then the second
on?</font></font></p> sub-expression and so on. So...</p>
<pre>
<p>A. If you embed regular expressions in C++ code, then remember
that escape characters are processed twice: once by the C++
compiler, and once by the regex++ expression compiler, so to pass
the regular expression \d+ to regex++, you need to embed "\\d+" in
your code. Likewise to match a literal backslash you will need to
embed "\\\\" in your code. <font color="#ff0000"></font></p>
<p><font color="#ff0000">Q. Why does using parenthesis in a POSIX
regular expression change the result of a match?</font></p>
<p>For POSIX (extended and basic) regular expressions, but not for
perl regexes, parentheses don't only mark; they determine what the
best match is as well. When the expression is compiled as a POSIX
basic or extended regex then Boost.regex follows the POSIX standard
leftmost longest rule for determining what matched. So if there is
more than one possible match after considering the whole
expression, it looks next at the first sub-expression and then the
second sub-expression and so on. So...</p>
<pre>
"(0*)([0-9]*)" against "00123" would produce "(0*)([0-9]*)" against "00123" would produce
$1 = "00" $1 = "00"
$2 = "123" $2 = "123"
</pre> </pre>
<p>where as</p>
<p>where as</p> <pre>
"0*([0-9])*" against "00123" would produce
<pre>
"0*([0-9)*" against "00123" would produce
$1 = "00123" $1 = "00123"
</pre> </pre>
<p>If you think about it, had $1 only matched the "123", this would be "less good"
<p>If you think about it, had $1 only matched the "123", this would than the match "00123" which is both further to the left and longer. If you
be "less good" than the match "00123" which is both further to the want $1 to match only the "123" part, then you need to use something like:</p>
left and longer. If you want $1 to match only the "123" part, then <pre>
you need to use something like:</p>
<pre>
"0*([1-9][0-9]*)" "0*([1-9][0-9]*)"
</pre> </pre>
<p>as the expression.</p>
<p>as the expression.</p> <p><font color="#ff0000">Q. Why don't character ranges work properly (POSIX mode
only)?</font><br>
<p><font color="#ff0000">Q. Why don't character ranges work A. The POSIX standard specifies that character range expressions are locale
properly (POSIX mode only)?</font><br> sensitive - so for example the expression [A-Z] will match any collating
A. The POSIX standard specifies that character range expressions element that collates between 'A' and 'Z'. That means that for most locales
are locale sensitive - so for example the expression [A-Z] will other than "C" or "POSIX", [A-Z] would match the single character 't' for
match any collating element that collates between 'A' and 'Z'. That example, which is not what most people expect - or at least not what most
means that for most locales other than "C" or "POSIX", [A-Z] would people have come to expect from regular expression engines. For this reason,
match the single character 't' for example, which is not what most the default behaviour of boost.regex (perl mode) is to turn locale sensitive
people expect - or at least not what most people have come to collation off by not setting the regex_constants::collate compile time flag.
expect from regular expression engines. For this reason, the However if you set a non-default compile time flag - for example
default behaviour of boost.regex (perl mode) is to turn locale regex_constants::extended or regex_constants::basic, then locale dependent
sensitive collation off by not setting the regex_constants::collate collation will be enabled, this also applies to the POSIX API functions which
compile time flag. However if you set a non-default compile time use either regex_constants::extended or regex_constants::basic internally. <i>[Note
flag - for example regex_constants::extended or - when regex_constants::nocollate in effect, the library behaves "as if" the
regex_constants::basic, then locale dependent collation will be LC_COLLATE locale category were always "C", regardless of what its actually set
enabled, this also applies to the POSIX API functions which use to - end note</i>].</p>
either regex_constants::extended or regex_constants::basic <p><font color="#ff0000">Q. Why are there no throw specifications on any of the
internally. <i>[Note - when regex_constants::nocollate in effect, functions? What exceptions can the library throw?</font></p>
the library behaves "as if" the LC_COLLATE locale category were <p>A. Not all compilers support (or honor) throw specifications, others support
always "C", regardless of what its actually set to - end them but with reduced efficiency. Throw specifications may be added at a later
note</i>].</p> date as compilers begin to handle this better. The library should throw only
three types of exception: boost::bad_expression can be thrown by basic_regex
<p><font color="#ff0000">Q. Why are there no throw specifications when compiling a regular expression, std::runtime_error can be thrown when a
on any of the functions? What exceptions can the library call to basic_regex::imbue tries to open a message catalogue that doesn't
throw?</font></p> exist, or when a call to regex_search or regex_match results in an
"everlasting" search,&nbsp;or when a call to RegEx::GrepFiles or
<p>A. Not all compilers support (or honor) throw specifications, RegEx::FindFiles tries to open a file that cannot be opened, finally
others support them but with reduced efficiency. Throw std::bad_alloc can be thrown by just about any of the functions in this
specifications may be added at a later date as compilers begin to library.</p>
handle this better. The library should throw only three types of <p></p>
exception: boost::bad_expression can be thrown by basic_regex when <hr>
compiling a regular expression, std::runtime_error can be thrown
when a call to basic_regex::imbue tries to open a message catalogue
that doesn't exist, or when a call to regex_search or regex_match
results in an "everlasting" search,&nbsp;or when a call to
RegEx::GrepFiles or RegEx::FindFiles tries to open a file that
cannot be opened, finally std::bad_alloc can be thrown by just
about any of the functions in this library.</p>
<p></p>
<hr>
<p>Revised <p>Revised
<!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%d %B, %Y" startspan --> <!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%d %B, %Y" startspan -->
24 Oct 2003 24 Oct 2003
<!--webbot bot="Timestamp" endspan i-checksum="39359" --></p> <!--webbot bot="Timestamp" endspan i-checksum="39359" --></p>
<p><i><EFBFBD> Copyright John Maddock&nbsp;1998- <p><i><EFBFBD> Copyright John Maddock&nbsp;1998-
<!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%Y" startspan --> <!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%Y" startspan --> 2003<!--webbot bot="Timestamp" endspan i-checksum="39359" --></i></p>
2003<!--webbot bot="Timestamp" endspan i-checksum="39359" --></i></p>
<P><I>Use, modification and distribution are subject to the Boost Software License, <P><I>Use, modification and distribution are subject to the Boost Software License,
Version 1.0. (See accompanying file <A href="../../../LICENSE_1_0.txt">LICENSE_1_0.txt</A> Version 1.0. (See accompanying file <A href="../../../LICENSE_1_0.txt">LICENSE_1_0.txt</A>
or copy at <A href="http://www.boost.org/LICENSE_1_0.txt">http://www.boost.org/LICENSE_1_0.txt</A>)</I></P> or copy at <A href="http://www.boost.org/LICENSE_1_0.txt">http://www.boost.org/LICENSE_1_0.txt</A>)</I></P>
</body> </body>
</html> </html>

View File

@ -1,153 +1,114 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html> <html>
<head> <head>
<meta name="generator" content="HTML Tidy, see www.w3.org"> <title>Boost.Regex: FAQ</title>
<title>Boost.Regex: FAQ</title> <meta name="generator" content="HTML Tidy, see www.w3.org">
<meta http-equiv="Content-Type" content= <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
"text/html; charset=iso-8859-1"> <link rel="stylesheet" type="text/css" href="../../../boost.css">
<link rel="stylesheet" type="text/css" href="../../../boost.css"> </head>
</head> <body>
<body> <p></p>
<p></p> <table id="Table1" cellspacing="1" cellpadding="1" width="100%" border="0">
<tr>
<table id="Table1" cellspacing="1" cellpadding="1" width="100%" <td valign="top" width="300">
border="0"> <h3><a href="../../../index.htm"><img height="86" width="277" alt="C++ Boost" src="../../../c++boost.gif" border="0"></a></h3>
<tr> </td>
<td valign="top" width="300"> <td width="353">
<h3><a href="../../../index.htm"><img height="86" width="277" alt= <h1 align="center">Boost.Regex</h1>
"C++ Boost" src="../../../c++boost.gif" border="0"></a></h3> <h2 align="center">FAQ</h2>
</td> </td>
<td width="353"> <td width="50">
<h1 align="center">Boost.Regex</h1> <h3><a href="index.html"><img height="45" width="43" alt="Boost.Regex Index" src="uarrow.gif" border="0"></a></h3>
</td>
<h2 align="center">FAQ</h2> </tr>
</td> </table>
<td width="50"> <br>
<h3><a href="index.html"><img height="45" width="43" alt= <br>
"Boost.Regex Index" src="uarrow.gif" border="0"></a></h3> <hr>
</td> <font color="#ff0000"><font color="#ff0000"></font></font>
</tr> <p><font color="#ff0000"><font color="#ff0000"><font color="#ff0000">&nbsp;Q. Why can't I
</table> use the "convenience" versions of regex_match / regex_search / regex_grep /
regex_format / regex_merge?</font></font></font></p>
<br> <p>A. These versions may or may not be available depending upon the capabilities
<br> of your compiler, the rules determining the format of these functions are quite
complex - and only the versions visible to a standard compliant compiler are
given in the help. To find out what your compiler supports, run
<hr> &lt;boost/regex.hpp&gt; through your C++ pre-processor, and search the output
<font color="#ff0000"><font color="#ff0000"></font></font> file for the function that you are interested in.<font color="#ff0000"><font color="#ff0000"></font></font></p>
<p><font color="#ff0000"><font color="#ff0000"><font color= <p><font color="#ff0000"><font color="#ff0000">Q. I can't get regex++ to work with
"#ff0000">&nbsp;Q. Why can't I use the "convenience" versions of escape characters, what's going on?</font></font></p>
regex_match / regex_search / regex_grep / regex_format / <p>A. If you embed regular expressions in C++ code, then remember that escape
regex_merge?</font></font></font></p> characters are processed twice: once by the C++ compiler, and once by the
regex++ expression compiler, so to pass the regular expression \d+ to regex++,
<p>A. These versions may or may not be available depending upon the you need to embed "\\d+" in your code. Likewise to match a literal backslash
capabilities of your compiler, the rules determining the format of you will need to embed "\\\\" in your code. <font color="#ff0000"></font>
these functions are quite complex - and only the versions visible </p>
to a standard compliant compiler are given in the help. To find out <p><font color="#ff0000">Q. Why does using parenthesis in a POSIX regular expression
what your compiler supports, run &lt;boost/regex.hpp&gt; through change the result of a match?</font></p>
your C++ pre-processor, and search the output file for the function <p>For POSIX (extended and basic) regular expressions, but not for perl regexes,
that you are interested in.<font color="#ff0000"><font color= parentheses don't only mark; they determine what the best match is as well.
"#ff0000"></font></font></p> When the expression is compiled as a POSIX basic or extended regex then
Boost.regex follows the POSIX standard leftmost longest rule for determining
<p><font color="#ff0000"><font color="#ff0000">Q. I can't get what matched. So if there is more than one possible match after considering the
regex++ to work with escape characters, what's going whole expression, it looks next at the first sub-expression and then the second
on?</font></font></p> sub-expression and so on. So...</p>
<pre>
<p>A. If you embed regular expressions in C++ code, then remember
that escape characters are processed twice: once by the C++
compiler, and once by the regex++ expression compiler, so to pass
the regular expression \d+ to regex++, you need to embed "\\d+" in
your code. Likewise to match a literal backslash you will need to
embed "\\\\" in your code. <font color="#ff0000"></font></p>
<p><font color="#ff0000">Q. Why does using parenthesis in a POSIX
regular expression change the result of a match?</font></p>
<p>For POSIX (extended and basic) regular expressions, but not for
perl regexes, parentheses don't only mark; they determine what the
best match is as well. When the expression is compiled as a POSIX
basic or extended regex then Boost.regex follows the POSIX standard
leftmost longest rule for determining what matched. So if there is
more than one possible match after considering the whole
expression, it looks next at the first sub-expression and then the
second sub-expression and so on. So...</p>
<pre>
"(0*)([0-9]*)" against "00123" would produce "(0*)([0-9]*)" against "00123" would produce
$1 = "00" $1 = "00"
$2 = "123" $2 = "123"
</pre> </pre>
<p>where as</p>
<p>where as</p> <pre>
"0*([0-9])*" against "00123" would produce
<pre>
"0*([0-9)*" against "00123" would produce
$1 = "00123" $1 = "00123"
</pre> </pre>
<p>If you think about it, had $1 only matched the "123", this would be "less good"
<p>If you think about it, had $1 only matched the "123", this would than the match "00123" which is both further to the left and longer. If you
be "less good" than the match "00123" which is both further to the want $1 to match only the "123" part, then you need to use something like:</p>
left and longer. If you want $1 to match only the "123" part, then <pre>
you need to use something like:</p>
<pre>
"0*([1-9][0-9]*)" "0*([1-9][0-9]*)"
</pre> </pre>
<p>as the expression.</p>
<p>as the expression.</p> <p><font color="#ff0000">Q. Why don't character ranges work properly (POSIX mode
only)?</font><br>
<p><font color="#ff0000">Q. Why don't character ranges work A. The POSIX standard specifies that character range expressions are locale
properly (POSIX mode only)?</font><br> sensitive - so for example the expression [A-Z] will match any collating
A. The POSIX standard specifies that character range expressions element that collates between 'A' and 'Z'. That means that for most locales
are locale sensitive - so for example the expression [A-Z] will other than "C" or "POSIX", [A-Z] would match the single character 't' for
match any collating element that collates between 'A' and 'Z'. That example, which is not what most people expect - or at least not what most
means that for most locales other than "C" or "POSIX", [A-Z] would people have come to expect from regular expression engines. For this reason,
match the single character 't' for example, which is not what most the default behaviour of boost.regex (perl mode) is to turn locale sensitive
people expect - or at least not what most people have come to collation off by not setting the regex_constants::collate compile time flag.
expect from regular expression engines. For this reason, the However if you set a non-default compile time flag - for example
default behaviour of boost.regex (perl mode) is to turn locale regex_constants::extended or regex_constants::basic, then locale dependent
sensitive collation off by not setting the regex_constants::collate collation will be enabled, this also applies to the POSIX API functions which
compile time flag. However if you set a non-default compile time use either regex_constants::extended or regex_constants::basic internally. <i>[Note
flag - for example regex_constants::extended or - when regex_constants::nocollate in effect, the library behaves "as if" the
regex_constants::basic, then locale dependent collation will be LC_COLLATE locale category were always "C", regardless of what its actually set
enabled, this also applies to the POSIX API functions which use to - end note</i>].</p>
either regex_constants::extended or regex_constants::basic <p><font color="#ff0000">Q. Why are there no throw specifications on any of the
internally. <i>[Note - when regex_constants::nocollate in effect, functions? What exceptions can the library throw?</font></p>
the library behaves "as if" the LC_COLLATE locale category were <p>A. Not all compilers support (or honor) throw specifications, others support
always "C", regardless of what its actually set to - end them but with reduced efficiency. Throw specifications may be added at a later
note</i>].</p> date as compilers begin to handle this better. The library should throw only
three types of exception: boost::bad_expression can be thrown by basic_regex
<p><font color="#ff0000">Q. Why are there no throw specifications when compiling a regular expression, std::runtime_error can be thrown when a
on any of the functions? What exceptions can the library call to basic_regex::imbue tries to open a message catalogue that doesn't
throw?</font></p> exist, or when a call to regex_search or regex_match results in an
"everlasting" search,&nbsp;or when a call to RegEx::GrepFiles or
<p>A. Not all compilers support (or honor) throw specifications, RegEx::FindFiles tries to open a file that cannot be opened, finally
others support them but with reduced efficiency. Throw std::bad_alloc can be thrown by just about any of the functions in this
specifications may be added at a later date as compilers begin to library.</p>
handle this better. The library should throw only three types of <p></p>
exception: boost::bad_expression can be thrown by basic_regex when <hr>
compiling a regular expression, std::runtime_error can be thrown
when a call to basic_regex::imbue tries to open a message catalogue
that doesn't exist, or when a call to regex_search or regex_match
results in an "everlasting" search,&nbsp;or when a call to
RegEx::GrepFiles or RegEx::FindFiles tries to open a file that
cannot be opened, finally std::bad_alloc can be thrown by just
about any of the functions in this library.</p>
<p></p>
<hr>
<p>Revised <p>Revised
<!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%d %B, %Y" startspan --> <!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%d %B, %Y" startspan -->
24 Oct 2003 24 Oct 2003
<!--webbot bot="Timestamp" endspan i-checksum="39359" --></p> <!--webbot bot="Timestamp" endspan i-checksum="39359" --></p>
<p><i><EFBFBD> Copyright John Maddock&nbsp;1998- <p><i><EFBFBD> Copyright John Maddock&nbsp;1998-
<!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%Y" startspan --> <!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%Y" startspan --> 2003<!--webbot bot="Timestamp" endspan i-checksum="39359" --></i></p>
2003<!--webbot bot="Timestamp" endspan i-checksum="39359" --></i></p>
<P><I>Use, modification and distribution are subject to the Boost Software License, <P><I>Use, modification and distribution are subject to the Boost Software License,
Version 1.0. (See accompanying file <A href="../../../LICENSE_1_0.txt">LICENSE_1_0.txt</A> Version 1.0. (See accompanying file <A href="../../../LICENSE_1_0.txt">LICENSE_1_0.txt</A>
or copy at <A href="http://www.boost.org/LICENSE_1_0.txt">http://www.boost.org/LICENSE_1_0.txt</A>)</I></P> or copy at <A href="http://www.boost.org/LICENSE_1_0.txt">http://www.boost.org/LICENSE_1_0.txt</A>)</I></P>
</body> </body>
</html> </html>