forked from boostorg/regex
115 lines
6.7 KiB
HTML
115 lines
6.7 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
|
||
<html>
|
||
<head>
|
||
<title>Boost.Regex: FAQ</title>
|
||
<meta name="generator" content="HTML Tidy, see www.w3.org">
|
||
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
||
<link rel="stylesheet" type="text/css" href="../../../boost.css">
|
||
</head>
|
||
<body>
|
||
<p></p>
|
||
<table id="Table1" cellspacing="1" cellpadding="1" width="100%" border="0">
|
||
<tr>
|
||
<td valign="top" width="300">
|
||
<h3><a href="../../../index.htm"><img height="86" width="277" alt="C++ Boost" src="../../../boost.png" border="0"></a></h3>
|
||
</td>
|
||
<td width="353">
|
||
<h1 align="center">Boost.Regex</h1>
|
||
<h2 align="center">FAQ</h2>
|
||
</td>
|
||
<td width="50">
|
||
<h3><a href="index.html"><img height="45" width="43" alt="Boost.Regex Index" src="uarrow.gif" border="0"></a></h3>
|
||
</td>
|
||
</tr>
|
||
</table>
|
||
<br>
|
||
<br>
|
||
<hr>
|
||
<font color="#ff0000"><font color="#ff0000"></font></font>
|
||
<p><font color="#ff0000"><font color="#ff0000"><font color="#ff0000"> Q. Why can't I
|
||
use the "convenience" versions of regex_match / regex_search / regex_grep /
|
||
regex_format / regex_merge?</font></font></font></p>
|
||
<p>A. These versions may or may not be available depending upon the capabilities
|
||
of your compiler, the rules determining the format of these functions are quite
|
||
complex - and only the versions visible to a standard compliant compiler are
|
||
given in the help. To find out what your compiler supports, run
|
||
<boost/regex.hpp> through your C++ pre-processor, and search the output
|
||
file for the function that you are interested in.<font color="#ff0000"><font color="#ff0000"></font></font></p>
|
||
<p><font color="#ff0000"><font color="#ff0000">Q. I can't get regex++ to work with
|
||
escape characters, what's going on?</font></font></p>
|
||
<p>A. If you embed regular expressions in C++ code, then remember that escape
|
||
characters are processed twice: once by the C++ compiler, and once by the
|
||
regex++ expression compiler, so to pass the regular expression \d+ to regex++,
|
||
you need to embed "\\d+" in your code. Likewise to match a literal backslash
|
||
you will need to embed "\\\\" in your code. <font color="#ff0000"></font>
|
||
</p>
|
||
<p><font color="#ff0000">Q. Why does using parenthesis in a POSIX regular expression
|
||
change the result of a match?</font></p>
|
||
<p>For POSIX (extended and basic) regular expressions, but not for perl regexes,
|
||
parentheses don't only mark; they determine what the best match is as well.
|
||
When the expression is compiled as a POSIX basic or extended regex then
|
||
Boost.regex follows the POSIX standard leftmost longest rule for determining
|
||
what matched. So if there is more than one possible match after considering the
|
||
whole expression, it looks next at the first sub-expression and then the second
|
||
sub-expression and so on. So...</p>
|
||
<pre>
|
||
"(0*)([0-9]*)" against "00123" would produce
|
||
$1 = "00"
|
||
$2 = "123"
|
||
</pre>
|
||
<p>where as</p>
|
||
<pre>
|
||
"0*([0-9])*" against "00123" would produce
|
||
$1 = "00123"
|
||
</pre>
|
||
<p>If you think about it, had $1 only matched the "123", this would be "less good"
|
||
than the match "00123" which is both further to the left and longer. If you
|
||
want $1 to match only the "123" part, then you need to use something like:</p>
|
||
<pre>
|
||
"0*([1-9][0-9]*)"
|
||
</pre>
|
||
<p>as the expression.</p>
|
||
<p><font color="#ff0000">Q. Why don't character ranges work properly (POSIX mode
|
||
only)?</font><br>
|
||
A. The POSIX standard specifies that character range expressions are locale
|
||
sensitive - so for example the expression [A-Z] will match any collating
|
||
element that collates between 'A' and 'Z'. That means that for most locales
|
||
other than "C" or "POSIX", [A-Z] would match the single character 't' for
|
||
example, which is not what most people expect - or at least not what most
|
||
people have come to expect from regular expression engines. For this reason,
|
||
the default behaviour of boost.regex (perl mode) is to turn locale sensitive
|
||
collation off by not setting the regex_constants::collate compile time flag.
|
||
However if you set a non-default compile time flag - for example
|
||
regex_constants::extended or regex_constants::basic, then locale dependent
|
||
collation will be enabled, this also applies to the POSIX API functions which
|
||
use either regex_constants::extended or regex_constants::basic internally. <i>[Note
|
||
- when regex_constants::nocollate in effect, the library behaves "as if" the
|
||
LC_COLLATE locale category were always "C", regardless of what its actually set
|
||
to - end note</i>].</p>
|
||
<p><font color="#ff0000">Q. Why are there no throw specifications on any of the
|
||
functions? What exceptions can the library throw?</font></p>
|
||
<p>A. Not all compilers support (or honor) throw specifications, others support
|
||
them but with reduced efficiency. Throw specifications may be added at a later
|
||
date as compilers begin to handle this better. The library should throw only
|
||
three types of exception: boost::bad_expression can be thrown by basic_regex
|
||
when compiling a regular expression, std::runtime_error can be thrown when a
|
||
call to basic_regex::imbue tries to open a message catalogue that doesn't
|
||
exist, or when a call to regex_search or regex_match results in an
|
||
"everlasting" search, or when a call to RegEx::GrepFiles or
|
||
RegEx::FindFiles tries to open a file that cannot be opened, finally
|
||
std::bad_alloc can be thrown by just about any of the functions in this
|
||
library.</p>
|
||
<p></p>
|
||
<hr>
|
||
<p>Revised
|
||
<!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%d %B, %Y" startspan -->
|
||
24 Oct 2003
|
||
<!--webbot bot="Timestamp" endspan i-checksum="39359" --></p>
|
||
<p><i><EFBFBD> Copyright John Maddock 1998-
|
||
<!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%Y" startspan --> 2003<!--webbot bot="Timestamp" endspan i-checksum="39359" --></i></p>
|
||
<P><I>Use, modification and distribution are subject to the Boost Software License,
|
||
Version 1.0. (See accompanying file <A href="../../../LICENSE_1_0.txt">LICENSE_1_0.txt</A>
|
||
or copy at <A href="http://www.boost.org/LICENSE_1_0.txt">http://www.boost.org/LICENSE_1_0.txt</A>)</I></P>
|
||
</body>
|
||
</html>
|