mirror of
https://github.com/boostorg/regex.git
synced 2025-07-06 17:06:35 +02:00
Add optional support for marked sub-expression location information. Add support for ${n} in format replacement text. Fixes #2556. Fixes #2269. Fixes #2514. [SVN r50370]
156 lines
10 KiB
HTML
156 lines
10 KiB
HTML
<html>
|
||
<head>
|
||
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
|
||
<title>FAQ</title>
|
||
<link rel="stylesheet" href="../../../../../../doc/html/boostbook.css" type="text/css">
|
||
<meta name="generator" content="DocBook XSL Stylesheets Vsnapshot_8125">
|
||
<link rel="home" href="../../index.html" title="Boost.Regex">
|
||
<link rel="up" href="../background_information.html" title="Background Information">
|
||
<link rel="prev" href="futher.html" title="References and Further Information">
|
||
<link rel="next" href="performance.html" title="Performance">
|
||
</head>
|
||
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
|
||
<table cellpadding="2" width="100%"><tr>
|
||
<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../boost.png"></td>
|
||
<td align="center"><a href="../../../../../../index.html">Home</a></td>
|
||
<td align="center"><a href="../../../../../../libs/libraries.htm">Libraries</a></td>
|
||
<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
|
||
<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
|
||
<td align="center"><a href="../../../../../../more/index.htm">More</a></td>
|
||
</tr></table>
|
||
<hr>
|
||
<div class="spirit-nav">
|
||
<a accesskey="p" href="futher.html"><img src="../../../../../../doc/html/images/prev.png" alt="Prev"></a><a accesskey="u" href="../background_information.html"><img src="../../../../../../doc/html/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../doc/html/images/home.png" alt="Home"></a><a accesskey="n" href="performance.html"><img src="../../../../../../doc/html/images/next.png" alt="Next"></a>
|
||
</div>
|
||
<div class="section" lang="en">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="boost_regex.background_information.faq"></a><a class="link" href="faq.html" title="FAQ"> FAQ</a>
|
||
</h3></div></div></div>
|
||
<p>
|
||
<span class="bold"><strong>Q.</strong></span> I can't get regex++ to work with escape
|
||
characters, what's going on?
|
||
</p>
|
||
<p>
|
||
<span class="bold"><strong>A.</strong></span> If you embed regular expressions in C++
|
||
code, then remember that escape characters are processed twice: once by the
|
||
C++ compiler, and once by the Boost.Regex expression compiler, so to pass
|
||
the regular expression \d+ to Boost.Regex, you need to embed "\d+"
|
||
in your code. Likewise to match a literal backslash you will need to embed
|
||
"\\" in your code.
|
||
</p>
|
||
<p>
|
||
<span class="bold"><strong>Q.</strong></span> No matter what I do regex_match always
|
||
returns false, what's going on?
|
||
</p>
|
||
<p>
|
||
<span class="bold"><strong>A.</strong></span> The algorithm regex_match only succeeds
|
||
if the expression matches <span class="bold"><strong>all</strong></span> of the text,
|
||
if you want to <span class="bold"><strong>find</strong></span> a sub-string within
|
||
the text that matches the expression then use regex_search instead.
|
||
</p>
|
||
<p>
|
||
<span class="bold"><strong>Q.</strong></span> Why does using parenthesis in a POSIX
|
||
regular expression change the result of a match?
|
||
</p>
|
||
<p>
|
||
<span class="bold"><strong>A.</strong></span> For POSIX (extended and basic) regular
|
||
expressions, but not for perl regexes, parentheses don't only mark; they
|
||
determine what the best match is as well. When the expression is compiled
|
||
as a POSIX basic or extended regex then Boost.Regex follows the POSIX standard
|
||
leftmost longest rule for determining what matched. So if there is more than
|
||
one possible match after considering the whole expression, it looks next
|
||
at the first sub-expression and then the second sub-expression and so on.
|
||
So...
|
||
</p>
|
||
<p>
|
||
"(0*)([0-9]*)" against "00123" would produce $1 = "00"
|
||
$2 = "123"
|
||
</p>
|
||
<p>
|
||
where as
|
||
</p>
|
||
<p>
|
||
"0*([0-9])*" against "00123" would produce $1 = "00123"
|
||
</p>
|
||
<p>
|
||
If you think about it, had $1 only matched the "123", this would
|
||
be "less good" than the match "00123" which is both further
|
||
to the left and longer. If you want $1 to match only the "123"
|
||
part, then you need to use something like:
|
||
</p>
|
||
<p>
|
||
"0*([1-9][0-9]*)"
|
||
</p>
|
||
<p>
|
||
as the expression.
|
||
</p>
|
||
<p>
|
||
<span class="bold"><strong>Q.</strong></span> Why don't character ranges work properly
|
||
(POSIX mode only)?
|
||
</p>
|
||
<p>
|
||
<span class="bold"><strong>A.</strong></span> The POSIX standard specifies that character
|
||
range expressions are locale sensitive - so for example the expression [A-Z]
|
||
will match any collating element that collates between 'A' and 'Z'. That
|
||
means that for most locales other than "C" or "POSIX",
|
||
[A-Z] would match the single character 't' for example, which is not what
|
||
most people expect - or at least not what most people have come to expect
|
||
from regular expression engines. For this reason, the default behaviour of
|
||
Boost.Regex (perl mode) is to turn locale sensitive collation off by not
|
||
setting the <code class="computeroutput"><span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">collate</span></code>
|
||
compile time flag. However if you set a non-default compile time flag - for
|
||
example <code class="computeroutput"><span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">extended</span></code> or <code class="computeroutput"><span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">basic</span></code>,
|
||
then locale dependent collation will be enabled, this also applies to the
|
||
POSIX API functions which use either <code class="computeroutput"><span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">extended</span></code>
|
||
or <code class="computeroutput"><span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">basic</span></code> internally. [Note - when <code class="computeroutput"><span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">nocollate</span></code> in effect, the library behaves
|
||
"as if" the LC_COLLATE locale category were always "C",
|
||
regardless of what its actually set to - end note].
|
||
</p>
|
||
<p>
|
||
<span class="bold"><strong>Q.</strong></span> Why are there no throw specifications
|
||
on any of the functions? What exceptions can the library throw?
|
||
</p>
|
||
<p>
|
||
<span class="bold"><strong>A.</strong></span> Not all compilers support (or honor)
|
||
throw specifications, others support them but with reduced efficiency. Throw
|
||
specifications may be added at a later date as compilers begin to handle
|
||
this better. The library should throw only three types of exception: [boost::regex_error]
|
||
can be thrown by <a class="link" href="../ref/basic_regex.html" title="basic_regex"><code class="computeroutput"><span class="identifier">basic_regex</span></code></a> when compiling a regular
|
||
expression, <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">runtime_error</span></code> can be thrown when a call
|
||
to <code class="computeroutput"><span class="identifier">basic_regex</span><span class="special">::</span><span class="identifier">imbue</span></code> tries to open a message catalogue
|
||
that doesn't exist, or when a call to <a class="link" href="../ref/regex_search.html" title="regex_search"><code class="computeroutput"><span class="identifier">regex_search</span></code></a> or <a class="link" href="../ref/regex_match.html" title="regex_match"><code class="computeroutput"><span class="identifier">regex_match</span></code></a> results in an "everlasting"
|
||
search, or when a call to <code class="computeroutput"><span class="identifier">RegEx</span><span class="special">::</span><span class="identifier">GrepFiles</span></code>
|
||
or <code class="computeroutput"><span class="identifier">RegEx</span><span class="special">::</span><span class="identifier">FindFiles</span></code> tries to open a file that cannot
|
||
be opened, finally <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">bad_alloc</span></code> can be thrown by just about any
|
||
of the functions in this library.
|
||
</p>
|
||
<p>
|
||
<span class="bold"><strong>Q.</strong></span> Why can't I use the "convenience"
|
||
versions of regex_match / regex_search / regex_grep / regex_format / regex_merge?
|
||
</p>
|
||
<p>
|
||
<span class="bold"><strong>A.</strong></span> These versions may or may not be available
|
||
depending upon the capabilities of your compiler, the rules determining the
|
||
format of these functions are quite complex - and only the versions visible
|
||
to a standard compliant compiler are given in the help. To find out what
|
||
your compiler supports, run <boost/regex.hpp> through your C++ pre-processor,
|
||
and search the output file for the function that you are interested in. Note
|
||
however, that very few current compilers still have problems with these overloaded
|
||
functions.
|
||
</p>
|
||
</div>
|
||
<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
|
||
<td align="left"></td>
|
||
<td align="right"><div class="copyright-footer">Copyright <20> 1998 -2007 John Maddock<p>
|
||
Distributed under the Boost Software License, Version 1.0. (See accompanying
|
||
file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
|
||
</p>
|
||
</div></td>
|
||
</tr></table>
|
||
<hr>
|
||
<div class="spirit-nav">
|
||
<a accesskey="p" href="futher.html"><img src="../../../../../../doc/html/images/prev.png" alt="Prev"></a><a accesskey="u" href="../background_information.html"><img src="../../../../../../doc/html/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../doc/html/images/home.png" alt="Home"></a><a accesskey="n" href="performance.html"><img src="../../../../../../doc/html/images/next.png" alt="Next"></a>
|
||
</div>
|
||
</body>
|
||
</html>
|