mirror of
https://github.com/boostorg/regex.git
synced 2026-04-29 18:23:23 +02:00
cfacfddccd
https://svn.boost.org/svn/boost/trunk ........ r44114 | emildotchevski | 2008-04-08 14:29:37 -0700 (Tue, 08 Apr 2008) | 1 line fixed compile errors, removed tabs as required. ........ r44118 | djowel | 2008-04-08 18:29:12 -0700 (Tue, 08 Apr 2008) | 7 lines In preparation for spirit2: * flat includes * home directory * forwarding headers * classic spirit ........ r44119 | djowel | 2008-04-08 18:51:47 -0700 (Tue, 08 Apr 2008) | 7 lines In preparation for spirit2: * flat includes * home directory * forwarding headers * classic spirit ........ r44120 | hkaiser | 2008-04-08 19:17:53 -0700 (Tue, 08 Apr 2008) | 1 line Fixed one more include path ........ r44121 | johnmaddock | 2008-04-09 04:34:20 -0700 (Wed, 09 Apr 2008) | 1 line Run config_info and config_test in both single and multi-thread modes. ........ r44122 | johnmaddock | 2008-04-09 04:34:45 -0700 (Wed, 09 Apr 2008) | 1 line Run config_info and config_test in both single and multi-thread modes. ........ r44123 | johnmaddock | 2008-04-09 04:35:36 -0700 (Wed, 09 Apr 2008) | 1 line Added needed #includes. ........ r44124 | johnmaddock | 2008-04-09 04:45:15 -0700 (Wed, 09 Apr 2008) | 1 line Added improved SVG support. ........ r44125 | hkaiser | 2008-04-09 06:50:03 -0700 (Wed, 09 Apr 2008) | 1 line Fixed #pragma message directives and a couple of forwarding headers. ........ r44126 | johnmaddock | 2008-04-09 08:21:03 -0700 (Wed, 09 Apr 2008) | 1 line Fix bug report #1797. ........ r44127 | johnmaddock | 2008-04-09 08:31:33 -0700 (Wed, 09 Apr 2008) | 1 line Update for bug report #1790. ........ r44128 | johnmaddock | 2008-04-09 08:32:08 -0700 (Wed, 09 Apr 2008) | 1 line Fix for bug #1790. ........ r44130 | danieljames | 2008-04-09 10:26:31 -0700 (Wed, 09 Apr 2008) | 1 line Fix a typo. ........ r44131 | danieljames | 2008-04-09 10:27:08 -0700 (Wed, 09 Apr 2008) | 1 line Rebuild the function types documentation. ........ r44132 | pdimov | 2008-04-09 10:49:20 -0700 (Wed, 09 Apr 2008) | 1 line Proper try_lock semantics. ........ r44134 | emildotchevski | 2008-04-09 11:48:39 -0700 (Wed, 09 Apr 2008) | 1 line missing include ........ r44136 | anthonyw | 2008-04-09 12:33:06 -0700 (Wed, 09 Apr 2008) | 1 line Added test for trac ticket #1803: condition_variable::notify_one may fail to wake a waiting thread on win32 ........ r44137 | pdimov | 2008-04-09 12:58:54 -0700 (Wed, 09 Apr 2008) | 1 line sp_counted_base_spin.hpp added, enabled by BOOST_SP_USE_SPINLOCK. ........ r44138 | pdimov | 2008-04-09 14:08:39 -0700 (Wed, 09 Apr 2008) | 1 line spinlock_gcc_arm.hpp added. ........ r44139 | grafik | 2008-04-09 14:20:28 -0700 (Wed, 09 Apr 2008) | 1 line Add ARM architecture/instrustion-set. ........ r44140 | pdimov | 2008-04-09 16:19:22 -0700 (Wed, 09 Apr 2008) | 1 line ARM assembly fix. ........ r44145 | johnmaddock | 2008-04-10 05:46:41 -0700 (Thu, 10 Apr 2008) | 2 lines Doh! Changes to code should actually compile! A fix for the last change. ........ r44146 | anthonyw | 2008-04-10 06:14:43 -0700 (Thu, 10 Apr 2008) | 1 line fix for notify problem in trac ticket #1803 ........ r44147 | anthonyw | 2008-04-10 06:27:44 -0700 (Thu, 10 Apr 2008) | 1 line fix for trac ticket #1804 ........ r44148 | anthonyw | 2008-04-10 06:35:07 -0700 (Thu, 10 Apr 2008) | 1 line Added native_handle to thread on posix platforms ........ r44149 | anthonyw | 2008-04-10 07:07:39 -0700 (Thu, 10 Apr 2008) | 1 line added overloads of timed_lock_shared with a relative timeout to shared_mutex ........ r44150 | anthonyw | 2008-04-10 07:15:26 -0700 (Thu, 10 Apr 2008) | 1 line added tests for plain timed_lock on shared_mutex ........ r44151 | daniel_frey | 2008-04-10 07:38:14 -0700 (Thu, 10 Apr 2008) | 1 line Added test and fix for "convertible to bool" requirement ........ r44152 | anthonyw | 2008-04-10 08:52:01 -0700 (Thu, 10 Apr 2008) | 1 line Added native_handle to condition_variable on pthreads ........ r44153 | anthonyw | 2008-04-10 11:34:42 -0700 (Thu, 10 Apr 2008) | 1 line Updated thread.hpp as catch-all header ........ r44160 | dgregor | 2008-04-10 14:05:14 -0700 (Thu, 10 Apr 2008) | 1 line Refactor mpi_datatype_cache to fix problems on VC9 ........ r44161 | danieljames | 2008-04-10 14:06:48 -0700 (Thu, 10 Apr 2008) | 2 lines Try to fix Herve's name in a couple of places. ........ r44163 | djowel | 2008-04-10 16:51:31 -0700 (Thu, 10 Apr 2008) | 1 line moving stuff to classic spirit ........ r44164 | emildotchevski | 2008-04-10 20:51:06 -0700 (Thu, 10 Apr 2008) | 1 line to_string fixes ........ r44165 | grafik | 2008-04-10 22:34:00 -0700 (Thu, 10 Apr 2008) | 1 line Use local sorted() function to support Python < 2.4. ........ r44166 | grafik | 2008-04-10 22:36:28 -0700 (Thu, 10 Apr 2008) | 1 line Add support for toolset requirements at the definition level. ........ r44167 | grafik | 2008-04-11 00:50:47 -0700 (Fri, 11 Apr 2008) | 1 line Initial support for cross-compiling to ARM architecture. ........ r44168 | anthonyw | 2008-04-11 01:52:09 -0700 (Fri, 11 Apr 2008) | 1 line Added test and fix for win32 condition_variable broadcast bug similar to #1803 ........ r44169 | johnmaddock | 2008-04-11 01:53:54 -0700 (Fri, 11 Apr 2008) | 1 line Fix doc typo from issue #1794. ........ r44170 | johnmaddock | 2008-04-11 02:21:08 -0700 (Fri, 11 Apr 2008) | 1 line Beefed up pthreads test cases. ........ r44171 | johnmaddock | 2008-04-11 02:22:31 -0700 (Fri, 11 Apr 2008) | 1 line Hopefully fix gcc/solaris single threading mode. ........ r44172 | jurko | 2008-04-11 03:51:43 -0700 (Fri, 11 Apr 2008) | 1 line Comment typo correction. ........ r44175 | dgregor | 2008-04-11 08:39:41 -0700 (Fri, 11 Apr 2008) | 1 line Fix some header-inclusion and header-ordering issues to get the MPI library compiling again. ........ r44186 | johnmaddock | 2008-04-11 10:54:47 -0700 (Fri, 11 Apr 2008) | 1 line Disable long double tests on unsupported platforms. ........ r44187 | johnmaddock | 2008-04-11 10:57:58 -0700 (Fri, 11 Apr 2008) | 1 line We don't need duplicate using declarations. ........ r44188 | johnmaddock | 2008-04-11 11:08:59 -0700 (Fri, 11 Apr 2008) | 1 line Update error levels for real_concept tests. ........ r44189 | johnmaddock | 2008-04-11 11:12:02 -0700 (Fri, 11 Apr 2008) | 1 line Update tolerance used for skewness test. ........ r44190 | hkaiser | 2008-04-11 11:19:46 -0700 (Fri, 11 Apr 2008) | 1 line Fixed reference to Spirit classic test suite ........ r44192 | emildotchevski | 2008-04-11 11:34:46 -0700 (Fri, 11 Apr 2008) | 1 line to_string adjustments ........ r44195 | jurko | 2008-04-11 14:03:06 -0700 (Fri, 11 Apr 2008) | 1 line Implemented a patch contributed by Igor Nazarenko reimplementing the list_sort() function to use a C qsort() function instead of a hand-crafted merge-sort algorithm. Makes some list sortings (e.g. 1,2,1,2,1,2,1,2,...) extremely faster, in turn significantly speeding up some project builds. ........ r44196 | hkaiser | 2008-04-11 15:01:55 -0700 (Fri, 11 Apr 2008) | 1 line Changed SpiritV1 header files to have a classic_ prefix ........ r44197 | hkaiser | 2008-04-11 15:05:25 -0700 (Fri, 11 Apr 2008) | 1 line Renamed a SpiritV1 header file I missed before ........ r44198 | hkaiser | 2008-04-11 19:35:34 -0700 (Fri, 11 Apr 2008) | 1 line Renamed PhoenixV1 files. ........ r44203 | hkaiser | 2008-04-11 20:00:17 -0700 (Fri, 11 Apr 2008) | 1 line Fixed an ambiguity. ........ r44206 | hkaiser | 2008-04-11 20:02:34 -0700 (Fri, 11 Apr 2008) | 1 line Fixed more SpiritV1 header references after renaming ........ r44246 | emildotchevski | 2008-04-11 20:27:57 -0700 (Fri, 11 Apr 2008) | 1 line removed tabs. what's wrong with tabs anyway? ........ r44342 | emildotchevski | 2008-04-11 23:08:10 -0700 (Fri, 11 Apr 2008) | 1 line documentation cleanup ........ r44343 | speedsnail | 2008-04-12 04:02:35 -0700 (Sat, 12 Apr 2008) | 2 lines Fixed a bug in for seldom used argument <property:/property-name/> in rule format-name. Added /property-name/ may be a regex. ........ r44344 | pdimov | 2008-04-12 07:27:22 -0700 (Sat, 12 Apr 2008) | 1 line shared_ptr::lock no longer requires exceptions. ........ r44346 | johnmaddock | 2008-04-12 09:01:16 -0700 (Sat, 12 Apr 2008) | 1 line Remove references to Boost.Test from the config_test target. ........ r44347 | johnmaddock | 2008-04-12 09:02:24 -0700 (Sat, 12 Apr 2008) | 1 line When -lrt is needed, it's needed in *both* single and multi-threaded builds. ........ r44350 | johnmaddock | 2008-04-12 09:27:11 -0700 (Sat, 12 Apr 2008) | 2 lines Add non central distro's to fwd.hpp. Added needed #include to bessel_ik.hpp. ........ r44351 | johnmaddock | 2008-04-12 09:28:57 -0700 (Sat, 12 Apr 2008) | 3 lines Fix declaration order in dist_nc_beta_incl_test.cpp test. Fix long long usage in sf_modf_incl_test.cpp. Adjust failure rates in test_zeta.cpp to cope with HP aCC and 128-bit long doubles. ........ r44352 | johnmaddock | 2008-04-12 09:42:28 -0700 (Sat, 12 Apr 2008) | 1 line Remove test row that causes problems for VC-7.1 due to a compiler bug. ........ r44353 | pdimov | 2008-04-12 11:22:18 -0700 (Sat, 12 Apr 2008) | 1 line sp_accept_owner added. ........ r44354 | grafik | 2008-04-12 12:44:47 -0700 (Sat, 12 Apr 2008) | 1 line Add multiple requirements for toolset subconditions instead of one composite as they are not supported for conditional requirements. Thanks to Roland for finding the problem. ........ r44355 | hkaiser | 2008-04-12 16:58:29 -0700 (Sat, 12 Apr 2008) | 1 line Changed copyright, started to apply changes for switching namespaces. ........ r44356 | djowel | 2008-04-12 17:15:11 -0700 (Sat, 12 Apr 2008) | 1 line added flat forwarding headers ........ r44357 | djowel | 2008-04-12 17:39:00 -0700 (Sat, 12 Apr 2008) | 1 line added flat forwarding headers ........ r44358 | djowel | 2008-04-12 17:54:10 -0700 (Sat, 12 Apr 2008) | 1 line adding spirit2 ........ r44359 | djowel | 2008-04-12 18:52:31 -0700 (Sat, 12 Apr 2008) | 1 line spirit2 ! :) ........ r44360 | djowel | 2008-04-12 20:02:30 -0700 (Sat, 12 Apr 2008) | 1 line spirit2 ! :) ........ r44361 | djowel | 2008-04-12 20:17:57 -0700 (Sat, 12 Apr 2008) | 1 line spirit2 ! :) ........ r44367 | andreas_huber69 | 2008-04-13 06:57:42 -0700 (Sun, 13 Apr 2008) | 1 line Changed the PingPong example to demonstrate how the inner workings of an asynchronous_state_machine<> subclass can be hidden. ........ r44369 | pdimov | 2008-04-13 08:35:40 -0700 (Sun, 13 Apr 2008) | 1 line Honor BOOST_DISABLE_THREADS; route GCC/ARM to the spinlock implementation; fall back to the spinlock implementation instead of using pthread_mutex. ........ r44370 | anthonyw | 2008-04-13 08:50:08 -0700 (Sun, 13 Apr 2008) | 1 line Added extended adopt/defer/try constructors to upgrade_lock ........ r44371 | hkaiser | 2008-04-13 09:28:27 -0700 (Sun, 13 Apr 2008) | 1 line Fixed Spirit Classic namespace switching. ........ r44372 | emildotchevski | 2008-04-13 10:07:26 -0700 (Sun, 13 Apr 2008) | 1 line minor compile error fix ........ r44374 | hkaiser | 2008-04-13 15:00:04 -0700 (Sun, 13 Apr 2008) | 1 line Added SpiritV2 test suite to regression tests. ........ r44376 | grafik | 2008-04-13 15:12:12 -0700 (Sun, 13 Apr 2008) | 1 line Move array test into canonical test subdir structure. ........ r44377 | grafik | 2008-04-13 15:24:41 -0700 (Sun, 13 Apr 2008) | 1 line Move crc test into canonical test subdir structure. ........ [SVN r44393]
661 lines
28 KiB
HTML
661 lines
28 KiB
HTML
<html>
|
|
<head>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
|
|
<title>POSIX Basic Regular Expression Syntax</title>
|
|
<link rel="stylesheet" href="../../../../../../doc/html/boostbook.css" type="text/css">
|
|
<meta name="generator" content="DocBook XSL Stylesheets Vsnapshot_2006-12-17_0120">
|
|
<link rel="start" href="../../index.html" title="Boost.Regex">
|
|
<link rel="up" href="../syntax.html" title="Regular Expression Syntax">
|
|
<link rel="prev" href="basic_extended.html" title="POSIX Extended Regular Expression Syntax">
|
|
<link rel="next" href="character_classes.html" title="Character Class Names">
|
|
</head>
|
|
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
|
|
<table cellpadding="2" width="100%"><tr>
|
|
<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../boost.png"></td>
|
|
<td align="center"><a href="../../../../../../index.html">Home</a></td>
|
|
<td align="center"><a href="../../../../../../libs/libraries.htm">Libraries</a></td>
|
|
<td align="center"><a href="http://www.boost.org/people/people.htm">People</a></td>
|
|
<td align="center"><a href="http://www.boost.org/more/faq.htm">FAQ</a></td>
|
|
<td align="center"><a href="../../../../../../more/index.htm">More</a></td>
|
|
</tr></table>
|
|
<hr>
|
|
<div class="spirit-nav">
|
|
<a accesskey="p" href="basic_extended.html"><img src="../../../../../../doc/html/images/prev.png" alt="Prev"></a><a accesskey="u" href="../syntax.html"><img src="../../../../../../doc/html/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../doc/html/images/home.png" alt="Home"></a><a accesskey="n" href="character_classes.html"><img src="../../../../../../doc/html/images/next.png" alt="Next"></a>
|
|
</div>
|
|
<div class="section" lang="en">
|
|
<div class="titlepage"><div><div><h3 class="title">
|
|
<a name="boost_regex.syntax.basic_syntax"></a><a href="basic_syntax.html" title="POSIX Basic Regular Expression Syntax"> POSIX Basic Regular
|
|
Expression Syntax</a>
|
|
</h3></div></div></div>
|
|
<a name="boost_regex.syntax.basic_syntax.synopsis"></a><h4>
|
|
<a name="id509034"></a>
|
|
<a href="basic_syntax.html#boost_regex.syntax.basic_syntax.synopsis">Synopsis</a>
|
|
</h4>
|
|
<p>
|
|
The POSIX-Basic regular expression syntax is used by the Unix utility <code class="computeroutput"><span class="identifier">sed</span></code>, and variations are used by <code class="computeroutput"><span class="identifier">grep</span></code> and <code class="computeroutput"><span class="identifier">emacs</span></code>.
|
|
You can construct POSIX basic regular expressions in Boost.Regex by passing
|
|
the flag <code class="computeroutput"><span class="identifier">basic</span></code> to the regex
|
|
constructor (see <a href="../ref/syntax_option_type.html" title="syntax_option_type"><code class="computeroutput"><span class="identifier">syntax_option_type</span></code></a>), for example:
|
|
</p>
|
|
<pre class="programlisting"><span class="comment">// e1 is a case sensitive POSIX-Basic expression:
|
|
</span><span class="identifier">boost</span><span class="special">::</span><span class="identifier">regex</span> <span class="identifier">e1</span><span class="special">(</span><span class="identifier">my_expression</span><span class="special">,</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">regex</span><span class="special">::</span><span class="identifier">basic</span><span class="special">);</span>
|
|
<span class="comment">// e2 a case insensitive POSIX-Basic expression:
|
|
</span><span class="identifier">boost</span><span class="special">::</span><span class="identifier">regex</span> <span class="identifier">e2</span><span class="special">(</span><span class="identifier">my_expression</span><span class="special">,</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">regex</span><span class="special">::</span><span class="identifier">basic</span><span class="special">|</span><span class="identifier">boost</span><span class="special">::</span><span class="identifier">regex</span><span class="special">::</span><span class="identifier">icase</span><span class="special">);</span>
|
|
</pre>
|
|
<a name="boost_regex.posix_basic"></a><p>
|
|
</p>
|
|
<a name="boost_regex.syntax.basic_syntax.posix_basic_syntax"></a><h4>
|
|
<a name="id509325"></a>
|
|
<a href="basic_syntax.html#boost_regex.syntax.basic_syntax.posix_basic_syntax">POSIX
|
|
Basic Syntax</a>
|
|
</h4>
|
|
<p>
|
|
In POSIX-Basic regular expressions, all characters are match themselves except
|
|
for the following special characters:
|
|
</p>
|
|
<pre class="programlisting">.[\*^$</pre>
|
|
<a name="boost_regex.syntax.basic_syntax.wildcard_"></a><h5>
|
|
<a name="id509364"></a>
|
|
<a href="basic_syntax.html#boost_regex.syntax.basic_syntax.wildcard_">Wildcard:</a>
|
|
</h5>
|
|
<p>
|
|
The single character '.' when used outside of a character set will match
|
|
any single character except:
|
|
</p>
|
|
<div class="itemizedlist"><ul type="disc">
|
|
<li>
|
|
The NULL character when the flag <code class="computeroutput"><span class="identifier">match_no_dot_null</span></code>
|
|
is passed to the matching algorithms.
|
|
</li>
|
|
<li>
|
|
The newline character when the flag <code class="computeroutput"><span class="identifier">match_not_dot_newline</span></code>
|
|
is passed to the matching algorithms.
|
|
</li>
|
|
</ul></div>
|
|
<a name="boost_regex.syntax.basic_syntax.anchors_"></a><h5>
|
|
<a name="id509433"></a>
|
|
<a href="basic_syntax.html#boost_regex.syntax.basic_syntax.anchors_">Anchors:</a>
|
|
</h5>
|
|
<p>
|
|
A '^' character shall match the start of a line when used as the first character
|
|
of an expression, or the first character of a sub-expression.
|
|
</p>
|
|
<p>
|
|
A '$' character shall match the end of a line when used as the last character
|
|
of an expression, or the last character of a sub-expression.
|
|
</p>
|
|
<a name="boost_regex.syntax.basic_syntax.marked_sub_expressions_"></a><h5>
|
|
<a name="id509469"></a>
|
|
<a href="basic_syntax.html#boost_regex.syntax.basic_syntax.marked_sub_expressions_">Marked
|
|
sub-expressions:</a>
|
|
</h5>
|
|
<p>
|
|
A section beginning <code class="computeroutput"><span class="special">\(</span></code> and ending
|
|
<code class="computeroutput"><span class="special">\)</span></code> acts as a marked sub-expression.
|
|
Whatever matched the sub-expression is split out in a separate field by the
|
|
matching algorithms. Marked sub-expressions can also repeated, or referred-to
|
|
by a back-reference.
|
|
</p>
|
|
<a name="boost_regex.syntax.basic_syntax.repeats_"></a><h5>
|
|
<a name="id509526"></a>
|
|
<a href="basic_syntax.html#boost_regex.syntax.basic_syntax.repeats_">Repeats:</a>
|
|
</h5>
|
|
<p>
|
|
Any atom (a single character, a marked sub-expression, or a character class)
|
|
can be repeated with the * operator.
|
|
</p>
|
|
<p>
|
|
For example <code class="computeroutput"><span class="identifier">a</span><span class="special">*</span></code>
|
|
will match any number of letter a's repeated zero or more times (an atom
|
|
repeated zero times matches an empty string), so the expression <code class="computeroutput"><span class="identifier">a</span><span class="special">*</span><span class="identifier">b</span></code>
|
|
will match any of the following:
|
|
</p>
|
|
<pre class="programlisting">b
|
|
ab
|
|
aaaaaaaab
|
|
</pre>
|
|
<p>
|
|
An atom can also be repeated with a bounded repeat:
|
|
</p>
|
|
<p>
|
|
<code class="computeroutput"><span class="identifier">a</span><span class="special">\{</span><span class="identifier">n</span><span class="special">\}</span></code> Matches
|
|
'a' repeated exactly n times.
|
|
</p>
|
|
<p>
|
|
<code class="computeroutput"><span class="identifier">a</span><span class="special">\{</span><span class="identifier">n</span><span class="special">,\}</span></code> Matches
|
|
'a' repeated n or more times.
|
|
</p>
|
|
<p>
|
|
<code class="computeroutput"><span class="identifier">a</span><span class="special">\{</span><span class="identifier">n</span><span class="special">,</span> <span class="identifier">m</span><span class="special">\}</span></code> Matches 'a' repeated between n and m times
|
|
inclusive.
|
|
</p>
|
|
<p>
|
|
For example:
|
|
</p>
|
|
<pre class="programlisting">^a{2,3}$</pre>
|
|
<p>
|
|
Will match either of:
|
|
</p>
|
|
<pre class="programlisting">aa
|
|
aaa
|
|
</pre>
|
|
<p>
|
|
But neither of:
|
|
</p>
|
|
<pre class="programlisting">a
|
|
aaaa
|
|
</pre>
|
|
<p>
|
|
It is an error to use a repeat operator, if the preceding construct can not
|
|
be repeated, for example:
|
|
</p>
|
|
<pre class="programlisting">a(*)</pre>
|
|
<p>
|
|
Will raise an error, as there is nothing for the * operator to be applied
|
|
to.
|
|
</p>
|
|
<a name="boost_regex.syntax.basic_syntax.back_references_"></a><h5>
|
|
<a name="id509770"></a>
|
|
<a href="basic_syntax.html#boost_regex.syntax.basic_syntax.back_references_">Back references:</a>
|
|
</h5>
|
|
<p>
|
|
An escape character followed by a digit <span class="emphasis"><em>n</em></span>, where <span class="emphasis"><em>n</em></span>
|
|
is in the range 1-9, matches the same string that was matched by sub-expression
|
|
<span class="emphasis"><em>n</em></span>. For example the expression:
|
|
</p>
|
|
<pre class="programlisting">^\(a*\).*\1$</pre>
|
|
<p>
|
|
Will match the string:
|
|
</p>
|
|
<pre class="programlisting">aaabbaaa</pre>
|
|
<p>
|
|
But not the string:
|
|
</p>
|
|
<pre class="programlisting">aaabba</pre>
|
|
<a name="boost_regex.syntax.basic_syntax.character_sets_"></a><h5>
|
|
<a name="id509844"></a>
|
|
<a href="basic_syntax.html#boost_regex.syntax.basic_syntax.character_sets_">Character
|
|
sets:</a>
|
|
</h5>
|
|
<p>
|
|
A character set is a bracket-expression starting with [ and ending with ],
|
|
it defines a set of characters, and matches any single character that is
|
|
a member of that set.
|
|
</p>
|
|
<p>
|
|
A bracket expression may contain any combination of the following:
|
|
</p>
|
|
<a name="boost_regex.syntax.basic_syntax.single_characters_"></a><h6>
|
|
<a name="id509880"></a>
|
|
<a href="basic_syntax.html#boost_regex.syntax.basic_syntax.single_characters_">Single
|
|
characters:</a>
|
|
</h6>
|
|
<p>
|
|
For example <code class="computeroutput"><span class="special">[</span><span class="identifier">abc</span><span class="special">]</span></code>, will match any of the characters 'a', 'b',
|
|
or 'c'.
|
|
</p>
|
|
<a name="boost_regex.syntax.basic_syntax.character_ranges_"></a><h6>
|
|
<a name="id509930"></a>
|
|
<a href="basic_syntax.html#boost_regex.syntax.basic_syntax.character_ranges_">Character
|
|
ranges:</a>
|
|
</h6>
|
|
<p>
|
|
For example <code class="computeroutput"><span class="special">[</span><span class="identifier">a</span><span class="special">-</span><span class="identifier">c</span><span class="special">]</span></code>
|
|
will match any single character in the range 'a' to 'c'. By default, for
|
|
POSIX-Basic regular expressions, a character <span class="emphasis"><em>x</em></span> is within
|
|
the range <span class="emphasis"><em>y</em></span> to <span class="emphasis"><em>z</em></span>, if it collates
|
|
within that range; this results in locale specific behavior. This behavior
|
|
can be turned off by unsetting the <code class="computeroutput"><span class="identifier">collate</span></code>
|
|
option flag when constructing the regular expression - in which case whether
|
|
a character appears within a range is determined by comparing the code points
|
|
of the characters only.
|
|
</p>
|
|
<a name="boost_regex.syntax.basic_syntax.negation_"></a><h6>
|
|
<a name="id510022"></a>
|
|
<a href="basic_syntax.html#boost_regex.syntax.basic_syntax.negation_">Negation:</a>
|
|
</h6>
|
|
<p>
|
|
If the bracket-expression begins with the ^ character, then it matches the
|
|
complement of the characters it contains, for example <code class="computeroutput"><span class="special">[^</span><span class="identifier">a</span><span class="special">-</span><span class="identifier">c</span><span class="special">]</span></code> matches any character that is not in the
|
|
range a-c.
|
|
</p>
|
|
<a name="boost_regex.syntax.basic_syntax.character_classes_"></a><h6>
|
|
<a name="id510083"></a>
|
|
<a href="basic_syntax.html#boost_regex.syntax.basic_syntax.character_classes_">Character
|
|
classes:</a>
|
|
</h6>
|
|
<p>
|
|
An expression of the form <code class="computeroutput"><span class="special">[[:</span><span class="identifier">name</span><span class="special">:]]</span></code>
|
|
matches the named character class "name", for example <code class="computeroutput"><span class="special">[[:</span><span class="identifier">lower</span><span class="special">:]]</span></code> matches any lower case character. See
|
|
<a href="character_classes.html" title="Character Class Names">character class names</a>.
|
|
</p>
|
|
<a name="boost_regex.syntax.basic_syntax.collating_elements_"></a><h6>
|
|
<a name="id510166"></a>
|
|
<a href="basic_syntax.html#boost_regex.syntax.basic_syntax.collating_elements_">Collating
|
|
Elements:</a>
|
|
</h6>
|
|
<p>
|
|
An expression of the form <code class="computeroutput"><span class="special">[[.</span><span class="identifier">col</span><span class="special">.]</span></code> matches
|
|
the collating element <span class="emphasis"><em>col</em></span>. A collating element is any
|
|
single character, or any sequence of characters that collates as a single
|
|
unit. Collating elements may also be used as the end point of a range, for
|
|
example: <code class="computeroutput"><span class="special">[[.</span><span class="identifier">ae</span><span class="special">.]-</span><span class="identifier">c</span><span class="special">]</span></code>
|
|
matches the character sequence "ae", plus any single character
|
|
in the rangle "ae"-c, assuming that "ae" is treated as
|
|
a single collating element in the current locale.
|
|
</p>
|
|
<p>
|
|
Collating elements may be used in place of escapes (which are not normally
|
|
allowed inside character sets), for example <code class="computeroutput"><span class="special">[[.^.]</span><span class="identifier">abc</span><span class="special">]</span></code> would
|
|
match either one of the characters 'abc^'.
|
|
</p>
|
|
<p>
|
|
As an extension, a collating element may also be specified via its symbolic
|
|
name, for example:
|
|
</p>
|
|
<pre class="programlisting">[[.NUL.]]</pre>
|
|
<p>
|
|
matches a 'NUL' character. See <a href="collating_names.html" title="Collating Names">collating
|
|
element names</a>.
|
|
</p>
|
|
<a name="boost_regex.syntax.basic_syntax.equivalence_classes_"></a><h6>
|
|
<a name="id510315"></a>
|
|
<a href="basic_syntax.html#boost_regex.syntax.basic_syntax.equivalence_classes_">Equivalence
|
|
classes:</a>
|
|
</h6>
|
|
<p>
|
|
An expression of theform <code class="computeroutput"><span class="special">[[=</span><span class="identifier">col</span><span class="special">=]]</span></code>,
|
|
matches any character or collating element whose primary sort key is the
|
|
same as that for collating element <span class="emphasis"><em>col</em></span>, as with collating
|
|
elements the name <span class="emphasis"><em>col</em></span> may be a <a href="collating_names.html" title="Collating Names">collating
|
|
symbolic name</a>. A primary sort key is one that ignores case, accentation,
|
|
or locale-specific tailorings; so for example <code class="computeroutput"><span class="special">[[=</span><span class="identifier">a</span><span class="special">=]]</span></code> matches
|
|
any of the characters: a, À, Á, Â, Ã, Ä, Å, A, à, á, â, ã, ä and å. Unfortunately implementation
|
|
of this is reliant on the platform's collation and localisation support;
|
|
this feature can not be relied upon to work portably across all platforms,
|
|
or even all locales on one platform.
|
|
</p>
|
|
<a name="boost_regex.syntax.basic_syntax.combinations_"></a><h6>
|
|
<a name="id510419"></a>
|
|
<a href="basic_syntax.html#boost_regex.syntax.basic_syntax.combinations_">Combinations:</a>
|
|
</h6>
|
|
<p>
|
|
All of the above can be combined in one character set declaration, for example:
|
|
<code class="computeroutput"><span class="special">[[:</span><span class="identifier">digit</span><span class="special">:]</span><span class="identifier">a</span><span class="special">-</span><span class="identifier">c</span><span class="special">[.</span><span class="identifier">NUL</span><span class="special">.]].</span></code>
|
|
</p>
|
|
<a name="boost_regex.syntax.basic_syntax.escapes"></a><h5>
|
|
<a name="id510497"></a>
|
|
<a href="basic_syntax.html#boost_regex.syntax.basic_syntax.escapes">Escapes</a>
|
|
</h5>
|
|
<p>
|
|
With the exception of the escape sequences \{, \}, \(, and \), which are
|
|
documented above, an escape followed by any character matches that character.
|
|
This can be used to make the special characters
|
|
</p>
|
|
<pre class="programlisting">.[\*^$</pre>
|
|
<p>
|
|
"ordinary". Note that the escape character loses its special meaning
|
|
inside a character set, so <code class="computeroutput"><span class="special">[\^]</span></code>
|
|
will match either a literal '\' or a '^'.
|
|
</p>
|
|
<a name="boost_regex.syntax.basic_syntax.what_gets_matched"></a><h4>
|
|
<a name="id510554"></a>
|
|
<a href="basic_syntax.html#boost_regex.syntax.basic_syntax.what_gets_matched">What Gets
|
|
Matched</a>
|
|
</h4>
|
|
<p>
|
|
When there is more that one way to match a regular expression, the "best"
|
|
possible match is obtained using the <a href="leftmost_longest_rule.html" title="The Leftmost Longest Rule">leftmost-longest
|
|
rule</a>.
|
|
</p>
|
|
<a name="boost_regex.syntax.basic_syntax.variations"></a><h4>
|
|
<a name="id510594"></a>
|
|
<a href="basic_syntax.html#boost_regex.syntax.basic_syntax.variations">Variations</a>
|
|
</h4>
|
|
<a name="boost_regex.grep_syntax"></a><p>
|
|
</p>
|
|
<a name="boost_regex.syntax.basic_syntax.grep"></a><h5>
|
|
<a name="id510626"></a>
|
|
<a href="basic_syntax.html#boost_regex.syntax.basic_syntax.grep">Grep</a>
|
|
</h5>
|
|
<p>
|
|
When an expression is compiled with the flag <code class="computeroutput"><span class="identifier">grep</span></code>
|
|
set, then the expression is treated as a newline separated list of <a href="basic_syntax.html#boost_regex.posix_basic">POSIX-Basic expressions</a>, a match
|
|
is found if any of the expressions in the list match, for example:
|
|
</p>
|
|
<pre class="programlisting"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">regex</span> <span class="identifier">e</span><span class="special">(</span><span class="string">"abc\ndef"</span><span class="special">,</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">regex</span><span class="special">::</span><span class="identifier">grep</span><span class="special">);</span>
|
|
</pre>
|
|
<p>
|
|
will match either of the <a href="basic_syntax.html#boost_regex.posix_basic">POSIX-Basic
|
|
expressions</a> "abc" or "def".
|
|
</p>
|
|
<p>
|
|
As its name suggests, this behavior is consistent with the Unix utility grep.
|
|
</p>
|
|
<a name="boost_regex.syntax.basic_syntax.emacs"></a><h5>
|
|
<a name="id510770"></a>
|
|
<a href="basic_syntax.html#boost_regex.syntax.basic_syntax.emacs">emacs</a>
|
|
</h5>
|
|
<p>
|
|
In addition to the <a href="basic_syntax.html#boost_regex.posix_basic">POSIX-Basic features</a>
|
|
the following characters are also special:
|
|
</p>
|
|
<div class="informaltable"><table class="table">
|
|
<colgroup>
|
|
<col>
|
|
<col>
|
|
</colgroup>
|
|
<thead><tr>
|
|
<th>
|
|
<p>
|
|
Character
|
|
</p>
|
|
</th>
|
|
<th>
|
|
<p>
|
|
Description
|
|
</p>
|
|
</th>
|
|
</tr></thead>
|
|
<tbody>
|
|
<tr>
|
|
<td>
|
|
<p>
|
|
+
|
|
</p>
|
|
</td>
|
|
<td>
|
|
<p>
|
|
repeats the preceding atom one or more times.
|
|
</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>
|
|
?
|
|
</p>
|
|
</td>
|
|
<td>
|
|
<p>
|
|
repeats the preceding atom zero or one times.
|
|
</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>
|
|
*?
|
|
</p>
|
|
</td>
|
|
<td>
|
|
<p>
|
|
A non-greedy version of *.
|
|
</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>
|
|
+?
|
|
</p>
|
|
</td>
|
|
<td>
|
|
<p>
|
|
A non-greedy version of +.
|
|
</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>
|
|
??
|
|
</p>
|
|
</td>
|
|
<td>
|
|
<p>
|
|
A non-greedy version of ?.
|
|
</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table></div>
|
|
<p>
|
|
And the following escape sequences are also recognised:
|
|
</p>
|
|
<div class="informaltable"><table class="table">
|
|
<colgroup>
|
|
<col>
|
|
<col>
|
|
</colgroup>
|
|
<thead><tr>
|
|
<th>
|
|
<p>
|
|
Escape
|
|
</p>
|
|
</th>
|
|
<th>
|
|
<p>
|
|
Description
|
|
</p>
|
|
</th>
|
|
</tr></thead>
|
|
<tbody>
|
|
<tr>
|
|
<td>
|
|
<p>
|
|
\|
|
|
</p>
|
|
</td>
|
|
<td>
|
|
<p>
|
|
specifies an alternative.
|
|
</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>
|
|
\(?: ... )
|
|
</p>
|
|
</td>
|
|
<td>
|
|
<p>
|
|
is a non-marking grouping construct - allows you to lexically group
|
|
something without spitting out an extra sub-expression.
|
|
</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>
|
|
\w
|
|
</p>
|
|
</td>
|
|
<td>
|
|
<p>
|
|
matches any word character.
|
|
</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>
|
|
\W
|
|
</p>
|
|
</td>
|
|
<td>
|
|
<p>
|
|
matches any non-word character.
|
|
</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>
|
|
\sx
|
|
</p>
|
|
</td>
|
|
<td>
|
|
<p>
|
|
matches any character in the syntax group x, the following emacs
|
|
groupings are supported: 's', ' ', '_', 'w', '.', ')', '(', '"',
|
|
'\'', '>' and '<'. Refer to the emacs docs for details.
|
|
</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>
|
|
\Sx
|
|
</p>
|
|
</td>
|
|
<td>
|
|
<p>
|
|
matches any character not in the syntax grouping x.
|
|
</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>
|
|
\c and \C
|
|
</p>
|
|
</td>
|
|
<td>
|
|
<p>
|
|
These are not supported.
|
|
</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>
|
|
\`
|
|
</p>
|
|
</td>
|
|
<td>
|
|
<p>
|
|
matches zero characters only at the start of a buffer (or string
|
|
being matched).
|
|
</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>
|
|
\'
|
|
</p>
|
|
</td>
|
|
<td>
|
|
<p>
|
|
matches zero characters only at the end of a buffer (or string being
|
|
matched).
|
|
</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>
|
|
\b
|
|
</p>
|
|
</td>
|
|
<td>
|
|
<p>
|
|
matches zero characters at a word boundary.
|
|
</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>
|
|
\B
|
|
</p>
|
|
</td>
|
|
<td>
|
|
<p>
|
|
matches zero characters, not at a word boundary.
|
|
</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>
|
|
\<
|
|
</p>
|
|
</td>
|
|
<td>
|
|
<p>
|
|
matches zero characters only at the start of a word.
|
|
</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>
|
|
\>
|
|
</p>
|
|
</td>
|
|
<td>
|
|
<p>
|
|
matches zero characters only at the end of a word.
|
|
</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table></div>
|
|
<p>
|
|
Finally, you should note that emacs style regular expressions are matched
|
|
according to the <a href="perl_syntax.html#boost_regex.syntax.perl_syntax.what_gets_matched">Perl
|
|
"depth first search" rules</a>. Emacs expressions are matched
|
|
this way because they contain Perl-like extensions, that do not interact
|
|
well with the <a href="leftmost_longest_rule.html" title="The Leftmost Longest Rule">POSIX-style
|
|
leftmost-longest rule</a>.
|
|
</p>
|
|
<a name="boost_regex.syntax.basic_syntax.options"></a><h4>
|
|
<a name="id511266"></a>
|
|
<a href="basic_syntax.html#boost_regex.syntax.basic_syntax.options">Options</a>
|
|
</h4>
|
|
<p>
|
|
There are a <a href="../ref/syntax_option_type/syntax_option_type_basic.html" title="Options for POSIX Basic Regular Expressions">variety
|
|
of flags</a> that may be combined with the <code class="computeroutput"><span class="identifier">basic</span></code>
|
|
and <code class="computeroutput"><span class="identifier">grep</span></code> options when constructing
|
|
the regular expression, in particular note that the <a href="../ref/syntax_option_type/syntax_option_type_basic.html" title="Options for POSIX Basic Regular Expressions"><code class="computeroutput"><span class="identifier">newline_alt</span></code>, <code class="computeroutput"><span class="identifier">no_char_classes</span></code>,
|
|
<code class="computeroutput"><span class="identifier">no</span><span class="special">-</span><span class="identifier">intervals</span></code>, <code class="computeroutput"><span class="identifier">bk_plus_qm</span></code>
|
|
and <code class="computeroutput"><span class="identifier">bk_plus_vbar</span></code></a> options
|
|
all alter the syntax, while the <a href="../ref/syntax_option_type/syntax_option_type_basic.html" title="Options for POSIX Basic Regular Expressions"><code class="computeroutput"><span class="identifier">collate</span></code> and <code class="computeroutput"><span class="identifier">icase</span></code>
|
|
options</a> modify how the case and locale sensitivity are to be applied.
|
|
</p>
|
|
<a name="boost_regex.syntax.basic_syntax.references"></a><h4>
|
|
<a name="id511438"></a>
|
|
<a href="basic_syntax.html#boost_regex.syntax.basic_syntax.references">References</a>
|
|
</h4>
|
|
<p>
|
|
<a href="http://www.opengroup.org/onlinepubs/000095399/basedefs/xbd_chap09.html" target="_top">IEEE
|
|
Std 1003.1-2001, Portable Operating System Interface (POSIX ), Base Definitions
|
|
and Headers, Section 9, Regular Expressions (FWD.1).</a>
|
|
</p>
|
|
<p>
|
|
<a href="http://www.opengroup.org/onlinepubs/000095399/utilities/grep.html" target="_top">IEEE
|
|
Std 1003.1-2001, Portable Operating System Interface (POSIX ), Shells and
|
|
Utilities, Section 4, Utilities, grep (FWD.1).</a>
|
|
</p>
|
|
<p>
|
|
<a href="http://www.gnu.org/software/emacs/" target="_top">Emacs Version 21.3.</a>
|
|
</p>
|
|
</div>
|
|
<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
|
|
<td align="left"></td>
|
|
<td align="right"><div class="copyright-footer">Copyright © 1998 -2007 John Maddock<p>
|
|
Distributed under the Boost Software License, Version 1.0. (See accompanying
|
|
file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
|
|
</p>
|
|
</div></td>
|
|
</tr></table>
|
|
<hr>
|
|
<div class="spirit-nav">
|
|
<a accesskey="p" href="basic_extended.html"><img src="../../../../../../doc/html/images/prev.png" alt="Prev"></a><a accesskey="u" href="../syntax.html"><img src="../../../../../../doc/html/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../doc/html/images/home.png" alt="Home"></a><a accesskey="n" href="character_classes.html"><img src="../../../../../../doc/html/images/next.png" alt="Next"></a>
|
|
</div>
|
|
</body>
|
|
</html>
|