From f90d8c667e57d94e745cce57d5ce0c711ddbb5f0 Mon Sep 17 00:00:00 2001 From: John Maddock Date: Fri, 5 Mar 2004 11:32:34 +0000 Subject: [PATCH] Fixed typo [SVN r22438] --- doc/Attic/faq.html | 229 +++++++++++++++++++-------------------------- doc/faq.html | 229 +++++++++++++++++++-------------------------- 2 files changed, 190 insertions(+), 268 deletions(-) diff --git a/doc/Attic/faq.html b/doc/Attic/faq.html index c31e42da..99b3cd8a 100644 --- a/doc/Attic/faq.html +++ b/doc/Attic/faq.html @@ -1,153 +1,114 @@ - - -Boost.Regex: FAQ - - - - -

- - - - - - - -
-

-"C++

-
-

Boost.Regex

- -

FAQ

-
-

-"Boost.Regex

-
- -
-
- - -
- -

 Q. Why can't I use the "convenience" versions of -regex_match / regex_search / regex_grep / regex_format / -regex_merge?

- -

A. These versions may or may not be available depending upon the -capabilities of your compiler, the rules determining the format of -these functions are quite complex - and only the versions visible -to a standard compliant compiler are given in the help. To find out -what your compiler supports, run <boost/regex.hpp> through -your C++ pre-processor, and search the output file for the function -that you are interested in.

- -

Q. I can't get -regex++ to work with escape characters, what's going -on?

- -

A. If you embed regular expressions in C++ code, then remember -that escape characters are processed twice: once by the C++ -compiler, and once by the regex++ expression compiler, so to pass -the regular expression \d+ to regex++, you need to embed "\\d+" in -your code. Likewise to match a literal backslash you will need to -embed "\\\\" in your code.

- -

Q. Why does using parenthesis in a POSIX -regular expression change the result of a match?

- -

For POSIX (extended and basic) regular expressions, but not for -perl regexes, parentheses don't only mark; they determine what the -best match is as well. When the expression is compiled as a POSIX -basic or extended regex then Boost.regex follows the POSIX standard -leftmost longest rule for determining what matched. So if there is -more than one possible match after considering the whole -expression, it looks next at the first sub-expression and then the -second sub-expression and so on. So...

- -
+   
+      Boost.Regex: FAQ
+      
+      
+      
+   
+   
+      

+ + + + + + +
+

C++ Boost

+
+

Boost.Regex

+

FAQ

+
+

Boost.Regex Index

+
+
+
+
+ +

 Q. Why can't I + use the "convenience" versions of regex_match / regex_search / regex_grep / + regex_format / regex_merge?

+

A. These versions may or may not be available depending upon the capabilities + of your compiler, the rules determining the format of these functions are quite + complex - and only the versions visible to a standard compliant compiler are + given in the help. To find out what your compiler supports, run + <boost/regex.hpp> through your C++ pre-processor, and search the output + file for the function that you are interested in.

+

Q. I can't get regex++ to work with + escape characters, what's going on?

+

A. If you embed regular expressions in C++ code, then remember that escape + characters are processed twice: once by the C++ compiler, and once by the + regex++ expression compiler, so to pass the regular expression \d+ to regex++, + you need to embed "\\d+" in your code. Likewise to match a literal backslash + you will need to embed "\\\\" in your code. +

+

Q. Why does using parenthesis in a POSIX regular expression + change the result of a match?

+

For POSIX (extended and basic) regular expressions, but not for perl regexes, + parentheses don't only mark; they determine what the best match is as well. + When the expression is compiled as a POSIX basic or extended regex then + Boost.regex follows the POSIX standard leftmost longest rule for determining + what matched. So if there is more than one possible match after considering the + whole expression, it looks next at the first sub-expression and then the second + sub-expression and so on. So...

+
 "(0*)([0-9]*)" against "00123" would produce
 $1 = "00"
 $2 = "123"
 
- -

where as

- -
-"0*([0-9)*" against "00123" would produce
+      

where as

+
+"0*([0-9])*" against "00123" would produce
 $1 = "00123"
 
- -

If you think about it, had $1 only matched the "123", this would -be "less good" than the match "00123" which is both further to the -left and longer. If you want $1 to match only the "123" part, then -you need to use something like:

- -
+      

If you think about it, had $1 only matched the "123", this would be "less good" + than the match "00123" which is both further to the left and longer. If you + want $1 to match only the "123" part, then you need to use something like:

+
 "0*([1-9][0-9]*)"
 
- -

as the expression.

- -

Q. Why don't character ranges work -properly (POSIX mode only)?
- A. The POSIX standard specifies that character range expressions -are locale sensitive - so for example the expression [A-Z] will -match any collating element that collates between 'A' and 'Z'. That -means that for most locales other than "C" or "POSIX", [A-Z] would -match the single character 't' for example, which is not what most -people expect - or at least not what most people have come to -expect from regular expression engines. For this reason, the -default behaviour of boost.regex (perl mode) is to turn locale -sensitive collation off by not setting the regex_constants::collate -compile time flag. However if you set a non-default compile time -flag - for example regex_constants::extended or -regex_constants::basic, then locale dependent collation will be -enabled, this also applies to the POSIX API functions which use -either regex_constants::extended or regex_constants::basic -internally. [Note - when regex_constants::nocollate in effect, -the library behaves "as if" the LC_COLLATE locale category were -always "C", regardless of what its actually set to - end -note].

- -

Q. Why are there no throw specifications -on any of the functions? What exceptions can the library -throw?

- -

A. Not all compilers support (or honor) throw specifications, -others support them but with reduced efficiency. Throw -specifications may be added at a later date as compilers begin to -handle this better. The library should throw only three types of -exception: boost::bad_expression can be thrown by basic_regex when -compiling a regular expression, std::runtime_error can be thrown -when a call to basic_regex::imbue tries to open a message catalogue -that doesn't exist, or when a call to regex_search or regex_match -results in an "everlasting" search, or when a call to -RegEx::GrepFiles or RegEx::FindFiles tries to open a file that -cannot be opened, finally std::bad_alloc can be thrown by just -about any of the functions in this library.

- -

- -
+

as the expression.

+

Q. Why don't character ranges work properly (POSIX mode + only)?
+ A. The POSIX standard specifies that character range expressions are locale + sensitive - so for example the expression [A-Z] will match any collating + element that collates between 'A' and 'Z'. That means that for most locales + other than "C" or "POSIX", [A-Z] would match the single character 't' for + example, which is not what most people expect - or at least not what most + people have come to expect from regular expression engines. For this reason, + the default behaviour of boost.regex (perl mode) is to turn locale sensitive + collation off by not setting the regex_constants::collate compile time flag. + However if you set a non-default compile time flag - for example + regex_constants::extended or regex_constants::basic, then locale dependent + collation will be enabled, this also applies to the POSIX API functions which + use either regex_constants::extended or regex_constants::basic internally. [Note + - when regex_constants::nocollate in effect, the library behaves "as if" the + LC_COLLATE locale category were always "C", regardless of what its actually set + to - end note].

+

Q. Why are there no throw specifications on any of the + functions? What exceptions can the library throw?

+

A. Not all compilers support (or honor) throw specifications, others support + them but with reduced efficiency. Throw specifications may be added at a later + date as compilers begin to handle this better. The library should throw only + three types of exception: boost::bad_expression can be thrown by basic_regex + when compiling a regular expression, std::runtime_error can be thrown when a + call to basic_regex::imbue tries to open a message catalogue that doesn't + exist, or when a call to regex_search or regex_match results in an + "everlasting" search, or when a call to RegEx::GrepFiles or + RegEx::FindFiles tries to open a file that cannot be opened, finally + std::bad_alloc can be thrown by just about any of the functions in this + library.

+

+

Revised 24 Oct 2003

© Copyright John Maddock 1998- - - 2003

+ 2003

Use, modification and distribution are subject to the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)

- + - - diff --git a/doc/faq.html b/doc/faq.html index c31e42da..99b3cd8a 100644 --- a/doc/faq.html +++ b/doc/faq.html @@ -1,153 +1,114 @@ - - -Boost.Regex: FAQ - - - - -

- - - - - - - -
-

-"C++

-
-

Boost.Regex

- -

FAQ

-
-

-"Boost.Regex

-
- -
-
- - -
- -

 Q. Why can't I use the "convenience" versions of -regex_match / regex_search / regex_grep / regex_format / -regex_merge?

- -

A. These versions may or may not be available depending upon the -capabilities of your compiler, the rules determining the format of -these functions are quite complex - and only the versions visible -to a standard compliant compiler are given in the help. To find out -what your compiler supports, run <boost/regex.hpp> through -your C++ pre-processor, and search the output file for the function -that you are interested in.

- -

Q. I can't get -regex++ to work with escape characters, what's going -on?

- -

A. If you embed regular expressions in C++ code, then remember -that escape characters are processed twice: once by the C++ -compiler, and once by the regex++ expression compiler, so to pass -the regular expression \d+ to regex++, you need to embed "\\d+" in -your code. Likewise to match a literal backslash you will need to -embed "\\\\" in your code.

- -

Q. Why does using parenthesis in a POSIX -regular expression change the result of a match?

- -

For POSIX (extended and basic) regular expressions, but not for -perl regexes, parentheses don't only mark; they determine what the -best match is as well. When the expression is compiled as a POSIX -basic or extended regex then Boost.regex follows the POSIX standard -leftmost longest rule for determining what matched. So if there is -more than one possible match after considering the whole -expression, it looks next at the first sub-expression and then the -second sub-expression and so on. So...

- -
+   
+      Boost.Regex: FAQ
+      
+      
+      
+   
+   
+      

+ + + + + + +
+

C++ Boost

+
+

Boost.Regex

+

FAQ

+
+

Boost.Regex Index

+
+
+
+
+ +

 Q. Why can't I + use the "convenience" versions of regex_match / regex_search / regex_grep / + regex_format / regex_merge?

+

A. These versions may or may not be available depending upon the capabilities + of your compiler, the rules determining the format of these functions are quite + complex - and only the versions visible to a standard compliant compiler are + given in the help. To find out what your compiler supports, run + <boost/regex.hpp> through your C++ pre-processor, and search the output + file for the function that you are interested in.

+

Q. I can't get regex++ to work with + escape characters, what's going on?

+

A. If you embed regular expressions in C++ code, then remember that escape + characters are processed twice: once by the C++ compiler, and once by the + regex++ expression compiler, so to pass the regular expression \d+ to regex++, + you need to embed "\\d+" in your code. Likewise to match a literal backslash + you will need to embed "\\\\" in your code. +

+

Q. Why does using parenthesis in a POSIX regular expression + change the result of a match?

+

For POSIX (extended and basic) regular expressions, but not for perl regexes, + parentheses don't only mark; they determine what the best match is as well. + When the expression is compiled as a POSIX basic or extended regex then + Boost.regex follows the POSIX standard leftmost longest rule for determining + what matched. So if there is more than one possible match after considering the + whole expression, it looks next at the first sub-expression and then the second + sub-expression and so on. So...

+
 "(0*)([0-9]*)" against "00123" would produce
 $1 = "00"
 $2 = "123"
 
- -

where as

- -
-"0*([0-9)*" against "00123" would produce
+      

where as

+
+"0*([0-9])*" against "00123" would produce
 $1 = "00123"
 
- -

If you think about it, had $1 only matched the "123", this would -be "less good" than the match "00123" which is both further to the -left and longer. If you want $1 to match only the "123" part, then -you need to use something like:

- -
+      

If you think about it, had $1 only matched the "123", this would be "less good" + than the match "00123" which is both further to the left and longer. If you + want $1 to match only the "123" part, then you need to use something like:

+
 "0*([1-9][0-9]*)"
 
- -

as the expression.

- -

Q. Why don't character ranges work -properly (POSIX mode only)?
- A. The POSIX standard specifies that character range expressions -are locale sensitive - so for example the expression [A-Z] will -match any collating element that collates between 'A' and 'Z'. That -means that for most locales other than "C" or "POSIX", [A-Z] would -match the single character 't' for example, which is not what most -people expect - or at least not what most people have come to -expect from regular expression engines. For this reason, the -default behaviour of boost.regex (perl mode) is to turn locale -sensitive collation off by not setting the regex_constants::collate -compile time flag. However if you set a non-default compile time -flag - for example regex_constants::extended or -regex_constants::basic, then locale dependent collation will be -enabled, this also applies to the POSIX API functions which use -either regex_constants::extended or regex_constants::basic -internally. [Note - when regex_constants::nocollate in effect, -the library behaves "as if" the LC_COLLATE locale category were -always "C", regardless of what its actually set to - end -note].

- -

Q. Why are there no throw specifications -on any of the functions? What exceptions can the library -throw?

- -

A. Not all compilers support (or honor) throw specifications, -others support them but with reduced efficiency. Throw -specifications may be added at a later date as compilers begin to -handle this better. The library should throw only three types of -exception: boost::bad_expression can be thrown by basic_regex when -compiling a regular expression, std::runtime_error can be thrown -when a call to basic_regex::imbue tries to open a message catalogue -that doesn't exist, or when a call to regex_search or regex_match -results in an "everlasting" search, or when a call to -RegEx::GrepFiles or RegEx::FindFiles tries to open a file that -cannot be opened, finally std::bad_alloc can be thrown by just -about any of the functions in this library.

- -

- -
+

as the expression.

+

Q. Why don't character ranges work properly (POSIX mode + only)?
+ A. The POSIX standard specifies that character range expressions are locale + sensitive - so for example the expression [A-Z] will match any collating + element that collates between 'A' and 'Z'. That means that for most locales + other than "C" or "POSIX", [A-Z] would match the single character 't' for + example, which is not what most people expect - or at least not what most + people have come to expect from regular expression engines. For this reason, + the default behaviour of boost.regex (perl mode) is to turn locale sensitive + collation off by not setting the regex_constants::collate compile time flag. + However if you set a non-default compile time flag - for example + regex_constants::extended or regex_constants::basic, then locale dependent + collation will be enabled, this also applies to the POSIX API functions which + use either regex_constants::extended or regex_constants::basic internally. [Note + - when regex_constants::nocollate in effect, the library behaves "as if" the + LC_COLLATE locale category were always "C", regardless of what its actually set + to - end note].

+

Q. Why are there no throw specifications on any of the + functions? What exceptions can the library throw?

+

A. Not all compilers support (or honor) throw specifications, others support + them but with reduced efficiency. Throw specifications may be added at a later + date as compilers begin to handle this better. The library should throw only + three types of exception: boost::bad_expression can be thrown by basic_regex + when compiling a regular expression, std::runtime_error can be thrown when a + call to basic_regex::imbue tries to open a message catalogue that doesn't + exist, or when a call to regex_search or regex_match results in an + "everlasting" search, or when a call to RegEx::GrepFiles or + RegEx::FindFiles tries to open a file that cannot be opened, finally + std::bad_alloc can be thrown by just about any of the functions in this + library.

+

+

Revised 24 Oct 2003

© Copyright John Maddock 1998- - - 2003

+ 2003

Use, modification and distribution are subject to the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)

- + - -