diff --git a/appendix.htm b/appendix.htm index 9acac550..a525dbde 100644 --- a/appendix.htm +++ b/appendix.htm @@ -13,25 +13,25 @@ content="C:\PROGRAM FILES\MICROSOFT OFFICE\OFFICE\html.dot">
-
Regex++, - Appendices.-(version 3.12, 18 April 2000) -Copyright (c) 1998-2000 -Dr John Maddock - -Permission to use, copy, modify, distribute and sell this software -and its documentation for any purpose is hereby granted without fee, -provided that the above copyright notice appear in all copies and -that both that copyright notice and this permission notice appear -in supporting documentation. Dr John Maddock makes no representations -about the suitability of this software for any purpose. -It is provided "as is" without express or implied warranty.+ | Regex++, Appendices.+(Version 3.20, 29th Sept 2001) + +Copyright (c) 1998-2001 +Dr John Maddock +Permission to use, copy, modify, + distribute and sell this software and its documentation + for any purpose is hereby granted without fee, provided + that the above copyright notice appear in all copies and + that both that copyright notice and this permission + notice appear in supporting documentation. Dr John + Maddock makes no representations about the suitability of + this software for any purpose. It is provided "as is" + without express or implied warranty. |
-
Regex++, - FAQ.-(version 3.12, 18 April 2000) -Copyright (c) 1998-2000 -Dr John Maddock - -Permission to use, copy, modify, distribute and sell this software -and its documentation for any purpose is hereby granted without fee, -provided that the above copyright notice appear in all copies and -that both that copyright notice and this permission notice appear -in supporting documentation. Dr John Maddock makes no representations -about the suitability of this software for any purpose. -It is provided "as is" without express or implied warranty.+ | Regex++, FAQ.+(Version 3.20, 29th Sept 2001) + +Copyright (c) 1998-2001 +Dr John Maddock +Permission to use, copy, modify, + distribute and sell this software and its documentation + for any purpose is hereby granted without fee, provided + that the above copyright notice appear in all copies and + that both that copyright notice and this permission + notice appear in supporting documentation. Dr John + Maddock makes no representations about the suitability of + this software for any purpose. It is provided "as is" + without express or implied warranty. |
-
- |
- Regex++, Format String Reference.-Copyright (c) 1998-2000 -Dr John Maddock + + + + ++ - |
-
There are three kind of format string: -sed, perl and extended, the extended syntax is the default so this is covered -first.
-Extended format syntax
-In format strings, all characters are treated as literals except: ()$\?: -
-To use any of these as literals you must prefix them with the escape -character \
-The following special sequences are recognized:
-
-
Grouping:
-Use the parenthesis characters ( and ) to group sub-expressions within the
-format string, use \( and \) to represent literal '(' and ')'.
-
-
Sub-expression expansions:
-The following perl like expressions expand to a particular matched
-sub-expression:
-
- | $` | -Expands to all the text from the end of the -previous match to the start of the current match, if there was no previous -match in the current operation, then everything from the start of the input -string to the start of the match. | -- |
- | $' | -Expands to all the text from the end of the match -to the end of the input string. | -- |
- | $& | -Expands to all of the current match. | -- |
- | $0 | -Expands to all of the current match. | -- |
- | $N | -Expands to the text that matched sub-expression -N. | -- |
Conditional expressions:
-Conditional expressions allow two different format strings to be selected -dependent upon whether a sub-expression participated in the match or not:
-?Ntrue_expression:false_expression
-Executes true_expression if sub-expression N participated in the -match, otherwise executes false_expression.
-Example: suppose we search for "(while)|(for)" then the format
-string "?1WHILE:FOR" would output what matched, but in upper case.
-
-
-
Escape sequences:
-The following escape sequences are also allowed:
-
- | \a | -The bell character. | -- |
- | \f | -The form feed character. | -- |
- | \n | -The newline character. | -- |
- | \r | -The carriage return character. | -- |
- | \t | -The tab character. | -- |
- | \v | -A vertical tab character. | -- |
- | \x | -A hexadecimal character - for example \x0D. | -- |
- | \x{} | -A possible unicode hexadecimal character - for -example \x{1A0} | -- |
- | \cx | -The ASCII escape character x, for example \c@ is -equivalent to escape-@. | -- |
- | \e | -The ASCII escape character. | -- |
- | \dd | -An octal character constant, for example \10. | -- |
Perl format strings
-Perl format strings are the same as the default syntax except that the -characters ()?: have no special meaning.
-Sed format strings
-Sed format strings use only the characters \ and & as special -characters.
-\n where n is a digit, is expanded to the nth sub-expression.
-& is expanded to the whole of the match (equivalent to \0).
-Other escape sequences are expanded as per the default syntax.
-
Copyright Dr John -Maddock 1998-2000 all rights reserved.
- - ++
Regex++, Format + String Reference.+(Version 3.20, 29th Sept 2001) + +Copyright (c) 1998-2001 +Dr John Maddock +Permission to use, copy, modify, + distribute and sell this software and its documentation + for any purpose is hereby granted without fee, provided + that the above copyright notice appear in all copies and + that both that copyright notice and this permission + notice appear in supporting documentation. Dr John + Maddock makes no representations about the suitability of + this software for any purpose. It is provided "as is" + without express or implied warranty. + |
+
Format strings are used by the algorithms regex_format and regex_merge, and are +used to transform one string into another.
+ +There are three kind of format string: sed, perl and extended, +the extended syntax is the default so this is covered first.
+ +Extended format syntax
+ +In format strings, all characters are treated as literals +except: ()$\?:
+ +To use any of these as literals you must prefix them with the +escape character \
+ +The following special sequences are recognized:
+
+
Grouping:
+ +Use the parenthesis characters ( and ) to group sub-expressions
+within the format string, use \( and \) to represent literal '('
+and ')'.
+
+
Sub-expression expansions:
+ +The following perl like expressions expand to a particular
+matched sub-expression:
+
+ | $` | +Expands to all the text from + the end of the previous match to the start of the current + match, if there was no previous match in the current + operation, then everything from the start of the input + string to the start of the match. | ++ |
+ | $' | +Expands to all the text from + the end of the match to the end of the input string. | ++ |
+ | $& | +Expands to all of the + current match. | ++ |
+ | $0 | +Expands to all of the + current match. | ++ |
+ | $N | +Expands to the text that + matched sub-expression N. | ++ |
+
Conditional expressions:
+ +Conditional expressions allow two different format strings to +be selected dependent upon whether a sub-expression participated +in the match or not:
+ +?Ntrue_expression:false_expression
+ +Executes true_expression if sub-expression N +participated in the match, otherwise executes false_expression.
+ +Example: suppose we search for "(while)|(for)" then
+the format string "?1WHILE:FOR" would output what
+matched, but in upper case.
+
+
Escape sequences:
+ +The following escape sequences are also allowed:
+
+ | \a | +The bell character. | ++ |
+ | \f | +The form feed character. | ++ |
+ | \n | +The newline character. | ++ |
+ | \r | +The carriage return + character. | ++ |
+ | \t | +The tab character. | ++ |
+ | \v | +A vertical tab character. | ++ |
+ | \x | +A hexadecimal character - + for example \x0D. | ++ |
+ | \x{} | +A possible unicode + hexadecimal character - for example \x{1A0} | ++ |
+ | \cx | +The ASCII escape character + x, for example \c@ is equivalent to escape-@. | ++ |
+ | \e | +The ASCII escape character. | ++ |
+ | \dd | +An octal character constant, + for example \10. | ++ |
+
Perl format strings
+ +Perl format strings are the same as the default syntax except +that the characters ()?: have no special meaning.
+ +Sed format strings
+ +Sed format strings use only the characters \ and & as +special characters.
+ +\n where n is a digit, is expanded to the nth sub-expression.
+ +& is expanded to the whole of the match (equivalent to \0). +
+ +Other escape sequences are expanded as per the default syntax.
+
+
Copyright Dr +John Maddock 1998-2000 all rights reserved.
+ + diff --git a/hl_ref.htm b/hl_ref.htm index 7f1b1996..aa6df2c7 100644 --- a/hl_ref.htm +++ b/hl_ref.htm @@ -15,23 +15,24 @@ content="C:\PROGRAM FILES\MICROSOFT OFFICE\OFFICE\html.dot">Regex++, - RegEx Class Reference.-(version 3.12, 18 April 2000) -Copyright (c) 1998-2000 -Dr John Maddock - -Permission to use, copy, modify, distribute and sell this software -and its documentation for any purpose is hereby granted without fee, -provided that the above copyright notice appear in all copies and -that both that copyright notice and this permission notice appear -in supporting documentation. Dr John Maddock makes no representations -about the suitability of this software for any purpose. -It is provided "as is" without express or implied warranty.+ | Regex++, RegEx Class + Reference.+(Version 3.20, 29th Sept 2001) + +Copyright (c) 1998-2001 +Dr John Maddock +Permission to use, copy, modify, + distribute and sell this software and its documentation + for any purpose is hereby granted without fee, provided + that the above copyright notice appear in all copies and + that both that copyright notice and this permission + notice appear in supporting documentation. Dr John + Maddock makes no representations about the suitability of + this software for any purpose. It is provided "as is" + without express or implied warranty. |
Regex++, - Index.-(version 3.12, 18 April 2000) -Copyright (c) 1998-2000 -Dr John Maddock - -Permission to use, copy, modify, distribute and sell this software -and its documentation for any purpose is hereby granted without fee, -provided that the above copyright notice appear in all copies and -that both that copyright notice and this permission notice appear -in supporting documentation. Dr John Maddock makes no representations -about the suitability of this software for any purpose. -It is provided "as is" without express or implied warranty.+ | Regex++, Index.+(Version 3.20, 29th Sept 2001) + +Copyright (c) 1998-2001 +Dr John Maddock +Permission to use, copy, modify, + distribute and sell this software and its documentation + for any purpose is hereby granted without fee, provided + that the above copyright notice appear in all copies and + that both that copyright notice and this permission + notice appear in supporting documentation. Dr John + Maddock makes no representations about the suitability of + this software for any purpose. It is provided "as is" + without express or implied warranty. |
-
Regex++, - Introduction.-(version 3.12, 18 April 2000) -Copyright (c) 1998-2000 -Dr John Maddock - -Permission to use, copy, modify, distribute and sell this software -and its documentation for any purpose is hereby granted without fee, -provided that the above copyright notice appear in all copies and -that both that copyright notice and this permission notice appear -in supporting documentation. Dr John Maddock makes no representations -about the suitability of this software for any purpose. -It is provided "as is" without express or implied warranty.+ | Regex++, Introduction.+(Version 3.20, 29th Sept 2001) + +Copyright (c) 1998-2001 +Dr John Maddock +Permission to use, copy, modify, + distribute and sell this software and its documentation + for any purpose is hereby granted without fee, provided + that the above copyright notice appear in all copies and + that both that copyright notice and this permission + notice appear in supporting documentation. Dr John + Maddock makes no representations about the suitability of + this software for any purpose. It is provided "as is" + without express or implied warranty. |
-
- |
- Regex++, POSIX API Reference.-Copyright (c) 1998-2000 -Dr John Maddock + + + + ++ - |
-
#include <boost/cregex.hpp> -or: -#include <boost/regex.h>+
-
The following functions are available for users who need a POSIX compatible -C library, they are available in both Unicode and narrow character versions, -the standard POSIX API names are macros that expand to one version or the other -depending upon whether UNICODE is defined or not.
-Important: Note that all the symbols defined here are enclosed inside -namespace boost when used in C++ programs, unless you use #include -<boost/regex.h> instead - in which case the symbols are still defined in -namespace boost, but are made available in the global namespace as well.
-The functions are defined as:
-extern "C" { -int regcompA(regex_tA*, const char*, int); -unsigned int regerrorA(int, const regex_tA*, char*, unsigned int); -int regexecA(const regex_tA*, const char*, unsigned int, regmatch_t*, int); -void regfreeA(regex_tA*); +
Regex++, POSIX API + Reference.+(Version 3.20, 29th Sept 2001) + +Copyright (c) 1998-2001 +Dr John Maddock +Permission to use, copy, modify, + distribute and sell this software and its documentation + for any purpose is hereby granted without fee, provided + that the above copyright notice appear in all copies and + that both that copyright notice and this permission + notice appear in supporting documentation. Dr John + Maddock makes no representations about the suitability of + this software for any purpose. It is provided "as is" + without express or implied warranty. + |
+
#include <boost/cregex.hpp> +or: +#include <boost/regex.h>+ +
The following functions are available for users who need a +POSIX compatible C library, they are available in both Unicode +and narrow character versions, the standard POSIX API names are +macros that expand to one version or the other depending upon +whether UNICODE is defined or not.
+ +Important: Note that all the symbols defined here are +enclosed inside namespace boost when used in C++ programs, +unless you use #include <boost/regex.h> instead - in which +case the symbols are still defined in namespace boost, but are +made available in the global namespace as well.
+ +The functions are defined as:
+ +extern "C" { +int regcompA(regex_tA*, const char*, int); +unsigned int regerrorA(int, const regex_tA*, char*, unsigned int); +int regexecA(const regex_tA*, const char*, unsigned int, regmatch_t*, int); +void regfreeA(regex_tA*); + +int regcompW(regex_tW*, const wchar_t*, int); +unsigned int regerrorW(int, const regex_tW*, wchar_t*, unsigned int); +int regexecW(const regex_tW*, const wchar_t*, unsigned int, regmatch_t*, int); +void regfreeW(regex_tW*); #ifdef UNICODE #define regcomp regcompW @@ -75,203 +83,234 @@ namespace boost, but are made available in the global namespace as well. #define regfree regfreeA #define regex_t regex_tA #endif -}-All the functions operate on structure regex_t, which exposes two public -members: -
unsigned int re_nsub this is filled in by regcomp and -indicates the number of sub-expressions contained in the regular expression. -
-const TCHAR* re_endp points to the end of the expression to compile -when the flag REG_PEND is set.
-Footnote: regex_t is actually a #define - it is either regex_tA or -regex_tW depending upon whether UNICODE is defined or not, TCHAR is either char -or wchar_t again depending upon the macro UNICODE.
-regcomp takes a pointer to a regex_t, a pointer to the
-expression to compile and a flags parameter which can be a combination of:
-
- | REG_EXTENDED | -Compiles modern regular expressions. Equivalent to -regbase::char_classes | regbase::intervals | regbase::bk_refs. | -- |
- | REG_BASIC | -Compiles basic (obsolete) regular expression -syntax. Equivalent to regbase::char_classes | regbase::intervals | -regbase::limited_ops | regbase::bk_braces | regbase::bk_parens | -regbase::bk_refs. | -- |
- | REG_NOSPEC | -All characters are ordinary, the expression is a -literal string. | -- |
- | REG_ICASE | -Compiles for matching that ignores character -case. | -- |
- | REG_NOSUB | -Has no effect in this library. | -- |
- | REG_NEWLINE | -When this flag is set a dot does not match the -newline character. | -- |
- | REG_PEND | -When this flag is set the re_endp parameter of the -regex_t structure must point to the end of the regular expression to -compile. | -- |
- | REG_NOCOLLATE | -When this flag is set then locale dependent -collation for character ranges is turned off. | -- |
- | REG_ESCAPE_IN_LISTS -, , , |
-When this flag is set, then escape sequences are -permitted in bracket expressions (character sets). | -- |
- | REG_NEWLINE_ALT | -When this flag is set then the newline character -is equivalent to the alternation operator |. | -- |
- | REG_PERL | -A shortcut for perl-like behavior: -REG_EXTENDED | REG_NOCOLLATE | REG_ESCAPE_IN_LISTS | -- |
- | REG_AWK | -A shortcut for awk-like behavior: REG_EXTENDED | -REG_ESCAPE_IN_LISTS | -- |
- | REG_GREP | -A shortcut for grep like behavior: REG_BASIC | -REG_NEWLINE_ALT | -- |
- | REG_EGREP | -A shortcut for egrep like behavior: -REG_EXTENDED | REG_NEWLINE_ALT | -- |
regerror takes the following parameters, it maps an error code
-to a human readable string:
-
- | int code | -The error code. | -- |
- | const regex_t* e | -The regular expression (can be null). | -- |
- | char* buf | -The buffer to fill in with the error message. | -- |
- | unsigned int buf_size | -The length of buf. | -- |
If the error code is OR'ed with REG_ITOA then the message that results is -the printable name of the code rather than a message, for example -"REG_BADPAT". If the code is REG_ATIO then e must not be null -and e->re_pend must point to the printable name of an error code, the -return value is then the value of the error code. For any other value of -code, the return value is the number of characters in the error message, -if the return value is greater than or equal to buf_size then -regerror will have to be called again with a larger buffer.
-regexec finds the first occurrence of expression e within
-string buf. If len is non-zero then *m is filled in with
-what matched the regular expression, m[0] contains what matched the
-whole string, m[1] the first sub-expression etc, see regmatch_t
-in the header file declaration for more details. The eflags parameter
-can be a combination of:
-
- | REG_NOTBOL | -Parameter buf does not represent the start -of a line. | -- |
- | REG_NOTEOL | -Parameter buf does not terminate at the end -of a line. | -- |
- | REG_STARTEND | -The string searched starts at buf + -pmatch[0].rm_so and ends at buf + pmatch[0].rm_eo. | -- |
Finally regfree frees all the memory that was allocated by -regcomp.
-Footnote: this is an abridged reference to the POSIX API functions, it is
-provided for compatibility with other libraries, rather than an API to be used
-in new code (unless you need access from a language other than C++). This
-version of these functions should also happily coexist with other versions, as
-the names used are macros that expand to the actual function names.
-
Copyright Dr John -Maddock 1998-2000 all rights reserved.
- - +} +All the functions operate on structure regex_t, which +exposes two public members:
+ +unsigned int re_nsub this is filled in by regcomp +and indicates the number of sub-expressions contained in the +regular expression.
+ +const TCHAR* re_endp points to the end of the +expression to compile when the flag REG_PEND is set.
+ +Footnote: regex_t is actually a #define - it is either +regex_tA or regex_tW depending upon whether UNICODE is defined or +not, TCHAR is either char or wchar_t again depending upon the +macro UNICODE.
+ +regcomp takes a pointer to a regex_t, a pointer
+to the expression to compile and a flags parameter which can be a
+combination of:
+
+ | REG_EXTENDED | +Compiles modern regular + expressions. Equivalent to regbase::char_classes | + regbase::intervals | regbase::bk_refs. | ++ |
+ | REG_BASIC | +Compiles basic (obsolete) + regular expression syntax. Equivalent to regbase::char_classes + | regbase::intervals | regbase::limited_ops | regbase::bk_braces + | regbase::bk_parens | regbase::bk_refs. | ++ |
+ | REG_NOSPEC | +All characters are ordinary, + the expression is a literal string. | ++ |
+ | REG_ICASE | +Compiles for matching that + ignores character case. | ++ |
+ | REG_NOSUB | +Has no effect in this + library. | ++ |
+ | REG_NEWLINE | +When this flag is set a dot + does not match the newline character. | ++ |
+ | REG_PEND | +When this flag is set the + re_endp parameter of the regex_t structure must point to + the end of the regular expression to compile. | ++ |
+ | REG_NOCOLLATE | +When this flag is set then + locale dependent collation for character ranges is turned + off. | ++ |
+ | REG_ESCAPE_IN_LISTS + , , , |
+ When this flag is set, then + escape sequences are permitted in bracket expressions (character + sets). | ++ |
+ | REG_NEWLINE_ALT | +When this flag is set then + the newline character is equivalent to the alternation + operator |. | ++ |
+ | REG_PERL | +A shortcut for perl-like + behavior: REG_EXTENDED | REG_NOCOLLATE | + REG_ESCAPE_IN_LISTS | ++ |
+ | REG_AWK | +A shortcut for awk-like + behavior: REG_EXTENDED | REG_ESCAPE_IN_LISTS | ++ |
+ | REG_GREP | +A shortcut for grep like + behavior: REG_BASIC | REG_NEWLINE_ALT | ++ |
+ | REG_EGREP | +A shortcut for egrep + like behavior: REG_EXTENDED | REG_NEWLINE_ALT | ++ |
+
regerror takes the following parameters, it maps an
+error code to a human readable string:
+
+ | int code | +The error code. | ++ |
+ | const regex_t* e | +The regular expression (can + be null). | ++ |
+ | char* buf | +The buffer to fill in with + the error message. | ++ |
+ | unsigned int buf_size | +The length of buf. | ++ |
If the error code is OR'ed with REG_ITOA then the message that +results is the printable name of the code rather than a message, +for example "REG_BADPAT". If the code is REG_ATIO then e +must not be null and e->re_pend must point to the +printable name of an error code, the return value is then the +value of the error code. For any other value of code, the +return value is the number of characters in the error message, if +the return value is greater than or equal to buf_size then +regerror will have to be called again with a larger buffer.
+ +regexec finds the first occurrence of expression e
+within string buf. If len is non-zero then *m
+is filled in with what matched the regular expression, m[0]
+contains what matched the whole string, m[1] the first sub-expression
+etc, see regmatch_t in the header file declaration for
+more details. The eflags parameter can be a combination of:
+
+
+ | REG_NOTBOL | +Parameter buf does + not represent the start of a line. | ++ |
+ | REG_NOTEOL | +Parameter buf does + not terminate at the end of a line. | ++ |
+ | REG_STARTEND | +The string searched starts + at buf + pmatch[0].rm_so and ends at buf + pmatch[0].rm_eo. | ++ |
+
Finally regfree frees all the memory that was allocated +by regcomp.
+ +Footnote: this is an abridged reference to the POSIX API
+functions, it is provided for compatibility with other libraries,
+rather than an API to be used in new code (unless you need access
+from a language other than C++). This version of these functions
+should also happily coexist with other versions, as the names
+used are macros that expand to the actual function names.
+
Copyright Dr +John Maddock 1998-2000 all rights reserved.
+ + diff --git a/src/c_regex_traits.cpp b/src/c_regex_traits.cpp index 2a4955ef..53eb4d04 100644 --- a/src/c_regex_traits.cpp +++ b/src/c_regex_traits.cpp @@ -16,7 +16,7 @@ /* * LOCATION: see http://www.boost.org for most recent version. * FILE c_regex_traits.cpp - * VERSION 3.12 + * VERSION seeRegex++, - Regular Expression Syntax.-(version 3.12, 18 April 2000) -Copyright (c) 1998-2000 -Dr John Maddock - -Permission to use, copy, modify, distribute and sell this software -and its documentation for any purpose is hereby granted without fee, -provided that the above copyright notice appear in all copies and -that both that copyright notice and this permission notice appear -in supporting documentation. Dr John Maddock makes no representations -about the suitability of this software for any purpose. -It is provided "as is" without express or implied warranty.+ | Regex++, Regular + Expression Syntax.+(Version 3.20, 29th Sept 2001) + +Copyright (c) 1998-2001 +Dr John Maddock +Permission to use, copy, modify, + distribute and sell this software and its documentation + for any purpose is hereby granted without fee, provided + that the above copyright notice appear in all copies and + that both that copyright notice and this permission + notice appear in supporting documentation. Dr John + Maddock makes no representations about the suitability of + this software for any purpose. It is provided "as is" + without express or implied warranty. |
-
![]() |
- Regex++, - Template Class and Algorithm Reference.-(version 3.12, 18 April 2000) -Copyright (c) 1998-9 -Dr John Maddock - -Permission to use, copy, modify, distribute and sell this software -and its documentation for any purpose is hereby granted without fee, -provided that the above copyright notice appear in all copies and -that both that copyright notice and this permission notice appear -in supporting documentation. Dr John Maddock makes no representations -about the suitability of this software for any purpose. -It is provided "as is" without express or implied warranty.+ | Regex++ template + class reference.+(Version 3.20, 29th Sept 2001) + +Copyright (c) 1998-2001 +Dr John Maddock +Permission to use, copy, modify, + distribute and sell this software and its documentation + for any purpose is hereby granted without fee, provided + that the above copyright notice appear in all copies and + that both that copyright notice and this permission + notice appear in supporting documentation. Dr John + Maddock makes no representations about the suitability of + this software for any purpose. It is provided "as is" + without express or implied warranty. |
Where "base_type" defaults to w32_regex_traits on Win32 systems, and c_regex_traits otherwise. The default behaviour can be changed by defining one of -BOOST_REGEX_USE_C_LOCALE (forces use of c_regex_traits by default), -or BOOST_REGEX_USE_CPP_LOCALE (forces use of cpp_regex_traits by -default). Alternatively a specific traits class can be passed to -the reg_expression template.
+BOOST_REGEX_USE_C_LOCALE (forces use of c_regex_traits by +default), or BOOST_REGEX_USE_CPP_LOCALE (forces use of cpp_regex_traits +by default). Alternatively a specific traits class can be passed +to the reg_expression template.The requirements for custom traits classes are documented separately here....
diff --git a/test/c_compiler_checks/posix_api_check.c b/test/c_compiler_checks/posix_api_check.c index c6731ef7..1d999f92 100644 --- a/test/c_compiler_checks/posix_api_check.c +++ b/test/c_compiler_checks/posix_api_check.c @@ -16,7 +16,7 @@ /* * LOCATION: see http://www.boost.org for most recent version. * FILE posix_api_compiler_check.c - * VERSION 3.01 + * VERSION seeRegex++, Traits Class -Reference. (version 3.12, 18 April 2000)--Copyright (c) 1998-2000 -Dr John Maddock + + + + + ++ - |
-
This section describes the traits class requirements of the reg_expression -template class, these requirements are somewhat complex (sorry), and subject to -change as uses ask for new features, however I will try to keep them stable for -a while, and ideally the requirements should lessen rather than increase.
-The reg_expression traits classes encapsulate both the properties of -a character type, and the properties of the locale associated with that type. -The associated locale may be defined at run-time (via std::locale), or -hard-coded into the traits class and determined at compile time.
-The following example class illustrates the interface required by a -"typical" traits class for use with class reg_expression:
-+
Regex++, Traits Class + Reference.+(Version 3.20, 29th Sept 2001) + +Copyright (c) 1998-2001 +Dr John Maddock +Permission to use, copy, modify, + distribute and sell this software and its documentation + for any purpose is hereby granted without fee, provided + that the above copyright notice appear in all copies and + that both that copyright notice and this permission + notice appear in supporting documentation. Dr John + Maddock makes no representations about the suitability of + this software for any purpose. It is provided "as is" + without express or implied warranty. + |
+
This section describes the traits class requirements of the +reg_expression template class, these requirements are somewhat +complex (sorry), and subject to change as uses ask for new +features, however I will try to keep them stable for a while, and +ideally the requirements should lessen rather than increase.
+ +The reg_expression traits classes encapsulate both the +properties of a character type, and the properties of the locale +associated with that type. The associated locale may be defined +at run-time (via std::locale), or hard-coded into the traits +class and determined at compile time.
+ +The following example class illustrates the interface required +by a "typical" traits class for use with class +reg_expression:
+ +class mytraits { typedef implementation_defined char_type; @@ -161,711 +169,839 @@ class mytraits mytraits(); ~mytraits(); }; -+ -
The member types required by a traits class are defined as follows:
-
- | Member name | -Description | -- |
- | char_type | -The character type encapsulated -by this traits class, must be a POD type, and be convertible to uchar_type. - | -- |
- | uchar_type | -The unsigned type corresponding -to char_type, must be convertible to size_type. | -- |
- | size_type | -An unsigned integral type, with -at least as much precision as uchar_type. | -- |
- | string_type | -A type that offers the same -facilities as std::basic_string<char_type. This is used for collating -elements, and sort strings, if char_type has no locale dependent collation (it -is not a "character"), then it could be something simpler than -std::basic_string. | -- |
- | locale_type | -A type that encapsulates the -locale used by the traits class, probably std::locale but could be a platform -specific type, or a dummy type if per-instance locales are not supported by the -traits class. | -- |
- | uint32_t | -An unsigned integral type with -at least 32-bits of precision, used as a bitmask type for character -classification. | -- |
- | sentry | -A class or struct type which is
-constructible from an instance of the traits class, and is convertible to
-void*. An instance of type sentry will be constructed before compiling each
-regular expression, it provides an opportunity to carry out prefix/suffix
-operations on the traits class. For example a traits class that -encapsulates the global locale, can use this as an opportunity to synchronize -with the global locale (by updating any cached data). - |
-- |
- The following member constants are used to represent the locale
-independent syntax of a regular expression; the member function
-syntax_type returns one of these values, and is used to convert a locale
-dependent regular expression, into a locale-independent sequence of tokens.
-
- | Member constant | -English language -representation | -- |
- | syntax_char | -All non-special -characters. | -- |
- | syntax_open_bracket | -( | -- |
- | syntax_close_bracket | -) | -- |
- | syntax_dollar | -$ | -- |
- | syntax_caret | -^ | -- |
- | syntax_dot | -. | -- |
- | syntax_star | -* | -- |
- | syntax_plus | -+ | -- |
- | syntax_question | -? | -- |
- | syntax_open_set | -[ | -- |
- | syntax_close_set | -] | -- |
- | syntax_or | -| | -- |
- | syntax_slash | -\ | -- |
- | syntax_hash | -# | -- |
- | syntax_dash | -- | -- |
- | syntax_open_brace | -{ | -- |
- | syntax_close_brace | -} | -- |
- | syntax_digit | -0123456789 | -- |
- | syntax_b | -b | -- |
- | syntax_B | -B | -- |
- | syntax_left_word | -< | -- |
- | syntax_right_word | -- | - |
- | syntax_w | -w | -- |
- | syntax_W | -W | -- |
- | syntax_start_buffer | -` | -- |
- | syntax_end_buffer | -' | -- |
- | syntax_newline | -\n | -- |
- | syntax_comma | -, | -- |
- | syntax_a | -a | -- |
- | syntax_f | -f | -- |
- | syntax_n | -n | -- |
- | syntax_r | -r | -- |
- | syntax_t | -t | -- |
- | syntax_v | -v | -- |
- | syntax_x | -x | -- |
- | syntax_c | -c | -- |
- | syntax_colon | -: | -- |
- | syntax_equal | -= | -- |
- | syntax_e | -e | -- |
- | syntax_l | -l | -- |
- | syntax_L | -L | -- |
- | syntax_u | -u | -- |
- | syntax_U | -U | -- |
- | syntax_s | -s | -- |
- | syntax_S | -S | -- |
- | syntax_d | -d | -- |
- | syntax_D | -D | -- |
- | syntax_E | -E | -- |
- | syntax_Q | -Q | -- |
- | syntax_X | -X | -- |
- | syntax_C | -C | -- |
- | syntax_Z | -Z | -- |
- | syntax_G | -G | -- |
- | syntax_bang | -! | -- |
- | syntax_and | -& | -- |
-
The following member constants are used to represent particular character
-classifications:
-
- | Member constant | -Description | -- |
- | char_class_none | -No classification, must be zero. - | -- |
- | char_class_alpha | -All alphabetic characters. | -- |
- | char_class_cntrl | -All control characters. | -- |
- | char_class_digit | -All decimal digits. | -- |
- | char_class_lower | -All lower case characters. | -- |
- | char_class_punct | -All punctuation characters. - | -- |
- | char_class_space | -All white-space characters. - | -- |
- | char_class_upper | -All upper case characters. | -- |
- | char_class_xdigit | -All hexadecimal digit -characters. | -- |
- | char_class_blank | -All blank characters (space + -tab). | -- |
- | char_class_unicode | -All extended unicode characters -- those that can not be represented as a single narrow character. | -- |
- | char_class_alnum | -All alpha-numeric characters. - | -- |
- | char_class_graph | -All graphic characters. | -- |
- | char_class_print | -All printable characters. | -- |
- | char_class_word | -All word characters -(alphanumeric characters + the underscore). | -- |
The following member functions are required by all regular expression traits
-classes, those members that are declared here as const, could be
-declared static instead if the class does not contain instance data:
-
- | Member function | -Description | -- |
- | static size_t length(const -char_type* p); | -Returns the length of the -null-terminated string p. | -- |
- | unsigned int -syntax_type(size_type c)const; | - Converts an input
-character into a locale independent token (one of the syntax_xxx member
-constants). Called when parsing the regular expression into a
-locale-independent parse tree. Example: in English language regular -expressions we would use "[[:word:]]" to represent the character -class of all word characters, and "\w" as a shortcut for this. -Consequently syntax_type('w') returns syntax_w. In French language regular -expressions, we would use "[[:mot:]]" in place of -"[[:word:]]" and therefore "\m" in place of "\w", -therefore it is syntax_type('m') that returns syntax_w. - |
-- |
- | char_type translate(char_type c, -bool icase)const; | - Translates an input
-character into a unique identifier that represents the equivalence class that
-that character belongs to. If icase is true, then the returned value is
-insensitive to case. [An equivalence class is the set of all -characters that must be treated as being equivalent to each other.] - |
-- |
- | void transform(string_type& -out, const string_type& in)const; | -Transforms the string -in, into a locale-dependent sort key, and stores the result in -out. | -- |
- | void -transform_primary(string_type& out, const string_type& in)const; | -Transforms the string -in, into a locale-dependent primary sort key, and stores the result in -out. | -- |
- | bool is_separator(char_type -c)const; | -Returns true only if -c is a line separator. | -- |
- | bool is_combining(char_type -c)const; | -Returns true only if -c is a unicode combining character. | -- |
- | bool is_class(char_type c, -uint32_t f)const; | -Returns true only if -c is a member of one of the character classes represented by the bitmap -f. | -- |
- | int toi(char_type c)const; | - Converts the character
-c to a decimal integer. [Precondition: -is_class(c,char_class_digit)==true] - |
-- |
- | int toi(const char_type*& -first, const char_type* last, int radix)const; | - Converts the string
-[first-last) into an integral value using base radix. Stops when it
-finds the first non-digit character, and sets first to point to that
-character. [Precondition: is_class(*first,char_class_digit)==true] - - |
-- |
- | uint32_t lookup_classname(const -char_type* first, const char_type* last)const; | -Returns the bitmap -representing the character class [first-last), or char_class_none if -[first-last) is not recognized as a character class name. | -- |
- | bool -lookup_collatename(string_type& buf, const char_type* first, const -char_type* last)const; | -If the sequence [first-last) is -the name of a known collating element, then stores the collating element in -buf, and returns true, otherwise returns false. | -- |
- | locale_type imbue(locale_type -l); | -Imbues the class with the -locale l. | -- |
- | locale_type getloc()const; | -Returns the traits-class -locale. | -- |
- | std::string -error_string(unsigned id)const; | -Returns the -locale-dependent error-string associated with the error-number id. The -parameter id is one of the REG_XXX error codes described by the POSIX -standard, and defined in <boost/cregex.hpp. | -- |
- | mytraits(); | -Constructor. | -- |
- | ~ mytraits(); | -Destructor. | -- |
- -
Copyright Dr John -Maddock 1998-2000 all rights reserved.
- - +The member types required by a traits class are defined as
+follows:
+
+ | Member + name | +Description + | ++ |
+ | char_type | +The + character type encapsulated by this traits class, must be + a POD type, and be convertible to uchar_type. | ++ |
+ | uchar_type + | +The + unsigned type corresponding to char_type, must be + convertible to size_type. | ++ |
+ | size_type | +An + unsigned integral type, with at least as much precision + as uchar_type. | ++ |
+ | string_type + | +A type + that offers the same facilities as std::basic_string<char_type. + This is used for collating elements, and sort strings, if + char_type has no locale dependent collation (it is not a + "character"), then it could be something + simpler than std::basic_string. | ++ |
+ | locale_type + | +A type + that encapsulates the locale used by the traits class, + probably std::locale but could be a platform specific + type, or a dummy type if per-instance locales are not + supported by the traits class. | ++ |
+ | uint32_t | +An + unsigned integral type with at least 32-bits of + precision, used as a bitmask type for character + classification. | ++ |
+ | sentry | +A class or
+ struct type which is constructible from an instance of
+ the traits class, and is convertible to void*. An
+ instance of type sentry will be constructed before
+ compiling each regular expression, it provides an
+ opportunity to carry out prefix/suffix operations on the
+ traits class. For example a traits class that + encapsulates the global locale, can use this as an + opportunity to synchronize with the global locale (by + updating any cached data). + |
+ + |
+ The following member constants are used to represent the
+locale independent syntax of a regular expression; the member
+function syntax_type returns one of these values, and is
+used to convert a locale dependent regular expression, into a
+locale-independent sequence of tokens.
+
+ | Member + constant | +English + language representation | ++ |
+ | syntax_char + | +All non-special + characters. | ++ |
+ | syntax_open_bracket + | +( | ++ |
+ | syntax_close_bracket + | +) | ++ |
+ | syntax_dollar + | +$ | ++ |
+ | syntax_caret + | +^ | ++ |
+ | syntax_dot + | +. | ++ |
+ | syntax_star + | +* | ++ |
+ | syntax_plus + | ++ | ++ |
+ | syntax_question + | +? | ++ |
+ | syntax_open_set + | +[ | ++ |
+ | syntax_close_set + | +] | ++ |
+ | syntax_or + | +| | ++ |
+ | syntax_slash + | +\ | ++ |
+ | syntax_hash + | +# | ++ |
+ | syntax_dash + | +- | ++ |
+ | syntax_open_brace + | +{ | ++ |
+ | syntax_close_brace + | +} | ++ |
+ | syntax_digit + | +0123456789 + | ++ |
+ | syntax_b + | +b | ++ |
+ | syntax_B + | +B | ++ |
+ | syntax_left_word + | +< + | ++ |
+ | syntax_right_word + | ++ | + |
+ | syntax_w + | +w | ++ |
+ | syntax_W + | +W | ++ |
+ | syntax_start_buffer + | +` | ++ |
+ | syntax_end_buffer + | +' | ++ |
+ | syntax_newline + | +\n | ++ |
+ | syntax_comma + | +, | ++ |
+ | syntax_a + | +a | ++ |
+ | syntax_f + | +f | ++ |
+ | syntax_n + | +n | ++ |
+ | syntax_r + | +r | ++ |
+ | syntax_t + | +t | ++ |
+ | syntax_v + | +v | ++ |
+ | syntax_x + | +x | ++ |
+ | syntax_c + | +c | ++ |
+ | syntax_colon + | +: | ++ |
+ | syntax_equal + | += | ++ |
+ | syntax_e + | +e | ++ |
+ | syntax_l + | +l | ++ |
+ | syntax_L + | +L | ++ |
+ | syntax_u + | +u | ++ |
+ | syntax_U + | +U | ++ |
+ | syntax_s + | +s | ++ |
+ | syntax_S + | +S | ++ |
+ | syntax_d + | +d | ++ |
+ | syntax_D + | +D | ++ |
+ | syntax_E + | +E | ++ |
+ | syntax_Q + | +Q | ++ |
+ | syntax_X + | +X | ++ |
+ | syntax_C + | +C | ++ |
+ | syntax_Z + | +Z | ++ |
+ | syntax_G + | +G | ++ |
+ | syntax_bang + | +! | ++ |
+ | syntax_and + | +& + | ++ |
The following member constants are used to represent
+particular character classifications:
+
+ | Member + constant | +Description + | ++ |
+ | char_class_none + | +No + classification, must be zero. | ++ |
+ | char_class_alpha + | +All + alphabetic characters. | ++ |
+ | char_class_cntrl + | +All + control characters. | ++ |
+ | char_class_digit + | +All + decimal digits. | ++ |
+ | char_class_lower + | +All lower + case characters. | ++ |
+ | char_class_punct + | +All + punctuation characters. | ++ |
+ | char_class_space + | +All white-space + characters. | ++ |
+ | char_class_upper + | +All upper + case characters. | ++ |
+ | char_class_xdigit + | +All + hexadecimal digit characters. | ++ |
+ | char_class_blank + | +All blank + characters (space + tab). | ++ |
+ | char_class_unicode + | +All + extended unicode characters - those that can not be + represented as a single narrow character. | ++ |
+ | char_class_alnum + | +All alpha-numeric + characters. | ++ |
+ | char_class_graph + | +All + graphic characters. | ++ |
+ | char_class_print + | +All + printable characters. | ++ |
+ | char_class_word + | +All word + characters (alphanumeric characters + the underscore). | ++ |
The following member functions are required by all regular
+expression traits classes, those members that are declared here
+as const, could be declared static instead if the
+class does not contain instance data:
+
+ | Member + function | +Description + | ++ |
+ | static + size_t length(const char_type* p); | +Returns + the length of the null-terminated string p. | ++ |
+ | unsigned + int syntax_type(size_type c)const; | + Converts
+ an input character into a locale independent token (one
+ of the syntax_xxx member constants). Called when parsing
+ the regular expression into a locale-independent parse
+ tree. Example: in English language regular + expressions we would use "[[:word:]]" to + represent the character class of all word characters, and + "\w" as a shortcut for this. Consequently + syntax_type('w') returns syntax_w. In French language + regular expressions, we would use "[[:mot:]]" + in place of "[[:word:]]" and therefore "\m" + in place of "\w", therefore it is syntax_type('m') + that returns syntax_w. + |
+ + |
+ | char_type + translate(char_type c, bool icase)const; | + Translates
+ an input character into a unique identifier that
+ represents the equivalence class that that character
+ belongs to. If icase is true, then the returned value is
+ insensitive to case. [An equivalence class is + the set of all characters that must be treated as being + equivalent to each other.] + |
+ + |
+ | void + transform(string_type& out, const string_type& in)const; + | +Transforms + the string in, into a locale-dependent sort key, + and stores the result in out. | ++ |
+ | void + transform_primary(string_type& out, const + string_type& in)const; | +Transforms + the string in, into a locale-dependent primary + sort key, and stores the result in out. | ++ |
+ | bool + is_separator(char_type c)const; | +Returns + true only if c is a line separator. | ++ |
+ | bool + is_combining(char_type c)const; | +Returns + true only if c is a unicode combining character. | ++ |
+ | bool + is_class(char_type c, uint32_t f)const; | +Returns + true only if c is a member of one of the character + classes represented by the bitmap f. | ++ |
+ | int toi(char_type + c)const; | + Converts
+ the character c to a decimal integer. [Precondition: + is_class(c,char_class_digit)==true] + |
+ + |
+ | int toi(const + char_type*& first, const char_type* last, int radix)const; + | + Converts
+ the string [first-last) into an integral value using base
+ radix. Stops when it finds the first non-digit
+ character, and sets first to point to that
+ character. [Precondition: is_class(*first,char_class_digit)==true] + + |
+ + |
+ | uint32_t + lookup_classname(const char_type* first, const char_type* + last)const; | +Returns + the bitmap representing the character class [first-last), + or char_class_none if [first-last) is not recognized as a + character class name. | ++ |
+ | bool + lookup_collatename(string_type& buf, const char_type* + first, const char_type* last)const; | +If the + sequence [first-last) is the name of a known collating + element, then stores the collating element in buf, and + returns true, otherwise returns false. | ++ |
+ | locale_type + imbue(locale_type l); | +Imbues + the class with the locale l. | ++ |
+ | locale_type + getloc()const; | +Returns + the traits-class locale. | ++ |
+ | std::string + error_string(unsigned id)const; | +Returns + the locale-dependent error-string associated with the + error-number id. The parameter id is one of + the REG_XXX error codes described by the POSIX standard, + and defined in <boost/cregex.hpp. | ++ |
+ | mytraits(); + | +Constructor. + | ++ |
+ | ~ mytraits(); + | +Destructor. + | ++ |
Copyright Dr +John Maddock 1998-2000 all rights reserved.
+ +