Initial commit of quickbook conversion of docs.

[SVN r37942]
2007-06-08 09:13:34 +00:00
parent f4877f6698
commit 5f96b68080
52 changed files with 8859 additions and 0 deletions
--- a/doc/unicode.qbk
+++ b/doc/unicode.qbk
@ -0,0 +1,34 @@
+
+[section:unicode Unicode and Boost.Regex]
+
+There are two ways to use Boost.Regex with Unicode strings:
+
+[h4 Rely on wchar_t]
+
+If your platform's `wchar_t` type can hold Unicode strings, and your 
+platform's C/C++ runtime correctly handles wide character constants 
+(when passed to `std::iswspace` `std::iswlower` etc), then you can use 
+`boost::wregex` to process Unicode.  However, there are several 
+disadvantages to this approach:
+
+* It's not portable: there's no guarantee on the width of `wchar_t`, or 
+even whether the runtime treats wide characters as Unicode at all, 
+most Windows compilers do so, but many Unix systems do not.
+* There's no support for Unicode-specific character classes: `[[:Nd:]]`, `[[:Po:]]` etc.
+* You can only search strings that are encoded as sequences of wide 
+characters, it is not possible to search UTF-8, or even UTF-16 on many platforms.
+
+[h4 Use a Unicode Aware Regular Expression Type.]
+
+If you have the 
+[@http://www.ibm.com/software/globalization/icu/ ICU library], then 
+Boost.Regex can be 
+[link boost_regex.install.building_with_unicode_and_icu_support 
+configured to make use 
+of it], and provide a distinct regular expression type (boost::u32regex), 
+that supports both Unicode specific character properties, and the searching 
+of text that is encoded in either UTF-8, UTF-16, or UTF-32.  See: 
+[link boost_regex.ref.non_std_strings.icu 
+ICU string class support].
+
+[endsect]