mirror of
https://github.com/boostorg/regex.git
synced 2025-07-17 06:12:10 +02:00
121 lines
2.5 KiB
Plaintext
121 lines
2.5 KiB
Plaintext
![]() |
|
||
|
[section:collating_names Collating Names]
|
||
|
|
||
|
[section:digraphs Digraphs]
|
||
|
|
||
|
The following are treated as valid digraphs when used as a collating name:
|
||
|
|
||
|
"ae", "Ae", "AE", "ch", "Ch", "CH", "ll", "Ll", "LL", "ss", "Ss", "SS", "nj", "Nj", "NJ", "dz", "Dz", "DZ", "lj", "Lj", "LJ".
|
||
|
|
||
|
So for example the expression:
|
||
|
|
||
|
[pre \[\[.ae.\]-c\] ]
|
||
|
|
||
|
will match any character that collates between the digraph "ae" and the character "c".
|
||
|
|
||
|
[endsect]
|
||
|
|
||
|
[section:posix_symbolic_names POSIX Symbolic Names]
|
||
|
|
||
|
The following symbolic names are recognised as valid collating element names,
|
||
|
in addition to any single character, this allows you to write for example:
|
||
|
|
||
|
[pre \[\[.left-square-bracket.\]\[.right-square-bracket.\]\]]
|
||
|
|
||
|
if you wanted to match either "\[" or "\]".
|
||
|
|
||
|
[table
|
||
|
[[Name][Character]]
|
||
|
[[NUL] [\\x00]]
|
||
|
[[SOH] [\\x01]]
|
||
|
[[STX] [\\x02]]
|
||
|
[[ETX] [\\x03]]
|
||
|
[[EOT] [\\x04]]
|
||
|
[[ENQ] [\\x05]]
|
||
|
[[ACK] [\\x06]]
|
||
|
[[alert] [\\x07]]
|
||
|
[[backspace] [\\x08]]
|
||
|
[[tab] [\\t]]
|
||
|
[[newline] [\\n]]
|
||
|
[[vertical-tab] [\\v]]
|
||
|
[[form-feed] [\\f]]
|
||
|
[[carriage-return] [\\r]]
|
||
|
[[SO] [\\xE]]
|
||
|
[[SI] [\\xF]]
|
||
|
[[DLE] [\\x10]]
|
||
|
[[DC1] [\\x11]]
|
||
|
[[DC2] [\\x12]]
|
||
|
[[DC3] [\\x13]]
|
||
|
[[DC4] [\\x14]]
|
||
|
[[NAK] [\\x15]]
|
||
|
[[SYN] [\\x16]]
|
||
|
[[ETB] [\\x17]]
|
||
|
[[CAN] [\\x18]]
|
||
|
[[EM] [\\x19]]
|
||
|
[[SUB] [\\x1A]]
|
||
|
[[ESC] [\\x1B]]
|
||
|
[[IS4] [\\x1C]]
|
||
|
[[IS3] [\\x1D]]
|
||
|
[[IS2] [\\x1E]]
|
||
|
[[IS1] [\\x1F]]
|
||
|
[[space] [\\x20]]
|
||
|
[[exclamation-mark] [!]]
|
||
|
[[quotation-mark] ["]]
|
||
|
[[number-sign] [#]]
|
||
|
[[dollar-sign] [$]]
|
||
|
[[percent-sign] [%]]
|
||
|
[[ampersand] [&]]
|
||
|
[[apostrophe] [\']]
|
||
|
[[left-parenthesis] [(]]
|
||
|
[[right-parenthesis] [)]]
|
||
|
[[asterisk] [\*]]
|
||
|
[[plus-sign] [+]]
|
||
|
[[comma] [,]]
|
||
|
[[hyphen] [-]]
|
||
|
[[period] [.]]
|
||
|
[[slash] [ / ]]
|
||
|
[[zero] [0]]
|
||
|
[[one] [1]]
|
||
|
[[two] [2]]
|
||
|
[[three] [3]]
|
||
|
[[four] [4]]
|
||
|
[[five] [5]]
|
||
|
[[six] [6]]
|
||
|
[[seven] [7]]
|
||
|
[[eight] [8]]
|
||
|
[[nine] [9]]
|
||
|
[[colon] [\:]]
|
||
|
[[semicolon] [;]]
|
||
|
[[less-than-sign] [<]]
|
||
|
[[equals-sign] [=]]
|
||
|
[[greater-than-sign] [>]]
|
||
|
[[question-mark] [?]]
|
||
|
[[commercial-at] [@]]
|
||
|
[[left-square-bracket] [\[]]
|
||
|
[[backslash][\\]]
|
||
|
[[right-square-bracket][\]]]
|
||
|
[[circumflex][~]]
|
||
|
[[underscore][_]]
|
||
|
[[grave-accent][`]]
|
||
|
[[left-curly-bracket][{]]
|
||
|
[[vertical-line][|]]
|
||
|
[[right-curly-bracket][}]]
|
||
|
[[tilde][~]]
|
||
|
[[DEL][\\x7F]]
|
||
|
]
|
||
|
|
||
|
[endsect]
|
||
|
|
||
|
[section:named_unicode Named Unicode Characters]
|
||
|
|
||
|
When using [link boost_regex.unicode Unicode aware regular expressions] (with the `u32regex` type), all
|
||
|
the normal symbolic names for Unicode characters (those given in Unidata.txt)
|
||
|
are recognised. So for example:
|
||
|
|
||
|
[pre \[\[.CYRILLIC CAPITAL LETTER I.\]\] ]
|
||
|
|
||
|
would match the Unicode character 0x0418.
|
||
|
|
||
|
[endsect]
|
||
|
[endsect]
|