Pattern Matching
Opie provides pattern matching using
regular expressions
or wildcards.
It supports unicode and
character sets.
Regular Expressions
Regular Expressions knows these primitives:
- c matches the character 'c'
- . matches any character
- ^ matches start of input
- $ matches end of input
- [] matches a defined set of characters - see below.
- a* matches a sequence of zero or more a's
- a+ matches a sequence of one or more a's
- a? matches an optional a
- \c escape code for matching special characters such
as \, [, *, +, . etc.
- \t matches the TAB character (9)
- \n matches newline (10)
- \r matches return (13)
- \s matches a white space (defined as any character
for which QChar::isSpace() returns TRUE. This includes at least
ASCII characters 9 (TAB), 10 (LF), 11 (VT), 12(FF), 13 (CR) and 32
(Space)).
- \d matches a digit (defined as any character for
which QChar::isDigit() returns TRUE. This includes at least ASCII
characters '0'-'9').
- \x1f6b matches the character with unicode point U1f6b
(hexadecimal 1f6b). \x0012 will match the ASCII/Latin1 character
0x12 (18 decimal, 12 hexadecimal).
- \022 matches the ASCII/Latin1 character 022 (18
decimal, 22 octal).
wildcard mode
In wildcard mode, it only knows four primitives:
- c matches the character 'c'
- ? matches any character
- * matches any sequence of characters
- [] matches a defined set of characters - see below.
It supports Unicode both in the pattern strings and in the
strings to be matched.
A character set matches a defined set of characters. For example,
[BSD] matches any of 'B', 'D' and 'S'. Within a character set, the
special characters '.', '*', '?', '^', '$', '+' and '[' lose their
special meanings. The following special characters apply:
- ^ When placed first in the list, changes the
character set to match any character not in the list. To include
the character '^' itself in the set, escape it or place it anywhere
but first.
- - Defines a range of characters. To include the
character '-' itself in the set, escape it or place it last.
- ] Ends the character set definition. To include the
character ']' itself in the set, escape it or place it first (but
after the negation operator '^', if present)
Thus, [a-zA-Z0-9.] matches upper and lower case ASCII letters,
digits and dot; and [^\s] matches everything except white space.
Bugs and limitations:
- Case insensitive matching is not supported for non-ASCII/Latin1
(non-8bit) characters.
Taken from the Qt Documentation.
This file is part of the Qt toolkit,
copyright © 1995-2001 Trolltech, all rights reserved.