The Juggling Lab siteswap generator allows regular expressions to be used as output filters. This page is not intended to be a comprehensive discussion of regular expressions; for this read one of the several good tutorials available on the web.
Juggling Lab uses standard regular expression syntax, with some important differences.
In standard regular expressions, the characters []()|
act as metacharacters with special
non-literal meaning. Doing a literal match of one of these characters requires a preceding backslash
'\'
, for example the regex \[
matches the string [
. In
siteswap notation the characters []()|
have special meaning, so relative to
standard regular expressions we swap the roles of [
and \[
. So within Juggling Lab the regex
[
is a literal match for [
, and \[
and \]
are used
to define character classes (see below).
The "include" patterns are applied once, after a pattern is generated but before it is printed.
Therefore the boundary matchers ^
and $
do what one expects, matching to the beginning
or end of the pattern respectively. For example the include filter 4$
results in patterns ending
with a 4
throw, since $
matches to the end of the pattern.
By contrast, for efficiency reasons the "exclude" filters are applied as the pattern is being built
up, throw by throw. So the beginning matcher ^
always matches the beginning of the pattern, but the end
matcher $
can match the end of any throw. (For this purpose, a "throw" is any set of events
occurring simultaneously, e.g., (4,[2x2])
counts as a single throw.) So an exclude filter of 4$
excludes patterns containing 4
throws anywhere, not just at the end of the pattern.
If the beginning matcher ^
is not supplied in a given filter term, then a .*
wildcard
match is prepended to it. For include filters only, the same .*
is appended to the pattern if no
ending matcher $
is supplied. This is done for convenience, so that for example an include filter of
4
will match a 4
throw anywhere in the pattern (it is converted to .*4.*
before
the regex matching is done).
Note that .*
is not automatically added to the end of exclude filters. Thus for example an exclude filter
33
will match two successive 3
throws anywhere in the pattern, but it will not match
the siteswap throw [33]
. One could exclude the latter with a filter pattern of [33]
or
33]
.
Char Matches any identical character
\[abc\] Simple character class \[a-zA-Z\] Character class with ranges \[^abc\] Negated character class
. Matches any character other than newline \d Matches a digit character \D Matches a non-digit character
^ Matches only at the beginning of a pattern $ Matches at the end of a pattern, or throw (see note above)
A* Matches A 0 or more times (greedy) A+ Matches A 1 or more times (greedy) A? Matches A 1 or 0 times (greedy) A{n} Matches A exactly n times (greedy) A{n,} Matches A at least n times (greedy) A{n,m} Matches A at least n but not more than m times (greedy)
A*? Matches A 0 or more times (reluctant) A+? Matches A 1 or more times (reluctant) A?? Matches A 0 or 1 times (reluctant)
AB Matches A followed by B A\|B Matches either A or B \(A\) Used for subexpression grouping \(?:A\) Used for subexpression clustering (just like grouping but no backrefs)
\1 Backreference to 1st parenthesized subexpression \2 Backreference to 2nd parenthesized subexpression \3 Backreference to 3rd parenthesized subexpression \4 Backreference to 4th parenthesized subexpression \5 Backreference to 5th parenthesized subexpression \6 Backreference to 6th parenthesized subexpression \7 Backreference to 7th parenthesized subexpression \8 Backreference to 8th parenthesized subexpression \9 Backreference to 9th parenthesized subexpression
You can refer to the contents of a parenthesized expression within a regular expression itself. This is called a 'backreference'. The first backreference in a regular expression is denoted by \1, the second by \2 and so on. So the expression:
\(\[0-9\]+\)=\1
will match any string of the form n=n (like 0=0 or 2=2).
All closure operators (+, *, ?, {m,n}) are greedy by default, meaning that they match as many elements of the string as possible without causing the overall match to fail. If you want a closure to be reluctant (non-greedy), you can simply follow it with a '?'. A reluctant closure will match as few elements of the string as possible when finding matches. {m,n} closures don't currently support reluctancy.