The regular-expressions Library¶
Overview¶
The regular-expressions
library exports the
regular-expressions
module, which contains functions that compile
and search for regular expressions. The module has the same semantics
as Perl (version 4) unless otherwise noted.
A regular expression that is grammatically correct may still be
illegal if it contains an infinitely quantified sub-regex that matches
the empty string. That is, if R is a regex that can match the empty
string, then any regex containing R*, R+, and R{n,} is illegal. In
this case, the regular-expressions
library will signal an
<invalid-regex>
error when the regex is parsed.
Quick Start¶
The most common use of regular expressions is probably to perform a
search and figure out what text matched and/or where it occurred in a
string. You need to use regular-expressions;
in both your library
and your module, and then…
define constant $re :: <regex> = compile-regex("^abc(.+)123$");
let match :: false-or(<regex-match>) = regex-search($re, "abcdef123");
// match is #f if search failed.
if (match)
let text = match-group(match, 1);
// text = "def"
let (text, start, _end) = match-group(match, 1);
// text = "def", start = 3, _end = 6
match-group(match, 2) => error: <invalid-match-group>
// group 0 is the entire match
...
end;
compile-regex("*") => error: <invalid-regex>
Reference¶
- <regex> Sealed Class¶
A compiled regular expression object. These should only be created via
compile-regex
.
- <regex-error> Sealed Class¶
The superclass of all regular expression-related errors.
- Superclasses:
<format-string-condition>
,<error>
- <invalid-regex> Sealed Class¶
Signalled by
compile-regex
when the given regular expression cannot be compiled.- Superclasses:
- Init-Keywords:
pattern
- regex-error-pattern Sealed Generic function¶
Return the pattern that caused an
<invalid-regex>
error.- Signature:
regex-error-pattern error => pattern
- Parameters:
error – An
<invalid-regex>
.
- Values:
pattern – A
<string>
.
- <invalid-match-group> Sealed Class¶
Signalled when an invalid group identifier is passed to
match-group
.- Superclasses:
- <regex-match> Sealed Class¶
Stores the match groups and other information about a specific regex search result.
- Superclasses:
- Init-Keywords:
regular-expression
- compile-regex Sealed Generic function¶
Compile a string into a
<regex>
.- Signature:
compile-regex pattern #key case-sensitive verbose multi-line dot-matches-all use-cache => regex
- Parameters:
pattern – A
<string>
.case-sensitive (#key) – A
<boolean>
, default#t
.verbose (#key) – A
<boolean>
, default#f
.multi-line (#key) – A
<boolean>
, default#f
.dot-matches-all (#key) – A
<boolean>
, default#f
.use-cache (#key) – A
<boolean>
, default#t
. If true, the resulting regular expression will be cached and re-used the next time the same string is compiled.
- Values:
regex – A
<regex>
.
- Conditions:
<invalid-regex>
is signalled if pattern can’t be compiled.
- regex-position Sealed Generic function¶
Find the position of pattern in text.
- Signature:
regex-position pattern text #key start end case-sensitive => regex-start, #rest marks
- Parameters:
- Values:
regex-start – An instance of
false-or(<integer>)
.#rest marks – An instance of
<object>
.
A match will only be found if it fits entirely within the range specified by start and end.
If the regular expression is not found, return #f, otherwise return a variable number of indices marking the start and end of groups.
This is a low-level API. Use
regex-search
if you want to get a<regex-match>
object back.
- regex-replace Sealed Generic function¶
Replace occurrences of pattern within big with replacement.
- Signature:
regex-replace big pattern replacement #key start end count case-sensitive => new-string
- Parameters:
big – The
<string>
within which to search.pattern – The
<regex>
to search for.replacement – The
<string>
to replace pattern with.start (#key) – An
<integer>
, default0
. The index in big at which to start searching.end (#key) – An
<integer>
, default*big*.size
. The index at which to end the search.case-sensitive (#key) – A
<boolean>
, default#t
.count (#key) – An instance of
false-or(<integer>)
, default#f
. The number of matches to replace.#f
means to replace all.
- Values:
new-string – An instance of
<string>
.
A match will only be found if it fits entirely within the range specified by start and end.
- regex-search Sealed Generic function¶
Search for a pattern within text.
- Signature:
regex-search pattern text #key anchored start end case-sensitive => match
- Parameters:
pattern – The
<regex>
to search for.text – The
<string>
in which to search.anchored (#key) – A
<boolean>
, default#f
. Whether or not the search should be anchored at the start position. This is useful because “^…” will only match at the beginning of a string, or after \n if the regex was compiled with multi-line = #t.start (#key) – An
<integer>
, default0
. The index in text at which to start searching.end (#key) – An
<integer>
, default*text*.size
. The index at which to end the search.case-sensitive (#key) – A
<boolean>
, default#t
.
- Values:
match – An instance of
false-or(<regex-match>)
.#f
is returned if no match was found.
A match will only be found if it fits entirely within the range specified by start and end.
- regex-search-strings Sealed Generic function¶
Find all matches for a regular expression within a string.
- Signature:
regex-search-strings pattern text #key anchored start end case-sensitive => #rest strings
- Parameters:
pattern – An instance of
<regex>
.text – An instance of
<string>
.anchored (#key) – An instance of
<boolean>
.start (#key) – An
<integer>
, default0
. The index in text at which to start searching.end (#key) – An
<integer>
, default*text*.size
. The index at which to end the search.case-sensitive (#key) – A
<boolean>
, default#t
.
- Values:
#rest strings – An instance of
<object>
.
A match will only be found if it fits entirely within the range specified by start and end.
- match-group Sealed Generic function¶
Return information about a specific match group in a
<regex-match>
.- Signature:
match-group match group => text start-index end-index
- Parameters:
match – An instance of
<regex-match>
.
- Values:
text – An instance of
false-or(<string>)
.start-index – An instance of
false-or(<integer>)
.end-index – An instance of
false-or(<integer>)
.
- Conditions:
<invalid-match-group>
is signalled ifgroup
does not name a valid group.
The requested group may be an
<integer>
to access groups by number, or a<string>
to access groups by name. Accessing groups by name only works if they were given names in the compiled regular expression via the(?<foo>...)
syntax.Group 0 is always the entire regular expression match.
It is possible for the group identifier to be valid and for
#f
to be returned. This can happen, for example, if the group was in the part of an|
(or) expression that didn’t match.