The regular-expressions Library

Overview

The regular-expressions library exports the regular-expressions module, which contains functions that compile and search for regular expressions. The module has the same semantics as Perl (version 4) unless otherwise noted.

A regular expression that is grammatically correct may still be illegal if it contains an infinitely quantified sub-regex that matches the empty string. That is, if R is a regex that can match the empty string, then any regex containing R*, R+, and R{n,} is illegal. In this case, the regular-expressions library will signal an <invalid-regex> error when the regex is parsed.

Quick Start

The most common use of regular expressions is probably to perform a search and figure out what text matched and/or where it occurred in a string. You need to use regular-expressions; in both your library and your module, and then…

define constant $re :: <regex> = compile-regex("^abc(.+)123$");

let match :: false-or(<regex-match>) = regex-search($re, "abcdef123");
// match is #f if search failed.

if (match)
  let text = match-group(match, 1);
  // text = "def"

  let (text, start, _end) = match-group(match, 1);
  // text = "def", start = 3, _end = 6

  match-group(match, 2) => error: <invalid-match-group>
  // group 0 is the entire match
  ...
end;

compile-regex("*") => error: <invalid-regex>

Reference

<regex> Sealed Class

A compiled regular expression object. These should only be created via compile-regex.

<regex-error> Sealed Class

The superclass of all regular expression-related errors.

Superclasses:

<format-string-condition>, <error>

<invalid-regex> Sealed Class

Signalled by compile-regex when the given regular expression cannot be compiled.

Superclasses:

<regex-error>

Init-Keywords:
  • pattern

regex-error-pattern Sealed Generic function

Return the pattern that caused an <invalid-regex> error.

Signature:

regex-error-pattern error => pattern

Parameters:
Values:
<invalid-match-group> Sealed Class

Signalled when an invalid group identifier is passed to match-group.

Superclasses:

<regex-error>

<regex-match> Sealed Class

Stores the match groups and other information about a specific regex search result.

Superclasses:

<object>

Init-Keywords:
  • regular-expression

compile-regex Sealed Generic function

Compile a string into a <regex>.

Signature:

compile-regex pattern #key case-sensitive verbose multi-line dot-matches-all use-cache => regex

Parameters:
  • pattern – A <string>.

  • case-sensitive (#key) – A <boolean>, default #t.

  • verbose (#key) – A <boolean>, default #f.

  • multi-line (#key) – A <boolean>, default #f.

  • dot-matches-all (#key) – A <boolean>, default #f.

  • use-cache (#key) – A <boolean>, default #t. If true, the resulting regular expression will be cached and re-used the next time the same string is compiled.

Values:
Conditions:

<invalid-regex> is signalled if pattern can’t be compiled.

regex-pattern Sealed Generic function

Return the <string> from which regex was created.

Signature:

regex-pattern regex => pattern

Parameters:
Values:
regex-group-count Sealed Generic function

Return the number of groups in a <regex>.

Signature:

regex-group-count regex => num-groups

Parameters:
Values:
regex-position Sealed Generic function

Find the position of pattern in text.

Signature:

regex-position pattern text #key start end case-sensitive => regex-start, #rest marks

Parameters:
  • pattern – A <regex>.

  • text – A <string>.

  • start (#key) – A <integer>, default 0. The index at which to start the search.

  • end (#key) – An <integer>, default *text*.size. The index at which to end the search.

  • case-sensitive (#key) – A <boolean>, default #t.

Values:
  • regex-start – An instance of false-or(<integer>).

  • #rest marks – An instance of <object>.

A match will only be found if it fits entirely within the range specified by start and end.

If the regular expression is not found, return #f, otherwise return a variable number of indices marking the start and end of groups.

This is a low-level API. Use regex-search if you want to get a <regex-match> object back.

regex-replace Sealed Generic function

Replace occurrences of pattern within big with replacement.

Signature:

regex-replace big pattern replacement #key start end count case-sensitive => new-string

Parameters:
  • big – The <string> within which to search.

  • pattern – The <regex> to search for.

  • replacement – The <string> to replace pattern with.

  • start (#key) – An <integer>, default 0. The index in big at which to start searching.

  • end (#key) – An <integer>, default *big*.size. The index at which to end the search.

  • case-sensitive (#key) – A <boolean>, default #t.

  • count (#key) – An instance of false-or(<integer>), default #f. The number of matches to replace. #f means to replace all.

Values:
  • new-string – An instance of <string>.

A match will only be found if it fits entirely within the range specified by start and end.

Search for a pattern within text.

Signature:

regex-search pattern text #key anchored start end case-sensitive => match

Parameters:
  • pattern – The <regex> to search for.

  • text – The <string> in which to search.

  • anchored (#key) – A <boolean>, default #f. Whether or not the search should be anchored at the start position. This is useful because “^…” will only match at the beginning of a string, or after \n if the regex was compiled with multi-line = #t.

  • start (#key) – An <integer>, default 0. The index in text at which to start searching.

  • end (#key) – An <integer>, default *text*.size. The index at which to end the search.

  • case-sensitive (#key) – A <boolean>, default #t.

Values:
  • match – An instance of false-or(<regex-match>). #f is returned if no match was found.

A match will only be found if it fits entirely within the range specified by start and end.

regex-search-strings Sealed Generic function

Find all matches for a regular expression within a string.

Signature:

regex-search-strings pattern text #key anchored start end case-sensitive => #rest strings

Parameters:
  • pattern – An instance of <regex>.

  • text – An instance of <string>.

  • anchored (#key) – An instance of <boolean>.

  • start (#key) – An <integer>, default 0. The index in text at which to start searching.

  • end (#key) – An <integer>, default *text*.size. The index at which to end the search.

  • case-sensitive (#key) – A <boolean>, default #t.

Values:
  • #rest strings – An instance of <object>.

A match will only be found if it fits entirely within the range specified by start and end.

match-group Sealed Generic function

Return information about a specific match group in a <regex-match>.

Signature:

match-group match group => text start-index end-index

Parameters:
Values:
  • text – An instance of false-or(<string>).

  • start-index – An instance of false-or(<integer>).

  • end-index – An instance of false-or(<integer>).

Conditions:

<invalid-match-group> is signalled if group does not name a valid group.

The requested group may be an <integer> to access groups by number, or a <string> to access groups by name. Accessing groups by name only works if they were given names in the compiled regular expression via the (?<foo>...) syntax.

Group 0 is always the entire regular expression match.

It is possible for the group identifier to be valid and for #f to be returned. This can happen, for example, if the group was in the part of an | (or) expression that didn’t match.