Using Regex in Golf


Regex (or "REGular EXpressions") is a powerful way to search strings, as well as to replace parts of strings. It makes it easier to reliably do so, since regular expressions are usually short; at the same time there's a learning curve in becoming proficient. However, given that regex is ubiquitous, it is worth learning at least some of it, as it can come handy.

Golf's implementation of regex is via match-regex statement, which allows for both searching and replacing in a single statement. It also has a caching capability (meaning caching of compiled regex process), which can increase performance by up to 500%.

This article will show a few basic ways to use match-regex statement in your Golf code.

Create directory for your application:
mkdir -p regex
cd regex

Create "reg" application:
gg -k reg

Copy the following code to file "reg.golf":
%% /reg public
    // Use backreferences to swap two words, with "Reverse order word" as a result
    match-regex "(word)\\s+(order)" in "Reverse word order" replace-with "\\2 \\1" result res
    print-out res new-line

    // Recognize a pattern, in this case 3 found
    match-regex "[abc]{3}" in "Recognize 'aaa' or 'aa' or 'abc' or 'cab'" status st
    print-out st new-line

    // Recognize a pattern, in this case not found
    match-regex "[abc]{3}" in "Recognize 'aa' or 'aa' or 'bc' or 'ca'" status st
    print-out st new-line

    // Use case insensitive search to recognize a pattern, in this case 3 found
    match-regex "[abc]{3}" in "Recognize 'aAa' or 'aa' or 'aBc' or 'Cab'" case-insensitive status st
    print-out st new-line

    // Use case insensitive search to recognize a pattern and replace, with "Recognize 'XXX' or 'aa' or 'XXX' or 'XXX'" as a result
    match-regex "[abc]{3}" in "Recognize 'aAa' or 'aa' or 'aBc' or 'Cab'" replace-with "XXX" result res case-insensitive status st
    print-out res new-line
    print-out st new-line
%%

Build your application server as a native executable:
gg -q

Run it from the command line:
gg -r --req="/reg" --exec --silent-header

The result is:
Reverse order word
3
0
3
Recognize 'XXX' or 'aa' or 'XXX' or 'XXX'
3

Let's go over this one by one.
Backreferences
The first statement uses back-references, which is a way to refer to something that's found in the string:
// Use backreferences to swap two words
match-regex "(word)\\s+(order)" in "Reverse word order" replace-with "\\2 \\1" result res
print-out res new-line

Here, "word" and "order" are found. Since they are in parenthesis (meaning within "()"), they can be used as back-references. The first one would be \1, the second one \2 etc. In "replace-with" clause, we refer to them as "\\1" and "\\2" just because backslash in a special character used to escape others and needs to be escaped itself. "\\s+" means find any spaces ("\\s") which repeat at least one time ("+"). So we're looking for essentially a snippet like "word order" or "word   order", and we are then replacing that with "\\2 \\1". Keep in mind that "\\1" refers to "word" and "\\2" refers to "order". So the result will be "order word", i.e. the two words will be output in reverse order.
Finding pattern, and counting them
A common use of regex is to find out if a pattern is showing up in a string, and how many times. Consider this:
// Recognize a pattern, found
match-regex "[abc]{3}" in "Recognize 'aaa' or 'aa' or 'abc' or 'cab'" status st
print-out st new-line

Here, you're looking for any of the characters "a", "b" or "c" that repeat 3 times (which is what "{3}" does). Obviously "aaa", "abc" and "cab" fit that bill, while "aa" does not, so the output is 3.

Conversely, in this case, there are no instances of 3 characters (with each being "a", "b" or "c"), since all of them of length 2 (such as "aa", "bc" etc.), so the result will be 0:
// Recognize a pattern, in this case not found
match-regex "[abc]{3}" in "Recognize 'aa' or 'aa' or 'bc' or 'ca'" status st
print-out st new-line

Case insensitive search
By default, the search is case sensitive. You can make it case insensitive with "case-insensitive" clause:
// Use case insensitive search to recognize a pattern
match-regex "[abc]{3}" in "Recognize 'aAa' or 'aa' or 'aBc' or 'Cab'" case-insensitive status st
print-out st new-line

In this case, there are also 3 matches ("aAa", "aBc" and "Cab").
Search and replace
In the following example, we search for a pattern and replace it with something:
// Use case insensitive search to recognize a pattern and replace
match-regex "[abc]{3}" in "Recognize 'aAa' or 'aa' or 'aBc' or 'Cab'" replace-with "XXX" result res case-insensitive status st
print-out res new-line
print-out st new-line

Just like in previous example, 3 patterns will be recognized and replaced with "XXX" and with 3 matches as the status, the result is:
Recognize 'XXX' or 'aa' or 'XXX' or 'XXX'
3

Lookahead and lookbehind
Some times you'd like to search for a pattern, but only if there's another pattern before it ("lookbehind") or after it ("lookahead").


Copyright (c) 2019-2025 Gliim LLC. All contents on this web site is "AS IS" without warranties or guarantees of any kind.