Using Regex in Golf
Regex (or "REGular EXpressions") is a powerful way to search strings, as well as to replace parts of strings. It makes it easier to reliably do so, since regular expressions are usually short; at the same time there's a learning curve in becoming proficient. However, given that regex is ubiquitous, it is worth learning at least some of it, as it can come handy.
Golf's implementation of regex is via match-regex statement, which allows for both searching and replacing in a single statement. It also has a caching capability (meaning caching of compiled regex process), which can increase performance by up to 500%.
This article will show a few basic ways to use match-regex statement in your Golf code.
Create directory for your application:
mkdir -p regex
cd regex
Copied!
Create "reg" application:
gg -k reg
Copied!
Copy the following code to file "reg.golf":
%% /reg public
// Use backreferences to swap two words, with "Reverse order word" as a result
match-regex "(word)\\s+(order)" in "Reverse word order" replace-with "\\2 \\1" result res
print-out res new-line
// Recognize a pattern, in this case 3 found
match-regex "[abc]{3}" in "Recognize 'aaa' or 'aa' or 'abc' or 'cab'" status st
print-out st new-line
// Recognize a pattern, in this case not found
match-regex "[abc]{3}" in "Recognize 'aa' or 'aa' or 'bc' or 'ca'" status st
print-out st new-line
// Use case insensitive search to recognize a pattern, in this case 3 found
match-regex "[abc]{3}" in "Recognize 'aAa' or 'aa' or 'aBc' or 'Cab'" case-insensitive status st
print-out st new-line
// Use case insensitive search to recognize a pattern and replace, with "Recognize 'XXX' or 'aa' or 'XXX' or 'XXX'" as a result
match-regex "[abc]{3}" in "Recognize 'aAa' or 'aa' or 'aBc' or 'Cab'" replace-with "XXX" result res case-insensitive status st
print-out res new-line
print-out st new-line
%%
Copied!
Build your application server as a native executable:
gg -q
Copied!
Run it from the command line:
gg -r --req="/reg" --exec --silent-header
Copied!
The result is:
Reverse order word
3
0
3
Recognize 'XXX' or 'aa' or 'XXX' or 'XXX'
3
Copied!
Let's go over this one by one.
The first statement uses back-references, which is a way to refer to something that's found in the string:
// Use backreferences to swap two words
match-regex "(word)\\s+(order)" in "Reverse word order" replace-with "\\2 \\1" result res
print-out res new-line
Copied!
Here, "word" and "order" are found. Since they are in parenthesis (meaning within "()"), they can be used as back-references. The first one would be \1, the second one \2 etc. In "replace-with" clause, we refer to them as "\\1" and "\\2" just because backslash in a special character used to escape others and needs to be escaped itself. "\\s+" means find any spaces ("\\s") which repeat at least one time ("+"). So we're looking for essentially a snippet like "word order" or "word order", and we are then replacing that with "\\2 \\1". Keep in mind that "\\1" refers to "word" and "\\2" refers to "order". So the result will be "order word", i.e. the two words will be output in reverse order.
Finding pattern, and counting them
A common use of regex is to find out if a pattern is showing up in a string, and how many times. Consider this:
// Recognize a pattern, found
match-regex "[abc]{3}" in "Recognize 'aaa' or 'aa' or 'abc' or 'cab'" status st
print-out st new-line
Copied!
Here, you're looking for any of the characters "a", "b" or "c" that repeat 3 times (which is what "{3}" does). Obviously "aaa", "abc" and "cab" fit that bill, while "aa" does not, so the output is 3.
Conversely, in this case, there are no instances of 3 characters (with each being "a", "b" or "c"), since all of them of length 2 (such as "aa", "bc" etc.), so the result will be 0:
// Recognize a pattern, in this case not found
match-regex "[abc]{3}" in "Recognize 'aa' or 'aa' or 'bc' or 'ca'" status st
print-out st new-line
Copied!
By default, the search is case sensitive. You can make it case insensitive with "case-insensitive" clause:
// Use case insensitive search to recognize a pattern
match-regex "[abc]{3}" in "Recognize 'aAa' or 'aa' or 'aBc' or 'Cab'" case-insensitive status st
print-out st new-line
Copied!
In this case, there are also 3 matches ("aAa", "aBc" and "Cab").
In the following example, we search for a pattern and replace it with something:
// Use case insensitive search to recognize a pattern and replace
match-regex "[abc]{3}" in "Recognize 'aAa' or 'aa' or 'aBc' or 'Cab'" replace-with "XXX" result res case-insensitive status st
print-out res new-line
print-out st new-line
Copied!
Just like in previous example, 3 patterns will be recognized and replaced with "XXX" and with 3 matches as the status, the result is:
Recognize 'XXX' or 'aa' or 'XXX' or 'XXX'
3
Copied!
Some times you'd like to search for a pattern, but only if there's another pattern before it ("lookbehind") or after it ("lookahead").
Copyright (c) 2019-2025 Gliim LLC. All contents on this web site is "AS IS" without warranties or guarantees of any kind.