Emmanuel John I'm a full-stack software developer, mentor, and writer. I am an open source enthusiast. In my spare time, I enjoy watching sci-fi movies and cheering for Arsenal FC.

A deep dive into regular expressions with Golang

6 min read 1911

Go Logo Over Stacked Blocks

Regular expressions are a key feature of every programming language in software development. It comes in handy when you need to create data validation logic that compares input values to a pattern. Golang comes with an inbuilt regexp package that allows you to write regular expressions of any complexity.

In this article, we’ll cover the following:

Basic matches with regex function

The regexp package provides a number of functions for handling pattern matches.
For this article, we’ll experiment on the basic and the most useful functions for handling pattern matches.

MatchString

The MatchString function checks if the input string contains any match of the regular expression pattern. It accepts a regular expression pattern and a string as parameters and returns true if there is a match. If the pattern fails to compile, errors are returned.

Here is a snippet for the above explanation:

package main
import (
    "fmt"
    "regexp"
)
func main() {
    inputText := "I love new york city"
    match, err := regexp.MatchString("[A-z]ork", inputText)
    if err == nil {
        fmt.Println("Match:", match)
    } else {
        fmt.Println("Error:", err)
    }
}

Compile and execute the above code, and you will receive the following result:

Match: true

FindStringIndex

FindStringIndex method returns only the first match, progressing from left to right. The match is expressed as an int slice, with the first value indicating the start of the match in the string and the second number indicating the number of characters that were matched. A nil result means no matches were found.

Here is a sample use case for the FindStringIndex method:

package main
import (
    "fmt"
    "regexp"
)
func getSubstring(s string, indices []int) string {
    return string(s[indices[0]:indices[1]])
}
func main() {
    pattern := regexp.MustCompile("H[a-z]{4}|[A-z]ork")
    welcomeMessage := "Hello guys, welcome to new york city"
    firstMatchIndex := pattern.FindStringIndex(welcomeMessage)
    fmt.Println("First matched index", firstMatchIndex[0], "-", firstMatchIndex[1],
        "=", getSubstring(welcomeMessage, firstMatchIndex))
}

Here, we wrote a custom getSubstring function that returns a substring composed with the values from the int slice and the input string. The regexp package has the inbuilt FindString function that achieves the same result.

Compile and execute the above code, and you will receive the following result:

First matched index 0 - 5 = Hello

FindString

This method returns only the first matched string, progressing from left to right made by the compiled pattern.

An empty string will be returned if no match is made. If there is no match, the return value is an empty string, but it will also be empty if the regular expression successfully matches an empty string. If you need to distinguish between these cases, use the FindStringIndex or FindStringSubmatch methods.

Here is a sample use case for the FindString method:

package main
import (
    "fmt"
    "regexp"
)
func main() {
    pattern := regexp.MustCompile("H[a-z]{4}|[A-z]ork")
    welcomeMessage := "Hello guys, welcome to new york city"
    firstMatchSubstring := pattern.FindString(welcomeMessage)
    fmt.Println("First matched substring:", firstMatchSubstring)
}

Compile and execute the above code, and you will receive the following result:

First matched substring: Hello

FindAllString(s string, n int)

This method takes a string and int arguments and returns a string slice containing all the matches found in the input string by the compiled pattern in the input string. The maximum number of matches is specified by the int argument max, with -1 indicating no limit. If there are no matches, a nil result is returned.

Here is a sample use case for the FindAllString method:

package main
import (
    "fmt"
    "regexp"
)
func main() {
    pattern := regexp.MustCompile("H[a-z]{4}|[A-z]ork")
    welcomeMessage := "Hello guys, welcome to new york city"
    allSubstringMatches := pattern.FindAllString(welcomeMessage, -1)
    fmt.Println(allSubstringMatches)
}

Compile and execute the above code, and you will receive the following result:

[Hello york]

FindAllStringIndex(s string, n int)

The FindAllStringIndex method is the All version of FindStringIndex. It takes a string and int arguments and returns a slice of int slices of all successive matches of the pattern. If there are no matches, a nil result is returned.

Calling a FindAll method with the int parameter of 0 will return no results, and calling with the int parameter of 1 will return a slice of int slice of the first matched string from the left. The int parameter of -1 returns a slice of int slices of all successive matches of the pattern.

The returned slice contains two int values, with the first value indicating the start of the match in the string and the second number indicating the number of characters that were matched.



Here is a sample use case for the FindAllStringIndex method:

package main
import (
    "fmt"
    "regexp"
)
func getSubstring(s string, indices []int) string {
    return string(s[indices[0]:indices[1]])
}
func main() {
    pattern := regexp.MustCompile("H[a-z]{4}|[A-z]ork")
    welcomeMessage := "Hello guys, welcome to new york city"
    allIndices := pattern.FindAllStringIndex(welcomeMessage, -1)
    fmt.Println(allIndices)
    for i, idx := range allIndices {
        fmt.Println("Index", i, "=", idx[0], "-",
            idx[1], "=", getSubstring(welcomeMessage, idx))
    }
}

Here, we wrote a basic for loop that calls our custom getSubstring function that returns a substring composed with the values from the int slice and the input string.

Compile and execute the above code, and you will receive the following result:

\[[0 5\] [27 31]]
Index 0 = 0 - 5 = Hello
Index 1 = 27 - 31 = york

Compiling and reusing regular expression patterns

Compile method compiles a regular expression pattern so it can be reused in more complex queries.

attern, compileErr := regexp.Compile("[A-z]ork")

Here is how to write a compile pattern in Golang:

package main
import (
    "fmt"
    "regexp"
)
func main() {
    pattern, compileErr := regexp.Compile("[A-z]ork")
    correctAnswer := "Yes, I love new york city"
    question := "Do you love new york city?"
    wrongAnswer := "I love dogs"
    if compileErr == nil {
        fmt.Println("Question:", pattern.MatchString(question))
        fmt.Println("Correct Answer:", pattern.MatchString(correctAnswer))
        fmt.Println("Wrong Answer:", pattern.MatchString(wrongAnswer))
    } else {
        fmt.Println("Error:", compileErr)
    }
}

Because the pattern only needs to be compiled once, this is more efficient. The MatchString function is defined by an instance of the RegExp type, which is returned by the Compile function.

Compiling a pattern also gives you access to regular expression features and techniques.

Compile and execute the above code, and you will receive the following result:

Question: true
Correct Answer: true
Wrong Answer: false 

Using a regular expression to split strings

The Split method splits a string using the matches made by a regular expression. It takes a string and int arguments and returns a slice of the split substrings. The int parameter of -1 returns a slice of substrings of all successive matches of the pattern.

Split(s string, n int)

package main
import (
    "fmt"
    "regexp"
)
func main() {
    pattern := regexp.MustCompile("guys|york")
    welcomeMessage := "Hello guys, welcome to new york city"
    split := pattern.Split(welcomeMessage, -1)
    for _, s := range split {
        if s != "" {
            fmt.Println("Substring:", s)
        }
    }
}

Here, we wrote a basic for loop to print each split substring.

You can experiment more by replacing -1 with other numbers.


More great articles from LogRocket:


Compile and execute the above code, and you will receive the following result:

[Hello  , welcome to new   city]
Substring: Hello 
Substring: , welcome to new 
Substring:  city

Subexpressions

Subexpressions make it easier to retrieve substrings from within a matched region by allowing sections of a regular expression to be accessed. Subexpressions are denoted with parentheses. It can be used to identify the regions of content that are important within the pattern:

pattern := regexp.MustCompile("welcome ([A-z]*) new ([A-z]*) city")

FindStringSubmatch

The FindStringSubmatch method is a subexpression method that does the same thing as the FindString method but additionally returns the substrings matched by the expressions:

package main
import (
    "fmt"
    "regexp"
)
func main() {
    pattern := regexp.MustCompile("welcome ([A-z]*) new ([A-z]*) city")
    welcomeMessage := "Hello guys, welcome to new york city"
    subStr := pattern.FindStringSubmatch(welcomeMessage)
    fmt.Println(subStr)

    for _, s := range subStr {
        fmt.Println("Match:", s)
    }
}

Here, we’ve defined two subexpressions, one for each variable component of the pattern.

Compile and execute the above code, and you will receive the following result:

[welcome to new york city to york]
Match: welcome to new york city
Match: to
Match: york

Naming subexpressions

Subexpressions can also be named to ease processing of the resulting outputs.

Here is how to name a subexpression: within parentheses, add a question mark, followed by an uppercase P, followed by the name within angle brackets.

Here is a snippet for the above explanation:

pattern := regexp.MustCompile("welcome (?P<val1>[A-z]*) new (?P<val2>[A-z]*) city")

Here, the subexpressions are given the names val1 and val2.

The regexp methods for subexpressions

SubexpNames()

This method returns a slice containing the names of the subexpressions, expressed in the order in which they are defined:

package main
import (
    "fmt"
    "regexp"
)
func main() {
    pattern := regexp.MustCompile("welcome (?P<val1>[A-z]*) new (?P<val2>[A-z]*) city")
    replaced := pattern.SubexpNames()
    fmt.Println(replaced)
}

Compile and execute the above code, and you will receive the following result:

[ val1 val2]

NumSubexp()

This method returns the number of subexpressions:

package main
import (
    "fmt"
    "regexp"
)
func main() {
    pattern := regexp.MustCompile("welcome (?P<val1>[A-z]*) new (?P<val2>[A-z]*) city")
    replaced := pattern.NumSubexp()
    fmt.Println(replaced)
}

Compile and execute the above code, and you will receive the following result:

2

Check out Golang documentation for more regexp methods for subexpressions.

Using a Regular Expression to Replace Substrings

The regexp package provides a number of methods for replacing substrings.

ReplaceAllString(src, repl string)

This method takes in the string and a template parameter and substitutes the matched portion of the string (s) with the specified template, which is expanded before it is included in the result to accommodate subexpressions:

package main
import (
    "fmt"
    "regexp"
)
func main() {
    pattern := regexp.MustCompile("welcome (?P<val1>[A-z]*) new (?P<val2>[A-z]*) city")
    welcomeMessage := "Hello guys, welcome to new york city"
    template := "(value 1: ${val1}, value 2: ${val2})"
    replaced := pattern.ReplaceAllString(welcomeMessage, template)
    fmt.Println(replaced)
}

Compile and execute the above code, and you will receive the following result:

Hello guys, (value 1: to, value 2: york)

ReplaceAllLiteralString(src, repl string)

This method takes in the string and a template parameter and substitutes the matched piece of the string (s) with the specified template, which is included in the result without being expanded for subexpressions:

package main
import (
    "fmt"
    "regexp"
)
func main() {
    pattern := regexp.MustCompile("welcome (?P<val1>[A-z]*) new (?P<val2>[A-z]*) city")
    welcomeMessage := "Hello guys, welcome to new york city"
    template := "(value 1: ${val1}, value 2: ${val2})"
    replaced := pattern.ReplaceAllLiteralString(welcomeMessage, template)
    fmt.Println(replaced)
}

Compile and execute the above code, and you will receive the following result:

Hello guys, (value 1: ${val1}, value 2: ${val2})

Using a function to replace matched content

The ReplaceAllStringFunc method substitutes content generated by a function for the matched portion of the input string:

package main
import (
    "fmt"
    "regexp"
)
func main() {
    pattern := regexp.MustCompile("welcome ([A-z]*) new ([A-z]*) city")
    welcomeMessage := "Hello guys, welcome to new york city"
    replaced := pattern.ReplaceAllStringFunc(welcomeMessage, func(s string) string {
        return "here is the replacement content for the matched string."
    })
    fmt.Println(replaced)
}

Compile and execute the above code, and you will receive the following result:

Hello guys, here is the replacement content for the matched string

Conclusion

In this article, we’ve explored the go regexp package, alongside its inbuilt methods for handling basic to complex matches with regex and compiling and reusing regular expression patterns.

You can check out the go regexp documentation for more information on regular expressions with Golang.

Get setup with LogRocket's modern error tracking in minutes:

  1. Visit https://logrocket.com/signup/ to get an app ID.
  2. Install LogRocket via NPM or script tag. LogRocket.init() must be called client-side, not server-side.
  3. $ npm i --save logrocket 

    // Code:

    import LogRocket from 'logrocket';
    LogRocket.init('app/id');
    Add to your HTML:

    <script src="https://cdn.lr-ingest.com/LogRocket.min.js"></script>
    <script>window.LogRocket && window.LogRocket.init('app/id');</script>
  4. (Optional) Install plugins for deeper integrations with your stack:
    • Redux middleware
    • ngrx middleware
    • Vuex plugin
Get started now
Emmanuel John I'm a full-stack software developer, mentor, and writer. I am an open source enthusiast. In my spare time, I enjoy watching sci-fi movies and cheering for Arsenal FC.

Leave a Reply