While using regexp usually yields an elegant and compact solution, often it's not the fastest.
For tasks where you have to replace certain substrings with others, the standard library provides a really efficient solution in the form of strings.Replacer
:
Replacer replaces a list of strings with replacements. It is safe for concurrent use by multiple goroutines.
You may create a reusable replacer with strings.NewReplacer()
, where you list the pairs containing the replaceable parts and their replacements. When you want to perform a replacing, you simply call Replacer.Replace()
.
Here's how it would look like:
const replacement = "<br>\n"
var replacer = strings.NewReplacer(
"\r\n", replacement,
"\r", replacement,
"\n", replacement,
"\v", replacement,
"\f", replacement,
"\u0085", replacement,
"\u2028", replacement,
"\u2029", replacement,
)
func replaceReplacer(s string) string {
return replacer.Replace(s)
}
Here's how the regexp solution from Wiktor's answer looks like:
var re = regexp.MustCompile(`\r\n|[\r\n\v\f\x{0085}\x{2028}\x{2029}]`)
func replaceRegexp(s string) string {
return re.ReplaceAllString(s, "<br>\n")
}
The implementation is actually quite fast. Here's a simple benchmark comparing it to the above pre-compiled regexp solution:
const input = "1st\nsecond\r\nthird\r4th\u0085fifth\u2028sixth"
func BenchmarkReplacer(b *testing.B) {
for i := 0; i < b.N; i++ {
replaceReplacer(input)
}
}
func BenchmarkRegexp(b *testing.B) {
for i := 0; i < b.N; i++ {
replaceRegexp(input)
}
}
And the benchmark results:
BenchmarkReplacer-4 3000000 495 ns/op
BenchmarkRegexp-4 500000 2787 ns/op
For our test input, strings.Replacer
was more than 5 times faster.
There's also another advantage. In the example above we obtain the result as a new string
value (in both solutions). This requires a new string
allocation. If we need to write the result to an io.Writer
(e.g. we're creating an HTTP response or writing the result to a file), we can avoid having to create the new string
in case of strings.Replacer
as it has a handy Replacer.WriteString()
method which takes an io.Writer
and writes the result into it without allocating and returning it as a string
. This further significantly increases the performance gain compared to the regexp solution.