I have an old piece of code that performs find and replace of tokens within a string.
It receives a map of from
and to
pairs, iterates over them and for each of those pairs, iterates over the target string, looks for the from
using indexOf()
, and replaces it with the value of to
. It does all the work on a StringBuffer
and eventually returns a String
.
I replaced that code with this line: replaceAll("[,. ]*", "");
And I ran some comparative performance tests.
When comparing for 1,000,000
iterations, I got this:
Old Code: 1287ms
New Code: 4605ms
3 times longer!
I then tried replacing it with 3 calls to replace
:
replace(",", "");
replace(".", "");
replace(" ", "");
This resulted with the following results:
Old Code: 1295
New Code: 3524
2 times longer!
Any idea why replace
and replaceAll
are so inefficient? Can I do something to make it faster?
Edit: Thanks for all the answers - the main problem was indeed that [,. ]*
did not do what I wanted it to do. Changing it to be [,. ]+
almost equaled the performance of the non-Regex based solution.
Using a pre-compiled regex helped, but was marginal. (It is a solution very applicable for my problem.
Test code:
Replace string with Regex: [,. ]*
Replace string with Regex: [,. ]+
Replace string with Regex: [,. ]+ and Pre-Compiled Pattern