How do I improve the performance of this simple piece of python code ?
Isn't re.search
the best way to find a matching line, since it is almost ~6x slower than Perl or am I doing something wrong ?
#!/usr/bin/env python
import re
import time
import sys
i=0
j=0
time1=time.time()
base_register =r'DramBaseAddress\d+'
for line in open('rndcfg.cfg'):
i+=1
if(re.search(base_register, line)):
j+=1
time2=time.time()
print (i,j)
print (time2-time1)
print (sys.version)
This code takes about 0.96 seconds to complete (Average of 10 runs)
Output:
168197 2688
0.8597519397735596
3.3.2 (default, Sep 24 2013, 15:14:17)
[GCC 4.1.1]
while the following Perl code does it in 0.15 seconds.
#!/usr/bin/env perl
use strict;
use warnings;
use Time::HiRes qw(time);
my $i=0;my $j=0;
my $time1=time;
open(my $fp, 'rndcfg.cfg');
while(<$fp>)
{
$i++;
if(/DramBaseAddress\d+/)
{
$j++;
}
}
close($fp);
my $time2=time;
printf("%d,%d\n",$i,$j);
printf("%f\n",$time2-$time1);
printf("%s\n",$]);
Output:
168197,2688
0.135579
5.012001
EDIT: Corrected regular expression - Which worsened the performance slightly