I have a script in python to process a log file - it parses the values and joins them simply with a tab
.
p = re.compile(
"([0-9/]+) ([0-9]+):([0-9]+):([0-9]+) I.*"+
"worker\\(([0-9]+)\\)(?:@([^]]*))?.*\\[([0-9]+)\\] "+
"=RES= PS:([0-9]+) DW:([0-9]+) RT:([0-9]+) PRT:([0-9]+) IP:([^ ]*) "+
"JOB:([^!]+)!([0-9]+) CS:([\\.0-9]+) CONV:([^ ]*) URL:[^ ]+ KEY:([^/]+)([^ ]*)"
)
for line in sys.stdin:
line = line.strip()
if len(line) == 0: continue
result = p.match(line)
if result != None:
print "\t".join([x if x is not None else "." for x in result.groups()])
However, the scripts behaves quite slowly and it takes a long time to process the data.
How can I achieve the same behaviour in faster way? Perl/SED/PHP/Bash/...?
Thanks