Parsing Karma Coverage Output in Bash for a Jenkins Job (Scripting)

Question

I'm working with the following output:

=============================== Coverage summary ===============================
Statements   : 26.16% ( 1681/6425 )
Branches     : 6.89% ( 119/1727 )
Functions    : 23.82% ( 390/1637 )
Lines        : 26.17% ( 1680/6420 )
================================================================================

I would like to parse the 4 coverage percentage numbers without the percent via REGEX, into a comma separated list.

Any suggestions for a good regex expression for this? Or another good option?

Hi Jeff. These _all_ seem like good answers, except maybe the `bash` one, and people have put work into them. I think the `grep` one is the clearest, better than my `sed`. Anyway, I am voting them all up. — Joseph Quinsey, Aug 03 '18 at 02:31

Joseph Quinsey · Answer 1 · 2018-08-02T02:06:07.323

The sed command:

  sed -n '/ .*% /{s/.* \(.*\)% .*/\1/;p;}' input.txt | sed ':a;N;$!ba;s/\n/,/g'

gives the output:

  26.16,6.89,23.82,26.17

Edit: A better answer, with only a single sed, would be:

  sed -n '/ .*% /{s/.* \(.*\)% .*/\1/;H;};${g;s/\n/,/g;s/,//;p;}' input.txt

Explanation:

/ .*% / search for lines with a percentage value (note spaces)
s/.* $.*$% .*/\1/ and delete everything except the percentage value
H and then append it to the hold space, prefixed with a newline
$ then for the last line
g get the hold space
s/\n/,/g replace all the newlines with commas
s/,// and delete the initial comma
p and then finally output the result

To harden the regex, you could replace the search for the percentage value .*% with for example [0-9.]*%.

wp78de · Accepted Answer · 2018-08-13T04:09:17.073

2

I think this is a grep job. This should help:

$ grep -oE "[0-9]{1,2}\.[0-9]{2}" input.txt | xargs | tr " " ","

Output:

26.16,6.89,23.82,26.17

The input file just contains what you have shown above. Obviously, there are other ways like cat to feed the input to the command.

Explanation:

grep -oE: only show matches using extended regex
xargs: put all results onto a single line
tr " " ",": translate the spaces into commas:

This is actually a nice shell tool belt example, I would say.

Including the consideration of Joseph Quinsey, the regex can be made more robust with a lookahead to assert a % sign after then numeric value using a Perl-compatible RE pattern:

grep -oP "[0-9]{1,2}\.[0-9]{2}(?=%)" input.txt | xargs | tr " " ","

edited Aug 13 '18 at 04:09

answered Aug 01 '18 at 06:08

wp78de

18,207
7
43
71

Maybe you could add a `%` to the end of your RE, to make it more robust, and then remove the `%` with `tr`? – Joseph Quinsey Aug 03 '18 at 02:43
1

@JosephQuinsey good idea. I used a PCRE with a lookahead to do so. – wp78de Aug 03 '18 at 03:08
Thank you @wp78de, however it has a bit of trouble when it rounds up to one decimal place - I.E Branches 8.1% ( 140/1729 ), so using "[0-9]{1,2}\.[0-9]{1,2}(?=%)" did the trick. Thanks! – TheJeff Aug 12 '18 at 10:16
Bonus: https://stackoverflow.com/questions/51807953/parse-comma-separated-string-of-numbers-into-variables-scripting-bash – TheJeff Aug 12 '18 at 10:27

CWLiu · Answer 3 · 2018-08-03T02:25:04.097

Would you consider to use awk? Here's the command you may try,

$ awk 'match($0,/[0-9.]*%/){s=(s=="")?"":s",";s=s substr($0,RSTART,RLENGTH-1)}END{print s}' file
26.16,6.89,23.82,26.17

Brief explanation,

match($0,/[0-9.]*%/): find the record matched with regex [0-9.]*%
s=(s=="")?"":s",": since comma separated is required, we just need print commas before each matched except the first one.
s=s substr($0,RSTART,RLENGTH-1): print the matched part appended to s

score 1 · Answer 4 · answered Aug 01 '18 at 06:10

Assuming the item names (Statements, Branches, ...) do not contain whitespaces, how about:

#!/bin/bash

declare -a keys
declare -a vaues

while read -r line; do
    if [[ "$line" =~ ^([^\ ]+)\ *:\ *([0-9.]+)% ]]; then
        keys+=(${BASH_REMATCH[1]})
        values+=(${BASH_REMATCH[2]})
    fi
done < output.txt

ifsback=$IFS        # backup IFS
IFS=,
echo "${keys[*]}"
echo "${values[*]}"
IFS=$ifsback        # restore IFS

which yields:

Statements,Branches,Functions,Lines
26.16,6.89,23.82,26.17

score 1 · Answer 5 · answered Aug 01 '18 at 22:54

Yet another option, with perl:

cat the_file | perl -e 'while(<>){/(\d+\.\d+)%/ and $x.="$1,"}chop $x; print $x;'

The code, unrolled and explained:

while(<>){  # Read line by line. Put lines into $_
  /(\d+\.\d+)%/ and $x.="$1,"
  # Equivalent to:
  # if ($_ =~ /(\d+\.\d+)%/) {$x.="$1,"}
  # The regex matches "numbers", "dot", "numbers" and "%", 
  # stores just numbers on $1 (first capturing group)
}
chop $x; # Remove extra ',' and print result
print $x;

Somewhat shorter with an extra sed

cat the_file | perl -ne '/(\d+\.\d+)%/ and print "$1,"'|sed 's/.$//'

Uses "n" parameter which implies while(<>){}. For removing the last ',' we use sed.

Parsing Karma Coverage Output in Bash for a Jenkins Job (Scripting)

5 Answers5