0

i have multiple files in a directory that i need to reformat and put the output in one file, the file structure is:

========================================================
Daily KPIs  -   DATE:  24/04/2013
========================================================

--------------------------------------------------------
Number of des         =  5270
--------------------------------------------------------
Number of users       =  210
--------------------------------------------------------
Number of active      =  520
--------------------------------------------------------
Total non             =  713
--------------------------------------------------------

========================================================

I need the output format to be:

Date,Numberofdes,Numberofusers,Numberofactive,Totalnon
24042013,5270,210,520,713

The directory has around 1500 files with the same format and im using Centos 7.

Thanks

Ali Jaber
  • 75
  • 1
  • 1
  • 7
  • 3
    [Stack Overflow](http://stackoverflow.com/tour) is a question and answer site for professional and enthusiast programmers. Please show your coding efforts. – Cyrus Nov 24 '16 at 10:46
  • hi Cyrus, this is what I've reached so far: cat file.txt |sed 's/ //g'|sed 's/-//g'|sed 's/========================================================//g'|awk -F'[=;]' '{print $2}'|sed '/^$/d'|tr "/n" ",".. my output is 1128,718,7308,9154, im unable to get the dates on the column.. – Ali Jaber Nov 24 '16 at 11:02
  • 1
    Please add this to your question. – Cyrus Nov 24 '16 at 11:02
  • Please [edit] your question to show [what you have tried so far](http://whathaveyoutried.com). You should include a [mcve] of the code that you are having problems with, then we can try to help with the specific problem. You should also read [ask]. – Toby Speight Nov 24 '16 at 13:26

1 Answers1

1

First we need a method to join the elements of an array into a string (cf. Join elements of an array?):

function join_array()
{
    local IFS=$1
    shift
    echo "$*"
}

Then we can cycle over each of the files and convert each one into a comma-separated list (assuming that the original file have a name ending in *.txt).

for f in *.txt
do
    sed -n 's/[^:=]\+[:=] *\(.*\)/\1/p' < $f | {
        mapfile -t fields
        join_array , "${fields[@]}"
    }
done

Here, the sed command looks inside each input file for lines that:

  1. begin with a substring that contains neither a : nor a = character (the [^:=]\+ part);
  2. then follow a : or a = and an arbitrary number of spaces (the [:=] * part);
  3. finally, end with an arbitrary substring (the *\(.*\) part).

The last substring is then captured and printed instead of the original string. Any other line in the input files is discared.

After that, the output of sed is read by mapfile into the indexed array variable fields (the -t ensures that trailing newlines from each line read are discarded) and finally the lines are joined thanks to our previously-defined join_array method.

The reason whereby we need to wrap mapfile inside a subshell is explained here: readarray (or pipe) issue.

Community
  • 1
  • 1
Roberto Reale
  • 4,247
  • 1
  • 17
  • 21