4

I am new to unix shell scripting. I need to parse fixed length data file and convert to comma delimeter. I manage to do this. By using code below:

awk '{
 one=substr($0,1,1)
 two=substr($0,2,10)
 three=substr($0,12,4)
 four=substr($0,16,2)
 rest=substr($0,18)
 printf ("%s,%s,%s,%s,%s\n", one, two, three, four, rest)
}' data.txt > out.txt

Data.txt:

k12582927001611USNA
k12582990001497INAS
k12583053001161LNEU

Output.txt:

k,1258292700,1611,US,NA
k,1258299000,1497,IN,AS
k,1258305300,1161,LN,EU

The problem is I have a requirement to read the column position from a config file.

My configfile (configfile.txt) as below:

one=substr($0,1,1)
two=substr($0,2,10)
three=substr($0,12,4)
four=substr($0,16,2)
rest=substr($0,18)

To meet the requirement, I have created script as below:

configparam=`cat configfile.txt`
awk '{
$configparam
printf ("%s,%s,%s,%s,%s\n", one, two, three, four, rest)
}' data.txt > out.txt

but its not working. Can anybody here show me the correct way to achieve this?

zimzim
  • 43
  • 2
  • Possible duplicate of [How do I use shell variables in an awk script?](https://stackoverflow.com/questions/19075671/how-do-i-use-shell-variables-in-an-awk-script) – JNevill Oct 06 '17 at 16:26
  • Does your config file REALLY contain statements like `one=substr($0,1,1)`? If so - why that instead of just `1 10 4 ...`? – Ed Morton Oct 06 '17 at 17:21

3 Answers3

1

I've reorganized it as

cat cfg.awk

{ 
   one=substr($0,1,1)
   two=substr($0,2,10)
   three=substr($0,12,4)
   four=substr($0,16,2)
   rest=substr($0,18)
}

cat printer.awk

{ printf ("%s,%s,%s,%s,%s\n", one, two, three, four, rest) }

run as

awk -f cfg.awk -f printer.awk data.txt

output

k,1258292700,1611,US,NA
k,1258299000,1497,IN,AS
k,1258305300,1161,LN,EU

The only difference is that you need to add opening/closing { .. } (curly braces) around your var=substr code.

IHTH

shellter
  • 36,525
  • 7
  • 83
  • 90
1

One easiest way is to create file, which contains position start and no of chars like below, you don't have to write so many time one=substr($0,start,n_char);:

Input:

$ cat infile 
k12582927001611USNA
k12582990001497INAS
k12583053001161LNEU

Position file:

$ cat pos 
1,1
2,10
12,4
16,2
18

One-liner:

$ awk 'BEGIN{FS=OFS=","}FNR==NR{pos[++i,"s"]=$1;pos[i,"e"]=$2+0?$2:length;next}{for(j=1; j<=i; j++) printf("%s%s", substr($0,pos[j,"s"],pos[j,"e"]),j==i?ORS:OFS)}' pos infile 
k,1258292700,1611,US,NA
k,1258299000,1497,IN,AS
k,1258305300,1161,LN,EU

Better Readable :

awk 'BEGIN{
            FS=OFS=","
     }
     FNR==NR{
            pos[++i,"s"]=$1;
            pos[i,"e"]=$2+0?$2:length;
            next
     }
     {
          for(j=1; j<=i; j++) 
             printf("%s%s", substr($0,pos[j,"s"],pos[j,"e"]),j==i?ORS:OFS)
     }' pos infile
Akshay Hegde
  • 16,536
  • 2
  • 22
  • 36
1

Following awk may also help you in same.

awk '
function check(val, re){
  split(val, array,",");
  re=array[1] && array[2]?substr($0,array[1],array[2]):substr($0,array[1]);
  return re
}
FNR==NR{
  match($0,/\(.*\)/);
  a[FNR]=substr($0,RSTART+4,RLENGTH-5);
  count++;
  next}
{
for(i=1;i<=count;i++){
  val=val?val "," check(a[i]):check(a[i])
};
  print val;
  val=""
}
' Input_file_config   Input_file

Output will be as follows.

k,1258292700,1611,US,NA
k,1258299000,1497,IN,AS
k,1258305300,1161,LN,EU
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93