3

I have a file with the content like that:

03:14.27,"31K" 
03:13.59,"50M" 
04:11.51,"435K" 

Question is how to get numbers in bytes and replace with the old values so that I can get (also getting rid of quotes would be useful):

03:14.27,"31744"
...... 

What to use better ? grep or awk? Thanks!

Book Of Zeus
  • 49,509
  • 18
  • 174
  • 171
Ali Bek
  • 31
  • 2
  • The edit shows `\n` after each quoted value. Is that the way the actual file is, or is that an editorial assumption? – Jonathan M Dec 16 '11 at 14:23
  • @JonathanM the code has `\n`, i simply made it visible on my edit. view the original source http://stackoverflow.com/revisions/62bc4eba-9597-4981-a71e-af32e239a732/view-source – Book Of Zeus Dec 16 '11 at 14:24

3 Answers3

4

perl!

fg@erwin $ cat t.pl
#!/usr/bin/perl -W

use strict;

my %suffixes = (
        "K" => 10,
        "M" => 20,
        "G" => 30
);

while (my $line = <STDIN>) {
    $line =~ s/"(\d+)(\w)"/ '"' . ($1 << $suffixes{$2}) . '"'/ge;
    print $line;
}
fge@erwin ~ $ cat <<EOF | perl t.pl
> 03:14.27,"31K" 
> 03:13.59,"50M" 
> 04:11.51,"435K"
> EOF
03:14.27,"31744" 
03:13.59,"52428800" 
04:11.51,"445440"

(edit: new input)

fge
  • 119,121
  • 33
  • 254
  • 329
2

awk way:

 awk 'BEGIN{k=1024;m=1024*k;g=1024*m;FS=OFS="\""}
        {x=substr($2,1,length($2)-1)*1}
        $2~/[Kk]$/{x*=k}
        $2~/[mM]$/{x*=m}
        $2~/[Gg]$/{x*=g}
        {print $1,x"\""} yourFile

test with your example:

kent$  cat tt
03:14.27,"31K" 
03:13.59,"50M" 
04:11.51,"435K"

kent$  awk 'BEGIN{k=1024;m=1024*k;g=1024*m;FS=OFS="\""}
        {x=substr($2,1,length($2)-1)*1}
        $2~/[Kk]$/{x*=k}
        $2~/[mM]$/{x*=m}
        $2~/[Gg]$/{x*=g}
        {print $1,x"\""}' tt

output:

03:14.27,"31744"
03:13.59,"52428800"
04:11.51,"445440"

if you don't want the quotes:

 awk 'BEGIN{k=1024;m=1024*k;g=1024*m;FS="\""}
        {x=substr($2,1,length($2)-1)*1}
        $2~/[Kk]$/{x*=k}
        $2~/[mM]$/{x*=m}
        $2~/[Gg]$/{x*=g}
        {print $1,x} yourFile
Kent
  • 189,393
  • 32
  • 233
  • 301
1

Grep doesn't do replacements, you'd need sed for that. But sed can't do math or conditionals, so if you want the full x1024 K/M you'll need awk. If you can live with x1000, you can easily use sed to replace the K/M with the appropriate number of zeroes:

sed -e s/K/000/ -e s/M/000000/

Awk code for the full 1024, if you have gawk or another interpreter with switch:

#!/usr/bin/awk -f
BEGIN { FS = "\""; OFS = "\"" }
{
        N = $2+0
        if(N == 0) { print; next }       
        M = substr($2,length($2),1)
        switch(M) {
                # Add T, P, X, etc. if you need them.  Or just for fun.
                case "G": N *= 1024
                case "M": N *= 1024
                case "K": N *= 1024
        }
        $2 = N
        print
}

If there's the possibility of more quotation marks before this field, change $2 to $NF. If your interpreter doesn't have switch, you can use if statements with multiplied-out products or use Kent's answer. I just wanted to show using " as a separator and a proper use of switch fallthroughs.

Kevin
  • 53,822
  • 15
  • 101
  • 132
  • 1
    Actually he does, as per the question's title – fge Dec 16 '11 at 14:34
  • @fge I wanted to provide a quick answer from a mobile device before I got to work and could test it. Full answer is now up. – Kevin Dec 16 '11 at 16:24