-2

I have a csv file with a header containing title of each column. I want to convert all the numbers to scientific notation format with only 2 decimal places. i.e. 23452 should be converted to 2.34e+04 and 0.00023452 to 2.34e-04.

However, I want the first column to be left untouched. It is of the form text_number for e.g. ABC_100. I don't want ABC_100 to get converted to ABC_1e+2.

Simply speaking, leaving the first column and first row, I want everything to change to 2 decimal scientific format.

Example file:

Name,ClassA,ClassB,ClassC
File_10,2342,0.0212,34.234
File_50,43.234,7834,0.0024
File_100,300,0.0024,2.2341e-5 

Expected Output:

Name,ClassA,ClassB,ClassC
File_10,2.34e+03,2.12e-02,3.42e+01
File_50,4.3e+01,7.83e+03,2.4e-03
File_100,3e+02,2.4e-03,2.23e-5 
polm23
  • 14,456
  • 7
  • 35
  • 59
user402940
  • 315
  • 2
  • 10
  • please add 3-5 lines of sample text and complete expected output for that, it'd help in testing solutions... also, please add the code you are having trouble with.. – Sundeep Jul 04 '18 at 06:26
  • @Sundeep : I have added a small example. I don't have much experience with scripting languages, so I don't have any code right now. Sorry !!! – user402940 Jul 04 '18 at 06:34
  • 1
    sorry, but you are expected to show code for which you need help.. tag wikis (for ex: https://stackoverflow.com/tags/awk/info) could get you started.. try searching online as well, for ex: https://stackoverflow.com/questions/42814746/how-change-a-number-to-scientific-number-and-get-the-minimum-value-scientific-n – Sundeep Jul 04 '18 at 06:42
  • @Sundeep: Will definitely go through awk documentation to understand its working on csv files. Thanks. – user402940 Jul 04 '18 at 07:05

3 Answers3

0

Here you go.

awk -F, 'NR == 1 {print}
  NR > 1 {
    printf $1;
    for (ii = 2; ii <= NF; ii++){
      printf(",%1.2e", $ii)
    }
    print ""}' input.txt

The reference for printf may come in handy.

polm23
  • 14,456
  • 7
  • 35
  • 59
  • Never do `printf $X`, always use `printf "%s", $X` instead since the former will fail if/when `$X` contains `printf` formatting characters like `%s`. – Ed Morton Jul 04 '18 at 10:57
0

Another in awk:

$ awk '
BEGIN { FS=OFS="," }                # set field separators
{
    for(i=1;i<=NF;i++)              # iterate all fields
        if($i+0==$i)                # if $i is numeric
            $i=sprintf("%1.2e",$i)  # convert to scientific form
}
1' file                             # output
Name,ClassA,ClassB,ClassC
File_10,2.34e+03,2.12e-02,3.42e+01
File_50,4.32e+01,7.83e+03,2.40e-03
File_100,3.00e+02,2.40e-03,2.23e-05
James Brown
  • 36,089
  • 7
  • 43
  • 59
0

Yet another awk, based on @JamesBrown answer (for detecting number), using GNU awk, without any loop:

awk '
BEGIN{RS="[,\n]"}
$1+0==$1{$1=sprintf("%1.2e",$1)}
{printf "%s%s",$0,RT}' file

The record separator RS allows to catch every number as one record, and so avoid using any loop.

oliv
  • 12,690
  • 25
  • 45
  • You might want to replace that `$i` in `sprintf("%1.2e",$i)` with `$1` eventhough it evaluates to `$0` and works anyway. :D – James Brown Jul 04 '18 at 09:01