0

I have a dataframe like this

0 1 0 1 0 0....
1 1 1 1 0 0
0 0 1 1 0 1
.
.
.

And I want to multiply each of them with a geometric sequence

1, 10, 100, 1000, 10000 ... 10^(n-1)

so the result will be

0 10 0 1000 0 0....
1 10 100 1000 0 0
0 0 100 1000 0 100000
.
.
.

I have tried with

awk '{n=0 ; x=0 ; for (i = 1; i <= NF; i++) if ($i == 1)  {n=10**i ; x = x+n } print x }' test.txt

But the results were not what I expected

enter image description here

SG Kwon
  • 163
  • 1
  • 9
  • 3
    I wonder why many people try to write awk script in one line even if the code contains multiple lines and blocks. It is less readable and makes it hard to debug for him/herself. There are no benefits. Please stop it. – tshiono Oct 27 '21 at 05:49
  • Please, never show text with images. They are not searchable, not copy-paste-able and much heavier than needed. Copy-paste the text in your question and [format it properly](https://stackoverflow.com/help/formatting), instead. – Renaud Pacalet Oct 27 '21 at 06:23
  • 2
    @tshiono, its a myth honestly, even most of the people around me I know often ask me to give a one-liner `awk` code only :). May be over the period people have written so many one liners that everybody thinks only one liners can be written with `awk` :) So I don't blame users on it :) But yes we should not write big one-liners with `awk` though its possible but very difficult to read and maintain, cheers. – RavinderSingh13 Oct 27 '21 at 06:25
  • 1
    @RavinderSingh13 I do appreciate your polite comment. I understand. – tshiono Oct 27 '21 at 06:49

4 Answers4

3

With GNU awk:

awk '{for (i=1; i<=NF; i++){if($i==1){n=10**(i-1); $i=$i*n}} print}' test.txt

Output:

0 10 0 1000 0 0
1 10 100 1000 0 0
0 0 100 1000 0 100000
Cyrus
  • 84,225
  • 14
  • 89
  • 153
2

Note: In this answer, we always assume single digits per column

There are a couple of things you have to take into account:

  1. If you have a sequence given by:

    a b c d e
    

    Then the final number will be edcba

  2. awk is not aware of integers, but knows only floating point numbers, so there is a maximum number it can reach, from an integer perspective, and that is 253 (See biggest integer that can be stored in a double). This means that multiplication is not the way forward. If we don't use awk, this is still valid for integer arithmetic as the maximum value is 264-1 in unsigned version.

Having that said, it is better to just write the number with n places and use 0 as a delimiter. Example, if you want to compute 3 × 104, you can do;

awk 'BEGIN{printf "%0.*d",4+1,3}' | rev

Here we make use of rev to reverse the strings (00003 → 30000)

Solution 1: In the OP, the code alludes to the fact that the final sum is requested (a b c d eedcba). So we can just do the following:

sed 's/ //g' file | rev
awk -v OFS="" '{$1=$1}1' file | rev

If you want to get rid of the possible starting zero's you can do:

sed 's/ //g;s/^0*//; file | rev

Solution 2: If the OP only wants the multiplied columns as output, we can do:

awk '{for(i=NF;i>0;--i) printf("%0.*d"(i==1?ORS:OFS),i,$i)}' file | rev

Solution 3: If the OP only wants the multiplied columns as output and the sum:

awk '{ s=$0;gsub(/ /,"",s); printf s OFS }
     { for(i=NF;i>0;--i) printf("%0.*d"(i==1?ORS:OFS),i,$i)} }
    ' | rev
kvantour
  • 25,269
  • 4
  • 47
  • 72
  • Up-voted for the nice ideas and taking the overflow problem into account. But the last `awk` script is syntactically incorrect and, even if fixed, does not do what the OP want. For an input line like `0 1 1 0 1` the OP want `0 10 100 0 10000`. – Renaud Pacalet Oct 27 '21 at 14:37
  • Based on your ideas I think that the following would do what the OP want, without the overflow problem: `awk '{for(i=1;i<=NF;i++)$i=$i?sprintf("%s%0.*d",$i,i-1,0):$i} 1' test.txt`. – Renaud Pacalet Oct 27 '21 at 14:47
  • @RenaudPacalet I've fixed the issues you indicated. – kvantour Oct 28 '21 at 07:53
  • I do not know what version of `awk` you are using but with GNU `awk` and BSD `awk` your first `awk` program (`awk 'BEGIN{printf "%s0.*d",4+1,3}'`) outputs `50.*d`, not `00003`. – Renaud Pacalet Oct 28 '21 at 08:00
  • @RenaudPacalet Yea, there was a typo. Fixed now. – kvantour Oct 28 '21 at 08:57
1

What you wrote is absolutely not what you want. Your awk program parses each line of the input and computes only one number per line which happens to be 10 times the integer you would see if you were writing the 0's and 1's in reverse order. So, for a line like:

1 0 0 1 0 1

your awk program computes 10+0+0+10000+0+1000000=1010010. As you can see, this is the same as 10 times 101001 (100101 reversed).

To do what you want you can loop over all fields and modify them on the fly by multiplying them by the corresponding power of 10, as shown in the an other answer.

Note: another awk solution, a bit more compact, but strictly equivalent for your inputs, could be:

awk '{for(i=1;i<=NF;i++) $i*=10**(i-1)} {print}' test.txt

The first block loops over the fields and modifies them on the fly by multiplying them by the corresponding power of 10. The second block prints the modified line.

As noted in an other answer there is a potential overflow issue with the pure arithmetic approach. If you have lines with many fields you could hit the maximum of integer representation in floating point format. It could be that the strange 1024 values in the output you show are due to this.

If there is a risk of overflow, as suggested in the other answer, you could use another approach where the trailing zeroes are added not by multiplying by a power of 10, but by concatenating value 0 represented on the desired number of digits, something that printf and sprintf can do:

$ awk 'BEGIN {printf("%s%0.*d\n",1,4,0)}' /dev/null
10000

So, a GNU awk solution based on this could be:

awk '{for(i=1;i<=NF;i++) $i = $i ? sprintf("%s%0.*d",$i,i-1,0) : $i} 1' test.txt
Renaud Pacalet
  • 25,260
  • 3
  • 34
  • 51
0

how about not doing any math at all :

{m,n,g}awk '{ for(_+=_^=___=+(__="");_<=NF;_++) { 
                $_=$_    ( \
                __=__""___) } } gsub((_=" "(___))"+",_)^+_'

=

1 0 0 0 10000 0 0 0 0 1000000000 10000000000
1 0 0 0 10000 0 0 10000000 0 0 10000000000
1 0 0 0 10000 100000 0 0 0 0 10000000000
1 0 0 1000 0 0 1000000 0 100000000 1000000000
1 0 0 1000 10000 0 0 0 0 1000000000 10000000000

1 0 100 0 0 0 1000000 10000000 0 0 10000000000
1 0 100 0 0 100000 1000000 10000000 100000000 1000000000
1 0 100 0 10000 0 1000000 0 100000000
1 0 100 1000 0 100000 0 0 0 0 10000000000
1 0 100 1000 10000 0 0 10000000

1 10 0 0 0 0 1000000 10000000 0 1000000000
1 10 0 1000 0 100000 0 0 100000000
1 10 0 1000 0 100000 0 0 100000000 1000000000 10000000000
1 10 0 1000 0 100000 0 10000000 100000000 1000000000
1 10 100 1000 10000 100000 0 0 0 1000000000
RARE Kpop Manifesto
  • 2,453
  • 3
  • 11