I noticed an odd behavior while populating an array in awk
. The indices and value both were numbers, so adding 0
shouldn’t have impacted. For the sake of understanding, lets take the following example:
Here is a file that I wish to use for this demo:
$ cat file
2.60E5-2670161065730303122012098 Invnum987678
2.60E5-2670161065846403042011098 Invnum987912
2.60E5-2670161065916903012012075 Invnum987654
2.60E5-2670161066813503042011075 Invnum987322
2.60E5-2670161066835008092012075 Invnum987323
2.60E5-2670161067040701122012075 Invnum987324
2.60E5-2670161067106602122010074 Invnum987325
What I would like to do is create an index from $1
and assign it value from $2
. I will extract pieces of value from $1
and $2
using substr
function.
$ awk '{p=substr($1,12)+0; A[p]=substr($2,7)+0;next}END{for(x in A) print x,A[x]}’ file
Now, ideally what the output should have been is as follows (ignore the fact that associative arrays may output in random):
161065730303122012098 987678
161065846403042011098 987912
161065916903012012075 987654
161066813503042011075 987322
161066835008092012075 987323
161067040701122012075 987324
161067106602122010074 987325
But, the output I got was as follows:
161066835008092012544 987323
161065846403042017280 987912
161067040701122019328 987324
161067106602122018816 987325
161066813503041994752 987322
161065916903012007936 987654
161065730303122014208 987678
If I remove the +0
from above awk
one-liner, the output seems to be what I expect. What I would like to know is why would it corrupt the keys?
The above test was done on:
$ awk -version
awk version 20070501