2

Well, I have the following file:

12721   2 2 2 1 1 2 1 2 1 1 1 2 2 1 1 2 1 2 1 1 2
12722   2 2 2 1 1 2 1 2 1 1 1 2 2 1 1 2 1 2 1 1 2
12734   2 2 2 1 1 2 1 2 1 1 1 2 2 1 1 2 1 2 1 1 2
12753   2 2 2 1 1 2 1 2 1 1 1 2 2 1 1 2 1 2 1 1 2
12756   2 2 2 1 1 2 1 2 1 1 1 2 2 1 1 2 1 2 1 1 2

I need to remove the spaces starting from the second column, so that my file looks like this:

12721 222112121112211212112
12722 222112121112211212112
12734 222112121112211212112
12753 222112121112211212112
12756 222112121112211212112

I tried this command to replace:

sed '1,$s/ //g' snpdata > snpdata1

it did not work and I got it:

12721222112121112211212112
12722222112121112211212112
12734222112121112211212112
12753222112121112211212112
12756222112121112211212112

any suggestions to replace starting from the second column?

Note: my original dataset has thousands of columns and rows.

Greg Rov
  • 327
  • 3
  • 12

3 Answers3

1

EDIT: Since OP changed expected output now, so tweaked code a bit this should help OP for latest expected output.

awk '{val=$1;$1="";gsub(/[[:space:]]+/,"");print val,$0}'  Input_file

Following awk may help you here.

awk '{val=substr($0,5);gsub(/ +/,"",val);print substr($0,1,4), val;val=""}' Input_file
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
1

Using sed:

 sed 's/ //g;s/\([0-9]\{5\}\)\([0-9]\+\)/\1 \2/' file

where the first command removes all spaces and the second command groups the digits

oliv
  • 12,690
  • 25
  • 45
1

This might work for you (GNU sed):

sed 's/\([^ ]\)  */\1\n/1;s/ //g;s/\n/ /' file

Replace the first set of spaces following a column, by a newline. Remove all other spaces. Replace the newline by a space.

potong
  • 55,640
  • 6
  • 51
  • 83