I have a data set that I cut-n-pasted from a Google Spreadsheet into my text editor (Sublime Text 2), and the data set doesnt' quite match my needs for processing.
In the form that it has coming from the spreadsheet, the data starts with one line of strings, one for each column, and then a number of rows with data; in the data rows, each column has either the value 1
or is blank. I don't know if the data is tab separated when it comes from the spreadsheet, but after pasting it in the text file it is not. If the last 1
in a row is not in the last column, the line is padded with spaces up until but not including the last column.
I tried doing something with awk
, but I couldn't figure out how to tackle the fact that space is both separator and column value. Next, I tried a few commands with sed
, including replacing repeated spaces with zeros and piping to another sed
which replaced 10
with 1 0
, but then I sometimes got extra zeros inserted and I don't know where in the respective rows that happened.
This is some example data (there are 13 columns in the real file). I've added $
as the character after the last one on the line, so you can see how far the lines are padded.
"1" "2" "3" "4" "1" "2" "3" "4"
1 1 $ 0 1 1 0
1 1 $ 1 0 0 1
1 $ 0 1 0 0
1 1 1 $ 1 1 0 1
I would like to end up with something like the right (and then I don't care about where the line ends) so I can process it with awk
.
And by the way, I have seen this question, which doesn't solve my problem since the solution there is based on the fact that the file is tab-delimited, with no value at all in the "empty" cells. To reiterate, my file is space-delimited, with spaces in the empty cells.