using awk or sed to print all columns from the n-th to the last

Question

This is NOT a duplicate of another question. All previous questions/solutions posted on stackoverflow have got the same issue: additional spaces get replaced into a single space.

Example (1.txt)

filename Nospaces
filename One space
filename Two  spaces
filename Three   spaces

Result:

awk '{$1="";$0=$0;$1=$1}1' 1.txt
One space
Two spaces
Three spaces

awk '{$1=""; print substr($0,2)}' 1.txt
One space
Two spaces
Three spaces

@hek2mgl This is NOT a duplicate of another question. All previous questions/solutions posted on stackoverflow have got the same issue: additional spaces get replaced into a single space. — meso_2600, Mar 30 '16 at 13:57
not all of them have that issue. See the answers to http://stackoverflow.com/q/29514679/1745001 for example. — Ed Morton, Mar 30 '16 at 14:51
Hang on - that was YOUR question! You accepted the right answer almost exactly a year ago and now you're back asking the same question again. What's going on? — Ed Morton, Mar 30 '16 at 14:59

Tom Fenech · Answer 1 · 2016-03-30T13:40:55.440

If you define a field as any number of non-space characters followed by any number of space characters, then you can remove the first N like this:

$ sed -E 's/([^[:space:]]+[[:space:]]*){1}//' file
Nospaces
One space
Two  spaces
Three   spaces

Change {1} to {N}, where N is the number of fields to remove. If you only want to remove 1 field from the start, then you can remove the {1} entirely (as well as the parentheses which are used to create a group):

sed -E 's/[^[:space:]]+[[:space:]]*//' file

Some versions of sed (e.g. GNU sed) allow you to use the shorthand:

sed -E 's/(\S+\s*){1}//' file

If there may be some white space at the start of the line, you can add a \s* (or [[:space:]]*) to the start of the pattern, outside of the group:

sed -E 's/\s*(\S+\s*){1}//' file

The problem with using awk is that whenever you touch any of the fields on given record, the entire record is reformatted, causing each field to be separated by OFS (the Output Field Separator), which is a single space by default. You could use awk with sub if you wanted but since this is a simple substitution, sed is the right tool for the job.

"doesn't work" isn't a very clear description of the problem. Anyway, I changed it slightly (use a `*` instead of a `+`). I guess that it should do what you expect now. — Tom Fenech, Mar 30 '16 at 13:42

score 2 · Answer 2 · answered Mar 30 '16 at 12:47

2

Specify IFS with -F option to avoid omitting multiple space by awk

awk -F "[ ]" '{$1="";$0=$0;$1=$1}1' 1.txt
awk -F "[ ]" '{$1=""; print substr($0,2)}' 1.txt

answered Mar 30 '16 at 12:47

jijinp

2,592
1
13
15

works great. just wondering what are the caveats – meso_2600 Mar 30 '16 at 12:54
Hmm now the problem is if I want to start fro mthe second column by using $1=$2="" – meso_2600 Mar 30 '16 at 13:02
This works for even for nth column. – jijinp Mar 30 '16 at 13:59
no it doesnt as there are multiple spaces in later columns – meso_2600 Mar 30 '16 at 14:00
Edit your question to show the expected output. – jijinp Mar 30 '16 at 14:04

hek2mgl · Answer 3 · 2016-03-30T11:51:26.293

1

Use cut:

cut -d' ' -f2- a.txt

prints all columns from the second to the last and preserves whitespace.

edited Mar 30 '16 at 11:51

answered Mar 30 '16 at 11:37

hek2mgl

152,036
28
249
266

please read the question first – meso_2600 Mar 30 '16 at 11:39
@meso_2600 Uh sorry, I tough you mean rows. I would use `cut`. Updated the answer. – hek2mgl Mar 30 '16 at 11:51
and again. please read the question, it is awk or sed ;) – meso_2600 Mar 30 '16 at 13:12
1

I read it. However, `cut` is obviously the right tool for the job. Usually asking something like "do this with tool A or B" is frowned upon because you already suggesting a probably wrong solution in the question. Why are you forced to use `awk` or `sed`? – hek2mgl Mar 30 '16 at 13:22
cat encounters issues when the IFS consists of variable number of spaces, while awk handles this properly. This is why I have mentioned n-th column in the question. cut has problems as soon as I wnat to print 2nd->last column – meso_2600 Mar 30 '16 at 13:29
1

@meso_2600 I cannot reproduce that. The example above shows how to print the 2nd to last column. If your "real" problem differs from the one described in the question, how should I give the right answer then? – hek2mgl Mar 30 '16 at 13:37

score 1 · Accepted Answer · answered Mar 30 '16 at 13:24

1

To preserve whitespace in awk, you'll have to use regular expression substitutions or use substrings. As soon as you start modifying individual fields, awk has to recalculate $0 using the defined (or implicit) OFS.

Referencing Tom's sed answer:

awk '{sub(/^([^[:blank:]]+[[:blank:]]+){1}/, "", $0); print}' 1.txt

answered Mar 30 '16 at 13:24

glenn jackman

238,783
38
220
352

this code doesnt work – meso_2600 Mar 30 '16 at 13:30
It's fine with GNU awk. What awk are you using? – glenn jackman Mar 30 '16 at 13:38
1

With that old version, use `gawk --re-interval '...'` – glenn jackman Mar 30 '16 at 13:53
yup, works. but shorter, no subs needed already posted as an answer – meso_2600 Mar 30 '16 at 13:56
one issue, if one of th elines contains less columns than the defined n-th column, it gets printed. fix: awk '{for(i=0;i<[column_id];i++)sub(/[^[:space:]]+[[:space:]]*/,"")}1' Just wondering how does :space: differ from :blank: in this case – meso_2600 Apr 01 '16 at 08:54
"space" includes "vertical" whitespace like newline. – glenn jackman Apr 01 '16 at 09:50

score -1 · Answer 5 · answered Mar 30 '16 at 13:39

-1

Working code in awk, no leading space, supporting multiple space in the columns and printing from the n-th column:

awk '{ print substr($0, index($0,$column_id)) }' 1.txt

answered Mar 30 '16 at 13:39

meso_2600

1,940
5
25
50

this is just childish downvoting proper answer :) – meso_2600 Mar 30 '16 at 13:47
Good answer. Potential downfall: same content in two different columns. – glenn jackman Mar 30 '16 at 13:59
please share example as I can't reproduce such issue – meso_2600 Mar 30 '16 at 14:02
ok i can see the issue. thanks. your answer is correct – meso_2600 Mar 30 '16 at 14:05
For others: `echo foo bar baz bar | awk -v column_id=4 '{print substr($0, index($0,$column_id))}'` will print too many columns – glenn jackman Mar 30 '16 at 14:14
You should delete this before it accumulates more downvotes. – Ed Morton Mar 30 '16 at 14:53
Here is a `sed` way: `FIELD=2 && sed "s/^$[^ ]*\([ ]*$\)\{${FIELD}\}/\2/;s/ //" s.txt` – Mar 30 '16 at 19:34

using awk or sed to print all columns from the n-th to the last

5 Answers5