Change AWK field separator on the fly

Question

I'd like to use AWK to take the following spread sheet where the first name and last name are in one column:

Peter Griffin, 31 Spooner St, Quahog
Homer Simpson, 732 Evergreen Terr, Springfield
Fred Flintstone, 301 Cobblestone Way, Bedrock

and output to a new spreadsheet where the first name and last name have their own columns:

Peter, Griffin, 31 Spooner St, Quahog
Homer, Simpson, 732 Evergreen Terr, Springfield
Fred, Flintstone, 301 Cobblestone Way, Bedrock

I've tried changing field separators on the fly doing something like:

awk '{print $1 "," $2} {FS=","} {print $3} {FS=" "}' spreadsheet.csv

but it doesn't seem to work that way, and I get a jumbled mess. Is this possible using AWK?

FYI a comma is a terrible field separator for a name/address spreadsheet as they are commonly used within names and addresses. You should use a tab instead. — Ed Morton, Aug 30 '16 at 15:56
Hey @Ed, do you know why updating FS doesn't work here? I couldn't find an explanation in POSIX spec — oguz ismail, Sep 18 '19 at 06:09
@oguzismail because the OP is setting the FS to `,` **after** the record has already been split into fields and then setting it back to `" "` again before splitting the next record. — Ed Morton, Sep 18 '19 at 14:43
@Ed Oh yeah, but, is there a way to force awk to resplit fields after changing FS? — oguz ismail, Sep 18 '19 at 14:44

score 2 · Accepted Answer · answered Aug 30 '16 at 15:41

2

Just add a comma whenever a space is found in the first ,-based field:

awk 'BEGIN {FS=OFS=","} {sub(/ /, ", ", $1)}1' file
#                             ^    ^^
#               find a space...    ... replace it with , plus space

With your file:

$ awk 'BEGIN {FS=OFS=","} {sub(/ /, ", ", $1)}1' file
Peter, Griffin, 31 Spooner St, Quahog
Homer, Simpson, 732 Evergreen Terr, Springfield
Fred, Flintstone, 301 Cobblestone Way, Bedrock

This uses the function sub() to perform the replacement in the first field.

answered Aug 30 '16 at 15:41

fedorqui

275,237
103
548
598

Not `whenever a space` but `when the first space` otherwise `Harry Connick Jr` can't be in your database :-). Your code is correct it's just your text that's wrong. – Ed Morton Aug 30 '16 at 15:51
@EdMorton uhms, true, although thinking about it maybe `gsub()` just on `$1` may be best, since you Americans have a lot of senior, jr and III in your names ;D – fedorqui Aug 30 '16 at 19:37
@fedorqui This is what I was looking for, thanks! What is the purpose of the 1 after the closing curly brace? I notice that I can put any number in place of the 1 and the command still works. – Ryan R Aug 30 '16 at 21:04
@RyanR yes this is because `1`, `5` or any number evaluates to True and triggers awk's default action, consisting in printing the current line. So `1` is equivalent to `{print $0}`. – fedorqui Aug 30 '16 at 21:10
`you Americans`? I'm not American. I'm Scottish. Anyway, if you add more commas it'll break the (ill-advised!) comma-separated layout of the OPs file. – Ed Morton Aug 30 '16 at 22:01
@fedorqui your comment is still there on my screen, idk why you're not seeing it. I didn't find it offensive, just inaccurate :-)., so I certainly didn't flag it anyway. Yes, the OP wants to create exactly 2 fields out of that one field. – Ed Morton Aug 31 '16 at 13:59

John1024 · Answer 2 · 2016-08-30T15:59:51.467

2

Replace the first space with a comma-space:

$ sed 's/ /, /' file.csv
Peter, Griffin, 31 Spooner St, Quahog
Homer, Simpson, 732 Evergreen Terr, Springfield
Fred, Flintstone, 301 Cobblestone Way, Bedrock

Here, s/ /, / is a substitute command. It replaces the first found with ,.

To change the file in place, use the -i option:

sed -i.bak 's/ /, /' file.csv

edited Aug 30 '16 at 15:59

answered Aug 30 '16 at 15:43

John1024

109,961
14
137
171

The in-place option is a GNU extension, and does not work in unix/solaris. – Daniel Liston Dec 01 '20 at 20:08
@DanielListon You are right about [solaris sed not having a `-i` option](https://stackoverflow.com/questions/3576380/alternative-to-sed-i-on-solaris). The -`i option is not required by POSIX but it is available on GNU sed, MacOS sed, and [FreeBSD sed](https://www.freebsd.org/cgi/man.cgi?query=sed). The OP tagged this question for Linux. – John1024 Dec 08 '20 at 06:06
Understood. However, the OP requested and tagged an AWK solution, not sed. If an alternate solution is offered, it should be clear there is a difference between utilities with the same name offered by GNU and their original program versions. Especially if a modifier provided in an example does not work across ALL unix/linux platforms. – Daniel Liston Dec 31 '20 at 03:20

score 2 · Answer 3 · answered Aug 31 '16 at 00:47

2

Another possibility.

awk '{$1=$1","}1' file

Peter, Griffin, 31 Spooner St, Quahog
Homer, Simpson, 732 Evergreen Terr, Springfield
Fred, Flintstone, 301 Cobblestone Way, Bedrock

answered Aug 31 '16 at 00:47

Claes Wikner

1,457
1
9
8

What does the 1 after } I tried with other number and we have same result. I can't find in the man I even read awk source code but can't guess. – Et7f3XIV Nov 04 '21 at 10:03

Chem-man17 · Answer 4 · 2016-08-30T15:56:14.643

1

You can use multiple separators as-

awk -F '[ ,]' '{print $1 ", " $2 ", " $3 $4 " " $5 " " $6 ", " $7 " " $8}' file

Output-

Peter, Griffin,  31 Spooner St,  Quahog
Homer, Simpson,  732 Evergreen Terr,  Springfield
Fred, Flintstone,  301 Cobblestone Way,  Bedrock

You have to keep a track of the 'columns' that get defined though.

edited Aug 30 '16 at 15:56

answered Aug 30 '16 at 15:44

Chem-man17

1,700
1
12
27

Don't do that. It's unnecessarily hard-coding everything and will fail when the address is any other format, e.g. `19 Martin Luther King Dr` or the name is `Harry Connick Jr`. – Ed Morton Aug 30 '16 at 15:50
Yup, this becomes a specific solution. Not portable at all. But it addresses the problem of how to use multiple separators in the same awk line. – Chem-man17 Aug 30 '16 at 15:51
1

But using multiple separators is the wrong approach to this question as it adds more problems than it solves and if it were necessary you'd write it as `-F '[ ,]'`, not `-F ' |,'` (consider the difference if you had 10 separator characters). – Ed Morton Aug 30 '16 at 15:53
1

Conceded. Edited to reflect the better way. – Chem-man17 Aug 30 '16 at 15:56

Change AWK field separator on the fly

4 Answers4