Rearrange columns using cut

Question

I am having a file in the following format

Column1    Column2
str1       1
str2       2
str3       3

I want the columns to be rearranged. I tried below command

cut -f2,1 file.txt

The command doesn't reorder the columns. Any idea why its not working?

score 196 · Accepted Answer · edited May 21 '14 at 04:17

196

For the cut(1) man page:

Use one, and only one of -b, -c or -f. Each LIST is made up of one range, or many ranges separated by commas. Selected input is written in the same order that it is read, and is written exactly once.

It reaches field 1 first, so that is printed, followed by field 2.

Use awk instead:

awk '{ print $2 " " $1}' file.txt

edited May 21 '14 at 04:17

bcorso

45,608
10
63
75

answered Jan 24 '10 at 22:21

Ignacio Vazquez-Abrams

776,304
153
1,341
1,358

24

It's too bad `cut` doesn't support this intuitive re-ordering command. Anyway, another tip: you can use `awk`'s `-FS` and `-OFS` options to use custom input and output field separators (like `-d` and `--output-delimiter` for `cut`). – malana Aug 29 '11 at 13:30
23

Sorry, `FS` is an option, `OFS` is a variable. e.g. `awk -v OFS=";" -F"\t" '{print $2,$1}'` – malana Aug 29 '11 at 13:39
3

Note to Windows users of Git Bash: if you have weird output from the command above, looking like columns overriding each other, the carriage return is to blame. Change EOL in your file from CRLF to LF. – jakub.g Apr 09 '15 at 12:05
1

Alternatively if you don't want to change the input file, you can pipe it through `| sed 's/\r//' | ` before piping to `awk` – jakub.g Apr 09 '15 at 12:15
3

This one is very simple but might be useful for some, simply replace space with \t for reordering by tabs, and in case you want more columns, you can do it as for example `awk '{print $4 "\t" $2 "\t" $6 "\t" $7}' file` – FatihSarigol Jul 25 '17 at 04:10
So the reason it doesn't work is that `cut` can't reorder. It should just say that in the manual – CervEd Dec 20 '21 at 18:29

score 75 · Answer 2 · edited Feb 13 '23 at 08:39

75

You may also combine cut and paste:

paste <(cut -f2 file.txt) <(cut -f1 file.txt)

via comments: It's possible to avoid process substitution and remove one instance of cut by doing:

paste file.txt file.txt | cut -f2,3

edited Feb 13 '23 at 08:39

CervEd

3,306
28
25

answered Feb 19 '13 at 15:52

Justin Kaeser

5,868
27
46

3

Not sure if this qualifies as "cleverly", but: f=file.txt paste <(cut -f2 $f) <(cut -f1 $f). Also, I note that this method is the easiest when you have lots of columns and want to move around large blocks of them. – Michael Rusch Mar 22 '16 at 20:30
doesn't work with cells of variable lengths in same column – kraymer Apr 29 '16 at 08:10
2

@kraymer What do you mean? `cut` works fine for variable-length columns as long as you have a unique column separator. – tripleee Mar 29 '17 at 05:06
1

To eliminate the redundant file you could probably use tee: – JJW5432 Nov 15 '17 at 13:32
worked like a charm !! @JJW5432 can you please provide an edit or an answer on how to avoid redundant file using tee..that will be helpful too.. – ArigatoManga Aug 15 '18 at 10:08
Sorry I commented prematurely: I couldn't figure out how to merge the output of the `cut`s with paste. I think you need to use named papes or coprocesses or something that's a lot more trouble than it's worth, unless for some reason you really don't want to store it in a file. – JJW5432 Aug 16 '18 at 06:41
2

It's possible to avoid *`bash`isms* and remove one instance of `cut` by doing: `paste file.txt file.txt | cut -f2,3` – agc Nov 18 '18 at 20:36
Note that the second solution would only work when your delimiter is the same as paste's default output delimiter `\t`, which was the delimiter in the question. – Isin Altinkaya Apr 18 '22 at 10:30

agc · Answer 3 · 2022-12-06T19:07:43.223

Using join:

join -t $'\t' -o 1.2,1.1 file.txt file.txt

Notes:

-t $'\t' In GNU join the more intuitive -t '\t' without the $ fails, (coreutils v8.28 and earlier?); it's probably a bug that a workaround like $ should be necessary. See: unix join separator char.
Even though there's just one file being worked on, join syntax requires two filenames. Repeating the file name allows join to perform the desired action.

For systems with low resources join offers a smaller footprint than some of the tools used in other answers:

 wc -c $(realpath `which cut join sed awk perl`) | head -n -1
   43224 /usr/bin/cut
   47320 /usr/bin/join
  109840 /bin/sed
  658072 /usr/bin/gawk
 2093624 /usr/bin/perl

To be fair if you wanted low footprint you could just write your own custom utility in C... all of the utilities listed except Perl are coreutils. — , Jun 22 '21 at 04:43

score 7 · Answer 4 · answered Jan 25 '10 at 00:19

7

using just the shell,

while read -r col1 col2
do
  echo $col2 $col1
done <"file"

answered Jan 25 '10 at 00:19

ghostdog74

327,991
56
259
343

This is very often inefficient. Typically, you will find that the corresponding Awk script is a lot faster, for example. You should also be careful to quote the values `"$col2"` and `"$col1"` -- there could be shell metacharacters or other shenanigans in the data. – tripleee Mar 29 '17 at 05:08

score 7 · Answer 5 · answered Apr 07 '14 at 17:07

You can use Perl for that:

perl -ane 'print "$F[1] $F[0]\n"' < file.txt

-e option means execute the command after it
-n means read line by line (open the file, in this case STDOUT, and loop over lines)
-a means split such lines to a vector called @F ("F" - like Field). Perl indexes vectors starting from 0 unlike cut which indexes fields starting form 1.
You can add -F pattern (with no space between -F and pattern) to use pattern as a field separator when reading the file instead of the default whitespace

The advantage of running perl is that (if you know Perl) you can do much more computation on F than rearranging columns.

perlrun (1) claims -a implicitly sets -n but if I run without -n set, it doesn't seem to loop. odd. — Trenton, Jun 09 '16 at 16:31

score 3 · Answer 6 · answered Oct 10 '13 at 09:49

3

Just been working on something very similar, I am not an expert but I thought I would share the commands I have used. I had a multi column csv which I only required 4 columns out of and then I needed to reorder them.

My file was pipe '|' delimited but that can be swapped out.

LC_ALL=C cut -d$'|' -f1,2,3,8,10 ./file/location.txt | sed -E "s/(.*)\|(.*)\|(.*)\|(.*)\|(.*)/\3\|\5\|\1\|\2\|\4/" > ./newcsv.csv

Admittedly it is really rough and ready but it can be tweaked to suit!

answered Oct 10 '13 at 09:49

Chris Rymer

713
5
16

This does not answer the question posed. In the spirit of stack overflow please commit the time to answer a problem before you post. – Bill Gale Jan 22 '20 at 03:42
Thanks for pointing this out, really helpful. Hope you enjoy your badge., – Chris Rymer Nov 14 '21 at 17:14

score 3 · Answer 7 · answered Jan 04 '21 at 11:00

Just as an addition to answers that suggest to duplicate the columns and then to do cut. For duplication, paste etc. will work only for files, but not for streams. In that case, use sed instead.

cat file.txt | sed s/'.*'/'&\t&'/ | cut -f2,3

This works on both files and streams, and this is interesting if instead of just reading from a file with cat, you do something interesting before re-arranging the columns.

By comparison, the following does not work:

cat file.txt | paste - - | cut -f2,3

Here, the double stdin placeholder paste does not duplicate stdin, but reads the next line.

to clarify, with sed, `&` is used to print the matched text. so here we're printing matched text twice split by a tab character — oradwell, Jan 17 '22 at 15:34

score 1 · Answer 8 · edited Dec 06 '22 at 19:03

Using sed

Use sed with basic regular expression's nested subexpressions to capture and reorder the column content. This approach is best suited when there are a limited number of cuts to reorder columns, as in this case.

The basic idea is to surround interesting portions of the search pattern with $ and $, which can be played back in the replacement pattern with \# where # represents the sequential position of the subexpression in the search pattern.

For example:

$ echo "foo bar" | sed "s/\(foo\) \(bar\)/\2 \1/"

yields:

bar foo

Text outside a subexpression is scanned but not retained for playback in the replacement string.

Although the question did not discuss fixed width columns, we will discuss here as this is a worthy measure of any solution posed. For simplicity let's assume the file is space delimited although the solution can be extended for other delimiters.

Collapsing Spaces

To illustrate the simplest usage, let's assume that multiple spaces can be collapsed into single spaces, and the the second column values are terminated with EOL (and not space padded).

File:

bash-3.2$ cat f
Column1    Column2
str1       1
str2       2
str3       3
bash-3.2$ od -a f
0000000    C   o   l   u   m   n   1  sp  sp  sp  sp   C   o   l   u   m
0000020    n   2  nl   s   t   r   1  sp  sp  sp  sp  sp  sp  sp   1  nl
0000040    s   t   r   2  sp  sp  sp  sp  sp  sp  sp   2  nl   s   t   r
0000060    3  sp  sp  sp  sp  sp  sp  sp   3  nl 
0000072

Transform:

bash-3.2$ sed "s/\([^ ]*\)[ ]*\([^ ]*\)[ ]*/\2 \1/" f
Column2 Column1
1 str1
2 str2
3 str3
bash-3.2$ sed "s/\([^ ]*\)[ ]*\([^ ]*\)[ ]*/\2 \1/" f | od -a
0000000    C   o   l   u   m   n   2  sp   C   o   l   u   m   n   1  nl
0000020    1  sp   s   t   r   1  nl   2  sp   s   t   r   2  nl   3  sp
0000040    s   t   r   3  nl
0000045

Preserving Column Widths

Let's now extend the method to a file with constant width columns, while allowing columns to be of differing widths.

File:

bash-3.2$ cat f2
Column1    Column2
str1       1
str2       2
str3       3
bash-3.2$ od -a f2
0000000    C   o   l   u   m   n   1  sp  sp  sp  sp   C   o   l   u   m
0000020    n   2  nl   s   t   r   1  sp  sp  sp  sp  sp  sp  sp   1  sp
0000040   sp  sp  sp  sp  sp  nl   s   t   r   2  sp  sp  sp  sp  sp  sp
0000060   sp   2  sp  sp  sp  sp  sp  sp  nl   s   t   r   3  sp  sp  sp
0000100   sp  sp  sp  sp   3  sp  sp  sp  sp  sp  sp  nl
0000114

Transform:

bash-3.2$ sed "s/\([^ ]*\)\([ ]*\) \([^ ]*\)\([ ]*\)/\3\4 \1\2/" f2
Column2 Column1
1       str1      
2       str2      
3       str3      
bash-3.2$ sed "s/\([^ ]*\)\([ ]*\) \([^ ]*\)\([ ]*\)/\3\4 \1\2/" f2 | od -a
0000000    C   o   l   u   m   n   2  sp   C   o   l   u   m   n   1  sp
0000020   sp  sp  nl   1  sp  sp  sp  sp  sp  sp  sp   s   t   r   1  sp
0000040   sp  sp  sp  sp  sp  nl   2  sp  sp  sp  sp  sp  sp  sp   s   t
0000060    r   2  sp  sp  sp  sp  sp  sp  nl   3  sp  sp  sp  sp  sp  sp
0000100   sp   s   t   r   3  sp  sp  sp  sp  sp  sp  nl 
0000114

Lastly although the question's example does not have strings of unequal length, this sed expression supports this case.

File:

bash-3.2$ cat f3
Column1    Column2
str1       1      
string2    2      
str3       3

Transform:

bash-3.2$ sed "s/\([^ ]*\)\([ ]*\) \([^ ]*\)\([ ]*\)/\3\4 \1\2/" f3
Column2 Column1   
1       str1      
2       string2   
3       str3    
bash-3.2$ sed "s/\([^ ]*\)\([ ]*\) \([^ ]*\)\([ ]*\)/\3\4 \1\2/" f3 | od -a
0000000    C   o   l   u   m   n   2  sp   C   o   l   u   m   n   1  sp
0000020   sp  sp  nl   1  sp  sp  sp  sp  sp  sp  sp   s   t   r   1  sp
0000040   sp  sp  sp  sp  sp  nl   2  sp  sp  sp  sp  sp  sp  sp   s   t
0000060    r   i   n   g   2  sp  sp  sp  nl   3  sp  sp  sp  sp  sp  sp
0000100   sp   s   t   r   3  sp  sp  sp  sp  sp  sp  nl 
0000114

Comparison to other methods of column reordering under shell

Surprisingly for a file manipulation tool, awk is not well-suited for cutting from a field to end of record. In sed this can be accomplished using regular expressions, e.g. $xxx.*$$ where xxx is the expression to match the column.
Using paste and cut subshells gets tricky when implementing inside shell scripts. Code that works from the commandline fails to parse when brought inside a shell script. At least this was my experience (which drove me to this approach).

score 0 · Answer 9 · answered Jul 31 '20 at 20:40

Expanding on the answer from @Met, also using Perl:
If the input and output are TAB-delimited:

perl -F'\t' -lane 'print join "\t", @F[1, 0]' in_file

If the input and output are whitespace-delimited:

perl -lane 'print join " ", @F[1, 0]' in_file

Here,
-e tells Perl to look for the code inline, rather than in a separate script file,
-n reads the input 1 line at a time,
-l removes the input record separator (\n on *NIX) after reading the line (similar to chomp), and add output record separator (\n on *NIX) to each print,
-a splits the input line on whitespace into array @F,
-F'\t' in combination with -a splits the input line on TABs, instead of whitespace into array @F.

@F[1, 0] is the array made up of the 2nd and 1st elements of array @F, in this order. Remember that arrays in Perl are zero-indexed, while fields in cut are 1-indexed. So fields in @F[0, 1] are the same fields as the ones in cut -f1,2.

Note that such notation enables more flexible manipulation of input than in some other answers posted above (which are fine for a simple task). For example:

# reverses the order of fields:
perl -F'\t' -lane 'print join "\t", reverse @F' in_file

# prints last and first fields only:
perl -F'\t' -lane 'print join "\t", @F[-1, 0]' in_file

Rearrange columns using cut

9 Answers9

Linked

Related