5

I have a csv file with a general format

date,  
2013.04.04,
2013.04.04,
2012.04.02,
2013.02.01,
2013.04.05,
2013.04.02,

a script I run will add data to this file which will not necessarily be in date order. How can I sort the file into date order (ignoring the header) and overwrite the existing file rather than writing to STDOUT

I have used awk

awk 'NR == 1; NR > 1 {print $0 | "sort -n"}' file > file_sorted
mv file_sorted file

Is there a more effective way to do this without creating an additional file and moving?

moadeep
  • 3,988
  • 10
  • 45
  • 72

2 Answers2

12

You can do the following:

sort -n -o your_file your_file

-o defines the output file and is defined by POSIX, so it is safe to use (no original file mangled).

Output

$ cat s
date,  
2013.04.04,
2013.04.04,
2012.04.02,
2013.02.01,
2013.04.05,
2013.04.02,

$ sort -n -o s s

$ cat s
date,  
2012.04.02,
2013.02.01,
2013.04.02,
2013.04.04,
2013.04.04,
2013.04.05,
fedorqui
  • 275,237
  • 103
  • 548
  • 598
  • It would be great if you could elaborate on why this does not mangle the file (like a lot of other commands do when reading from and writing to the same file). Does `-o` magically prevent files from being mangled like `sed -i`? To me it seems like `sort` just doesn't start to output until everything was read because the last line could always be the first in the sorted ouput. I guess it could fail when using `sort` in `--merge` mode. If I'm right `-o` is just as good as a shell redirection `>`. – Socowi May 02 '19 at 11:55
  • @Socowi good question! This is [defined by POSIX](http://pubs.opengroup.org/onlinepubs/9699919799//utilities/sort.html). You can read more about it in [this answer](https://stackoverflow.com/a/29244387/1983854) to the question *How to sort a file in-place*. – fedorqui May 02 '19 at 12:06
  • 1
    Oh wow, thank you for the links. The note »*This file can be the same as one of the input files.*« was not given in my `man sort` so I thought there would be no guarantees. – Socowi May 02 '19 at 12:08
  • 1
    @Freedo that your original does not get truncated by error and left you with an empty file – fedorqui Jul 15 '19 at 09:52
2

Note that there exists a race condition if the script and the sorting is running at the same time.

If the file header sorts before the data, you can use the solution suggested by fedorqui as sort -o file file is safe (at least with GNU sort, see info sort).

Running sort from within awk seems kind of convoluted, another alternative would be to use head and tail (assuming bash shell):

{ head -n1 file; tail -n+2 file | sort -n; } > file_sorted

Now, about replacing the existing file. AFAIK, You have two options, create a new file and replace old file with new as you describe in your question, or you could use sponge from moreutils like this:

{ head -n1 file; tail -n+2 file | sort -n; } | sponge file

Note that sponge still creates a temporary file.

glenn jackman
  • 238,783
  • 38
  • 220
  • 352
Thor
  • 45,082
  • 11
  • 119
  • 130