2

I currently run some python code to output a set of values alongside a datetime string, into approx 400 text files. The outputs have different values but the same format.

I have changed the format of the datetime string at some point to have a colon between days and hours instead of a whitespace. So now I need to edit the previous whitespace version to match the current version. Sample :

2016/02/11 12:54:28 0071754407 7599 4727 2690
2016/02/11 14:07:41 0071754407 7599 4726 2690 
2016/02/11:15:26:58 0071754407 7599 4725 2690 
2016/02/11:17:12:59 0071754407 7599 4722 2690 
2016/02/11:19:01:21 0071754407 7599 4721 2690 

I am looking at using sed similar to this My current attempt looks something like so

find . -name '*.txt' | sed -ei 's'\d\/\d\s\d\d\:''\d\/\d\:\d\d\:g'

As you can see, I don't know how sed handles regex. I am also unsure about the suffixes to sed, I read that i allows to write to the same file and e is for expressions.

Any guidance would be appreciated.

Community
  • 1
  • 1
Astro David
  • 81
  • 10

2 Answers2

1

As correctly noted in the comments, you cannot modify files read on stdin, so you will need a while loop fed by find that will allow you to operate on the filename of each file instead. Something similar to:

while read -r name; do
    sed -i 's/^\([^ ]*\)[ ]\(.*$\)/\1:\2/' "$name"
done < <(find . -name "*.txt")

(note: you can add -i.bak to your -i option to have sed create backup files for each file it modifies)

Example INput

$ cat file
2016/02/11 12:54:28 0071754407 7599 4727 2690
2016/02/11 14:07:41 0071754407 7599 4726 2690

Output

$ sed -e 's/^\([^ ]*\)[ ]\(.*$\)/\1:\2/' <file
2016/02/11:12:54:28 0071754407 7599 4727 2690
2016/02/11:14:07:41 0071754407 7599 4726 2690
David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
  • Don't pipe find output's to another command. Use the `-exec` option instead. – hek2mgl Feb 22 '16 at 12:28
  • So I tried this with one file. `cat myfile1.txt` and then `sed -e 's/^\([^ ]*\)[ ]\(.*$\)/\1:\2/' – Astro David Feb 22 '16 at 12:38
  • duh... Didn't even think about it. I guess it wouldn't do much good. – David C. Rankin Feb 22 '16 at 12:38
  • Thanks for the update. I did receive an error, which I quickly resolved from [here](http://stackoverflow.com/questions/19456518/invalid-command-code-despite-escaping-periods-using-sed). Something something OSX. All txt files in current directory edited but the strings that were correct now have a colon at the end of them instead of whitespace, guessing regex matched it to be changed. `2016/02/11:15:26:58:0071754407 7599 4725 2690 1497796` – Astro David Feb 22 '16 at 12:50
  • I missed the part -i.bak, should definitely have used that. haha. Now to work out the regex string to reverse it. – Astro David Feb 22 '16 at 13:02
  • Test this for the reversal `sed -i 's/^\(.*[:].*[:].*[:].*\)[:]\(.*\)$/\1 \2/' file` (on a couple of files before unleashing it on all of them). – David C. Rankin Feb 22 '16 at 13:09
  • Thanks, the combined use of both has had the overall desired result. Thanks very much, I'm just terrible with regex. – Astro David Feb 22 '16 at 15:04
1

This should do the trick:

find . -name '*.txt' -exec sed -i -e 's_^\(..../../..\) \(..:..:..\)_\1:\2_' {} \;

Explanation:

The find -exec command will call sed for each filename it is identifying in the current path (.).

-i tells sed to modify the original input files instead of dumping its output to the standard output.

-e tells sed to use the substitution pattern provided right after it in the command line.

We used _ as a separator for the s substitution command.

We freely used the any character match (.) to match what we assume to be the timestamp. We start at beginning of line (^) and match the two parts of the timestamp into the substitution variables \1 and \2.

(If what we find at the line will not match our pattern containing the slashes and colons, will we not touch it).

The last part, {} \;, is the end of the -exec command of our find. it appends the filename to the end of the sed command and terminates that command.

ishahak
  • 6,585
  • 5
  • 38
  • 56