1

Given a file with:

2014-08-01 20:13:17.666 xxxxxxxxxx
2014-08-01 20:13:17.666 xxxxxxxxxx
2014-08-01 20:13:17.666 xxxxxxxxxx
......

I am attempting to remove the microseconds using sed:

GNU sed version 4.2.1
Copyright (C) 2009 Free Software Foundation, Inc.

The following is failing with error message "sed: -e expression #1, char 38: Invalid range end"

sed 's/\([0-9][0-9\- :]*\)\.[0-9]\{3\}/\1/g' < a.csv

However, vi a.csv and searching for

\([0-9][0-9\- :]*\)\.[0-9]\{3\}

works fine.

The rootcause is the escaped hyphen. If I remove the escaped hyphen, sed won't complain but it won't match the intended pattern. I have tried different ways of escaping the hyphen to no avail.

The workaround is to explicitly write out the entire date-time format as follows:

sed 's/\([0-9][0-9]*-[0-9][0-9]-[0-9][0-9] [0-2][0-9]:[0-5][0-9]:[0-5][0-9]\)\.[0-9]\{3\}/\1/g'

The workaround seems ugly and cumbersome. I do realize the underlying RE engine is different between sed and vi. However, I like to

  1. understand why even escaping the hyphen fails in sed
  2. how to revise the RE for sed to make it more elegant.

Related to, but is not resolved by, sed error "Invalid range end"

Community
  • 1
  • 1
mbsf
  • 131
  • 1
  • 10

1 Answers1

5

In a character class, the hyphen must be first or last. Attempting to escape it with a backslash does not work (and instead adds the backslash to the class).

There are multiple sed dialects and multiple other regex implementations which work differently, but in this case, the diagnostic is fairly trivial. And the fix:

sed 's/\([0-9][-0-9 :]*\)\.[0-9]\{3\}/\1/' < a.csv

(I also removed the /g flag because it appears to be redundant here. Surely, you have no more than one occurrence per line of this pattern?)

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • As a further aside, I don't understand why you would want to permit a hyphen just before the period. It should always be colon, two digits, period, microseconds; shouldn't it? But that doesn't answer your question (-: – tripleee Aug 06 '14 at 22:00