22

I have a file that contains a number followed by a file path on each line for a large amount of files. So it looks like this:

      7653 /home/usr123/file123456

But the problem with this is there is 6 empty white spaces before it which throws off the rest of my script. I have included the line which produces it below:

cat temp | uniq -c | sed 's/  */ /g' > temp2

I have narrowed it down to being the uniq command that produces the unnecessary white space. I have tried to implement a sed command to delete the white space but for some reason it deletes all but one. How would I be able to modify my sed statement or my uniq statement to get rid of these white spaces? Any help would be greatly appreciated!

Sal
  • 625
  • 2
  • 7
  • 11
  • Notice also separately that you'll want to avoid the [useless use of `cat`](https://stackoverflow.com/questions/11710552/useless-use-of-cat) – tripleee Oct 13 '22 at 04:53

2 Answers2

46

To remove all the leading blanks, use:

sed 's/^ *//g'

To remove all leading white space of any kind:

sed 's/^[[:space:]]*//g'

The ^ symbol matches only at the beginning of a line. Because of this, the above will leave alone any whitespace not at the beginning of the line.

John1024
  • 109,961
  • 14
  • 137
  • 171
3

Breaking down your current sed pattern 's/ */ /g': The search pattern, ' *', matches a single space, followed by zero or more spaces. The replacement pattern, ' ', is a single space. The result is to find any set of one or more spaces and replace it by a single space, hence removing all but one of the spaces. To make it remove all, you can just make your replacement pattern the empty string, '': sed 's/ *//g'

Note, however, that the 'g' makes it global, so that will also remove the space between the number and the file path. If you just want to remove the leading space, you can just drop the g. You can also use the + quantifier, which means "at least one", instead of the *, which means "at least zero". That would leave you with: sed 's/ \+/'

ETA: As John1024 points out, you can also use the ^ anchor to specify the beginning of the line, and the \s wildcard to specify any whitespace characters instead of just spaces: sed 's/\s\+//'

eewallace
  • 99
  • 3
  • 1
    Note that the \s character class shorthand is a GNU extension. John1024's [[:space:]] is POSIX compliant. – eewallace Apr 21 '15 at 00:29