1

I would like to replace all leading whitespace characters with an equal amount of tabs for every line in a file, using grep or sed. Each line has a few spaces followed by a dash and some text.

 -Line 1  
  -Line 2  
   -Line 3

Finding them is not a problem, but I don't see how to replace these characters using the backreferences. Something like:

sed 's/^([\s]+)(-.*)/\1\2/' file.txt

How can I solve this? Or is it even possible?

ganzpopp
  • 364
  • 3
  • 12

2 Answers2

2

Depending on your tab width, you might want to replace blocks of for example 4 or 8 spaces with tabs, like

sed 's/ \{4\}/\t/g' infile

or

sed 's/ \{8\}/\t/g' infile

This turns a file that looks like

$ cat infile
no space
 1 space
  2 spaces
   3 spaces
    4 spaces
     5 spaces
      6 spaces
       7 spaces
        8 spaces
         9 spaces
          10 spaces
           11 spaces

into this (replacing tabs with ^I so we can see them):

$ sed 's/ \{4\}/\t/g' infile | cat -T
no space
 1 space
  2 spaces
   3 spaces
^I4 spaces
^I 5 spaces
^I  6 spaces
^I   7 spaces
^I^I8 spaces
^I^I 9 spaces
^I^I  10 spaces
^I^I   11 spaces

or this

$ sed 's/ \{8\}/\t/g' infile | cat -T
no space
 1 space
  2 spaces
   3 spaces
    4 spaces
     5 spaces
      6 spaces
       7 spaces
^I8 spaces
^I 9 spaces
^I  10 spaces
^I   11 spaces

The tab width can be parameterized (notice double quotes):

$ tw=7
$ sed "s/ \{$tw\}/\t/g" infile | cat -T
no space
 1 space
  2 spaces
   3 spaces
    4 spaces
     5 spaces
      6 spaces
^I7 spaces
^I 8 spaces
^I  9 spaces
^I   10 spaces
^I    11 spaces

Notice how this can be easily done also in vim, see this question.

Spaces only at start of line

The commands above replace any group of four or eight spaces with a tab. If you want to only replace spaces at the start of a line, say for a file like this:

$ cat infile 
    4 spaces    word
     5 spaces    word
      6 spaces    word
       7 spaces    word
        8 spaces    word 
         9 spaces    word

you can use

$ sed ':a;s/^\(\t*\) \{4\}/\1\t/;/^\t* \{4\}/ba' infile | cat -T
^I4 spaces    word
^I 5 spaces    word
^I  6 spaces    word
^I   7 spaces    word
^I^I8 spaces    word 
^I^I 9 spaces    word

What this does:

# Label to branch to
:a

# Replace optional leading tabs followed by four spaces
# by the same amount plus one tabs
s/^\(\t*\) \{4\}/\1\t/

# If there are still four spaces after leading tabs, branch to a
/^\t* \{4\}/ba

Update

Turns out the question was actually about replacing spaces at the start of the line with a tab each.

For this input

0 spaces
 1 space
  2 spaces
   3 spaces

the following sed command works:

$ sed ':a;s/^\(\t*\) /\1\t/;ta' infile | cat -T
0 spaces$
^I1 space$
^I^I2 spaces$
^I^I^I3 spaces$

Explained:

:a                # Label to branch to
s/^\(\t*\) /\1\t/ # Capture tabs at start of line, replace next space with tab
ta                # Branches to :a if there was a substitution
Benjamin W.
  • 46,058
  • 19
  • 106
  • 116
  • Thanks for your answer, but this doesn't entirely cover my case. I want to replace every space at the beginning by a tab, not every 4 or 8 spaces. – ganzpopp Jan 16 '17 at 09:49
  • @ganzpopp I see - I've added a solution that does that. – Benjamin W. Jan 23 '17 at 05:57
  • The example `$ sed 's/^ \{4\}/\t/g' infile | cat -T` doesn't work as shown, at least not on Linux. It only replaces the first match. – jcomeau_ictx Jul 13 '22 at 20:12
  • 1
    @jcomeau_ictx Right, the first few commands shouldn't use an anchor, I've removed it. The output didn't correspond to the commands shown, now it does. – Benjamin W. Jul 13 '22 at 23:12
0

Keep it simple and just use awk:

$ awk '{s=$0; sub(/[^ ].*/,"",s); gsub(/ /,"\t",s); sub(/^ +/,s)} 1' file
        -Line 1
                -Line 2
                        -Line 3
Ed Morton
  • 188,023
  • 17
  • 78
  • 185