118

How do I replace whitespaces with tabs in linux in a given text file?

Jonathan
  • 11,809
  • 5
  • 57
  • 91
biznez
  • 3,911
  • 11
  • 33
  • 37

11 Answers11

190

Use the unexpand(1) program


UNEXPAND(1)                      User Commands                     UNEXPAND(1)

NAME
       unexpand - convert spaces to tabs

SYNOPSIS
       unexpand [OPTION]... [FILE]...

DESCRIPTION
       Convert  blanks in each FILE to tabs, writing to standard output.  With
       no FILE, or when FILE is -, read standard input.

       Mandatory arguments to long options are  mandatory  for  short  options
       too.

       -a, --all
              convert all blanks, instead of just initial blanks

       --first-only
              convert only leading sequences of blanks (overrides -a)

       -t, --tabs=N
              have tabs N characters apart instead of 8 (enables -a)

       -t, --tabs=LIST
              use comma separated LIST of tab positions (enables -a)

       --help display this help and exit

       --version
              output version information and exit
. . .
STANDARDS
       The expand and unexpand utilities conform to IEEE Std 1003.1-2001
       (``POSIX.1'').
DigitalRoss
  • 143,651
  • 25
  • 248
  • 329
  • 5
    Woah, never knew expand/unexpand existed. I was trying to do the opposite and expand was perfect rather than having to mess around with `tr` or `sed`. – Ibrahim Jan 09 '13 at 05:43
  • 4
    For the record, expand/unexpand are [standard utilities](http://pubs.opengroup.org/onlinepubs/009695299/utilities/unexpand.html). – kojiro Oct 30 '13 at 20:50
  • 5
    So cool that these are standard. I love the [UNIX philosophy](https://en.wikipedia.org/wiki/UNIX_philosophy). Would be nice if it could do in place though. – Matthew Flaschen Nov 14 '13 at 03:26
  • 2
    What's the easiest way to get `expand` to write back to the original file? If I try `expand main.cpp > main.cpp`, it wipes the file. – Craig McQueen Nov 29 '13 at 02:52
  • 3
    I don't think unexpand will work here.. it only convert the leading spaces and only with two or more spaces.. see here:http://lists.gnu.org/archive/html/bug-textutils/2001-01/msg00025.html – olala Dec 17 '13 at 04:26
  • 1
    this worked best to me `unexpand --first-only -t 2 ...`, thx! – Aquarius Power Aug 23 '14 at 03:52
  • 17
    Just a caution - unexpand will not convert a single space into a tab. If you need to blindly convert all runs of 0x20 characters into a single tab, you need a different tool. – Steve S. Feb 04 '15 at 16:47
  • 3
    Can't edit comment above. For me, `sed "s/ \+/\t/g"` did the trick. – Steve S. Feb 04 '15 at 16:56
  • remember what computers were at the dawn of computering @MatthewFlaschen. of course this is built-in! – Randy L Oct 19 '16 at 00:34
  • 1
    Example one-liner, how to replace spaces with tabs in an entire directory. Find to pick out the files you want to use (php files in my case), then a bash loop to run each file through unexpand, create a temporary file with the result, and then rename back to the original. find /path/to/directory -name '*.php' -print0 | while read -d '' -r filename; do unexpand --tabs=4 --first-only "$filename" > "$filename"-notab && mv "$filename"-notab "$filename"; done; – Daniel Howard Jun 28 '17 at 15:18
  • The only reasonable answer. Using perl or awk is a total overkill if you have a dedicated command for it. – 9ilsdx 9rvj 0lo Sep 28 '18 at 08:37
  • How do I overwrite the file instead of printing it? – Aaron Franke Nov 06 '19 at 20:08
  • NEVER try to overwrite the file or you will end up deleting it, always write to a new file. I used `unexpand --first-only -t 4 file > file2` to convert spaces to tabs. – CrazyPyro Sep 04 '20 at 16:24
54

I think you can try with awk

awk -v OFS="\t" '$1=$1' file1

or SED if you preffer

sed 's/[:blank:]+/,/g' thefile.txt > the_modified_copy.txt

or even tr

tr -s '\t' < thefile.txt | tr '\t' ' ' > the_modified_copy.txt

or a simplified version of the tr solution sugested by Sam Bisbee

tr ' ' \\t < someFile > someFile
Jonathan
  • 11,809
  • 5
  • 57
  • 91
  • 4
    In your sed example, best practices dictate that you use tr to replace single characters over sed for efficiency/speed reasons. Also, tr example is much easier this way: `tr ' ' \\t < someFile > someFile` – Sam Bisbee Sep 14 '09 at 22:12
  • 2
    Of course, tr has better performance than sed, but the main reason I have for loving Unix is that there're many ways to do something. If you plan to do this substitution many times you will search a solution with a good performance, but if you are going to do it only once, you will serach for a solution wich involves a command that make you feel confortable. – Jonathan Sep 15 '09 at 14:37
  • 2
    arg. I had to use trial and error to make the sed work. I have no idea why I had to escape the plus sign like this: `ls -l | sed "s/ \+/ /g"` – Jess Apr 11 '13 at 19:37
  • With `awk -v OFS="\t" '$1=$1' file1` I noticed that if you have a line beginning with number 0 (e.g. `0 1 2`), then the line will be ommitted from the result. – Nikola Novak Jun 29 '14 at 18:13
  • @Jess You found "correct default syntax" regex. By default sed treat single (unescaped) plus sign as simple character. The same is true for some other characters like '?', ... You can find more info here: https://www.gnu.org/software/sed/manual/html_node/Extended-regexps.html#Extended-regexps . Similar syntax details can be found here (note that this is man for grep, not sed): http://www.gnu.org/software/grep/manual/grep.html#Basic-vs-Extended . – Victor Yarema Feb 22 '16 at 19:29
  • How do I do this recursively? – Aaron Franke Jul 29 '19 at 17:44
15

Using Perl:

perl -p -i -e 's/ /\t/g' file.txt
djule5
  • 2,722
  • 2
  • 19
  • 19
John Millikin
  • 197,344
  • 39
  • 212
  • 226
13

better tr command:

tr [:blank:] \\t

This will clean up the output of say, unzip -l , for further processing with grep, cut, etc.

e.g.,

unzip -l some-jars-and-textfiles.zip | tr [:blank:] \\t | cut -f 5 | grep jar
Tarkin
  • 313
  • 2
  • 9
3

Example command for converting each .js file under the current dir to tabs (only leading spaces are converted):

find . -name "*.js" -exec bash -c 'unexpand -t 4 --first-only "$0" > /tmp/totabbuff && mv /tmp/totabbuff "$0"' {} \;
arkod
  • 1,973
  • 1
  • 20
  • 20
3

Download and run the following script to recursively convert soft tabs to hard tabs in plain text files.

Place and execute the script from inside the folder which contains the plain text files.

#!/bin/bash

find . -type f -and -not -path './.git/*' -exec grep -Iq . {} \; -and -print | while read -r file; do {
    echo "Converting... "$file"";
    data=$(unexpand --first-only -t 4 "$file");
    rm "$file";
    echo "$data" > "$file";
}; done;
olfek
  • 3,210
  • 4
  • 33
  • 49
2

This will replace consecutive spaces with one space (but not tab).

tr -s '[:blank:]'

This will replace consecutive spaces with a tab.

tr -s '[:blank:]' '\t'
shrewmouse
  • 5,338
  • 3
  • 38
  • 43
mel
  • 1,566
  • 5
  • 17
  • 29
2

Using sed:

T=$(printf "\t")
sed "s/[[:blank:]]\+/$T/g"

or

sed "s/[[:space:]]\+/$T/g"
Tibor
  • 21
  • 2
1

You can also use astyle. I found it quite useful and it has several options too:

Tab and Bracket Options:
   If  no  indentation  option is set, the default option of 4 spaces will be used. Equivalent to -s4 --indent=spaces=4.  If no brackets option is set, the
   brackets will not be changed.

   --indent=spaces, --indent=spaces=#, -s, -s#
          Indent using # spaces per indent. Between 1 to 20.  Not specifying # will result in a default of 4 spaces per indent.

   --indent=tab, --indent=tab=#, -t, -t#
          Indent using tab characters, assuming that each tab is # spaces long.  Between 1 and 20. Not specifying # will result in a default assumption  of
          4 spaces per tab.`
Ankur Agarwal
  • 23,692
  • 41
  • 137
  • 208
0

If you are talking about replacing all consecutive spaces on a line with a tab then tr -s '[:blank:]' '\t'.

[root@sysresccd /run/archiso/img_dev]# sfdisk -l -q -o Device,Start /dev/sda
Device         Start
/dev/sda1       2048
/dev/sda2     411648
/dev/sda3    2508800
/dev/sda4   10639360
/dev/sda5   75307008
/dev/sda6   96278528
/dev/sda7  115809778
[root@sysresccd /run/archiso/img_dev]# sfdisk -l -q -o Device,Start /dev/sda | tr -s '[:blank:]' '\t'
Device  Start
/dev/sda1       2048
/dev/sda2       411648
/dev/sda3       2508800
/dev/sda4       10639360
/dev/sda5       75307008
/dev/sda6       96278528
/dev/sda7       115809778

If you are talking about replacing all whitespace (e.g. space, tab, newline, etc.) then tr -s '[:space:]'.

[root@sysresccd /run/archiso/img_dev]# sfdisk -l -q -o Device,Start /dev/sda | tr -s '[:space:]' '\t'
Device  Start   /dev/sda1       2048    /dev/sda2       411648  /dev/sda3       2508800 /dev/sda4       10639360        /dev/sda5       75307008        /dev/sda6     96278528        /dev/sda7       115809778  

If you are talking about fixing a tab-damaged file then use expand and unexpand as mentioned in other answers.

shrewmouse
  • 5,338
  • 3
  • 38
  • 43
0
sed 's/[[:blank:]]\+/\t/g' original.out > fixed_file.out

This will for example reduce the amount of tabs.. or spaces into one single tab.

You can also do it for situations of multiple spaces/tabs into one space:

sed 's/[[:blank:]]\+/ /g' original.out > fixed_file.out
Dexter
  • 6,170
  • 18
  • 74
  • 101