295

How can I convert tabs to spaces in every file of a directory (possibly recursively)?

Also, is there a way of setting the number of spaces per tab?

kenorb
  • 155,785
  • 88
  • 678
  • 743
cnd
  • 32,616
  • 62
  • 183
  • 313

19 Answers19

381

Simple replacement with sed is okay but not the best possible solution. If there are "extra" spaces between the tabs they will still be there after substitution, so the margins will be ragged. Tabs expanded in the middle of lines will also not work correctly. In bash, we can say instead

find . -name '*.java' ! -type d -exec bash -c 'expand -t 4 "$0" > /tmp/e && mv /tmp/e "$0"' {} \;

to apply expand to every Java file in the current directory tree. Remove / replace the -name argument if you're targeting some other file types. As one of the comments mentions, be very careful when removing -name or using a weak, wildcard. You can easily clobber repository and other hidden files without intent. This is why the original answer included this:

You should always make a backup copy of the tree before trying something like this in case something goes wrong.

Gene
  • 46,253
  • 4
  • 58
  • 96
  • Could someone explain why to use the _ in the command, rather than omit it and use $0? – Jeffrey Martinez Nov 26 '13 at 01:13
  • 2
    @JeffreyMartinez Great question. gniourf_gniourf edited my original answer on 11 November and made disparaging remarks about not knowing the proper way to use `{}`. Looks like he didn't know about `$0` when `-c` is used. Then dimo414 changed from my use of a temp in the conversion directory to `/tmp`, which will be much slower if `/tmp` is on a different mount point. Unfortunately I don't have a Linux box available to test your `$0` proposal. But I think you are correct. – Gene Nov 26 '13 at 02:12
  • 1
    @Gene, thanks for the clarification, that sounds like stackoverflow alright :p . While I'm at it though, I'll add I had to use quotes around '*.java' for proper escaping of the *.java. – Jeffrey Martinez Nov 26 '13 at 03:34
  • @JeffreyMartinez @Gene `Don't omit the _ and try to use $0 inside the mini-script -- not only would that be more confusing, but it is also prone to failure if the filename provided by find has special meaning as an argument to the shell.` http://mywiki.wooledge.org/UsingFind – sabgenton Nov 30 '13 at 10:21
  • The $0 positional parameter was really meant to expand to the name of the name of the programme if you leave out all arguments it actually expands to `bash` in that scenario. – sabgenton Nov 30 '13 at 10:21
  • @sabgenton But according to the bash documentation for the `-c` option: "If Bash is started with the -c option, then $0 is set to the first argument after the string to be executed, if one is present." It seems they really meant to make the first argument available as `$0`. – Gene Nov 30 '13 at 15:51
  • It's obvious what the `-c` option allows but there's seems to be plenty of bash hackers that don't recommend using it that way in this context. I'm no expert but I spose $1 will go blank if no args are given where as $0 will expand to 'bash' which is not what you want. gniourf_gniourf was not confused or anything, he was following the convention, I see no reason to challenge it. – sabgenton Dec 01 '13 at 03:03
  • @sabgenton, I don't follow what you mean by "...also prone to failure if the filename...has special meaning...to the shell". I'm missing the part about how it would cause a problem whether we use the '_' or not. Could you clarify on that point? I get that this might be a "convention", but whenever I hear "convention" I'm weary that it might be code for "cause that's how everyone else does it" :p – Jeffrey Martinez Dec 03 '13 at 01:04
  • @JeffreyMartinez The guy behind that website is heavily respected in the bash IRC community (more than anyone I've seen). But I'm no expert and all I can tell you is $0 is across the board normally the name of the program. Run `bash -c 'echo "$0"'` with out arguments and you will see 0 gives the name of the program 'bash' , if your argument returned nothing you will still get the string 'bash' in the program why would you possibly want that? – sabgenton Dec 03 '13 at 01:44
  • You're right about $0 being 'bash' if I don't have an argument. But this is using 'find', which always supplies an argument (the filename), which, if there's an argument, gets assigned to $0, not 'bash'. Here is what I see, straight from my terminal: $ bash -c 'echo "|$0|$1|"' ----> |bash|| ...... $ bash -c 'echo "|$0|$1|"' one ----> |one|| ....... $ bash -c 'echo "|$0|$1|"' one two ------> |one|two| – Jeffrey Martinez Dec 04 '13 at 17:56
  • Furthermore, and I'll admit I searched for hardly 2 minutes, but I can't find where it's documented that 'bash' will show up in $0 when no arguments are passed in, whereas it's very clearly documented that you can expect the first argument after the string to appear in $0. I also disagree that '_' is more readable since unless you already know it's intention (because the guru told you once upon a time), it's impossible to determine what it's for. – Jeffrey Martinez Dec 04 '13 at 18:04
  • Fair call though people get used to shell scripting using `./somefile bla` and when using a file for scripting bla is "$1". That's the norm :) – sabgenton Dec 05 '13 at 08:31
  • Warning for Windows users: [expand](http://technet.microsoft.com/en-us/library/cc722332%28v=ws.10%29.aspx) means something entirely different. – tomByrer Dec 07 '13 at 16:57
  • 2
    If anybody is having a 'unknown primary or operator' error from find, then here is the full command which will fix it: `find . -name '*.java' ! -type d -exec bash -c 'expand -t 4 "$0" > /tmp/e && mv /tmp/e "$0"' {} \;` – Doge Apr 04 '14 at 19:58
  • 1
    @Micro Thanks. I made the original post that worked, but people keep editing it, breaking it in various ways. Thanks for fixing it (again). – Gene Apr 04 '14 at 22:57
  • any idea how to do this on a windows machine using git-bash/msysgit ? Most Linux commands work for me. Getting "missing argument for '-exec'" on this one – isimmons Apr 13 '14 at 16:27
  • @isimmons That's the setup I used to test it on Windows. Works fine for me. Is `bash` in your path? If not, you could try `\bin\bash` in place of `bash` in the command line. – Gene Apr 15 '14 at 20:51
  • @Gene yes bash is in the path. I can type bash and get bash-3.1$ But, I was using cmder. When I tried it with git-bash it works. Maybe when not run through git-bash it is trying to use the windows expand command or a problem with single vs double quotes. Don't know but it works in git-bash. Thanks – isimmons Apr 20 '14 at 18:57
  • @isimmons As the post says, it's a bash command. It will only make sense to a bash shell. – Gene Apr 21 '14 at 01:03
  • Was having issues again because of some changes to my system PATH which led me to find out in cmder I can type 'sh' which takes me into a shell prompt via msysgit/bin/sh.exe and then this command works perfectly. Better than having to go open git-bash to run it. – isimmons Jul 05 '14 at 00:33
  • 4
    I thought this answer hadn't enough comments as it was, so this is mine: if use use `sponge` from https://joeyh.name/code/moreutils/, you can write `find . -name '*.py' ! -type d -exec bash -c 'expand -t 8 "$0" | sponge "$0"' {} \;` – tokland Oct 09 '14 at 09:40
  • 8
    Don't be stupid and use `find . -name '*'`, I just destroyed my local git repo – Gautam Mar 22 '15 at 03:18
  • 1
    Thank you, I used this for unexpand: find . -name "*.js" -exec bash -c 'unexpand -t 4 --first-only "$0" > /tmp/totabbuff && mv /tmp/totabbuff "$0"' {} \; – arkod Nov 04 '15 at 09:18
  • This doesn't work anymore. find: missing argument to `-exec' – ABCD Jul 08 '16 at 11:43
  • @Gene sorry for commenting here, but the answer you posted on my question was very good. I only edited the question because I was having trouble expressing precisely what I wanted, but your answer was definitely what I was looking for. – MaiaVictor Aug 28 '16 at 22:04
  • be advised: this will replace symlinks with the actual files. Replace `! -type d` with `-type f` if you don't want this. – orestisf Jan 12 '17 at 00:13
  • Worked for me with the following caveats; 1) On Mac had to use -type d @orestisf and 2) only worked with the name wildcard not in quotations. – Syntax Feb 20 '17 at 07:37
  • This is my version: `find . -name '*.js' ! -type d -exec bash -c 'expand -t 2 "$0" > /tmp/e && mv /tmp/e "$0"' {} \; ` – Pencilcheck Jun 27 '17 at 03:19
  • I added `alias tabs_to_spaces="echo 'this-url'; echo \"find . -name '*.java' ! -type d -exec bash -c 'expand -t 4 "'\"\$0\"'" > /tmp/e && mv /tmp/e "'\"\$0\"'"' {} \\;\""` to remind me. It prints out the command to run C-c C-v style. – Karl Jan 17 '18 at 02:03
  • I often use this one to avoid hidden folders like `.git`: `find . -not -path '*/.git/*' -name '*.cs' -type f -exec bash -c 'expand -i -t 4 "$0" > /tmp/e && mv /tmp/e "$0"' {} \;` – mja Apr 09 '18 at 13:00
  • This changes file permissions. I have fixed that with `chmod --reference`: https://stackoverflow.com/a/52136507/895245 – Ciro Santilli OurBigBook.com Sep 02 '18 at 11:55
  • 1
    @CiroSantilli新疆改造中心六四事件法轮功 This doesn't change permissions. It creates new files with whatever default permissions (`umask`) usually set in `bashrc`. – Gene Sep 02 '18 at 15:58
  • @Gene yes, and therefore it sometimes changes file permissions, which is likely not what people want, notably for executable script files. – Ciro Santilli OurBigBook.com Sep 02 '18 at 16:01
  • If that's a problem, you can change the final `mv` to `cp`. Of course you must have write permissions on the original file. – Gene Sep 02 '18 at 16:03
  • this doesn't work. the find command shows it is finding all of the files, but the expand command is only operating on like five of them, and i can't figure out why.. they're all `*.ts` files... – ChaseMoskal Dec 30 '19 at 23:24
  • @ChaseMoskal Try giving the `expand` and `mv` commands yourself on a file that wasn't transformed. Perhaps there's some permission or similar problem you'll be able to see. This has worked well for hundreds of people, so there must be something unique about your use case. – Gene Dec 31 '19 at 03:28
219

Try the command line tool expand.

expand -i -t 4 input | sponge output

where

  • -i is used to expand only leading tabs on each line;
  • -t 4 means that each tab will be converted to 4 whitespace chars (8 by default).
  • sponge is from the moreutils package, and avoids clearing the input file. On macOS, the package moreutils is available via Homebrew (brew install moreutils) or MacPorts (sudo port install moreutils).

Finally, you can use gexpand on macOS, after installing coreutils with Homebrew (brew install coreutils) or MacPorts (sudo port install coreutils).

0 _
  • 10,524
  • 11
  • 77
  • 109
kev
  • 155,172
  • 47
  • 273
  • 272
  • 6
    It's one of [GNU_Core_Utilities](http://en.wikipedia.org/wiki/GNU_Core_Utilities) – kev Jun 19 '12 at 04:57
  • 2
    And for those systems that don't use the GNU Core Utilities, you have a decent chance of `expand` being installed since it is standardized by The Open Group's Single Unix Specification. See Issue 6, which is from 2001, though some updates were applied, hence the year of publication being 2004: [`expand`](http://pubs.opengroup.org/onlinepubs/009695399/utilities/expand.html) –  Jul 24 '13 at 22:12
  • 36
    You should pass `-i` to `expand` to only replace leading tabs on each line. This helps avoids replacing tabs that might be part of code. – Quolonel Questions Aug 08 '14 at 16:00
  • 3
    Can this be put inside a for loop? When I try that I get empty output files – ThorSummoner Feb 01 '15 at 02:11
  • 12
    how about for every single file in a directory recursively? – ahnbizcad Jun 10 '15 at 18:44
  • 4
    Every time I try to use this it blanks some (usually all) of the files. :\ – ThorSummoner Jun 23 '15 at 19:16
  • 5
    @ThorSummoner: if `input` is the same file as `output` the bash clobbers the content before even starting `expand`. This is how `>` works. – Robert Siemer Sep 16 '15 at 10:51
  • 3
    @ThorSummoner You should look into `sponge`, which is useful for taking stdout and redirecting it back to the original file. It works by saving all the output coming to its stdin, waiting until the pipe is done, and only then opening and writing the original file. It is part of the `moreutils` package (often not installed by default). – RaveTheTadpole Oct 07 '16 at 21:36
  • 4
    Note: You're creating a new file and the new file might have different permissions as the file you started with. I had some files with permission `0600`, after using `expand` the new file had the default permission of `0664`. Using `sponge` and creating a new file had the same effect. Using `sponge` and NOT creating a new file retained the original permissions. Example: `expand --tabs=4 input | sponge input`. Please note the use of `|` and not `>` in the `sponge` example. – dutoitns Dec 18 '16 at 04:22
  • 2
    expand -t 4 Foo | sponge Foo is the invocation I needed – Chris Hamons Jan 06 '17 at 20:26
  • @ahnbizcad I've added an answer below – olfek Jan 31 '17 at 18:35
  • Thanks, I had no idea about this utility. – nikhil Jul 12 '17 at 18:00
73

Warning: This will break your repo.

This will corrupt binary files, including those under svn, .git! Read the comments before using!

find . -iname '*.java' -type f -exec sed -i.orig 's/\t/ /g' {} +

The original file is saved as [filename].orig.

Replace '*.java' with the file ending of the file type you are looking for. This way you can prevent accidental corruption of binary files.

Downsides:

  • Will replace tabs everywhere in a file.
  • Will take a long time if you happen to have a 5GB SQL dump in this directory.
Community
  • 1
  • 1
Martin Beckett
  • 94,801
  • 28
  • 188
  • 263
  • 12
    for visual space that are a mix of tabs and spaces, this approach give incorrect expansion. – pizza Jun 19 '12 at 07:32
  • 7
    I would also add a file matcher like for example for only .php files find ./ -iname "*.php" -type f -exec sed -i 's/\t/ /g' {} \; – Daniel Luca CleanUnicorn Mar 26 '13 at 10:04
  • 102
    DO NOT USE SED! If there's an embedded tab in a string, you may end up mangling your code. This is what [expand](http://man.cx/expand) command was meant to handle. Use `expand`. – David W. Nov 12 '13 at 17:11
  • 6
    @DavidW. I would simply update this command to only replace tabs from the beginning of the line. ```find ./ -type f -exec sed -i 's/^\t/####/g' {} \;```. But I wasn't aware of the expand command - very useful! – Martin Konecny May 07 '14 at 16:08
  • 4
    The answer's command just destroyed my local git repository. YMMV. – Martin T. Jun 17 '14 at 09:24
  • 31
    DO NOT USE! This answer also just wrecked my local git repository. If you have files containing mixed tabs and spaces it will insert sequences of #'s. Use the answer by Gene or the comment by Doge below instead. – puppet Aug 18 '14 at 13:06
  • 2
    I don't know why this would kill your local repository, it didn't do that for me. The `#` characters might need to be replaced by actual spaces, I assume that's what the "The # are spaces" in the answer meant. But the `^` does not help: you end up replacing just the first tab, subsequent tabs will not be replaced, i.e. useless! – Sander Verhagen Nov 06 '14 at 20:11
  • 2
    Obviously, the amount of space that a tab expands to depends on the context. Thus, `sed` is a completely inappropriate tool for the task. – Sven Mar 30 '15 at 20:13
  • Yes, do not use. It messed up my files. Use the command from Gene. – user1097111 Mar 15 '16 at 22:06
  • See my answer below for a safer alternative. http://stackoverflow.com/a/41609013/1924979 – Harsh Vakharia Jan 12 '17 at 09:09
  • -1 because (as @Sven pointed it out) it apparently disregards the tab size, which means this can only work correctly by *accidentally* matching the desired tab size, and will mess up the indentation (and any other sort of tabbed positioning) in every other case. – Sz. Mar 14 '18 at 20:22
  • This is a pretty bad answer. It should probably be removed. – Dakkaron Dec 05 '19 at 12:49
  • Jesus christ people, this answer has been edited into a complete incoherent mess. The warnings about it breaking the repo are now all false - they are left over from back when the answer's `find` command did not filter by file extension, and thus hit every file indiscriminately. – mtraceur Feb 25 '21 at 22:32
46

Collecting the best comments from Gene's answer, the best solution by far, is by using sponge from moreutils.

sudo apt-get install moreutils
# The complete one-liner:
find ./ -iname '*.java' -type f -exec bash -c 'expand -t 4 "$0" | sponge "$0"' {} \;

Explanation:

  • ./ is recursively searching from current directory
  • -iname is a case insensitive match (for both *.java and *.JAVA likes)
  • type -f finds only regular files (no directories, binaries or symlinks)
  • -exec bash -c execute following commands in a subshell for each file name, {}
  • expand -t 4 expands all TABs to 4 spaces
  • sponge soak up standard input (from expand) and write to a file (the same one)*.

NOTE: * A simple file redirection (> "$0") won't work here because it would overwrite the file too soon.

Advantage: All original file permissions are retained and no intermediate tmp files are used.

Community
  • 1
  • 1
not2qubit
  • 14,531
  • 8
  • 95
  • 135
22

Use backslash-escaped sed.

On linux:

  • Replace all tabs with 1 hyphen inplace, in all *.txt files:

    sed -i $'s/\t/-/g' *.txt
    
  • Replace all tabs with 1 space inplace, in all *.txt files:

    sed -i $'s/\t/ /g' *.txt
    
  • Replace all tabs with 4 spaces inplace, in all *.txt files:

    sed -i $'s/\t/    /g' *.txt
    

On a mac:

  • Replace all tabs with 4 spaces inplace, in all *.txt files:

    sed -i '' $'s/\t/    /g' *.txt
    
e9t
  • 15,534
  • 5
  • 23
  • 25
10

You can use the generally available pr command (man page here). For example, to convert tabs to four spaces, do this:

pr -t -e=4 file > file.expanded
  • -t suppresses headers
  • -e=num expands tabs to num spaces

To convert all files in a directory tree recursively, while skipping binary files:

#!/bin/bash
num=4
shopt -s globstar nullglob
for f in **/*; do
  [[ -f "$f" ]]   || continue # skip if not a regular file
  ! grep -qI "$f" && continue # skip binary files
  pr -t -e=$num "$f" > "$f.expanded.$$" && mv "$f.expanded.$$" "$f"
done

The logic for skipping binary files is from this post.

NOTE:

  1. Doing this could be dangerous in a git or svn repo
  2. This is not the right solution if you have code files that have bare tabs embedded in string literals
codeforester
  • 39,467
  • 16
  • 112
  • 140
6

My recommendation is to use:

find . -name '*.lua' -exec ex '+%s/\t/  /g' -cwq {} \;

Comments:

  1. Use in place editing. Keep backups in a VCS. No need to produce *.orig files. It's good practice to diff the result against your last commit to make sure this worked as expected, in any case.
  2. sed is a stream editor. Use ex for in place editing. This avoids creating extra temp files and spawning shells for each replacement as in the top answer.
  3. WARNING: This messes with all tabs, not only those used for indentation. Also it does not do context aware replacement of tabs. This was sufficient for my use case. But might not be acceptable for you.
  4. EDIT: An earlier version of this answer used find|xargs instead of find -exec. As pointed out by @gniourf-gniourf this leads to problems with spaces, quotes and control chars in file names cf. Wheeler.
Community
  • 1
  • 1
Heinrich Hartmann
  • 1,205
  • 11
  • 13
  • `ex` might not be available on every Unix system. Substituting it with `vi -e` might work on more machines. Also, your regex replaces any number of starting tab characters with two spaces. Replace the regex with `+%s/\t/ /g` to no destroy multi level indentation. However this also affects tab characters that are not used for indentation. – Lukas Schmelzeisen Jun 14 '16 at 11:43
  • ex is part of POSIX [1] so should be available. Good point about multi level indendation. I had actually used the `/\t/ /` variant on my files, but opted for `/\t\+//` to not break non-indenting tabs. Missed the issues with multi-indentation! Updating the Answer. [1] http://man7.org/linux/man-pages/man1/ex.1p.html#SEE%C2%A0ALSO – Heinrich Hartmann Jun 14 '16 at 13:24
  • 2
    Using `xargs` in this way is useless, inefficient and broken (think of filenames containing spaces or quotes). Why don't you use `find`'s `-exec` switch instead? – gniourf_gniourf Jun 14 '16 at 13:33
  • I'd argue that filenames with spaces and quotes are broken ; ) If you need to support that I'd opt for: `-print0` options to find / xargs. I like xargs over `-exec` since: a) Separation of concerns b) it can be swapped with GNU parallel more easily. – Heinrich Hartmann Jun 14 '16 at 13:43
  • Updated adding @gniourf_gniourf comments. – Heinrich Hartmann Jun 14 '16 at 16:19
  • Who said that filenames with spaces are broken? we're not on Windows, we're on Linux. A filename can be any valid non-empty C string that doesn't contain `/` (since I mentioned C string, the nil byte is of course forbidden). The robust, portable and efficient way is to use `-exec`: `find . -name '*.lua' -exec ex '+%s/\t/ /g' -cwq {} \;`. `xargs` should only be used in very very specific situations (that I yet have to discover). – gniourf_gniourf Jun 14 '16 at 16:22
  • Wheeler has elaborated the UNIX file name problem quite eloquently: http://www.dwheeler.com/essays/fixing-unix-linux-filenames.html Allowing arbitraty C-strings was probably not the best design decision in the first place. When I control the names, I try to avoid whitespace and control characters. But again, you clearly have a point. I added a comment to the answer: `-print0/-0` is also correct, and IMHO cleaner. – Heinrich Hartmann Jun 14 '16 at 17:12
  • Thought about it again. Adopted -exec variant. Thx for the comments! – Heinrich Hartmann Jun 15 '16 at 09:30
6

You can use find with tabs-to-spaces package for this.

First, install tabs-to-spaces

npm install -g tabs-to-spaces

then, run this command from the root directory of your project;

find . -name '*' -exec t2s --spaces 2 {} \;

This will replace every tab character with 2 spaces in every file.

Harsh Vakharia
  • 2,104
  • 1
  • 23
  • 26
5

I like the "find" example above for the recursive application. To adapt it to be non-recursive, only changing files in the current directory that match a wildcard, the shell glob expansion can be sufficient for small amounts of files:

ls *.java | awk '{print "expand -t 4 ", $0, " > /tmp/e; mv /tmp/e ", $0}' | sh -v

If you want it silent after you trust that it works, just drop the -v on the sh command at the end.

Of course you can pick any set of files in the first command. For example, list only a particular subdirectory (or directories) in a controlled manner like this:

ls mod/*/*.php | awk '{print "expand -t 4 ", $0, " > /tmp/e; mv /tmp/e ", $0}' | sh

Or in turn run find(1) with some combination of depth parameters etc:

find mod/ -name '*.php' -mindepth 1 -maxdepth 2 | awk '{print "expand -t 4 ", $0, " > /tmp/e; mv /tmp/e ", $0}' | sh
Josip Rodin
  • 725
  • 6
  • 13
drchuck
  • 4,415
  • 3
  • 27
  • 30
  • 1
    Shell globbing will break sooner or later, because the total amount of filenames can only be of `ARG_MAX` length. This is 128k on Linux systems, but I've encountered this limit enough times to not rely on shell globbing. – Martin Tournoij Aug 12 '15 at 14:14
  • 1
    You don't really need to adapt them. `find` can be told `-maxdepth 1`, and it only processes the entries of the directory being modified, not the whole tree. – ShadowRanger Oct 29 '15 at 23:24
5

How can I convert tabs to spaces in every file of a directory (possibly recursively)?

This is usually not what you want.

Do you want to do this for png images? PDF files? The .git directory? Your Makefile (which requires tabs)? A 5GB SQL dump?

You could, in theory, pass a whole lot of exlude options to find or whatever else you're using; but this is fragile, and will break as soon as you add other binary files.

What you want, is at least:

  1. Skip files over a certain size.
  2. Detect if a file is binary by checking for the presence of a NULL byte.
  3. Only replace tabs at the start of a file (expand does this, sed doesn't).

As far as I know, there is no "standard" Unix utility that can do this, and it's not very easy to do with a shell one-liner, so a script is needed.

A while ago I created a little script called sanitize_files which does exactly that. It also fixes some other common stuff like replacing \r\n with \n, adding a trailing \n, etc.

You can find a simplified script without the extra features and command-line arguments below, but I recommend you use the above script as it's more likely to receive bugfixes and other updated than this post.

I would also like to point out, in response to some of the other answers here, that using shell globbing is not a robust way of doing this, because sooner or later you'll end up with more files than will fit in ARG_MAX (on modern Linux systems it's 128k, which may seem a lot, but sooner or later it's not enough).


#!/usr/bin/env python
#
# http://code.arp242.net/sanitize_files
#

import os, re, sys


def is_binary(data):
    return data.find(b'\000') >= 0


def should_ignore(path):
    keep = [
        # VCS systems
        '.git/', '.hg/' '.svn/' 'CVS/',

        # These files have significant whitespace/tabs, and cannot be edited
        # safely
        # TODO: there are probably more of these files..
        'Makefile', 'BSDmakefile', 'GNUmakefile', 'Gemfile.lock'
    ]

    for k in keep:
        if '/%s' % k in path:
            return True
    return False


def run(files):
    indent_find = b'\t'
    indent_replace = b'    ' * indent_width

    for f in files:
        if should_ignore(f):
            print('Ignoring %s' % f)
            continue

        try:
            size = os.stat(f).st_size
        # Unresolvable symlink, just ignore those
        except FileNotFoundError as exc:
            print('%s is unresolvable, skipping (%s)' % (f, exc))
            continue

        if size == 0: continue
        if size > 1024 ** 2:
            print("Skipping `%s' because it's over 1MiB" % f)
            continue

        try:
            data = open(f, 'rb').read()
        except (OSError, PermissionError) as exc:
            print("Error: Unable to read `%s': %s" % (f, exc))
            continue

        if is_binary(data):
            print("Skipping `%s' because it looks binary" % f)
            continue

        data = data.split(b'\n')

        fixed_indent = False
        for i, line in enumerate(data):
            # Fix indentation
            repl_count = 0
            while line.startswith(indent_find):
                fixed_indent = True
                repl_count += 1
                line = line.replace(indent_find, b'', 1)

            if repl_count > 0:
                line = indent_replace * repl_count + line

        data = list(filter(lambda x: x is not None, data))

        try:
            open(f, 'wb').write(b'\n'.join(data))
        except (OSError, PermissionError) as exc:
            print("Error: Unable to write to `%s': %s" % (f, exc))


if __name__ == '__main__':
    allfiles = []
    for root, dirs, files in os.walk(os.getcwd()):
        for f in files:
            p = '%s/%s' % (root, f)
            if do_add:
                allfiles.append(p)

    run(allfiles)
Martin Tournoij
  • 26,737
  • 24
  • 105
  • 146
4

I used astyle to re-indent all my C/C++ code after finding mixed tabs and spaces. It also has options to force a particular brace style if you'd like.

Theo Belaire
  • 2,980
  • 2
  • 22
  • 33
4

One can use vim for that:

find -type f \( -name '*.css' -o -name '*.html' -o -name '*.js' -o -name '*.php' \) -execdir vim -c retab -c wq {} \;

As Carpetsmoker stated, it will retab according to your vim settings. And modelines in the files, if any. Also, it will replace tabs not only at the beginning of the lines. Which is not what you generally want. E.g., you might have literals, containing tabs.

x-yuri
  • 16,722
  • 15
  • 114
  • 161
  • `:retab` will change all the tabs in a file, not those at the start. it also depends on what your `:tabstop` and `:expandtab` settings are in the vimrc or modeline, so this may not work at all. – Martin Tournoij Aug 12 '15 at 14:17
  • @Carpetsmoker Good point about tabs at the start of the lines. Does any of the solutions here handles this case? As for the `tabstop` and `expandtab` settings, it will work out if you're using `vim`. Unless you have mode lines in the files. – x-yuri Aug 12 '15 at 17:13
  • @x-yuri good question, but generally moot. Most people use \t not actual tabs in literals. – Ricardo Magalhães Cruz Dec 04 '15 at 17:02
4

To convert all Java files recursively in a directory to use 4 spaces instead of a tab:

find . -type f -name *.java -exec bash -c 'expand -t 4 {} > /tmp/stuff;mv /tmp/stuff {}' \;
Raffi Khatchadourian
  • 3,042
  • 3
  • 31
  • 37
  • How is this answer different from [this](http://stackoverflow.com/a/11094620/1275169) which was posted 4 years ago? – P.P Jun 29 '16 at 17:06
  • 2
    So does your answer. In fact, this is an inferior version of Gene's answer: 1) Gene's answer take care of directories with same name. 2) It doesn't *move* if expand failed. – P.P Jun 30 '16 at 07:52
4

Git repository friendly method

git-tab-to-space() (
  d="$(mktemp -d)"
  git grep --cached -Il '' | grep -E "${1:-.}" | \
    xargs -I'{}' bash -c '\
    f="${1}/f" \
    && expand -t 4 "$0" > "$f" && \
    chmod --reference="$0" "$f" && \
    mv "$f" "$0"' \
    '{}' "$d" \
  ;
  rmdir "$d"
)

Act on all files under the current directory:

git-tab-to-space

Act only on C or C++ files:

git-tab-to-space '\.(c|h)(|pp)$'

You likely want this notably because of those annoying Makefiles which require tabs.

The command git grep --cached -Il '':

  • lists only the tracked files, so nothing inside .git
  • excludes directories, binary files (would be corrupted), and symlinks (would be converted to regular files)

as explained at: How to list all text (non-binary) files in a git repository?

chmod --reference keeps the file permissions unchanged: https://unix.stackexchange.com/questions/20645/clone-ownership-and-permissions-from-another-file Unfortunately I can't find a succinct POSIX alternative.

If your codebase had the crazy idea to allow functional raw tabs in strings, use:

expand -i

and then have fun going over all non start of line tabs one by one, which you can list with: Is it possible to git grep for tabs?

Tested on Ubuntu 18.04.

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
4

No body mentioned rpl? Using rpl you can replace any string. To convert tabs to spaces,

rpl -R -e "\t" "    "  .

very simple.

PeopleMoutainPeopleSea
  • 1,492
  • 1
  • 15
  • 24
  • 1
    This corrupted all binary files in my repo. – Aaron Franke Nov 06 '19 at 20:14
  • 1
    An excellent command, but potentially dangerous with the recursive and all files in folder option as specified above. I would add the --dry-run option "just in case" to make sure you are sitting in the right folder. – MortimerCat Jan 06 '20 at 10:59
3

Download and run the following script to recursively convert hard tabs to soft tabs in plain text files.

Execute the script from inside the folder which contains the plain text files.

#!/bin/bash

find . -type f -and -not -path './.git/*' -exec grep -Iq . {} \; -and -print | while read -r file; do {
    echo "Converting... "$file"";
    data=$(expand --initial -t 4 "$file");
    rm "$file";
    echo "$data" > "$file";
}; done;
olfek
  • 3,210
  • 4
  • 33
  • 49
2

The use of expand as suggested in other answers seems the most logical approach for this task alone.

That said, it can also be done with Bash and Awk in case you may want to do some other modifications along with it.

If using Bash 4.0 or greater, the shopt builtin globstar can be used to search recursively with **.

With GNU Awk version 4.1 or greater, sed like "inplace" file modifications can be made:

shopt -s globstar
gawk -i inplace '{gsub("\t","    ")}1' **/*.ext

In case you want to set the number of spaces per tab:

gawk -i inplace -v n=4 'BEGIN{for(i=1;i<=n;i++) c=c" "}{gsub("\t",c)}1' **/*.ext
John B
  • 3,566
  • 1
  • 16
  • 20
-1

Converting tabs to space in just in ".lua" files [tabs -> 2 spaces]

find . -iname "*.lua" -exec sed -i "s#\t#  #g" '{}' \;
Makah
  • 4,435
  • 3
  • 47
  • 68
  • Obviously, the amount of space that a tab expands to depends on the context. Thus, sed is a completely inappropriate tool for the task. – Sven Mar 30 '15 at 20:15
  • ?? @Sven, my sed command does the same thing that expand command does (`expand -t 4 input >output`) – Makah Mar 31 '15 at 19:32
  • 3
    Of course not. `expand -t 4` will expand the tab in `a\tb` to 3 spaces and the tab in `aa\tb` to 2 spaces, just as it should be. `expand` takes the context of a tab into account, `sed` does not and will replace the tab with the amount of spaces your specify, regardless of the context. – Sven Mar 31 '15 at 20:43
-1

Use the vim-way:

$ ex +'bufdo retab' -cxa **/*.*
  • Make the backup! before executing the above command, as it can corrupt your binary files.
  • To use globstar (**) for recursion, activate by shopt -s globstar.
  • To specify specific file type, use for example: **/*.c.

To modify tabstop, add +'set ts=2'.

However the down-side is that it can replace tabs inside the strings.

So for slightly better solution (by using substitution), try:

$ ex -s +'bufdo %s/^\t\+/  /ge' -cxa **/*.*

Or by using ex editor + expand utility:

$ ex -s +'bufdo!%!expand -t2' -cxa **/*.*

For trailing spaces, see: How to remove trailing whitespaces for multiple files?


You may add the following function into your .bash_profile:

# Convert tabs to spaces.
# Usage: retab *.*
# See: https://stackoverflow.com/q/11094383/55075
retab() {
  ex +'set ts=2' +'bufdo retab' -cxa $*
}
Community
  • 1
  • 1
kenorb
  • 155,785
  • 88
  • 678
  • 743
  • I downvoted many answers in this thread, not just yours ;-) Reasons are: [`:retab` may not work at all](http://stackoverflow.com/questions/11094383/how-can-i-convert-tabs-to-spaces-in-every-file-of-a-directory/31967701#comment51842782_26839565), [shell globbing is a bad solution for this sort of thing](http://stackoverflow.com/questions/11094383/how-can-i-convert-tabs-to-spaces-in-every-file-of-a-directory/31967701#comment51842591_23369021), your `:s` command will replace *any* amount of tabs with 2 spaces (which you almost never want), starting ex just to run an `:!expand` process is silly... – Martin Tournoij Aug 12 '15 at 14:22
  • ...and all your solutions will clobber binary files and such (like .png files, .pdf files, etc.) – Martin Tournoij Aug 12 '15 at 14:22
  • This is frankly a horrible suggestion for documentation - one has to be intimately acquainted with a number of fairly opaque syntax and semantic issues of several programs to be able to comprehend this. – Josip Rodin May 22 '16 at 10:46