140

How can you remove all of the trailing whitespace of an entire project? Starting at a root directory, and removing the trailing whitespace from all files in all folders.

Also, I want to to be able to modify the file directly, and not just print everything to stdout.

l0b0
  • 55,365
  • 30
  • 138
  • 223
iamjwc
  • 1,456
  • 2
  • 10
  • 7

15 Answers15

90

Here is an OS X >= 10.6 Snow Leopard solution.

It Ignores .git and .svn folders and their contents. Also it won't leave a backup file.

(export LANG=C LC_CTYPE=C
find . -not \( -name .svn -prune -o -name .git -prune \) -type f -print0 | perl -0ne 'print if -T' | xargs -0 sed -Ei 's/[[:blank:]]+$//'
)

The enclosing parenthesis preserves the L* variables of current shell – executing in subshell.

Hans Ginzel
  • 8,192
  • 3
  • 24
  • 22
deepwell
  • 20,195
  • 10
  • 33
  • 39
  • 2
    How can I exclude binary files, such as .jpg, .jar, .png, etc, without having to list each file type specifically? – i_am_jorf Oct 23 '11 at 22:23
  • 2
    Maybe something like this is slightly better (but can be optimized): `find . -not \( -name .svn -prune -o -name .git -prune \) -type f -print0 | xargs -0 file -In | grep -v binary | cut -d ":" -f1 | xargs -0 sed -i '' -E "s/[[:space:]]*$//"` (uses `file` and `grep` to filter out binary files) – Noyo Jan 31 '12 at 19:38
  • 11
    You can make it faster by using `\+` instead of `*` in the replacement string - Otherwise it matches on every single line. – l0b0 Mar 05 '12 at 13:20
  • 11
    You could use [[:blank:]] to remove both tabs and spaces. – Leif Gruenwoldt Mar 13 '12 at 15:32
  • This does not exclude .DS_Store files, or at least in my case – Jonathan Lin Jan 10 '13 at 06:07
  • 21
    In Mountain Lion this returns `sed: RE error: illegal byte sequence` for me. – Bryson Feb 01 '13 at 02:02
  • 13
    For those of you having issues with "illegal byte sequence": Enter `export LANG=C` and try again – Georg Ledermann Mar 07 '13 at 08:53
  • 3
    In OS X 10.9 I also needed `export LC_CTYPE=C ` as found here: http://stackoverflow.com/questions/19242275/sed-re-error-illegal-byte-sequence-on-mac-os-x – kissgyorgy Feb 20 '14 at 00:32
  • 2
    This also seems to append an empty new line at EOF. I only want it to trim trailing space. Is this possible? –  Sep 16 '14 at 13:23
  • 2
    Without using xargs (Tested on GNU Linux): `find . -not \( -name .svn -prune -o -name .git -prune \) -type f -exec sed -i "s/[[:space:]]*$//g" "{}" \;` – jpoppe May 07 '15 at 23:26
  • What if you want to ignore `.jpg`, `.png`, `.ttf` too? – Kirk Strobeck Aug 30 '15 at 19:56
  • 1
    @l0b0 just `+`, not `\+` – bschlueter Sep 15 '15 at 20:16
  • @KirkStrobeck Just add them to the `-not` section: `find . -not \( -name *.jpg -prune -o -name *.png -prune -o -name *.ttf -prune \) --type f -print0 | xargs -0 sed -i '' -E "s/[[:blank:]]+$//"` – bschlueter Sep 15 '15 at 20:19
  • 1
    need to export LC_ALL=C on mac. export LANG=C still does not work for me. – jichi May 30 '16 at 09:54
40

Use:

find . -type f -print0 | xargs -0 perl -pi.bak -e 's/ +$//'

if you don't want the ".bak" files generated:

find . -type f -print0 | xargs -0 perl -pi -e 's/ +$//'

as a zsh user, you can omit the call to find, and instead use:

perl -pi -e 's/ +$//' **/*

Note: To prevent destroying .git directory, try adding: -not -iwholename '*.git*'.

kenorb
  • 155,785
  • 88
  • 678
  • 743
Sec
  • 7,059
  • 6
  • 31
  • 58
  • 48
    Don't try this in a git repo, as it can corrupt git's internal storage. – mgold Sep 13 '14 at 16:07
  • 13
    @mgold Too late, grrr;/ – kenorb Apr 17 '15 at 22:21
  • 4
    To clarify, it's alright to run this inside a subfolder of a git repo, just not inside any folders that contain git repo(s) as descendants, i.e. not inside any folders that have `.git` directories, no matter how deeply nested. – Illya Moskvin Nov 23 '16 at 17:42
  • 2
    Combining this answer with @deepwell's to avoid git/svn issues `find . -not \( -name .svn -prune -o -name .git -prune \) -type f -print0 | xargs -0 perl -pi -e 's/ +$//'` – William Denniss Aug 12 '17 at 02:47
  • 1
    There's probably a better way, but I recovered from mangling a git repo with this by cloning out the repo in a separate folder and then doing `rsync -rv --exclude=.git repo/ repo2/` after which the local changes in `repo` were also in the (undamaged) `repo2`. – MatrixManAtYrService Sep 18 '18 at 19:08
  • I accidentally ran this but it just messed with my `.git/index` which you can normally fix using https://stackoverflow.com/a/47109640/1507124 – CervEd Apr 27 '21 at 06:33
36

Two alternative approaches which also work with DOS newlines (CR/LF) and do a pretty good job at avoiding binary files:

Generic solution which checks that the MIME type starts with text/:

while IFS= read -r -d '' -u 9
do
    if [[ "$(file -bs --mime-type -- "$REPLY")" = text/* ]]
    then
        sed -i 's/[ \t]\+\(\r\?\)$/\1/' -- "$REPLY"
    else
        echo "Skipping $REPLY" >&2
    fi
done 9< <(find . -type f -print0)

Git repository-specific solution by Mat which uses the -I option of git grep to skip files which Git considers to be binary:

git grep -I --name-only -z -e '' | xargs -0 sed -i 's/[ \t]\+\(\r\?\)$/\1/'
Community
  • 1
  • 1
l0b0
  • 55,365
  • 30
  • 138
  • 223
  • 3
    So I really like this git solution. It should really be on the top. I don't want to save carriage returns though. But I prefer this to the one I combined in 2010. – odinho - Velmont Nov 30 '12 at 16:10
  • My git complains that the -e expression is empty, but it works great using -e '.*' – muirbot Jul 30 '14 at 20:39
  • @okor In GNU `sed` the suffix option to `-i` is *optional*, but in [BSD `sed`](http://www.freebsd.org/cgi/man.cgi?query=sed) it's not. It's strictly speaking not necessary here anyway, so I'll just remove it. – l0b0 Oct 30 '14 at 17:50
28

In Bash:

find dir -type f -exec sed -i 's/ *$//' '{}' ';'

Note: If you're using .git repository, try adding: -not -iwholename '.git'.

Drew Noakes
  • 300,895
  • 165
  • 679
  • 742
Adam Rosenfield
  • 390,455
  • 97
  • 512
  • 589
  • This generates errors like this for every file found. sed: 1: "dir/file.txt": command a expects \ followed by text – iamjwc Sep 29 '08 at 15:10
  • Replacing ';' with \; should work. (Also quotes around {} are not strictly needed). – agnul Sep 29 '08 at 15:20
  • 4
    To remove all whitespace not just spaces you should replace the space character with [:space:] in your sed regular expression. – WMR Sep 30 '08 at 13:17
  • Another side note: This only works with sed versions >= 4, smaller versions do not support in place editing. – WMR Sep 30 '08 at 13:18
  • This is a faster and safer variant: find dir -type f -print0 | xargs -r0 sed -i 's/ *$//' – pixelbeat Sep 30 '08 at 23:52
  • 3
    This broke my git :( – CrabMan Feb 09 '17 at 12:24
  • You almost certainly don't want to include hidden folders (i.e. temp stuff of all kinds, .git, .svn., .vscode, .idea), and neither node modules or similar 3rd party package-folders... `find . -type f -name "*.scss" -regextype posix-extended -not -regex ".*\/(\.|node_modules).*"`. If, _*and only if*_ the output looks fully ok to you, _*then*_ attach the `-exec` part... (folder backups beforehand never hurt, naturally) – Frank N Apr 04 '17 at 15:41
14

Ack was made for this kind of task.

It works just like grep, but knows not to descend into places like .svn, .git, .cvs, etc.

ack --print0 -l '[ \t]+$' | xargs -0 -n1 perl -pi -e 's/[ \t]+$//'

Much easier than jumping through hoops with find/grep.

Ack is available via most package managers (as either ack or ack-grep).

It's just a Perl program, so it's also available in a single-file version that you can just download and run. See: Ack Install

jbbuckley
  • 887
  • 11
  • 6
14

This worked for me in OSX 10.5 Leopard, which does not use GNU sed or xargs.

find dir -type f -print0 | xargs -0 sed -i.bak -E "s/[[:space:]]*$//"

Just be careful with this if you have files that need to be excluded (I did)!

You can use -prune to ignore certain directories or files. For Python files in a git repository, you could use something like:

find dir -not -path '.git' -iname '*.py'
Ondra Žižka
  • 43,948
  • 41
  • 217
  • 277
pojo
  • 5,892
  • 9
  • 35
  • 47
  • Any chance you could clarify this? I'd like a command that will remove trailing whitespace from all files in a directory recursively, while ignoring the ".git" directory. I can't quite follow your example... – Trevor Turk Feb 12 '10 at 00:39
  • If you're using tcsh you'll need to change the double quotes to single quotes. Otherwise, you'll get an "Illegal variable name." error. – Brandon Fosdick May 29 '11 at 01:02
  • GNU sed is similar but you do -i.bak or --in-place=.bak, ending up with a full command of `find dir -not -path '.git' -iname '*.py' -print0 | xargs -0 sed --in-place=.bak 's/[[:space:]]*$//'`. Replace `dir` with the directory in question as the top-level to recurse from. – David Gardner Jun 21 '11 at 12:55
  • `sed -i .bak` ? Shouldn't it be `sed -i.bak` (without the space)? – Ondra Žižka Nov 02 '16 at 02:21
9

ex

Try using Ex editor (part of Vim):

$ ex +'bufdo!%s/\s\+$//e' -cxa **/*.*

Note: For recursion (bash4 & zsh), we use a new globbing option (**/*.*). Enable by shopt -s globstar.

You may add the following function into your .bash_profile:

# Strip trailing whitespaces.
# Usage: trim *.*
# See: https://stackoverflow.com/q/10711051/55075
trim() {
  ex +'bufdo!%s/\s\+$//e' -cxa $*
}

sed

For using sed, check: How to remove trailing whitespaces with sed?

find

Find the following script (e.g. remove_trail_spaces.sh) for removing trailing whitespaces from the files:

#!/bin/sh
# Script to remove trailing whitespace of all files recursively
# See: https://stackoverflow.com/questions/149057/how-to-remove-trailing-whitespace-of-all-files-recursively

case "$OSTYPE" in
  darwin*) # OSX 10.5 Leopard, which does not use GNU sed or xargs.
    find . -type f -not -iwholename '*.git*' -print0  | xargs -0 sed -i .bak -E "s/[[:space:]]*$//"
    find . -type f -name \*.bak -print0 | xargs -0 rm -v
    ;;
  *)
    find . -type f -not -iwholename '*.git*' -print0 | xargs -0 perl -pi -e 's/ +$//'
esac

Run this script from the directory which you want to scan. On OSX at the end, it will remove all the files ending with .bak.

Or just:

find . -type f -name "*.java" -exec perl -p -i -e "s/[ \t]$//g" {} \;

which is recommended way by Spring Framework Code Style.

vgoff
  • 10,980
  • 3
  • 38
  • 56
kenorb
  • 155,785
  • 88
  • 678
  • 743
  • `find . -type f -name "*.java" -exec perl -p -i -e "s/[ \t]$//g" {} \;` only removes one trailing space instead of all. – Kalle Richter Jul 29 '18 at 15:08
6

I ended up not using find and not creating backup files.

sed -i '' 's/[[:space:]]*$//g' **/*.*

Depending on the depth of the file tree, this (shorter version) may be sufficient for your needs.

NOTE this also takes binary files, for instance.

Jesper Rønn-Jensen
  • 106,591
  • 44
  • 118
  • 155
  • For specific files: find . -name '*.rb' | xargs -I{} sed -i '' 's/[[:space:]]*$//g' {} – Gautam Rege Oct 16 '13 at 12:55
  • You don't need the '' parameter for sed; or I might be missing something. I tried it on all files in a given directory, like this: sed -i 's/[[:space:]]*$//g' util/*.m – Mircea Jan 10 '18 at 16:33
6

Instead of excluding files, here is a variation of the above the explicitly white lists the files, based on file extension, that you want to strip, feel free to season to taste:

find . \( -name *.rb -or -name *.html -or -name *.js -or -name *.coffee -or \
-name *.css -or -name *.scss -or -name *.erb -or -name *.yml -or -name *.ru \) \
-print0 | xargs -0 sed -i '' -E "s/[[:space:]]*$//"
ChicagoBob
  • 61
  • 1
  • 3
5

1) Many other answers use -E. I am not sure why, as that's undocumented BSD compatibility option. -r should be used instead.

2) Other answers use -i ''. That should be just -i (or -i'' if preffered), because -i has the suffix right after.

3) Git specific solution:

git config --global alias.check-whitespace \
'git diff-tree --check $(git hash-object -t tree /dev/null) HEAD'

git check-whitespace | grep trailing | cut -d: -f1 | uniq -u -z | xargs -0 sed --in-place -e 's/[ \t]+$//'

The first one registers a git alias check-whitespace which lists the files with trailing whitespaces. The second one runs sed on them.

I only use \t rather than [:space:] as I don't typically see vertical tabs, form feeds and non-breakable spaces. Your measurement may vary.

Community
  • 1
  • 1
Ondra Žižka
  • 43,948
  • 41
  • 217
  • 277
  • For the first I get `expansion of alias 'check-whitespace' failed; 'git' is not a git command sed: no input files`, and then if I remove the "extra git", I get `fatal: ambiguous argument '$(git': unknown revision or path not in the working tree.` – eri0o Apr 19 '23 at 00:34
5

I ended up running this, which is a mix between pojo and adams version.

It will clean both trailing whitespace, and also another form of trailing whitespace, the carriage return:

find . -not \( -name .svn -prune -o -name .git -prune \) -type f \
  -exec sed -i 's/[:space:]+$//' \{} \;  \
  -exec sed -i 's/\r\n$/\n/' \{} \;

It won't touch the .git folder if there is one.

Edit: Made it a bit safer after the comment, not allowing to take files with ".git" or ".svn" in it. But beware, it will touch binary files if you've got some. Use -iname "*.py" -or -iname "*.php" after -type f if you only want it to touch e.g. .py and .php-files.

Update 2: It now replaces all kinds of spaces at end of line (which means tabs as well)

odinho - Velmont
  • 20,922
  • 6
  • 41
  • 33
  • 4
    I don't know what's going on, but this totally fubared my git repo and messed with my images. PEOPLE, BE MORE CAREFUL THAN I WAS! – mattalxndr Apr 26 '11 at 22:48
  • Yes, it will ruin binary files. However, it shouldn't touch your git repo at all, because it skips whatever resides inside a .git-folder. But maybe only if you're in the same folder. – odinho - Velmont May 25 '11 at 12:29
5

I use regular expressions. 4 steps:

  1. Open the root folder in your editor (I use Visual Studio Code).
  2. Tap the Search icon on the left, and enable the regular expression mode.
  3. Enter " +\n" in the Search bar and "\n" in the Replace bar.
  4. Click "Replace All".

This removes all trailing spaces at the end of each line in all files. And you can exclude some files that don't fit with this need.

roedeercuco
  • 51
  • 1
  • 3
4

This works well.. add/remove --include for specific file types :

egrep -rl ' $' --include *.c *  | xargs sed -i 's/\s\+$//g'
Grant Murphy
  • 101
  • 1
  • 3
4

Ruby:

irb
Dir['lib/**/*.rb'].each{|f| x = File.read(f); File.write(f, x.gsub(/[ \t]+$/,"")) }
grosser
  • 14,707
  • 7
  • 57
  • 61
1

This is what works for me (Mac OS X 10.8, GNU sed installed by Homebrew):

find . -path ./vendor -prune -o \
  \( -name '*.java' -o -name '*.xml' -o -name '*.css' \) \
  -exec gsed -i -E 's/\t/    /' \{} \; \
  -exec gsed -i -E 's/[[:space:]]*$//' \{} \; \
  -exec gsed -i -E 's/\r\n/\n/' \{} \;

Removed trailing spaces, replaces tabs with spaces, replaces Windows CRLF with Unix \n.

What's interesting is that I have to run this 3-4 times before all files get fixed, by all cleaning gsed instructions.

yegor256
  • 102,010
  • 123
  • 446
  • 597