I have a simple shell script that removes trailing whitespace from a file. Is there any way to make this script more compact (without creating a temporary file)?
sed 's/[ \t]*$//' $1 > $1__.tmp
cat $1__.tmp > $1
rm $1__.tmp
I have a simple shell script that removes trailing whitespace from a file. Is there any way to make this script more compact (without creating a temporary file)?
sed 's/[ \t]*$//' $1 > $1__.tmp
cat $1__.tmp > $1
rm $1__.tmp
You can use the in place option -i
of sed
for Linux and Unix:
sed -i 's/[ \t]*$//' "$1"
Be aware the expression will delete trailing t
's on OSX (you can use gsed
to avoid this problem). It may delete them on BSD too.
If you don't have gsed, here is the correct (but hard-to-read) sed syntax on OSX:
sed -i '' -E 's/[ '$'\t'']+$//' "$1"
Three single-quoted strings ultimately become concatenated into a single argument/expression. There is no concatenation operator in bash, you just place strings one after the other with no space in between.
The $'\t'
resolves as a literal tab-character in bash (using ANSI-C quoting), so the tab is correctly concatenated into the expression.
At least on Mountain Lion, Viktor's answer will also remove the character 't' when it is at the end of a line. The following fixes that issue:
sed -i '' -e's/[[:space:]]*$//' "$1"
Thanks to codaddict for suggesting the -i
option.
The following command solves the problem on Snow Leopard
sed -i '' -e's/[ \t]*$//' "$1"
var1="\t\t Test String trimming "
echo $var1
Var2=$(echo "${var1}" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')
echo $Var2
I have a script in my .bashrc that works under OSX and Linux (bash only !)
function trim_trailing_space() {
if [[ $# -eq 0 ]]; then
echo "$FUNCNAME will trim (in place) trailing spaces in the given file (remove unwanted spaces at end of lines)"
echo "Usage :"
echo "$FUNCNAME file"
return
fi
local file=$1
unamestr=$(uname)
if [[ $unamestr == 'Darwin' ]]; then
#specific case for Mac OSX
sed -E -i '' 's/[[:space:]]*$//' $file
else
sed -i 's/[[:space:]]*$//' $file
fi
}
to which I add:
SRC_FILES_EXTENSIONS="js|ts|cpp|c|h|hpp|php|py|sh|cs|sql|json|ini|xml|conf"
function find_source_files() {
if [[ $# -eq 0 ]]; then
echo "$FUNCNAME will list sources files (having extensions $SRC_FILES_EXTENSIONS)"
echo "Usage :"
echo "$FUNCNAME folder"
return
fi
local folder=$1
unamestr=$(uname)
if [[ $unamestr == 'Darwin' ]]; then
#specific case for Mac OSX
find -E $folder -iregex '.*\.('$SRC_FILES_EXTENSIONS')'
else
#Rhahhh, lovely
local extensions_escaped=$(echo $SRC_FILES_EXTENSIONS | sed s/\|/\\\\\|/g)
#echo "extensions_escaped:$extensions_escaped"
find $folder -iregex '.*\.\('$extensions_escaped'\)$'
fi
}
function trim_trailing_space_all_source_files() {
for f in $(find_source_files .); do trim_trailing_space $f;done
}
For those who look for efficiency (many files to process, or huge files), using the +
repetition operator instead of *
makes the command more than twice faster.
With GNU sed:
sed -Ei 's/[ \t]+$//' "$1"
sed -i 's/[ \t]\+$//' "$1" # The same without extended regex
I also quickly benchmarked something else: using [ \t]
instead of [[:space:]]
also significantly speeds up the process (GNU sed v4.4):
sed -Ei 's/[ \t]+$//' "$1"
real 0m0,335s
user 0m0,133s
sys 0m0,193s
sed -Ei 's/[[:space:]]+$//' "$1"
real 0m0,838s
user 0m0,630s
sys 0m0,207s
sed -Ei 's/[ \t]*$//' "$1"
real 0m0,882s
user 0m0,657s
sys 0m0,227s
sed -Ei 's/[[:space:]]*$//' "$1"
real 0m1,711s
user 0m1,423s
sys 0m0,283s
In the specific case of sed
, the -i
option that others have already mentioned is far and away the simplest and sanest one.
In the more general case, sponge
, from the moreutils
collection, does exactly what you want: it lets you replace a file with the result of processing it, in a way specifically designed to keep the processing step from tripping over itself by overwriting the very file it's working on. To quote the sponge
man page:
sponge reads standard input and writes it out to the specified file. Unlike a shell redirect, sponge soaks up all its input before writing the output file. This allows constructing pipelines that read from and write to the same file.
Just for fun:
#!/bin/bash
FILE=$1
if [[ -z $FILE ]]; then
echo "You must pass a filename -- exiting" >&2
exit 1
fi
if [[ ! -f $FILE ]]; then
echo "There is not file '$FILE' here -- exiting" >&2
exit 1
fi
BEFORE=`wc -c "$FILE" | cut --delimiter=' ' --fields=1`
# >>>>>>>>>>
sed -i.bak -e's/[ \t]*$//' "$FILE"
# <<<<<<<<<<
AFTER=`wc -c "$FILE" | cut --delimiter=' ' --fields=1`
if [[ $? != 0 ]]; then
echo "Some error occurred" >&2
else
echo "Filtered '$FILE' from $BEFORE characters to $AFTER characters"
fi
These answers confused me. Both of these sed
commands worked for me on a Java source file:
sed 's/\s\+$/ filename
sed 's/[[:space:]]\+$// filename
for test purposes, I used:
$ echo " abc " | sed 's/\s\+$/-xx/'
abc-xx
$ echo -e " abc \t\t " | sed 's/\s\+$/-xx/'
abc-xx
Replacing all trailing whitespace with "-xx
".
@Viktor wishes to avoid a temporay file, personally I would only use the -i
=> in-place with a back-up suffix. At least until I know the command works.
Sorry, I just found the existing responses a little oblique. sed
is straightforward tool. It is easier to approach it in a straightforward way 90% of the time. Or perhaps I missed something, happy to corrected there.
To remove trailing whitespace for all files in the current directory, I use
ls | xargs sed -i 's/[ \t]*$//'
To only strip whitespaces (in my case spaces and tabs) from lines with at least one non-whitespace character (this way empty indented lines are not touched):
sed -i -r 's/([^ \t]+)[ \t]+$/\1/' "$file"