0

I have multiple files with just one line of simple text. I want to remove last character of every word in each file. Every file has different length of text.

The closest I got is to edit one file:

awk '{ print substr($1, 1, length($1)-1); print substr($2, 1, length($2)-1); }' file.txt

But I can not figure out, how to make this general, for files with different words count.

fedorqui
  • 275,237
  • 103
  • 548
  • 598
Tomáš Šíma
  • 834
  • 7
  • 26
  • to be sure, *1 line of simple text* and *every word* mean there is 1 line per file with 0 to several word inside and each word have to be modified (seeing lot of reply removing only the last char of the line) – NeronLeVelu Dec 15 '16 at 13:18
  • Yes, there are actually 1 to several words inside – Tomáš Šíma Dec 15 '16 at 13:30

5 Answers5

3
awk '{for(x=1;x<=NF;x++)sub(/.$/,"",$x)}7' file

this should do the removal.

If it was tested ok, and you want to overwrite your file, you can do:

awk '{for(x=1;x<=NF;x++)sub(/.$/,"",$x)}7' file > tmp && mv tmp file

Example:

kent$  awk '{for(x=1;x<=NF;x++)sub(/.$/,"",$x)}7' <<<"foo bar foobar"   
fo ba fooba
Kent
  • 189,393
  • 32
  • 233
  • 301
  • funny to use `7` for printing a line, a special meaning for 7 instead of frequent 1 for this purpose ? – NeronLeVelu Dec 15 '16 at 14:22
  • yes, the special meaning is lucky number! :-D just kidding.. 7 is easier for me to reach. I think right index finger is more convenient than left little finger. You know, a heavy vim user cares about keystrokes – Kent Dec 15 '16 at 14:25
2

Use awk to loop till max fields in each row upto NF, and apply the substr function.

awk '{for (i=1; i<=NF; i++) {printf "%s ", substr($i, 1, length($i)-1)}}END{printf "\n"}' file

For a sample input file

ABCD ABC BC

The awk logic produces an output

ABC AB B

Another way by changing the record-separator to NULL and just using print:-

awk 'BEGIN{ORS="";}{for (i=1; i<=NF; i++) {print substr($i, 1, length($i)-1); print " "}}END{print "\n"}' file
Inian
  • 80,270
  • 14
  • 142
  • 161
2

I would go for a Bash approach:

Since ${var%?} removes the last character of a variable:

$ var="hello"
$ echo "${var%?}"
hell

And you can use the same approach on arrays:

$ arr=("hello" "how" "are" "you")
$ printf "%s\n" "${arr[@]%?}"
hell
ho
ar
yo

What about going through the files, read their only line (you said the files just consist in one line) into an array and use the abovementioned tool to remove the last character of each word:

for file in dir/*; do
   read -r -a myline < "$file"
   printf "%s " "${myline[@]%?}"
done
Community
  • 1
  • 1
fedorqui
  • 275,237
  • 103
  • 548
  • 598
  • My only concern was the size constraint of the files when using pure `bash` logic like this. Is it really slow when processing huge files compared to `awk`? – Inian Dec 15 '16 at 12:16
  • 2
    @Inian we should test it. However, parsing a number of one-line files does not seem to be a very CPU intensive task, so bothering about performance is more of an academic debate. – fedorqui Dec 15 '16 at 12:17
  • 1
    That being said, it is also recommendable to keep this answer in mind: [Why is using a shell loop to process text considered bad practice?](http://unix.stackexchange.com/a/169765/40596). – fedorqui Dec 15 '16 at 12:27
  • Thanks! Asked the question in first place because of that particular topic in mind :) – Inian Dec 15 '16 at 12:29
0

Sed version, assuming word are only composed of letter (if not, just adapt the class [[:alpha:]] to reflect your need) and separate by space and puctuation

sed 's/$/ /;s/[[:alpha:]]\([[:blank:][:punct:]]\)/\1/g;s/ $//' YourFile

awk (gawk for regex boundaries in fact)

 gawk '{gsub(/.\>/, "");print}' YourFile

 #or optimized by @kent ;-) thks for the tips
 gawk '4+gsub(/.\>/, "")' YourFile
NeronLeVelu
  • 9,908
  • 1
  • 23
  • 43
0
$ cat foo
word1
word2 word3
$ sed 's/\([^ ]*\)[^ ]\( \|$\)/\1\2/g' foo
word
word word

A word is any string of characters excluding space (=[^ ]).

EDIT: If you want to enforce POSIX (--posix), you can use:

$ sed --posix 's/\([^ ]*\)[^ ]\([ ]\{,1\}\)/\1\2/g' foo
word
word word

This \( \|$\) changes to \([ ]\{,1\}\), ie there is an optional space in the end.

James Brown
  • 36,089
  • 7
  • 43
  • 59