4

How can I count how many characters appear within a file, minus those from a specific list. Here is an example file:

你好吗?
我很好,你呢?
我也很好。

I want to exclude any occurrences of , , and from the count. The output would look like this:

3
5
4
gniourf_gniourf
  • 44,650
  • 9
  • 93
  • 104
Village
  • 22,513
  • 46
  • 122
  • 163

5 Answers5

3

A pure bash solution:

while IFS= read -r l; do
    l=${l//[?,。]/}
    echo "${#l}"
done < file
gniourf_gniourf
  • 44,650
  • 9
  • 93
  • 104
2

Try

sed 's/[,。?]//g' file | perl -C -nle 'print length'

The sed part removes unwanted characters, and the perl part counts the remaining characters.

Hari Menon
  • 33,649
  • 14
  • 85
  • 108
2

One way is to remove those characters from the stream and then use wc -m. Here is an example that uses perl to remove the characters:

perl -pe 's/(\?|,|,|。)//g' file.txt | \ 
  while read -r line; do 
    printf "$line" | wc -m ; 
  done
jordanm
  • 33,009
  • 7
  • 61
  • 76
2

or more simple:

tr -d [?,,。] <file | wc -m
thom
  • 2,294
  • 12
  • 9
1

A simple solution, approached to this one, but using awk:

sed 's/[?,。]//g' file | awk '{ print length($0) }'
Community
  • 1
  • 1
Radu Rădeanu
  • 2,642
  • 2
  • 26
  • 43