-2

I tried this one, but it is displaying the count on number of lines instead.

declare -i x=0 while IFS="" read -r p || [ -n "$p" ] do x=x+1 done <test.txt echo "$x

I would be thankful if someone could explain this since i am a beginner

  • 4
    `wc -w test.txt` ? – tkausl Sep 10 '18 at 06:39
  • Hi @tkausl. This is working. Thank you soo much. But i want to iterate through the words. Not only count of words. Thanks in advance – Chitti_the_robot Sep 10 '18 at 06:42
  • 1
    Please, post some sample data with expected output to avoid misunderstanding of the question. – James Brown Sep 10 '18 at 06:59
  • `for i in $(cat file); do something $i; done` instead of using read & redirections is probably the simplest solution – Sam Sep 10 '18 at 07:17
  • @Sam `for i in $(cat file)` is a well-known anti-pattern. There is always a better solution than that. – Ed Morton Sep 11 '18 at 14:30
  • what would be your prefered solution and why then? i am well aware that the pattern is frequently misused, but to me that alone does not mean it should never be used. – Sam Sep 11 '18 at 14:48
  • @Sam, if it contains `*`, you get a list of filenames being iterated over. Why would you ever use it, when there are alternatives that don't have the side effects and bugs? `while read -r -a words; do for word in "${words[@]}"; do ...; done; done – Charles Duffy Sep 11 '18 at 15:54
  • you have a point in that it was reckless of me to suggest that without a reminder to toggle globbing with `set -f` / `set +f` if there is the slightest possibility the file may contain any special characters. – Sam Sep 11 '18 at 17:00
  • do note however that `set -f; for i in $(cat file); do echo $i >/dev/null; done; set +f` times about twice as fast as the equivalent `while read -r -d' ' i; do echo $i >/dev/null; done` for a large file on my system and that the array solution may fail for very long lines. – Sam Sep 11 '18 at 17:10

3 Answers3

2

Assuming your words are separated by tabs, spaces ad newlines, the following snippet:

echo $'word1 word2! word3
\tword4\t\t\t\t\t\tword5\tword6
word7 word8


word9 word10' | \
while IFS=$'\t ' read -ra linewords; do
    for i in "${linewords[@]}"; do
            echo word is "'$i'"
    done
done

will output:

word is 'word1'
word is 'word2!'
word is 'word3'
word is 'word4'
word is 'word5'
word is 'word6'
word is 'word7'
word is 'word8'
word is 'word9'
word is 'word10'

It uses multiple IFS values combined with read reading into an array, see this answer on how to split a string on a delimeter.

KamilCuk
  • 120,984
  • 8
  • 59
  • 111
  • 2
    You chose a convenient input for which your code works :) Try to use a tab between `word5` and `word6` instead of the space. The issue is that you want to use `$'...'` instead of `$"..."`. See [manual](https://www.gnu.org/software/bash/manual/bash.html#Locale-Translation) for explanation of `$"..."`. Also, since `read` reads lines by default, the `\n` is not necessary. – PesaThe Sep 10 '18 at 09:24
1

I'd use awk for that:

$ echo "Lorem ipsum dolor sit amet,
        consectetur adipisci elit,
        ..." | 
awk '{
    for(i=1;i<=NF;i++)
        print "iterating " $i
}'

Output:

iterating Lorem
iterating ipsum
iterating dolor
iterating sit
iterating amet,
iterating consectetur
iterating adipisci
iterating elit,
iterating ...
James Brown
  • 36,089
  • 7
  • 43
  • 59
0
grep -oE '\w+' YOUR_FILE.txt

writes the words in YOUR_FILE.txt to standard output. Pipe this into your loop, and you have an iteration over the words.

This assumes that a "word" in your case is one or more characters described by \w, i.e. either an underscore or what your current locale defines to be an alphanumeric character. If your idea of a "word" is different, you can of course tailor the regular expression according to your needs.

user1934428
  • 19,864
  • 7
  • 42
  • 87