1

how can I remove the duplicate words from sentences in file ,each sentence written in a separate line.

thank you

I have these sentences in file

hello every body hello
word I should remove the word
how can can i remove it ?

the expected out put should be

hello every body
word I should remove the
how can i remove it ?
dawg
  • 98,345
  • 23
  • 131
  • 206
AnasNss11
  • 25
  • 6
  • https://stackoverflow.com/questions/952268/how-to-remove-duplicate-words-from-a-plain-text-file-using-linux-command might help – rohitt Dec 07 '20 at 18:08
  • 1
    consider reviewing [how do I ask a good question](https://stackoverflow.com/help/how-to-ask) and then come back and update your question accordingly; in particular, provide your sample input, the code you've written, the (wrong) output generated by your script and the (correct) desired output – markp-fuso Dec 07 '20 at 18:17
  • Please [edit] your question and show the expected output that you want to get from the example input. – Bodo Dec 07 '20 at 18:19
  • @dwag it's the same , but it didn't work .. – AnasNss11 Dec 07 '20 at 18:26
  • @dawg, it won't help him - while it will strip the dupes it will cull the sentence structures ... – tink Dec 07 '20 at 18:28

1 Answers1

1

You can do:

awk '{for(i=1;i<=NF;i++) if(++arr[$i]==1) print $i}' file

Prints:

hello
every
body
word
I
should
remove
the
how
can
i
it
?

To maintain the line structure:

awk '{for(i=1;i<=NF;i++) 
       if(++arr[$i]==1) 
          printf "%s%s", $i, OFS
       print ""}' file

Prints:

hello every body 
word I should remove the 
how can i it ? 

If the deduplication is only on a per line basis:

awk '{delete arr
      for(i=1;i<=NF;i++) 
         if(++arr[$i]==1) printf "%s%s", $i, OFS
      print ""}' file

Prints:

hello every body 
word I should remove the 
how can i remove it ? 
dawg
  • 98,345
  • 23
  • 131
  • 206