0

How to give all words from one file to tr for searching and deleting in text from another file?

For example, I have a file vocabulary.txt and loveStroty.txt. I'm trying to delete all words that in are vocabulary from love Story.

$ voc="one free" #files look like this strings
$ love="one two free four"
$ tr "$voc" '' <<< $love

Example for output (doesn't matter if it is with separators or with new line separated):

two
four
Benjamin W.
  • 46,058
  • 19
  • 106
  • 116
Dmytro
  • 48
  • 6
  • 2
    `tr` isn't the right tool for this; it's purpose is to replace (or remove) individual characters. `tr -d 'abc'` is the same as `tr -d 'cba'` – it has no notion of character sequences. – Benjamin W. Mar 27 '19 at 19:32
  • Also, the second set can't be empty if you're not truncating the first set. To remove characters, you have to use `tr -d` – but again, you don't want to use `tr` in the first place. – Benjamin W. Mar 27 '19 at 19:34
  • You say you have two files, but your example uses strings instead of files. What do the two files look like? One word per line? – Benjamin W. Mar 27 '19 at 19:35
  • @BenjaminW. yes, all text in files like this strings. What I should do instead tr? – Dmytro Mar 27 '19 at 19:38
  • No linebreaks? One super long line? Or not? What's the desired output? – Benjamin W. Mar 27 '19 at 19:40
  • @BenjaminW, It possible to make it with separators or linebreaks if with it will more ease to solve – Dmytro Mar 27 '19 at 19:43
  • You mean you want to remove common elements in both lists? Use `comm`. – KamilCuk Mar 27 '19 at 19:47
  • It really matters what the input and the output exactly look like. For example, if the input files are both lists, this is a duplicate of [this question](https://stackoverflow.com/q/4366533/3266847); if they are space separated, it's another problem. It's unclear what exactly you want. – Benjamin W. Mar 27 '19 at 19:57
  • @BenjaminW. It's like two one line stings. – Dmytro Mar 27 '19 at 20:01

3 Answers3

3

I'm assuming your input files look like this:

$ cat lovestory.txt
one two free four
$ cat vocabulary.txt
one free

In Bash, I can then use grep, process substitution and tr to remove every word from lovestory.txt that exists in vocabulary.txt like this:

$ grep -vFxf <(tr ' ' '\n' < vocabulary.txt) <(tr ' ' '\n' < lovestory.txt)
two
four

tr ' ' '\n' < file replaces every space in file with a newline; grep -vFx removes matches of complete lines (fixed strings, no regular expressions).

Benjamin W.
  • 46,058
  • 19
  • 106
  • 116
1

If files are not big enough, you could give sed utility a try:

# Define the text which replaces the searched words
replace="<Replacement string here>"

for word in $(cat /path/to/<file_containing_words>); do
  sed -i "s/${word}/${replace}/g" <file_to_be_replaced>
done

So, for your specific example

replace=""

for word in $(cat /path/to/voc); do
  sed -i "s/${word}/${replace}/g" /path/to/love
done
akskap
  • 803
  • 6
  • 12
  • how I cat replace `/path/to/love` and `path/to/voc` with some stings? – Dmytro Mar 27 '19 at 20:00
  • Both set of words are in individual files, right ? (Atleast that's what I understood from your question above). You can just replace `/path/to/voc` and `/path/to/love` with respective absolute file paths on your system – akskap Mar 27 '19 at 20:05
0

With GNU awk for multi-char RS:

$ awk -v RS='\\s+' 'NR==FNR{a[$0];next} !($0 in a)' vocabulary.txt lovestory.txt
two
four
Ed Morton
  • 188,023
  • 17
  • 78
  • 185