Assuming your sed
understands \<
and \>
for word boundaries,
sed 's/\<\(i\|me\|my\|myself|\we|\our|\ours|\ourselves|\you|\your|\yours|\yourself\)\> \?//g' Hamlet.txt >newHam.txt
You want to make sure you include word boundaries; your original attempt would replace e.g. i
everywhere n the nput.
If you already have the words in a string, you can interpolate it in Bash with
sed "s/\\<\\(${list// /\\|}\\)\\> \\?//g" Hamlet.txt >newHam.txt
but the ${variable//pattern/substitution}
parameter expansion is not portable to e.g. /bin/sh
. Notice also how double quotes instead of single are necessary for the shell to be allowed to perform variable substitutions within the script, and how all literal backslashes need to be escaped with another backslash within double quotes.
Unfortunately, many details of sed
are poorly standardized. Ironically, switching to a tool which isn't standard at all might be the most portable solution.
perl -pe 'BEGIN {
@list = qw(i me my myself we our ours ourselves you your yours yourself .....);
$re = join("|", @list); }
s/\b($re)\b ?//go' Hamlet.txt >newHam.txt
If you want this as a standalone script,
#!/usr/bin/perl
BEGIN {
@list = qw(i me my myself we our ours ourselves you your yours yourself .....);
$re = join("|", @list);
}
while (<>) {
s/\b($re)\b ?//go;
print
}
These words are pronouns, not prepositions.
Finally, take care to fix the shebang of your script; the first line of the script needs to start with exactly the two characters #!
because that's what makes it a shebang. You'll also want to avoid the useless cat
in the future.