I need to delete all the strings in a file that have less than 4 unique characters in them
Input:
hello
cabby
pabba
lokka
lappa
coool
apple
Expected Output:
hello
cabby
lokka
apple
I tried to think up a regular expression to do this but I can't think how it would even be possible.
I did find a sed
command that seems promising, it deletes all duplicate characters. However, I am not sure how to program sed
to test if the program returns 4 characters, and then if it does, match the original string.
sed ':1;s/\(\(.\).*\)\2/\1/g;t'