My table in SQL server has some entries like shown below.
2934046 Kellogg’s Share Your Breakfast 74672 2407522 Kellogg?s Share Your Breakfast ACTIVE 2015-09-01 9999-12-31
2934046 Kellogg?s Share Your Breakfast 74672 2407522 Kellogg?s Share Your Breakfast ACTIVE 2015-09-01 9999-12-31
Another example could be
2939508 UOL Ação Social 81534 1527484 UOL Ac?o Social ACTIVE 2015-09-01 9999-12-31
2939508 UOL Ac?o Social 81534 1527484 UOL Ac?o Social ACTIVE 2015-09-01 9999-12-31
As it can be seen that both the entries are same, except for the question mark character in the second entry. Even if I do something like
SELECT DISTINCT * from my_table
it is not useful. I have to figure out a way to remove such kinds of duplicate entries based on special characters. My manager says that the entries with question marks are basically bad data and I should remove them. Does anyone have an idea how to do so ?