In crawling RSS
feed, I do not want duplicate items added to my list. The problem is that some duplicates are not detected by my if title not in mylist
line because they are slightly different. Nonetheless, these two news items are basically the same. Take a look at this two.
"Kom igjen, norsk ungdom, de eldre trenger oss!"
and
"Kom igjen norsk ungdom, de eldre trenger oss"
As you see, the first one has comma after Kom igjen
and the second one doesn't and has an exclamation mark at the end.
Since there is no other unique id that makes individual items unique, I do not know how to detect duplicates like the one above.