I have a bunch of strings such as:
Super Mario Bros. 8 (En,Fr,De,Es,It)
Donald Duck in Whacky Land (En,Fr,De,Es,Sv)
Toadstool Adventures 3D (En)
Chinaland (En,De)
A title which doesn't have any such thing
...
That is, a title of a product followed by (sometimes) a list of one or more language codes in parentheses.
I really struggle to come up with a (PCRE) regexp to safely remove these from the strings in a safe manner. That is, not likely to touch the titles.
I know that ([A-Z]{1}[a-z]{1})
must be involved somewhere, to match a single language code such as "It" or "De", but how I should handle the possibility of any number of such in a row, with commas between or no comma (if it's just one), is beyond my regular expression skills.
I really wish that they had used some kind of unambiguous separator between the title part and the "metadata" part of the filenames... Then I wouldn't need to do all this manual trial-and-error removal. But they didn't.