We have several texts (strings) that contain descriptions (not part of the produced speech) like [inaudible] and [Laughter]. We want to delete those elements from our string. They always have the same structure and are written in [...]. Example:
text="I think I could pretty much say, Mike, most of them have become stars, if not all. Because you won. Winning is a wonderful thing. [Laughter] So I thought what I'd do is go around the room"
That's what we tried so far:
text2=re.sub('[.*]', '', text)
or
text2=re.sub('/[.*/]', '', text)
If the text has two or more of these elements [inaudible] and so on, it deletes all the text in between these elements. That should not happen and we don't know how to avoid it. The first example sometimes deletes . and sometimes it doesn't, thats confusing as well. We are python beginners :)