I'm trying to split sentences from academic papers. Traditionally splitting sentences would simply be:
sentence = 'This is a sentence. This is another sentence'
separate = sentence.split('.')
# [ This is a sentence, This is another sentence ]
However, this logic does not work if you have sentences such as:
This is a sentence is a paper with a citation (author et al., 2020a) and it contains more
information. This is similar to the examples I have (author et al., 2020a).
How could I split sentences (like the sample above) so the output would look something like this:
['This is a sentence is a paper with a citation (author et al., 2020a) and it contains more information' , 'This is similar to the examples I have (author et al., 2020a)' ]
What is an easy solution to this problem? Appreciate the suggestions.