-1

Does anyone have any advice for removing separators of split quotes in a piece of text? I am using Python, and am still a beginner.

For example, "Well," he said, "I suppose I could take a break." In this example, the italicized "he said," is the separator, and needs to be removed. Then, the quote needs to be seen as one string within quotations such as, "Well, I suppose I could take a break." I haven't been able to find code similar to this yet, and was hoping someone may be able to point me in the right direction.

Thanks!

BigBlue
  • 39
  • 3
  • You can string replace `'he said'`, no? – OneCricketeer Sep 29 '16 at 18:54
  • looks like a pretty basic regex – njzk2 Sep 29 '16 at 18:55
  • 1
    It is not clear what the input data is (a paragraph of text, a whole book, a list of sentences, a list of text lines?) nor what should be done with it. It could range from removing everything between second and third quote, and a complete [NLP](https://en.wikipedia.org/wiki/Natural_language_processing). – zvone Sep 29 '16 at 19:03
  • This is for a complete NLP project. I am analyzing the syntax of the dialogue within the text (a .txt file). Thanks for your help! – BigBlue Oct 03 '16 at 15:34

1 Answers1

2

In order to get the content only within " in your given string, you may use re library as:

import re
my_string = '"Well," he said, "I suppose I could take a break."'
quoted_string = re.findall(r'\".*?\"', my_string)
# 'quoted_string' is -> ['"Well,"', '"I suppose I could take a break."']
new_string = ''.join(quoted_string).replace('"', '')
# 'new_string' is -> 'Well, I suppose I could take a break.'

You may write the same as one-liner as:

''.join(re.findall(r'\".*?\"', my_string)).replace('"', '')
Moinuddin Quadri
  • 46,825
  • 13
  • 96
  • 126