How do I scan a sentence in Python

Question

Suppose I have a text file :

As a manager, he told FIFA TV he communicates his messages in a measured way. “I’m not one of the lads,” Southgate explained.

Is there a way to get the sentence inside the quote (") and save that sentence as variable? I know I have to use the scanner method, but I'm new to python language and I don't know how. Can someone give me an example on how to store this?

Kindly Have a look at ["How to Extract a string between double quotes"](https://stackoverflow.com/questions/22735440/extract-a-string-between-double-quotes) — Abhi, Jul 07 '18 at 03:56
The sentence inside `"` have special characters so using it as variable name is not possible. You will need to do add `_` like characters to fill the gap and your task can be done. Check this https://stackoverflow.com/questions/5036700/how-can-you-dynamically-create-variables-via-a-while-loop too. — hygull, Jul 07 '18 at 04:03
I would prefer not to use modules such as re as discussed in that topic, just my opinion, but I think they complicate stuff. — Elodin, Jul 07 '18 at 04:04
My doubt is, you want to use the string within `"` as a variable name (creating variables dynamically) or you just want to assign this to other variable? — hygull, Jul 07 '18 at 04:07
I just want to get the string inside the " and transform it with another method @RishikeshAgrawani — Syafiqur Rahman, Jul 07 '18 at 04:20
There can be multiple substrings inside a string enclosed within `"` or only 1? — hygull, Jul 07 '18 at 04:59
Okay thanks for update. The text `“I’m not one of the lads,”` is not really inside `"` as it has `”`. Please make sure if it is `"` or `”`? I checked it in my editor. If it is `”` then we need to add support for **utf-8** texts, if it is `"` then it is fine. — hygull, Jul 07 '18 at 05:23

blhsing · Accepted Answer · 2018-07-07T05:27:01.043

1

If you can be sure that there are always two double-quotes in the strings you're trying to parse, you can simply use str.split('"')[0] to extract what's between them.

>>> s = '''As a manager, he told FIFA TV he communicates his messages in a measured way. "I’m not one of the lads," Southgate explained.'''
>>> s.split('"')[1]
'I’m not one of the lads,'

Edit: I now see that your input string actually uses the slanted double-quotes, “ and ”, not the standard double-quote, ", in which case I suggest that you use the following instead:

s = '''As a manager, he told FIFA TV he communicates his messages in a measured way. “I’m not one of the lads,” Southgate explained.'''
print(s[s.find('“') + 1:s.find('”')])

This outputs:

I’m not one of the lads,

edited Jul 07 '18 at 05:27

answered Jul 07 '18 at 04:10

blhsing

91,368
6
71
106

this is just split the sentence into word. what i mean is, to get the sentence inside the " – Syafiqur Rahman Jul 07 '18 at 05:13
Please look closer. It's properly getting the sentence between the quotes: `I’m not one of the lads,`. I just edited my answer to make it easier to read. – blhsing Jul 07 '18 at 05:17
doesnt work even when i paste ur code. the eror says IndexError: list index out of range – Syafiqur Rahman Jul 07 '18 at 05:20
I now see that it is because your input string actually uses the slanted double-quotes, `“` and `”`, not the standard double-quote, `"`. – blhsing Jul 07 '18 at 05:22
they are different? – Syafiqur Rahman Jul 07 '18 at 05:25
Yes they are completely different characters. I've updated my answer with a new solution to account for the special double-quotes that you are using. Please check it out. – blhsing Jul 07 '18 at 05:27
that works. thanks mate. now what is this +1:s.find('”') – Syafiqur Rahman Jul 07 '18 at 05:32
You're welcome. `s.find('“')` returns the index of the first `“` in the string, and by adding one to it, we get the starting index of the sentence you're looking for, up to the ending index, which is found with `s.find('”')`. – blhsing Jul 07 '18 at 05:35

Elodin · Answer 2 · 2018-07-07T04:25:30.777

You may want to do something like this:

file = open("filename.txt","r") # Opens the file
sentence = file.readline().split() # ['A','s',' ','a',' ','m'...]
startQuote = sentence.index('"') # Finds first occurence
endQuote = sentence[startQuote::].index("'") # Finds first occurrence after first quote
stringSentence = ''.join(sentence[startQuote:endQuote:]) # Creates string with splicing

If you want the code to accommodate for 'smart quotes', you can simplify it to to:

file = open("filename.txt","r") # Opens the file
sentence = file.readline().split() # ['A','s',' ','a',' ','m'...]
startQuote = sentence.index(“) # Finds first occurence
endQuote = sentence.index(”) # Finds first occurrence of end quote
stringSentence = ''.join(sentence[startQuote:endQuote:]) # Creates string with splicing

You may have to fix up some 'fence-post-errors' in this code as I have not tested it.

I hope this gives you an idea of what you need to do.

this ones give me error. the error said startQuote = sentence.index("\"") # Finds first occurence ValueError: '"' is not in list — Syafiqur Rahman, Jul 07 '18 at 04:17
That may be because, the quotes you used in the example, were not quotes. They were 'smart quotes', if you look carefully, you will see what I mean. They are slightly slanted. — Elodin, Jul 07 '18 at 04:25

hygull · Answer 3 · 2018-07-07T05:58:07.923

0

Let suppose you've your line of text in a file named sentence.txt.

sentence.txt

As a manager, he told FIFA TV he communicates his messages in a measured way. "I'm not one of the lads," Southgate explained.

Now, you can try the below code to read the above line and extract the sub string enclosed within " (double quotes).

# -*- coding: utf-8 -*-
with open('sentence.txt', encoding='utf-8') as f:
    sentence = f.read().strip();

words = sentence.split('\"');

if len(words) == 3:
    string_in_double_quote = words[1];
    print string_in_double_quote # I'm not one of the lads,
else:
    print 'WARNING: String in text file does not have 2 double quotes, make sure to have it'

edited Jul 07 '18 at 05:58

answered Jul 07 '18 at 05:38

hygull

8,464
2
43
52

this is works, but the ouptut is WARNING even though i copy-paste the text and the code – Syafiqur Rahman Jul 07 '18 at 05:47
Paste me your string, I will check and update it. Let me check what is the issue. – hygull Jul 07 '18 at 05:48
This is what in my Windows CMD, `e:\Users\Rishikesh\Python3\Practice\DoubleQuoteStringSeparation>python main.py I'm not one of the lads,`. – hygull Jul 07 '18 at 05:51
dude, i just paste the text above. im using py 2.7 – Syafiqur Rahman Jul 07 '18 at 05:51
Actually you copied the old code (where I'd forgotten to remove `[1]`). Now you please try the current code. – hygull Jul 07 '18 at 05:57

How do I scan a sentence in Python

3 Answers3