-4

Suppose I have a text file :

As a manager, he told FIFA TV he communicates his messages in a measured way. “I’m not one of the lads,” Southgate explained.

Is there a way to get the sentence inside the quote (") and save that sentence as variable? I know I have to use the scanner method, but I'm new to python language and I don't know how. Can someone give me an example on how to store this?

  • Kindly Have a look at ["How to Extract a string between double quotes"](https://stackoverflow.com/questions/22735440/extract-a-string-between-double-quotes) – Abhi Jul 07 '18 at 03:56
  • The sentence inside `"` have special characters so using it as variable name is not possible. You will need to do add `_` like characters to fill the gap and your task can be done. Check this https://stackoverflow.com/questions/5036700/how-can-you-dynamically-create-variables-via-a-while-loop too. – hygull Jul 07 '18 at 04:03
  • I would prefer not to use modules such as re as discussed in that topic, just my opinion, but I think they complicate stuff. – Elodin Jul 07 '18 at 04:04
  • My doubt is, you want to use the string within `"` as a variable name (creating variables dynamically) or you just want to assign this to other variable? – hygull Jul 07 '18 at 04:07
  • I just want to get the string inside the " and transform it with another method @RishikeshAgrawani – Syafiqur Rahman Jul 07 '18 at 04:20
  • There can be multiple substrings inside a string enclosed within `"` or only 1? – hygull Jul 07 '18 at 04:59
  • only 1 sentence inside " – Syafiqur Rahman Jul 07 '18 at 05:02
  • Okay thanks for update. The text `“I’m not one of the lads,”` is not really inside `"` as it has `”`. Please make sure if it is `"` or `”`? I checked it in my editor. If it is `”` then we need to add support for **utf-8** texts, if it is `"` then it is fine. – hygull Jul 07 '18 at 05:23
  • if it is using " how? and if it using ” how? why – Syafiqur Rahman Jul 07 '18 at 05:27

3 Answers3

1

If you can be sure that there are always two double-quotes in the strings you're trying to parse, you can simply use str.split('"')[0] to extract what's between them.

>>> s = '''As a manager, he told FIFA TV he communicates his messages in a measured way. "I’m not one of the lads," Southgate explained.'''
>>> s.split('"')[1]
'I’m not one of the lads,'

Edit: I now see that your input string actually uses the slanted double-quotes, and , not the standard double-quote, ", in which case I suggest that you use the following instead:

s = '''As a manager, he told FIFA TV he communicates his messages in a measured way. “I’m not one of the lads,” Southgate explained.'''
print(s[s.find('“') + 1:s.find('”')])

This outputs:

I’m not one of the lads,
blhsing
  • 91,368
  • 6
  • 71
  • 106
  • this is just split the sentence into word. what i mean is, to get the sentence inside the " – Syafiqur Rahman Jul 07 '18 at 05:13
  • Please look closer. It's properly getting the sentence between the quotes: `I’m not one of the lads,`. I just edited my answer to make it easier to read. – blhsing Jul 07 '18 at 05:17
  • doesnt work even when i paste ur code. the eror says IndexError: list index out of range – Syafiqur Rahman Jul 07 '18 at 05:20
  • I now see that it is because your input string actually uses the slanted double-quotes, `“` and `”`, not the standard double-quote, `"`. – blhsing Jul 07 '18 at 05:22
  • they are different? – Syafiqur Rahman Jul 07 '18 at 05:25
  • Yes they are completely different characters. I've updated my answer with a new solution to account for the special double-quotes that you are using. Please check it out. – blhsing Jul 07 '18 at 05:27
  • that works. thanks mate. now what is this +1:s.find('”') – Syafiqur Rahman Jul 07 '18 at 05:32
  • You're welcome. `s.find('“')` returns the index of the first `“` in the string, and by adding one to it, we get the starting index of the sentence you're looking for, up to the ending index, which is found with `s.find('”')`. – blhsing Jul 07 '18 at 05:35
0

You may want to do something like this:

file = open("filename.txt","r") # Opens the file
sentence = file.readline().split() # ['A','s',' ','a',' ','m'...]
startQuote = sentence.index('"') # Finds first occurence
endQuote = sentence[startQuote::].index("'") # Finds first occurrence after first quote
stringSentence = ''.join(sentence[startQuote:endQuote:]) # Creates string with splicing

If you want the code to accommodate for 'smart quotes', you can simplify it to to:

file = open("filename.txt","r") # Opens the file
sentence = file.readline().split() # ['A','s',' ','a',' ','m'...]
startQuote = sentence.index(“) # Finds first occurence
endQuote = sentence.index(”) # Finds first occurrence of end quote
stringSentence = ''.join(sentence[startQuote:endQuote:]) # Creates string with splicing

You may have to fix up some 'fence-post-errors' in this code as I have not tested it.

I hope this gives you an idea of what you need to do.

Elodin
  • 650
  • 5
  • 23
  • this ones give me error. the error said startQuote = sentence.index("\"") # Finds first occurence ValueError: '"' is not in list – Syafiqur Rahman Jul 07 '18 at 04:17
  • That may be because, the quotes you used in the example, were not quotes. They were 'smart quotes', if you look carefully, you will see what I mean. They are slightly slanted. – Elodin Jul 07 '18 at 04:25
0

Let suppose you've your line of text in a file named sentence.txt.

sentence.txt

As a manager, he told FIFA TV he communicates his messages in a measured way. "I'm not one of the lads," Southgate explained.

Now, you can try the below code to read the above line and extract the sub string enclosed within " (double quotes).

# -*- coding: utf-8 -*-
with open('sentence.txt', encoding='utf-8') as f:
    sentence = f.read().strip();

words = sentence.split('\"');

if len(words) == 3:
    string_in_double_quote = words[1];
    print string_in_double_quote # I'm not one of the lads,
else:
    print 'WARNING: String in text file does not have 2 double quotes, make sure to have it'
hygull
  • 8,464
  • 2
  • 43
  • 52