-1

I have a python string, which contains a number of strings. I want to write a function that extracts strings out of a given string. I'll explain this using an example :

I have this string

'"Row Number","Status","Timestamp","Requestor, Email, id"'

As one can see, This string contains many strings, which are represented inside double quotes(""). I want to write a function that takes as input this string and return the following list to me.

['Row Number', 'Status', 'Timestamp', 'Requestor, Email, id']

Currently, I have very trivial logic. I split the string at ", and then I process a weird list like the following to get my desired list.

['', 'Row Number', ',', 'Status', ',', 'Timestamp', ',', 'Requestor, Email, id', '']

I pick odd indexed items from this list to get my desired list. I wanted to know if there is any better approach to solve this problem. For example, maybe using re, we can get this using a single line of code? Or something else?

Edit : I'm not reading any csv file, or my data is not coming from any csv source. This data is coming from google sheets API. I can not use any functions like read_csv or separate the string at commas. This is not possible since certain strings inside the string also contain the comma. SO, I MUST EXTRACT ALL THE STRINGS INSIDE "" first.

Ruchit Patel
  • 733
  • 1
  • 11
  • 26
  • 1
    Your best bet is probably the `csv` module, which will handle not only quoted strings but also strings that contain commas that aren't meant to be delimiters. – TigerhawkT3 Mar 30 '21 at 08:20
  • Since the substrings you're looking for are really separated by commas rather than double-quotes, you should instead split them by commas, and then strip the double-quotes at both ends of each substring. – blhsing Mar 30 '21 at 08:21
  • @blhsing - With that route I'd say `replace` the double quotes with nothing and then split on comma. – TigerhawkT3 Mar 30 '21 at 08:22
  • Splitting by commas is not an option for me. Since, I'm pretty sure that several strings in the string contains commas. So, `string.split(',')` is not a solution for me – Ruchit Patel Mar 30 '21 at 08:22
  • 1
    In that case, yes, use the `csv` module as I originally said. – TigerhawkT3 Mar 30 '21 at 08:22
  • The data you describe looks like it came from a `.csv` file. If that is the case, then you should *explicitly say so up front* when you ask this sort of question. Except actually you should first do your own research, for example putting something like `python read csv file` into a search engine. If you believe the format is custom, there are two reasons for that: you made it up (then re-design it to be easier for you!) or it was given to you by someone else (talk to them or read their documentation to understand what they had in mind). – Karl Knechtel Mar 30 '21 at 08:22
  • Google Sheets is a spreadsheet service, and it seems to be emitting data in CSV format. Of course, the best option would be to see if the API allows you to extract data by column instead of just getting one long string. Either way, SHOUTING DOESN'T HELP. – TigerhawkT3 Mar 30 '21 at 08:27
  • Your edit makes no sense. There is no good reason why you cannot use the `csv` module. At least not because there are commas inside the quotes. – blhsing Mar 30 '21 at 08:28
  • If you are not reading a `csv`, but just parsing a simple string you can do it using regular expressions: ```python import re s = '"Row Number","Status","Timestamp","Requestor Email id"' pattern = re.compile(r'\"([\w\s]+)+\"') print(pattern.findall(s)) ``` – Fabio Caccamo Mar 30 '21 at 08:30
  • @MikePatel checkout my answer. I have mentioned the scenario where you have commas in your strings too. – an4s911 Mar 30 '21 at 08:40
  • yes, as @TigerhawkT3 suggested, reading from `csv` file works. It provides the quote character as a parameter in order to avoid splitting at commas which are not supposed to be delimiters. Thanks for this suggestion. I was really not aware about this `quotechar` parameter in csv reader. – Ruchit Patel Mar 30 '21 at 08:46

1 Answers1

-2

Use split(‘,’) then use strip(‘\”’) on it to get rid of the . Like this:-

def returnList(string):
    splitted_strings = string.split('”,”')
    #Here I have ‘“,”’ for separating the items only the place 
    #where the commas and double quotations meet
    LIST = []
    
    for item in splitted_strings:
        stripped_item = item.strip('\"')
        LIST.append(stripped_item)
    
    return LIST

string = '"Item1","Item2","Item3"'
print(returnList(string))

#OUTPUT: ['Item1', 'Item2', 'Item3']

Even if you have commas in your items this will work as it splits at the points where both the double quotations and the commas come together.

Hope this helps

an4s911
  • 411
  • 1
  • 6
  • 16