-1

I have some data that looks something like this

s = "string,string,string;otherstring,otherstring,otherstring"

I want to first split it by the '; ' I used s.split('; ') so I then have

s = ["string, string, string", "otherstring, otherstring, otherstring"]

Then I want to split it again this time by the ', '.

I have tried a few different things that have not worked. How can I solve this?

First I started with

s = "string,string,string;otherstring,otherstring,otherstring"
s.split('; ')

for i in range(len(s)):
   s[i].split(', ')

But that didn't work so I then tried

s = "string,string,string;otherstring,otherstring,otherstring"
[i.split('; ') for i in s.split(', ')]

And that also didn't work.

I'm assuming there is a solution using re.split() but I didn't find anything in my 5 minute google search

Patrick
  • 29
  • 4
  • 2
    What is your desired result? [string, string, string, otherstring, otherstring, otherstring] or [[string, string, string], [otherstring, otherstring, otherstring]] – Gilseung Ahn Jul 25 '23 at 00:40
  • So what do you want the final result to look like? A flat list, as if there were no difference between comma and semicolon? Or two lists, one from each side of the semicolon, delimited by commas? – Mark Reed Jul 25 '23 at 01:11
  • "First I started with" - this creates all the split-up results, but doesn't do anything with the sub-lists. "But that didn't work so I then tried" - that creates a list of sub-lists, but then doesn't give it a name, so it won't be useful later. – Karl Knechtel Jul 25 '23 at 04:02
  • "I'm assuming there is a solution using re.split()" - well, from your search, did you understand how `re.split` works? Do you know how to write regular expressions? Can you think of a regular expression that describes the pattern you want to use for splitting? Where exactly are you stuck? – Karl Knechtel Jul 25 '23 at 04:25

4 Answers4

3

you are trying to split the entire string s directly, rather than first splitting it into a list of substrings and then splitting each substring individually. Additionally, you need to pay attention to the spaces after the commas when splitting:

s = "string,string,string;otherstring,otherstring,otherstring"

# First split by ';' to get individual substrings
substrings = s.split(';') #removed the space

# Then split each substring by ', ' to get the final result
result = [substring.split(',') for substring in substrings]
print(result)

Output:

[['string', 'string', 'string'], ['otherstring', 'otherstring', 'otherstring']]

Now you have a list of lists where each inner list contains the individual elements after the second split.

Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
zoldxk
  • 2,632
  • 1
  • 7
  • 29
1

It's probably easier to use the builtin re module which has its own version of split that will let you split with a regex and there for have multiple delimiters.

import re

s = "string,string,string;otherstring,otherstring,otherstring"

re.split(r'[,;]', s)
# ['string', 'string', 'string', 'otherstring', 'otherstring', 'otherstring']
Mark
  • 90,562
  • 7
  • 108
  • 148
0

I am not sure if they are all understand you wrong but I feel like this is the result your going for. One list out of two splits without any sub-lists.

s = "string,string,string;otherstring,otherstring,otherstring"

output = [ y for x in [ i.split(",") for i in s.split(";")] for y in x ]
print( output )

Output

['string', 'string', 'string', 'otherstring', 'otherstring', 'otherstring']

Edit: This is also possible to do with re with more ease.

SomeSimpleton
  • 350
  • 1
  • 2
  • 12
0

I'm assuming you'd like to get a single list of all elements:

['string', 'string', 'string', 'otherstring', 'otherstring', 'otherstring']

3 steps to solve

  1. Split by ;
  2. Use a list comprehension to split by ,.
  3. Combine the individual elements of two lists by using sum() with an empty list as the second argument:
    s = "string,string,string;otherstring,otherstring,otherstring"
 
    split_on_semi = s.split(";")

    split_on_comma = [item.split(",") for item in split_semi]

    final_list = sum(split_comma, [])
FlightPlan
  • 43
  • 1
  • 8