0

Each string in the list below corresponds to two tags:

 tags = ['Club House Folk Pop ', 'alternative rock electro ']

I would like to split the string in order to create sublists with the correct genres classified, as in:

['Club house', 'Folk Pop'] and ['alternative rock', 'electro']

I know I can split the list with:

for t in tags:
   tag = t.split("")

But that would disrupt the meaning of the tags.

Is there a way I can split them using one specific space "", like so:

tags = ['Club House Folk Pop ', 'alternative rock electro ']

                   ^                             ^
                   |                             |
                   |                             |
                  here                          here
8-Bit Borges
  • 9,643
  • 29
  • 101
  • 198
  • 1
    What is considered a "correct" genre? Do you have a list of valid genre's available? – Brendan Abel Feb 09 '17 at 01:54
  • It it always the second space? What about if the *first* genre is a single word (e.g. `'electro alternative rock'`)? You're probably better off trying to find matches to a list of known genres (if possible). – Mac Feb 09 '17 at 01:54
  • 1
    Possible duplicate of [Split string at nth occurrence of a given character](http://stackoverflow.com/questions/17060039/split-string-at-nth-occurrence-of-a-given-character) – Jonathan von Schroeder Feb 09 '17 at 01:55

1 Answers1

0

Assuming that it is always after the second space, you can split the list using the following:

x = [[" ".join(tag.split(" ")[:2]) , " ".join(tag.split(" ")[2:])] for tag in tags]

What this does is iterate over every item in the list, then splits that item by spaces. It next combines the first two elements and everything past the last two elements. Assuming that the example data you posted is representative of the whole data set, this should work.

Wso
  • 192
  • 2
  • 11