0

I am trying to separate this type of data using python:

['ALCOHOL','BREAD','CAKES AND SWEETS','FRUIT AND VEGETABLES','MILK AND DAIRY PRODUCTS'],['BREAD','CAKES AND SWEETS','DIPS','MILK AND DAIRY PRODUCTS','PASTA'],['HOT FOOD','OTHERS'],['ALCOHOL','BREAD','CAKES AND SWEETS'],['BREAD','CAKES AND SWEETS','FRUIT AND VEGETABLES','MILK AND DAIRY PRODUCTS','OTHERS','SNACKS','SPICES','WATER'],['BREAD','CAKES AND SWEETS','FRUIT AND VEGETABLES'],['BREAD','CAKES AND SWEETS']

At the moment I am splitting the string using '],[' but it seems that when the strings are split the characters that are used to do the splitting are lost. Is there any way to split this string but retain the characters that I am splitting with?

Sean goodlip
  • 29
  • 1
  • 5

3 Answers3

1

Another shorter way is to replace the splitting ',' with a character or string that doesn't occur in your data e.g. replace '],[' with ']###['.

After the replace you can split on '###':

elements = input.replace('],[', ']###[').split('###')
Mace
  • 1,355
  • 8
  • 13
0

Check this out:

x is your string and d is your delimiter

print([y+d for y in x.split(d)])

Or using regex

print(re.split('(\],\[)', x))
davidbilla
  • 2,120
  • 1
  • 15
  • 26
0

Assuming that you want to keep the '[' and ']' you can use split() to get the elements but using split() also removes the splitting string '],['. So you have to process the list you have got to re-add '[' and ']'.

input = "['ALCOHOL','BREAD','CAKES AND SWEETS','FRUIT AND VEGETABLES','MILK AND DAIRY 
   PRODUCTS'],['BREAD','CAKES AND SWEETS','DIPS','MILK AND DAIRY PRODUCTS','PASTA'],            
   ['HOT FOOD','OTHERS'],['ALCOHOL','BREAD','CAKES AND SWEETS'],['BREAD','CAKES AND 
   SWEETS','FRUIT AND VEGETABLES','MILK AND DAIRY 
   PRODUCTS','OTHERS','SNACKS','SPICES','WATER'],['BREAD','CAKES AND SWEETS','FRUIT 
   AND VEGETABLES'],['BREAD','CAKES AND SWEETS']"

elements = []

# remove leading '[' and ending ']' otherwise ---------------
# you get '[[' and ']]' at first and last element
input = input[1:-1]

# split on '],[' and re-add '[' and ']' -------------------------
temp_elements = input.split('],[')
for temp_element in temp_elements:
    elements.append('[' + temp_element + ']')

# result -----------------------------------------------------
for element in elements:
    print(element)

Result

['ALCOHOL','BREAD','CAKES AND SWEETS','FRUIT AND VEGETABLES','MILK AND DAIRY PRODUCTS']
['BREAD','CAKES AND SWEETS','DIPS','MILK AND DAIRY PRODUCTS','PASTA']
['HOT FOOD','OTHERS']
['ALCOHOL','BREAD','CAKES AND SWEETS']
['BREAD','CAKES AND SWEETS','FRUIT AND VEGETABLES','MILK AND DAIRY PRODUCTS','OTHERS','SNACKS','SPICES','WATER']
['BREAD','CAKES AND SWEETS','FRUIT AND VEGETABLES']
['BREAD','CAKES AND SWEETS']

Want to keep the ',' use

for temp_element in temp_elements:
    elements.append('[' + temp_element + '],')
Mace
  • 1,355
  • 8
  • 13