How to split values of list at point where lower case and upper case letter are merged

Question

I am trying to create a lists of list of facilities offered by different restaurants out of a pandas dataframe containing different restaurants and their information one of the column is named "Facility" and contains facility offered by different restaurants. column Facility is as below

0      Home Delivery   Vegetarian Only  Valet Parking...
1      Wheelchair Accessible   Serves Non Veg   Full ...
2      Home Delivery   Full Bar Available  Live Music...
3      Full Bar Available  Valet Parking AvailableInd...
4      Full Bar Available  Table booking recommendedS...
                             ...                        
170                        Home Delivery  Indoor Seating
171    Wheelchair Accessible   Full Bar Available  Ni...
172    Full Bar Available  Smoking AreaBrunchLive Spo...
173    Full Bar Available  Free ParkingLive Sports Sc...
174                     Indoor SeatingDesserts and Bakes

After splitting on " " I got this output

#code for splitting
split_facility = facility_series.apply(lambda x: x.split("  "))
split_facility[0]  #first element

output:
['Home Delivery',
 ' Vegetarian Only',
 'Valet Parking AvailableIndoor SeatingKid FriendlyPet FriendlyVegan OptionsOutdoor SeatingCatering AvailableSelf ServiceServes Jain FoodDesserts and Bakes']

Here I noticed that in some values of the list the consecutive facilities have merged like ["Valet Parking AvailableIndoor Seating"] which otherwise should be ["Valet Parking Available", "Indoor Seating"]
I want the output to be like this:-

[['Home Delivery','Vegetarian Only','Valet Parking Available','Indoor Seating','Kid Friendly','Pet Friendly','Vegan Options','Outdoor Seating','Catering Available','Self Service','Serves Jain Food','Desserts and Bakes']
['Wheelchair Accessible','Serves Non Veg','Full Bar Available','Indoor Seating','Free Wifi','Free Parking','Outdoor Seating','Table booking recommended','Brunch']]

Try reading a [regular expression primer](https://www.google.com/search?q=regular+expression+primer), there are many out there. — joanis, Oct 01 '19 at 12:22
There's also this [regex reference question](https://stackoverflow.com/q/22937618/3216427) here on SO. — joanis, Oct 01 '19 at 12:23
Thank you for the links I'll surely go through it, but as of now due to time constraint, it will be a world of good if I can get a solution to it. — Devesh, Oct 01 '19 at 18:11

score 0 · Answer 1 · answered Oct 10 '19 at 15:22

0

Try using ([A-Z][a-z])([A-Z][a-z]) for identifying words which come in the pattern RegExp (first letter of both the words capitalized) and replace with $"1 $2".

In here, $1 is the first word group and $2 is the second word group

answered Oct 10 '19 at 15:22

rbhattad

51
1
1

How to split values of list at point where lower case and upper case letter are merged

1 Answers1