-3

I am trying to create a lists of list of facilities offered by different restaurants out of a pandas dataframe containing different restaurants and their information one of the column is named "Facility" and contains facility offered by different restaurants. column Facility is as below

0      Home Delivery   Vegetarian Only  Valet Parking...
1      Wheelchair Accessible   Serves Non Veg   Full ...
2      Home Delivery   Full Bar Available  Live Music...
3      Full Bar Available  Valet Parking AvailableInd...
4      Full Bar Available  Table booking recommendedS...
                             ...                        
170                        Home Delivery  Indoor Seating
171    Wheelchair Accessible   Full Bar Available  Ni...
172    Full Bar Available  Smoking AreaBrunchLive Spo...
173    Full Bar Available  Free ParkingLive Sports Sc...
174                     Indoor SeatingDesserts and Bakes

After splitting on " " I got this output

#code for splitting
split_facility = facility_series.apply(lambda x: x.split("  "))
split_facility[0]  #first element

output:
['Home Delivery',
 ' Vegetarian Only',
 'Valet Parking AvailableIndoor SeatingKid FriendlyPet FriendlyVegan OptionsOutdoor SeatingCatering AvailableSelf ServiceServes Jain FoodDesserts and Bakes']

Here I noticed that in some values of the list the consecutive facilities have merged like ["Valet Parking AvailableIndoor Seating"] which otherwise should be ["Valet Parking Available", "Indoor Seating"]
I want the output to be like this:-

[['Home Delivery','Vegetarian Only','Valet Parking Available','Indoor Seating','Kid Friendly','Pet Friendly','Vegan Options','Outdoor Seating','Catering Available','Self Service','Serves Jain Food','Desserts and Bakes']
['Wheelchair Accessible','Serves Non Veg','Full Bar Available','Indoor Seating','Free Wifi','Free Parking','Outdoor Seating','Table booking recommended','Brunch']]
Devesh
  • 127
  • 1
  • 10

1 Answers1

0

Try using ([A-Z][a-z])([A-Z][a-z]) for identifying words which come in the pattern RegExp (first letter of both the words capitalized) and replace with $"1 $2".

In here, $1 is the first word group and $2 is the second word group

rbhattad
  • 51
  • 1
  • 1