-2

I have a dataframe having the following keys:

tourist_spot,
nearest landmark,
longitute(landmark),
latitude(landmark),
nearest police station,
longitute(PS),
latitude(PS),
nearest bus stop,
longitute(BS),
latitude(BS),
nearest taxi stand,
longitute(TS),
latitude(TS)

and several other columns like this. What I want to achieve is something like this:

name,
type,
latitude,
longitude,
nearest_to 

The name will have the name of the landmark or PS or BS etc. Type will indicate the type of place. For example, PS for police station etc. latitude and longitude will be the locations of each place and nearest_to will be the name of the tourist_spot.

Sample data:

tourist_spot                ts1
nearest landmark            nl1
longitute(landmark)         4
latitude(landmark)          5
nearest police station      ps1
longitute(PS)               7
latitude(PS)                8
nearest bus stop            bs1
longitute(BS)               9
latitude(BS)                10
nearest taxi stand          ts1
longitute(TS)               11 
latitude(TS)                12

Please convert into a table like structure. Its tough to make a big table here.

Desired output:

 name   type   longitude   latitude   nearest_to
    nl1    landmark   4         5           ts1
    ps1    PS         7         8           ts1                        

My code so far:

import pandas as pd
df = pd.read_excel("C:\\Users\\NCOG1\\Desktop\\Meghalaya\\Daribokgre.xlsx")
df1 = pd.DataFrame(columns=['name','type_id','longitude','latitude', 'nearby_to'])
df1['name'] = df['Nearest Landmark Name'] 
df1['type_id'] = df['Nearest Landmark Name']
df1['longitude'] = df['Longitude (of Nearest Landmark)']
df1['latitude'] = df['Latitude (of Nearest Landmark)']
df1['nearby_to'] = df['Name of Tourist Spot']
Karan Gupta
  • 529
  • 2
  • 7
  • 21

1 Answers1

1

First is necessary parse columns with Latitude and Longitude with one column before:

df = pd.read_excel('Daribokgre.xlsx', sheet_name='Sheet2', index_col=0)
c = df.columns[df.columns.str.startswith('Latitude')]
idx = df.columns.get_indexer(c)
df1 = df.iloc[:, np.sort(np.r_[idx-1, idx, idx + 1])[1:]].reset_index()
#print (df1)

Then extract data to new columns in helper DataFrame:

df2 = df1.columns.to_frame(index=False)
m = df2[0].str.contains('Latitude|Longitude')
df2['type'] = df2[0].mask(m).ffill()
df2['b'] = np.where(m, df2[0].str.split(r'\s+\(', n=1).str[0], 'name')

Last create MultiIndex and reshape by stack:

df1.columns = [df2['type'], df2['b']]
df1 = (df1.stack(0)
         .reset_index(level=0, drop=True)
         .reset_index().rename(columns={1:'type'})
         .rename_axis(None, axis=1))
print (df1)
                                                type     Latitude  \
0                                           Bus Stop  25⁰33'05" N   
1                                           Landmark  25⁰33'05" N   
2                               Name of Tourist Spot  25⁰29'24" N   
3                              Nearest Hospital Name  25⁰36'01" N   
4                            Nearest Taxi Stand Name  25⁰30'45" N   
5                                     Police Station  25⁰35'53" N   
6          Restaurants or other eating joints nearby   2529'22' N   
7  Staying Options nearby (Hotels, Resorts, Homes...  25⁰29'22" N   

     Longitude                                               name  
0  90⁰19'42" E           Oragitok Jnc. Old Williamnagar-Tura road  
1  90⁰19'42" E               Oragitok Williamnagar-Tura  old road  
2  90⁰19'22" E                                         Daribokgre  
3  90⁰16'25" E                                        Asanang PHC  
4  90⁰13'01" E                                  Tura Bazaar, Tura  
5  90⁰15'14" E  Rongram Beat House whoever Daribokgre falls un...  
6   9019'21" E                                Daribokgre Homestay  
7  90⁰19'21" E                            1. Daribokgre Homestay,  
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252