The script below takes one string input as a polyline and returns a data frame with the corresponding latitude/longitude pairs.
I would like to input to be a data set as follows:
ActivityID | Polyline |
---|---|
1 | PolyLineValue1 |
2 | PolyLineValue2 |
3 | PolyLineValue2 |
and the output to be (keeping the ActivityID)
ActivityID | latitude | longitude |
---|---|---|
1 | 123 | 123 |
1 | 123 | 123 |
1 | 123 | 123 |
2 | 123 | 123 |
2 | 123 | 123 |
2 | 123 | 123 |
3 | 123 | 123 |
3 | 123 | 123 |
3 | 123 | 123 |
I was thinking along the lines of iterating over the input dataset to do this but I've read here that's not a great idea performance wise.
Please can someone advice how to do this in Python?
import pandas as pd
polyline_str = 'eldyHyOOCKBuA~@_Ar@[LgC`CaAdAaBtA[h@Sv@a@|@St@M\\g@l@TIP?FMDi@Pw@L_@Ra@XH^jAfAtFF|@@~@AhB@z@Aj@]g@Uq@k@oAUu@Ow@UmBWgAUVs@zAc@pBu@xAg@vAm@lBaAtCYn@Y~@Qz@gArCc@\\]`@y@j@e@M}AyCM]Ou@_@kAe@mBqAeGaAcFWo@e@eCWwBY_BWwB[kBAu@LY^a@^a@f@w@d@]n@o@\\q@r@_AVa@Vm@TSv@yAhBoBv@cA`B_BnB_C~@cA\\c@^]pAyAVUJn@Az@_@pCCz@YxBUrBEv@Vt@b@Rf@JpBTd@LnBThAPpB`@b@@jAPd@JJL@VK`@eAbAaAlA]T_@b@_@\\w@xA_AjAa@b@OJ[h@[l@Sr@a@bAg@jBLW`@yA`@e@NKNPXdA|@xEL|@@zCFnBGt@a@To@RkARuA@gAKa@@s@vA_@d@URm@z@aAnB[b@Up@q@tA_@d@_AbAc@Vm@RYNe@`@cAt@a@`@_@PYLg@SA?ORCf@EJa@TeCtBKJmC~Au@La@hAJnA?XcUaD@^Cg@BGBBMJOPIVC\\?f@Hx@ZlBH|@F`AXtDD~@Hr@B~@BJf@Pd@FBTC`BHrCAx@WjCKz@E`A]lBOvBSpBc@|Aa@`Cq@zC_@rBmArE[j@o@~AkAdC}@hAy@jA}@`A[h@_@f@]T}@_C]k@G@c@Xs@sAUOcApAeA~@wA`AQBa@Tc@LiAd@i@Hc@@e@AmAFgAAe@Hg@Pc@`@{@bAe@HkAJKHExBWv@F|@\\x@BrAQPe@@e@My@AmAIkAUaAYc@Dg@Kg@E{CFkAMkAHoGAoBCe@Kg@EoASkAc@kAK_@o@g@FMDYj@GV]b@aA\\{CbAcAfAc@Xc@N", "_ujyHpoDb@Ub@]^c@`@[pBu@hAYb@Q`@Wp@qA\\FBBVp@hABf@Ld@Zb@LPCnBZd@?t@FlAARBjA@l@Kd@[hAEz@Nd@Np@?r@I`AUdBp@fBTlCJx@Hd@Ab@WNu@SgACw@GS]u@Yu@BGnAg@hAWd@Yl@k@^g@d@S\\GfAEf@?b@Hj@?f@FTAhASd@Yb@U`@[`@_@~@kAb@S^]~AkBZk@F?Zf@\\t@Vn@Nz@nAxBHCz@_Az@gAt@s@RY^a@t@wAZc@h@uAT_@~@oDv@qBH{@Nu@Xk@Nq@Bs@Lu@PuAPy@F{@Ps@Lw@Ru@Fk@Rw@@}@Kw@NaAP_DGy@?_AF}@Gw@M{@IQa@Du@iGMu@KeBQ{AQw@Iy@Ss@H}@\\g@Z]?CCChTpB@HHJBERGDOXMt@Od@@RBvAZFCTAPFv@AFGH?x@u@\\c@Vi@EMeAqCA_@Xm@RW~@cA~@w@|@cAt@c@Pe@Kw@_BeGcAcF}@uDmBoHKu@G{@YuBI}@i@yCK{@WkAJw@^c@`@y@lAc@^YrA}B|@iAPe@t@uAd@s@z@cArAsAnAmAn@}@~@aA^e@t@iA`AaAtA{A\\HEz@UjBQvBSlAG~@YnB?|@Vl@`@NfARf@B\\FhBNdBRbF`AjAP\\d@`@lA'
index, lat, lng = 0, 0, 0
# list
coordinates = []
# Set
changes = {'latitude': 0, 'longitude': 0}
# Coordinates have variable length when encoded, so just keep
# track of whether we've hit the end of the string. In each
# while loop iteration, a single coordinate is decoded.
while index < len(polyline_str):
# Gather lat/lon changes, store them in a dictionary to apply them later
for unit in ['latitude', 'longitude']:
shift, result = 0, 0
while True:
byte = ord(polyline_str[index]) - 63
index += 1
result |= (byte & 0x1f) << shift
shift += 5
if not byte >= 0x20:
break
if (result & 1):
changes[unit] = ~(result >> 1)
else:
changes[unit] = (result >> 1)
lat += changes['latitude']
lng += changes['longitude']
coordinates.append((lat / 100000.0, lng / 100000.0))
df = pd.DataFrame(coordinates, columns = ['lat', 'lng'])
print(df)