How to get first and last element of a list which is saved as a string of list of tuples

Question

I have a dataframe column which consists of trajectories of GPS lat, long points. One sample element of this column looks like:

'[(41.141412, -8.618643), (41.141376, -8.618499), (41.14251, -8.620326), (41.143815, -8.622153), (41.144373, -8.623953), (41.144778, -8.62668), (41.144697, -8.627373), (41.14521, -8.630226), (41.14692, -8.632746), (41.148225, -8.631738), (41.150385, -8.629938), (41.151213, -8.62911), (41.15124, -8.629128), (41.152203, -8.628786), (41.152374, -8.628687), (41.152518, -8.628759), (41.15268, -8.630838), (41.153022, -8.632323), (41.154489, -8.631144), (41.154507, -8.630829), (41.154516, -8.630829), (41.154498, -8.630829), (41.154489, -8.630838)]'

I would like to access the first and last point of each of such element in the dataframe (starting and ending latitude, longitude points) by doing something like df["trajectory"][:,0] and df["trajectory"][:,-1]. However, this gives the error:

KeyError: 'key of type tuple not found and not a MultiIndex' and doing df["trajectory"][:][0][0] gives '['.

I tried to force the datatype of the column by doing list(df["trajectory"][0]), however this gives:

['[',
 '(',
 '4',
 '1',
 '.',
 '1',
 '4',
 '1',...

The column has dtype object and each column element seems to be interpreted as strings. How can I work around this issue? I appreciate any help!

It's a string **which represents** a list of tuples. If you want to work with it as a list of tuples, it's necessary to **create** that list of tuples, which entails **converting** the string. See the linked duplicate for details. — Karl Knechtel, Jan 17 '23 at 15:41
@robot, `df["trajectory"].str.strip("[]").str.split(",\s(?=[(])", expand=True).iloc[:, [0,-1]]` will get you a _dataframe_ with the first and last point in each row. — Timeless, Jan 17 '23 at 15:48

How to get first and last element of a list which is saved as a string of list of tuples

0 Answers0