0

I am pretty new in data science. I am trying to deal DataFrame data inside a list. I have read the almost every post about string indices must be integers, but it did not help at all.

My DataFrame looks like this: enter image description here

And the my list look like this

myList -> [0098b710-3259-4794-9075-3c83fc1ba058 1.561642e+09    32.775882   39.897459],
          [0098b710-3259-4794-9075-3c83fc1ba057 1.561642e+09    32.775882   39.897459],
and goes on...

This is the Data in case you need to reproduce something guys.

I need to access the list items(dataframes) one by one, then I need to split dataframe if the difference between two timestamps greater than 60000

I wrote code this, but it gives an error, whenever I tried to access timestamp. Can you guys help with the problem

mycode:

a = []
for i in range(0,len(data_one_user)):
   x = data_one_user[i]
   x['label'] = (x['timestamp'] - x['timestamp'].shift(1))
   x['trip'] = np.where(x['label'] > 60000, True, False)
   x = x.drop('label', axis=1)
   x['trip'] = np.where(x['trip'] == True, a.append(x) , a.extend(x))
   #a = a.drop('trip', axis=1)
   x = a

Edit: If you wonder the object types

data_one_user -> list
data_one_user[0] = x -> pandas. core.frame.DataFrame
data_one_user[0]['timestamp'] = x['timestamp'] -> pandas.core.series.Series

Edit2: I added the error print out

enter image description here

Edit3: Output of x

enter image description here

Kaan Taha Köken
  • 933
  • 3
  • 17
  • 37
  • If you do `x.dtypes` on one of the dataframes, what is the output? – Jarad Jul 12 '19 at 08:41
  • 1
    Please do not paste images of code or errors. Since your question revolves around `pandas` I suggest you read: [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – sophros Jul 12 '19 at 08:43
  • @Jarad the out of this -> user_id:object, timestamp:float64, longitude:float64, latitude:float64, dtype:object – Kaan Taha Köken Jul 12 '19 at 08:44
  • 1
    @sophros As you can see, I added the csv file and the code I wrote. Additionally, I added the error and how the data looks as a screenshot – Kaan Taha Köken Jul 12 '19 at 08:46
  • The error may be because the value of x is list and you have x['label'] the square brackets contain string value. please check the value of x is it a list or dictionary. which data type is it? if x is a list the square brackets must contains integer value. – Abhishek-Saini Jul 12 '19 at 08:52
  • @Abhishek-Saini I have already put the data types, under edit1 – Kaan Taha Köken Jul 12 '19 at 08:55
  • I think you really need to print out `x`. For example, if you do `abc = 'abc123'` and then do `abc['a_string']` you can recreate the `string indices must be integers` error. So something is wrong with `x`. – Jarad Jul 12 '19 at 08:57
  • @Jarad i printed right after x = data_one_user[i], and i tried to access by integer index like x[0] gave me a key error – Kaan Taha Köken Jul 12 '19 at 09:01
  • @Kaan Taha Köken I think when you do `x = data_one_user[i]` you have a string value in x and when you do `x['timestamp']` you are getting an error. – Abhishek-Saini Jul 12 '19 at 09:07
  • @Abhishek-Saini I tried to do x = data_one_user[i]['timestamp'], but it gave the same `indices` error again. When i do x = data_one_user[i][0] , gave `keyerror` – Kaan Taha Köken Jul 12 '19 at 09:12

1 Answers1

0

I found the problem that causes the error. At the end of the list, labels are repeated.

Kaan Taha Köken
  • 933
  • 3
  • 17
  • 37