1

I'm a pandas beginner.

I have the following data:

a = [{"content": '1', "time": 1577870427}, 
     {"content": '4', "time": 1577870427},
     {"content": '2', "time": 1577956827},
     {"content": '3', "time": 1580548827}, 
     {"content": '4', "time": 1580635227},
     {"content": '5', "time": 1583054427}, 
     {"content": '6', "time": 1583140827}]

And I want:

2020-01: [
     {"content": '1', "time": '2020-01-01'},
     {"content": '4', "time": '2020-01-01'},
     {"content": '2', "time": '2020-01-02'},
    ]

    2020-02: [
     {"content": '3', "time": '2020-02-01'}, 
     {"content": '4', "time": '2020-02-02'},
    ]

    2020-03: [
     {"content": '5', "time": '2020-03-01'}, 
     {"content": '6', "time": '2020-03-02'}
    ]
Alan Kavanagh
  • 9,425
  • 7
  • 41
  • 65
xin.chen
  • 964
  • 2
  • 8
  • 24

2 Answers2

2

You can convert column time to datetimes by to_datetime with unit parameter and for custom format use Series.dt.strftime:

df = pd.DataFrame(a)
d = pd.to_datetime(df['time'], unit='s')
df['time'] = d.dt.strftime('%Y-%m-%d')
g = d.dt.strftime('%Y-%m')

d1 = {k: v.to_dict('r') for k, v in df.groupby(g)}
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

first you should convert your list of dictionaries into a pandas DataFrame. As you can see here, this is rather easy:

import pandas as pd
data = pd.DataFrame(a)

next you want to convert your time column into date-time objects rather than integers. The best way I know to do this is the to_datetime function in pandas. Please see the documentation for further Details.

data['time'] = pd.to_datetime(data['time'], unit = 's') #should do the trick, but could not test yet...

For the actual grouping, pandas provides the very powerful groupby function, which is implemented for all DataFrame objects. Again, the docs will provide detailed information.

data.groupby(['time'])

please note that if the output is not exactly what you want, you can easily modify it using groupby because it accepts mappings, functions, labels or lists of labels as an argument. This should allow you to get exactly what you want, if you play with it a little.

Chris
  • 710
  • 7
  • 15
  • Data conversion not worked in this case, I used `data.time = data.time.apply(lambda d: dt.datetime.fromtimestamp(d).strftime('%Y-%m-%d'))` instead. – ipj Feb 20 '20 at 10:06
  • glad you found a way, I don't deal with timestamps that often. I am sure there is a way to use `to_datetime`... but I am not the expert here. – Chris Feb 20 '20 at 10:14
  • I've checked once again and `pd.to_datetime` is ok, but there is typo: should be `unit = 's'` not `units = 's'`. So usage of lambda function is not needed. – ipj Feb 20 '20 at 10:59