1

data:

a=  [{"content": 1, "time": 1577870427}, {"content": 4, "time": 1577870427},
     {"content": 2, "time": 1577956827},
     {"content": 3, "time": 1580548827}, {"content": 5, "time": 1580635227},
     {"content": 6, "time": 1583054427}, {"content": 7, "time": 1583140827}]

i hope content more than 5

final data

[{"content": 6, "time": 1583054427}, {"content": 7, "time": 1583140827}]

my code

index = pd.to_datetime([i['time'] for i in a], unit='s')
df = pd.Series(a,index)
df.gt(5)

but raise error

xin.chen
  • 964
  • 2
  • 8
  • 24

1 Answers1

1

Problem is in your Series are data in dictionaries, so in pandas is really not easy processing, also it is possible ony in loops (apply or list comprehension or for).

index = pd.to_datetime([i['time'] for i in a], unit='s')
df = pd.Series(a,index)
print (df.head().apply(type))
2020-01-01 09:20:27    <class 'dict'>
2020-01-01 09:20:27    <class 'dict'>
2020-01-02 09:20:27    <class 'dict'>
2020-02-01 09:20:27    <class 'dict'>
2020-02-02 09:20:27    <class 'dict'>
dtype: object

If want filter it is possible by extract content to Series with scalars and then possible compare:

print (df[df.str.get('content').gt(5)])
2020-03-01 09:20:27    {'content': 6, 'time': 1583054427}
2020-03-02 09:20:27    {'content': 7, 'time': 1583140827}
dtype: object

How it working:

print (df.str.get('content'))
2020-01-01 09:20:27    1
2020-01-01 09:20:27    4
2020-01-02 09:20:27    2
2020-02-01 09:20:27    3
2020-02-02 09:20:27    5
2020-03-01 09:20:27    6
2020-03-02 09:20:27    7
dtype: int64

print (df.str.get('content').gt(5))
2020-01-01 09:20:27    False
2020-01-01 09:20:27    False
2020-01-02 09:20:27    False
2020-02-01 09:20:27    False
2020-02-02 09:20:27    False
2020-03-01 09:20:27     True
2020-03-02 09:20:27     True
dtype: bool

If want processing data need apply with custom function:

def f(x):
    x['time'] = pd.to_datetime(x['time'], unit='s')
    return x

df = df.apply(f)
print (df)
2020-01-01 09:20:27    {'content': 1, 'time': 2020-01-01 09:20:27}
2020-01-01 09:20:27    {'content': 4, 'time': 2020-01-01 09:20:27}
2020-01-02 09:20:27    {'content': 2, 'time': 2020-01-02 09:20:27}
2020-02-01 09:20:27    {'content': 3, 'time': 2020-02-01 09:20:27}
2020-02-02 09:20:27    {'content': 5, 'time': 2020-02-02 09:20:27}
2020-03-01 09:20:27    {'content': 6, 'time': 2020-03-01 09:20:27}
2020-03-02 09:20:27    {'content': 7, 'time': 2020-03-02 09:20:27}
dtype: object

So better is create DataFrame:

df = pd.DataFrame(a)
print (df)
   content        time
0        1  1577870427
1        4  1577870427
2        2  1577956827
3        3  1580548827
4        5  1580635227
5        6  1583054427
6        7  1583140827

Then is easy processing, e.g. compare, because scalars:

print (df['content'].gt(5))
0    False
1    False
2    False
3    False
4    False
5     True
6     True
Name: content, dtype: bool

df['time'] = pd.to_datetime(df['time'], unit='s')
print (df)
   content                time
0        1 2020-01-01 09:20:27
1        4 2020-01-01 09:20:27
2        2 2020-01-02 09:20:27
3        3 2020-02-01 09:20:27
4        5 2020-02-02 09:20:27
5        6 2020-03-01 09:20:27
6        7 2020-03-02 09:20:27
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • but final data must be ``` [{"content": 6, "time": 1583054427}, {"content": 7, "time": 1583140827}] ``` – xin.chen Feb 21 '20 at 09:02
  • @xin.chen - I know, but in pandas processing Series like `df = pd.Series(a,index)` is really complicated, because dicts. – jezrael Feb 21 '20 at 09:03
  • actual ,my data is [{},{}], can you use pands.series ? – xin.chen Feb 21 '20 at 09:04
  • @xin.chen - yop, it is possible, but data with dict, lists or similar are not easy processing, also it is slow, because pandas working fastest with arrays of scalars – jezrael Feb 21 '20 at 09:07
  • @xin.chen - Answer is about lists, but same is possible say for list of dicts, check [this](https://stackoverflow.com/a/52563718/2901002) – jezrael Feb 21 '20 at 09:11
  • @xin.chen - Added to answer how is possible convert datetimes in dict. – jezrael Feb 21 '20 at 09:17