-1

I have a list of dictionaries of market data below, how do I convert it into a Pandas DataFrame using its keys as columns/

[
    '{"level":0,"side":"Ask","price":"13745.75000","volume":"2"}',
    '{"level":1,"side":"Ask","price":"13745.50000","volume":"5"}',
    '{"level":2,"side":"Ask","price":"13745.25000","volume":"6"}',
    '{"level":3,"side":"Ask","price":"13745.00000","volume":"15"}',
    '{"level":4,"side":"Ask","price":"13744.75000","volume":"4"}',
    '{"level":5,"side":"Ask","price":"13744.50000","volume":"5"}',
    '{"level":6,"side":"Ask","price":"13744.25000","volume":"3"}',
    '{"level":7,"side":"Ask","price":"13744.00000","volume":"3"}',
    '{"level":8,"side":"Ask","price":"13743.75000","volume":"2"}',
    '{"level":9,"side":"Ask","price":"13743.50000","volume":"1"}',
    '{"level":10,"side":"Bid","price":"13743.00000","volume":"2"}',
    '{"level":11,"side":"Bid","price":"13742.75000","volume":"4"}',
    '{"level":12,"side":"Bid","price":"13742.50000","volume":"3"}',
    '{"level":13,"side":"Bid","price":"13742.25000","volume":"4"}',
    '{"level":14,"side":"Bid","price":"13742.00000","volume":"4"}',
    '{"level":15,"side":"Bid","price":"13741.75000","volume":"4"}',
    '{"level":16,"side":"Bid","price":"13741.50000","volume":"4"}',
    '{"level":17,"side":"Bid","price":"13741.25000","volume":"4"}',
    '{"level":18,"side":"Bid","price":"13741.00000","volume":"4"}',
    '{"level":19,"side":"Bid","price":"13740.75000","volume":"2"}',
]

user5305519
  • 3,008
  • 4
  • 26
  • 44

2 Answers2

1

You have a list of string representation of dicts. It is easier to convert to pandas dataframe if you can convert it to list of dicts.

For that you can use ast.literal_eval which is safer than plain eval like this

from ast import literal_eval

data = ['{"level":0,"side":"Ask","price":"13745.75000","volume":"2"}', '{"level":1,"side":"Ask","price":"13745.50000","volume":"5"}', '{"level":2,"side":"Ask","price":"13745.25000","volume":"6"}', '{"level":3,"side":"Ask","price":"13745.00000","volume":"15"}', '{"level":4,"side":"Ask","price":"13744.75000","volume":"4"}', '{"level":5,"side":"Ask","price":"13744.50000","volume":"5"}', '{"level":6,"side":"Ask","price":"13744.25000","volume":"3"}', '{"level":7,"side":"Ask","price":"13744.00000","volume":"3"}', '{"level":8,"side":"Ask","price":"13743.75000","volume":"2"}', '{"level":9,"side":"Ask","price":"13743.50000","volume":"1"}', '{"level":10,"side":"Bid","price":"13743.00000","volume":"2"}', '{"level":11,"side":"Bid","price":"13742.75000","volume":"4"}', '{"level":12,"side":"Bid","price":"13742.50000","volume":"3"}', '{"level":13,"side":"Bid","price":"13742.25000","volume":"4"}', '{"level":14,"side":"Bid","price":"13742.00000","volume":"4"}', '{"level":15,"side":"Bid","price":"13741.75000","volume":"4"}', '{"level":16,"side":"Bid","price":"13741.50000","volume":"4"}', '{"level":17,"side":"Bid","price":"13741.25000","volume":"4"}', '{"level":18,"side":"Bid","price":"13741.00000","volume":"4"}', '{"level":19,"side":"Bid","price":"13740.75000","volume":"2"}']

data_new = [literal_eval(i) for i in data]

Output will look like this,

[{'level': 0, 'side': 'Ask', 'price': '13745.75000', 'volume': '2'},
 {'level': 1, 'side': 'Ask', 'price': '13745.50000', 'volume': '5'},
 {'level': 2, 'side': 'Ask', 'price': '13745.25000', 'volume': '6'},
 {'level': 3, 'side': 'Ask', 'price': '13745.00000', 'volume': '15'},
 {'level': 4, 'side': 'Ask', 'price': '13744.75000', 'volume': '4'},
 {'level': 5, 'side': 'Ask', 'price': '13744.50000', 'volume': '5'},
 {'level': 6, 'side': 'Ask', 'price': '13744.25000', 'volume': '3'},
 {'level': 7, 'side': 'Ask', 'price': '13744.00000', 'volume': '3'},
 {'level': 8, 'side': 'Ask', 'price': '13743.75000', 'volume': '2'},
 {'level': 9, 'side': 'Ask', 'price': '13743.50000', 'volume': '1'},
 {'level': 10, 'side': 'Bid', 'price': '13743.00000', 'volume': '2'},
 {'level': 11, 'side': 'Bid', 'price': '13742.75000', 'volume': '4'},
 {'level': 12, 'side': 'Bid', 'price': '13742.50000', 'volume': '3'},
 {'level': 13, 'side': 'Bid', 'price': '13742.25000', 'volume': '4'},
 {'level': 14, 'side': 'Bid', 'price': '13742.00000', 'volume': '4'},
 {'level': 15, 'side': 'Bid', 'price': '13741.75000', 'volume': '4'},
 {'level': 16, 'side': 'Bid', 'price': '13741.50000', 'volume': '4'},
 {'level': 17, 'side': 'Bid', 'price': '13741.25000', 'volume': '4'},
 {'level': 18, 'side': 'Bid', 'price': '13741.00000', 'volume': '4'},
 {'level': 19, 'side': 'Bid', 'price': '13740.75000', 'volume': '2'}]

As you can see it is a list of dicts. Pandas can work on it easily. For converting to pandas dataframe do this,

df = pd.DataFrame(data_new)

    level   side    price   volume
0   0   Ask 13745.75000 2
1   1   Ask 13745.50000 5
2   2   Ask 13745.25000 6
3   3   Ask 13745.00000 15
4   4   Ask 13744.75000 4
5   5   Ask 13744.50000 5
6   6   Ask 13744.25000 3
7   7   Ask 13744.00000 3
8   8   Ask 13743.75000 2
9   9   Ask 13743.50000 1
10  10  Bid 13743.00000 2
11  11  Bid 13742.75000 4
12  12  Bid 13742.50000 3
13  13  Bid 13742.25000 4
14  14  Bid 13742.00000 4
15  15  Bid 13741.75000 4
16  16  Bid 13741.50000 4
17  17  Bid 13741.25000 4
18  18  Bid 13741.00000 4
19  19  Bid 13740.75000 2

You can set level as index too if needed by doing,

df = pd.DataFrame(df_new).set_index("level")
Sreeram TP
  • 11,346
  • 7
  • 54
  • 108
0
data=['{"level":0,"side":"Ask","price":"13745.75000","volume":"2"}', '{"level":1,"side":"Ask","price":"13745.50000","volume":"5"}', '{"level":2,"side":"Ask","price":"13745.25000","volume":"6"}', '{"level":3,"side":"Ask","price":"13745.00000","volume":"15"}', '{"level":4,"side":"Ask","price":"13744.75000","volume":"4"}', '{"level":5,"side":"Ask","price":"13744.50000","volume":"5"}', '{"level":6,"side":"Ask","price":"13744.25000","volume":"3"}', '{"level":7,"side":"Ask","price":"13744.00000","volume":"3"}', '{"level":8,"side":"Ask","price":"13743.75000","volume":"2"}', '{"level":9,"side":"Ask","price":"13743.50000","volume":"1"}', '{"level":10,"side":"Bid","price":"13743.00000","volume":"2"}', '{"level":11,"side":"Bid","price":"13742.75000","volume":"4"}', '{"level":12,"side":"Bid","price":"13742.50000","volume":"3"}', '{"level":13,"side":"Bid","price":"13742.25000","volume":"4"}', '{"level":14,"side":"Bid","price":"13742.00000","volume":"4"}', '{"level":15,"side":"Bid","price":"13741.75000","volume":"4"}', '{"level":16,"side":"Bid","price":"13741.50000","volume":"4"}', '{"level":17,"side":"Bid","price":"13741.25000","volume":"4"}', '{"level":18,"side":"Bid","price":"13741.00000","volume":"4"}', '{"level":19,"side":"Bid","price":"13740.75000","volume":"2"}']

import pandas as pd

You can use:

df=pd.Series(data)

Finally make use of apply() and eval():

df=df.apply(lambda x:eval(x))

data=df.tolist()

Finally:

df=pd.DataFrame(data).set_index('level')

Now if you print df you will get your desired output

Anurag Dabas
  • 23,866
  • 9
  • 21
  • 41
  • 2
    https://stackoverflow.com/questions/1832940/why-is-using-eval-a-bad-practice – jezrael Apr 21 '21 at 05:06
  • thank you @jezrael I read this many times but I thinks It's in the case of oop...i.e classes...But sometimes eval easily manupulates the data..In case of classes setattr is good practice – Anurag Dabas Apr 21 '21 at 05:12
  • 1
    I've used ast.literal_eval in my answer. I've found it to be safer. – Sreeram TP Apr 21 '21 at 05:15
  • 1
    all answered did Work but this one have no imports! the data coming from realtime socket pandas series did faster process thank you – SismacJamstack Apr 21 '21 at 06:18