1

python Original dataframe ( 2 column ):

matchNum accumulatedscore
78       [{'periodvalue': 'FirstHalf', 'periodstatus': 'ResultFinal', 'home': '0', 'away': '0'}, {'periodvalue': 'SecondHalf', 'periodstatus': 'ResultFinal', 'home': '1', 'away': '0'}]
56       [{'periodvalue': 'FirstHalf', 'periodstatus': 'ResultFinal', 'home': '2', 'away': '1'}, {'periodvalue': 'SecondHalf', 'periodstatus': 'ResultFinal', 'home': '4', 'away': '3'}]

How can I change them into original dataframe I hope...

matchNum home1 away1 home2 away2

78                 0     0     1     0
56                 2     1     4     3

It is so difficult.....

Original dataframe: enter image description here

I hope this : enter image description here

3 Answers3

1

Assuming your pandas DataFrame be like:

d = {'matchNum': [78, 56], 
     'accumulatedscore':["[{'periodvalue': 'FirstHalf', 'periodstatus': 'ResultFinal', 'home': '0', 'away': '0'}, {'periodvalue': 'SecondHalf', 'periodstatus': 'ResultFinal', 'home': '1', 'away': '0'}]",
                        "[{'periodvalue': 'FirstHalf', 'periodstatus': 'ResultFinal', 'home': '2', 'away': '1'}, {'periodvalue': 'SecondHalf', 'periodstatus': 'ResultFinal', 'home': '4', 'away': '3'}]"
                        ]}

import pandas as pd
df = pd.DataFrame(d)

You can simply convert the string which has the form of a python dictionary (refer here) :

import ast
df['home1']= df.apply(lambda x: ast.literal_eval(x['accumulatedscore'])[0]['home'] , axis=1)
df['away1']= df.apply(lambda x: ast.literal_eval(x['accumulatedscore'])[0]['away'], axis=1)
df['home2']= df.apply(lambda x: ast.literal_eval(x['accumulatedscore'])[1]['home'], axis=1)
df['away2']= df.apply(lambda x: ast.literal_eval(x['accumulatedscore'])[1]['away'], axis=1)
df = df.drop(columns = 'accumulatedscore')

Your df would be like:

    matchNum    home1   away1   home2   away2
0   78  0   0   1   0
1   56  2   1   4   3
Fatemeh Sangin
  • 558
  • 1
  • 4
  • 19
  • error: malformed node or string: [{'periodvalue': 'FirstHalf', 'periodstatus': 'ResultFinal', 'home': '2', 'away': '1'}, {'periodvalue': 'SecondHalf', 'periodstatus': 'ResultFinal', 'home': '5', 'away': '1'}] – Chai Christo Dec 07 '21 at 17:23
  • I assumed your dataFrame is as shown in the first box. @ChaiChristo – Fatemeh Sangin Dec 07 '21 at 22:48
1

You can extract only the relevant key-value pairs from each dictionary in df['accumulatedscore'], explode the Series, convert it to a DataFrame, and combine duplicate indices:

df1 = (df.merge(df['accumulatedscore']
                .apply(lambda lst:tuple({'home'+str(i): d['home'], 'away'+str(i): d['away']} 
                                        for i, d in enumerate(lst, 1)))
                .explode()
                .apply(pd.Series)
                .groupby(level=0).first(), 
                left_index=True, right_index=True)
       .drop('accumulatedscore', axis=1))

Output:

   matchNum home1 away1 home2 away2
0        78     0     0     1     0
1        56     2     1     4     3
1

The simplest way- without lambdas, just only transformations :). As accumulatedscore actually contains json values.

Preparation

import pandas as pd
import json

d = {
    "matchNum": [78, 56],
    "accumulatedscore":
        [
            '[{"periodvalue": "FirstHalf", "periodstatus": "ResultFinal", "home": "0", "away": "0"}, {"periodvalue": "SecondHalf", "periodstatus": "ResultFinal", "home": "1", "away": "0"}]',
            '[{"periodvalue": "FirstHalf", "periodstatus": "ResultFinal", "home": "2", "away": "1"}, {"periodvalue": "SecondHalf", "periodstatus": "ResultFinal", "home": "4", "away": "3"}]'
        ]
}
df = pd.DataFrame(d)

Solution

dfa = (
    df.join(pd.json_normalize(df["accumulatedscore"].apply(json.loads)))
        .rename(columns={0: "dict1", 1: "dict2"})
        .drop("accumulatedscore", axis=1)
)
dfb = (
    dfa.join(pd.json_normalize(dfa["dict1"]))
       .join(pd.json_normalize(dfa["dict2"]), rsuffix="2")
       .rename(columns={"home": "home1", "away": "away1"})[["matchNum", "home1", "away1", "home2", "away2"]]
)

Result

dfb
   matchNum home1 away1 home2 away2
0        78     0     0     1     0
1        56     2     1     4     3
Ihor Konovalenko
  • 1,298
  • 2
  • 16
  • 21