I have a csv file in which a column is itself a dictionary. This column contains three attributes each of which I want as a separate column in the resultant dataframe.
From the answer How to split a single column into three columns in pandas (python)? I am trying to use the following line of code to achieve the desired result:
df[['one', 'two', 'three']] = pd.DataFrame([ x.split(',') for x in df['statistics'].tolist() ])
But when I execute the above line of code I get the following error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_15084\3622721730.py in <module>
----> 1 df[['one', 'two', 'three']] = pd.DataFrame([ x.split(',') for x in df['statistics'].tolist() ])
~\AppData\Local\Temp\ipykernel_15084\3622721730.py in <listcomp>(.0)
----> 1 df[['one', 'two', 'three']] = pd.DataFrame([ x.split(',') for x in df['statistics'].tolist() ])
AttributeError: 'dict' object has no attribute 'split'
I attach the df for ready reference:
kind etag id statistics
youtube#video WerWpr9_ht7SPd646jeYvOrMdFU 1isoZVQ9DxY {'viewCount': '133155', 'likeCount': '9199', 'favoriteCount': '0', 'commentCount': '1271'}
youtube#video IkNRr3T_bPnPpfsRPJ9ZmpFLWQI 2izTju-uxrk {'viewCount': '103436', 'likeCount': '3930', 'favoriteCount': '0', 'commentCount': '712'}
youtube#video ea_8Q2h6XDamfLZNhIL0HM3UZw4 oUOUI4_mS5c {'viewCount': '61008', 'likeCount': '3119', 'favoriteCount': '0', 'commentCount': '210'}
youtube#video LjxX4UdBSR88LO41UtUf6cSBsV4 ONrmi30DkJc {'viewCount': '58111', 'likeCount': '2885', 'favoriteCount': '0', 'commentCount': '141'}
youtube#video D98h38VbjEri485pD7dYrOyfoGM RA7t76Ie1TE {'viewCount': '77895', 'likeCount': '3394', 'favoriteCount': '0', 'commentCount': '216'}
youtube#video 4sa3me5UXvRmHb_4rNUKG0XhuVs boomn3StWJ0 {'viewCount': '57257', 'likeCount': '3187', 'favoriteCount': '0', 'commentCount': '159'}
youtube#video e37d1Q_PIJj0ckLAE1Sv-ukVHDw AV3vptOJVaE {'viewCount': '67967', 'likeCount': '3371', 'favoriteCount': '0', 'commentCount': '207'}
youtube#video Ly4sowP9gxeM-3iNgLUUWydTiaU vq6PEiPXGVk {'viewCount': '213144', 'likeCount': '8917', 'favoriteCount': '0', 'commentCount': '550'}
youtube#video ubupKrV7LSJJCmyw4PBPY91BmPo toDp4JS5cwI {'viewCount': '316336', 'likeCount': '9160', 'favoriteCount': '0', 'commentCount': '747'}
youtube#video g6W6BiuT7Af1alJmvmNtgXzZVLw qFOcxBGmOjQ {'viewCount': '468641', 'likeCount': '16106', 'favoriteCount': '0', 'commentCount': '1021'}
youtube#video jhRggyXoTq_PAghKVfqVaZptT8I 6SOKGnf84Ik {'viewCount': '210653', 'likeCount': '10222', 'favoriteCount': '0', 'commentCount': '591'}
youtube#video 2kXYv_ycWt_AhVLV7ZfQ7KR6zFo q-wZ1819y7c {'viewCount': '214089', 'likeCount': '11232', 'favoriteCount': '0', 'commentCount': '571'}
youtube#video p7RePnFd9fXm6PU_UEBCSDs-iyQ 8I4S5Ery92s {'viewCount': '352246', 'likeCount': '15854', 'favoriteCount': '0', 'commentCount': '655'}
youtube#video mJ3OiBk5QpRTlJs-TH_rzEDHLJE aeSqTAwm5NI {'viewCount': '347399', 'likeCount': '13567', 'favoriteCount': '0', 'commentCount': '713'}
youtube#video iQWVTcoYkgmNjJTy93eo6fqdbrM yPwIprzFfF0 {'viewCount': '361987', 'likeCount': '15262', 'favoriteCount': '0', 'commentCount': '559'}
youtube#video XArq68sxje-985r9BAvs05Jj-HA Gg0wYPxbmjA {'viewCount': '1466364', 'likeCount': '52941', 'favoriteCount': '0', 'commentCount': '4278'}
youtube#video F0_58PVsa6pPEmphN1sEYZBe0sU ZcjXo8KtWRY {'viewCount': '230492', 'likeCount': '7322', 'favoriteCount': '0', 'commentCount': '622'}
youtube#video emkAGoMq-kgWTEwJeNOh3EshkiU ur7hLYv404I {'viewCount': '279350', 'likeCount': '9968', 'favoriteCount': '0', 'commentCount': '1187'}
youtube#video fXqmKxY3vFPYnutf0MqQKoyZQV4 wpgA-rRBqs8 {'viewCount': '215555', 'likeCount': '7564', 'favoriteCount': '0', 'commentCount': '451'}
youtube#video 2ml-vwsPQ_5jdgA2UdxoTc4ZXnk sG5rnRb-FI8 {'viewCount': '283075', 'likeCount': '9599', 'favoriteCount': '0', 'commentCount': '747'}
I require the resultant df to be as follows: