-1

i have a data frame like below

enter image description here

i tried using explode but didn't worked. and i would like to convert to the below format enter image description here

i am using the below code

import json
import pandas as pd
from pandas.io.json import json_normalize
import requests

url ="https://url/Token"
api_url = "url/Responses?pageSize=1"
r = requests.post('url/Token', data={'userName': 'userName','password': 'password','grant_type' :'password'})
d = r.json()
t = d['access_token']
headers =  {"Content-Type":"application/json", "Authorization": f"Bearer {t}"}
data = requests.get(api_url,headers=headers).json()
result = pd.json_normalize(data,record_path=['Data'])
df= result.explode('ResponseFields').reset_index(drop=True)
df = df.join(pd.json_normalize(df.pop('ResponseFields')))
final = df.pivot_table(index=['TicketNumber','SubmitDateTimeUtc'],columns='FormFieldKey',values='Value', aggfunc='first')



Appreciate your help.

Thanks

sample Data

{'Data': [{'Uid': '770d152d-1f07-ea11-828a-00155df8e0e9', 'UserUid': '119a203c-45f0-4901-bd2b-d347210cf250', 'FormUid': '7204be0c-1307-ea11-828a-00155df8e0e9', 'Title': 'reimbursement test - 12', 'TicketNumber': 'AD8F', 'SyncDateTimeUtc': '2019-11-14T20:41:56.043Z', 'SubmitDateTimeUtc': '2019-11-14T20:41:56.043Z', 'ResponseFields': [{'FormFieldKey': 'FIRST_NAME', 'Value': 'Chan', 'Files': []}, {'FormFieldKey': 'LAST_NAME', 'Value': 'Marlyn', 'Files': []}, {'FormFieldKey': 'EMP_', 'Value': '12', 'Files': []}, {'FormFieldKey': 'EMAIL', 'Value': 'cmarlyn945@gmail.com ', 'Files': []}, {'FormFieldKey': 'FACILITY', 'Value': 'ABC Remote south', 'Files': []}, {'FormFieldKey': 'TYPE', 'Value': 'Short call out 3', 'Files': []}, {'FormFieldKey': 'AMOUNT', 'Value': '24.00', 'Files': []}, {'FormFieldKey': 'RECEIPT_DAY_DATE', 'Value': '11/14/2019', 'Files': []}, {'FormFieldKey': 'BEG_DAY__DATE', 'Value': '11/12/2019', 'Files': []}, {'FormFieldKey': 'END_DAY_DATE', 'Value': '11/14/2019', 'Files': []}, {'FormFieldKey': 'SUBMIT_DAY_DATE', 'Value': '11/14/2019', 'Files': []}, {'FormFieldKey': 'RECEIPT_IMAGE_FILE', 'Value': '{"displayName":"1573764018026424446650638108755.jpg","systemName":"3c7e4b58-273b-4d27-92c8-f79f9be55940.jpg","url":""}', 'Files': [{'UploadedFileName': '1573764018026424446650638108755.jpg', 'Url': 'https://test.net/tpf/aa/Forms/Response462250/3c7e4b58-273b-4d27-92c8-f79f9be55940.jpg?md5=5a8e6bdd6612ff29d26016a43348e477&tk365=exp=1668939087~acl=%2Ftpf%2Faa%2FForms%2FResponse462250%2F3c7e4b58-273b-4d27-92c8-f79f9be55940.jpg%2A~hmac=384b663258dc19e1e5ef2b4bd71c9bd6ae5b05117e4e0f654078863cc09fd985', 'Md5': '5a8e6bdd6612ff29d26016a43348e477'}]}, {'FormFieldKey': 'SIGNATURE_CANVAS', 'Value': '[[{"x":1286,"y":536},{"x":1286,"y":656},{"x":1381,"y":394},{"x":1386,"y":376},{"x":1386,"y":369},{"x":1388,"y":362},{"x":1388,"y":357},{"x":1388,"y":353}],[{"x":304,"y":195},{"x":299,"y":191},{"x":292,"y":181},{"x":283,"y":172},{"x":274,"y":158},{"x":260,"y":144},{"x":239,"y":128},{"x":218,"y":112},{"x":202,"y":105},{"x":186,"y":105},{"x":174,"y":117},{"x":163,"y":140},{"x":151,"y":186},{"x":149,"y":239},{"x":153,"y":292},{"x":158,"y":341},{"x":163,"y":378},{"x":170,"y":403},{"x":179,"y":424},{"x":188,"y":438},{"x":202,"y":447},{"x":218,"y":447},{"x":232,"y":443},{"x":250,"y":438},{"x":269,"y":431},{"x":290,"y":427},{"x":308,"y":422},{"x":322,"y":415},{"x":334,"y":406}],[{"x":352,"y":267},{"x":352,"y":260},{"x":352,"y":258},{"x":352,"y":258},{"x":352,"y":258},{"x":357,"y":265},{"x":357,"y":281},{"x":361,"y":304},{"x":361,"y":327},{"x":361,"y":353},{"x":361,"y":373},{"x":361,"y":390},{"x":361,"y":399},{"x":361,"y":406},{"x":361,"y":399},{"x":361,"y":378},{"x":359,"y":346},{"x":352,"y":290},{"x":345,"y":244},{"x":345,"y":214},{"x":345,"y":191},{"x":345,"y":172},{"x":350,"y":161},{"x":359,"y":151},{"x":380,"y":144},{"x":403,"y":144},{"x":429,"y":144},{"x":447,"y":149},{"x":466,"y":158},{"x":479,"y":175},{"x":489,"y":195},{"x":489,"y":216},{"x":486,"y":235},{"x":477,"y":255},{"x":466,"y":272},{"x":447,"y":286},{"x":424,"y":295},{"x":398,"y":297},{"x":373,"y":297},{"x":355,"y":288}],[{"x":572,"y":163},{"x":572,"y":163},{"x":572,"y":163},{"x":567,"y":175},{"x":567,"y":198},{"x":565,"y":242},{"x":572,"y":295},{"x":579,"y":350},{"x":583,"y":394},{"x":590,"y":427},{"x":595,"y":445},{"x":595,"y":457},{"x":590,"y":457},{"x":577,"y":438},{"x":560,"y":403},{"x":537,"y":341},{"x":523,"y":290},{"x":519,"y":265},{"x":519,"y":244},{"x":521,"y":225},{"x":535,"y":212},{"x":572,"y":198},{"x":637,"y":184},{"x":692,"y":175},{"x":750,"y":163},{"x":775,"y":156},{"x":796,"y":147},{"x":810,"y":142},{"x":810,"y":135},{"x":799,"y":135},{"x":771,"y":140}],[{"x":644,"y":471},{"x":669,"y":447},{"x":688,"y":424},{"x":720,"y":392},{"x":787,"y":339},{"x":808,"y":320},{"x":824,"y":302},{"x":817,"y":295},{"x":803,"y":295},{"x":778,"y":299},{"x":752,"y":316},{"x":734,"y":341},{"x":725,"y":366},{"x":725,"y":390},{"x":736,"y":410},{"x":757,"y":424},{"x":782,"y":429},{"x":810,"y":429},{"x":842,"y":417},{"x":868,"y":394},{"x":886,"y":369},{"x":896,"y":334},{"x":891,"y":302},{"x":879,"y":279},{"x":863,"y":260},{"x":854,"y":249},{"x":849,"y":244},{"x":849,"y":244},{"x":866,"y":249},{"x":893,"y":260},{"x":967,"y":281},{"x":1044,"y":297},{"x":1129,"y":304},{"x":1206,"y":309},{"x":1245,"y":316},{"x":1270,"y":323},{"x":1293,"y":329},{"x":1300,"y":341},{"x":1293,"y":362},{"x":1256,"y":390},{"x":1164,"y":429},{"x":1060,"y":447},{"x":923,"y":452},{"x":805,"y":454},{"x":572,"y":464}]]', 'Files': []}], 'IpAddress': '111.11.11.67', 'DeviceId': None}], 'TotalRecords': 10020, 'ErrorCode': '', 'ErrorMessage': '', 'IsSuccess': True}

Above code works perfectly fine for page size =1 . When i change to api_url = "url/Responses?pageSize=15" i am getting the below error

enter image description here

Jijju
  • 17
  • 4
  • 1
    Please add your starting and ending dataframes as text instead of images so that we can more easily copy and work with them. Check out this post for help on posting good reproducible pandas examples: https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – scotscotmcc Nov 20 '22 at 18:57

1 Answers1

1

you can use pivot_table:

# if ResponseFields' type is string:
#import ast
#df['ResponseFields']=df['ResponseFields'].apply(ast.literal_eval) convert string to dict

result = pd.json_normalize(data,record_path=['Data'])
df= result.explode('ResponseFields').reset_index(drop=True)
df = df.join(pd.json_normalize(df.pop('ResponseFields')))
final = df.pivot_table(index=['TicketNumber','SubmitDateTimeUtc'],columns='FormFieldKey',values='Value', aggfunc='first')

#if you want to see all columns
df['Files']=df['Files'].astype(str)
final = df.pivot_table(index=['Uid', 'UserUid', 'FormUid', 'Title', 'TicketNumber', 'Files','SyncDateTimeUtc','SubmitDateTimeUtc', 'IpAddress'],columns='FormFieldKey',values='Value', aggfunc='first')
final=final.reset_index()
Bushmaster
  • 4,196
  • 3
  • 8
  • 28
  • Thank you. But i am getting only the First Name Not the other columns(LAST_NAME.......) Not sure if it is something with "aggfunc" – Jijju Nov 20 '22 at 19:45
  • TicketNumber SubmitDateTimeUtc FIRST_NAME AD8F 2010-11-14T20:41:56.043Z Chan – Jijju Nov 20 '22 at 19:47
  • please make sure you use it correctly. Because it works fine for me. You are still getting an error, write this:`print(df.to_dict())` and add the output to the question. – Bushmaster Nov 20 '22 at 19:52
  • {'Uid': {0: '770d152d-1f07-ea11-828a-00155df8e0e9'}, 'UserUid': {0: '119a203c-45f0-4901-bd2b-d347210cf250'}, 'FormUid': {0: '7204be0c-1307-ea11-828a-00155df8e0e9'}, 'Title': {0: 'reimbursement test - 12'}, 'TicketNumber': {0: 'AD8F'}, 'SyncDateTimeUtc': {0: '2019-11-14T20:41:56.043Z'}, 'SubmitDateTimeUtc': {0: '2019-11-14T20:41:56.043Z'}, 'IpAddress': {0: '111.11.11.67'}, 'DeviceId': {0: None}, 'FormFieldKey': {0: 'FIRST_NAME'}, 'Value': {0: 'Chan'}, 'Files': {0: []}} – Jijju Nov 20 '22 at 20:14
  • This is the transformed state. Try this: `print(df.head().to_dict('list'))` – Bushmaster Nov 20 '22 at 20:20
  • Not letting me paste the data. How do i paste it? – Jijju Nov 20 '22 at 20:29
  • to question or to my answer ? – Bushmaster Nov 20 '22 at 20:31
  • Not letting me paste the answer data set because of character limit – Jijju Nov 20 '22 at 20:37
  • 1
    Can you paste the whole code you used to the question? – Bushmaster Nov 20 '22 at 20:40
  • okey i edited my answer. Can you check it ? – Bushmaster Nov 20 '22 at 21:09
  • Copied code along with sample json data to my initial thread. – Jijju Nov 20 '22 at 21:10
  • if you want to see all columns in final df check my answer. – Bushmaster Nov 20 '22 at 21:15
  • 1 result = pd.json_normalize(data,record_path=['Data']) 2 df= result.explode('ResponseFields').reset_index(drop=True) ----> 3 df = df.join(pd.json_normalize(df.pop('ResponseFields'))) 4 final = df.pivot_table(index=['TicketNumber','SubmitDateTimeUtc'],columns='FormFieldKey',values='Value', aggfunc='first') AttributeError: 'float' object has no attribute 'items' – Jijju Nov 20 '22 at 21:22