I have been struggling with this logic. I am getting data from google analytics in this format, per user per site. (So these are all the activities that one user performed on the site) I cannot change the format in which I receive the data.
PROBLEM: I am running a loop through all of the users and get this output for each user. I want to put this data into a data frame in order to use it later. The problem I have, is the 'activities':[{.....},{......}] part, I cannot figure out to store all of this data in a way that makes sense.
{'sampleRate': 1,
'sessions': [{'activities': [{'activityTime': '2020-01-08T16:00:44.399101Z',
'activityType': 'PAGEVIEW',
'campaign': '(not set)',
'channelGrouping': 'Direct',
'customDimension': [{'index': 1}],
'hostname': 'company.domain.com',
'keyword': '(not set)',
'landingPagePath': '/login',
'medium': '(none)',
'pageview': {'pagePath': '/thepath',
'pageTitle': 'thecurrentwebpage'},
'source': '(direct)'},
{'activityTime': '2020-01-08T15:58:43.077293Z',
'activityType': 'PAGEVIEW',
'campaign': '(not set)',
'channelGrouping': 'Direct',
'customDimension': [{'index': 1}],
'hostname': 'company.domain.com',
'keyword': '(not set)',
'landingPagePath': '/login',
'medium': '(none)',
'pageview': {'pagePath': '/theotherpath',
'pageTitle': 'thecurrentwebpage'},
'source': '(direct)'}],
'dataSource': 'web',
'deviceCategory': 'desktop',
'platform': 'Windows',
'sessionDate': '2020-01-08',
'sessionId': '1578491x03d'},
{'activities': [{'activityTime': '2019-12-28T21:58:48.993944Z',
'activityType': 'PAGEVIEW',
'campaign': '(not set)',.....
EXPECTED OUTPUT:
For each user's data to be stored in tables organised as such:
If there is some logic error in the picture, I am happy to change what I have. I just need the data to work.
PS: I need to use SQL and ERD's in LucidChart, I have never before had to manipulate data in this format. Any help, to get the data -that is structured like the example above- into a dataframe(s).
EDITED:
Example of two different types of activities (the activity is always classified as either 'pageview' or 'event'):
{'activityTime':
# Pageview activity
'2020-01-08T15:48:38.012671Z',
'activityType': 'PAGEVIEW',
'campaign': '(not set)',
'channelGrouping': 'Direct',
'customDimension': [{'index': 1}],
'hostname': 'company.domain.com',
'keyword': '(not set)',
'landingPagePath': '/login',
'medium': '(none)',
'pageview': {'pagePath': '/login',
'pageTitle': 'titleofthepage'},
'source': '(direct)'},
# Event activity
{'activityTime': '2020-01-08T15:48:37.915105Z',
'activityType': 'EVENT',
'campaign': '(not set)',
'channelGrouping': 'Direct',
'customDimension': [{'index': 1}],
'event': {'eventAction': 'Successfully Logged '
'In',
'eventCategory': 'Auth',
'eventCount': '1',
'eventLabel': '(not set)'},
'hostname': 'company.domain.com',
'keyword': '(not set)',
'landingPagePath': '/login',
'medium': '(none)',
'source': '(direct)'}]