creating a empty JSON file and uptaing it by pandas dataframe's row on python

Question

So, I have a pandas dataframe with large no. of rows. whose one row might look like this and more.

data_store.iloc[0]
Out[5]: 
mac_address                                                         00:03:7f:05:c0:06
Visit                                                                        1/4/2016
storeid                                                          Ritika - Bhubaneswar
Mall or not                                                               High Street
Address                                 794, Sahid Nagar, Janpath, Bhubaneswar-751007
Unnamed: 4                                                                         OR
Locality                                                                      JanPath
Affluence Index Locality                                                          4
Lifestyle Index                                                                   5
Tourist Attraction In the locality?                                               0
City                                                                      Bhubaneswar
Pop Density City                                                                 2131
Population Density of City                                                        NaN
City Affluence Index                                                           Medium
Mall / Shopping Complex                                                   High Street
Mall Premiumness Index                                                            NaN
Multiplex                                                                         NaN
Offices Nearby                                                                    NaN
Food Court                                                                        NaN
Average Footfall                                                                  NaN
Average Rental                                                                    NaN
Size of the mall                                                                  NaN
Area                                                                              NaN
Upscale Street                                                                    NaN
Place of Worship in vicinity                                                      NaN
High Street / Mall                                                        High Street
Brand Premiumness Index                                                             4
Restaurant Nearby?                                                                0
Store Size                                                                      Large
Area.1                                                                           2600

There may be some more value in place of Nan just take it as a example.Now the unique key here is mac_address so I want to start with a empty JSON document. now for each row of data i will update the JSON file. like

{
   mac_address: "00:03:7f:05:c0:06"
     {
     "cities" : [
       {
         "City Name1" : "Wittenbergplatz",
         "City count" : "12"
       },
       {
         "City Name2" : "Spichernstrasse",
         "City Count" : "19"
       },
       {
        "City Name3" : "Weberwiese",
        "City count" : "30"
       }
      ]
     }
 }

city count is no. of times a mac_address visited to a city. By reading this particular row I would like to update a city named Bhubneswar and Count 1. Now for each new row i would like to check if it is already there in JSON for that probably i would have to import the JSON in python in dictionary or something(suggest).So, if a mac_address is already there i would like to update the info of that row in existing JSON across that mac_address and if it is not there i would like to add that mac_address as new field and update the info of that row across that mac_address. I have to do it in python and pandas dataframe as i have a bit idea about pandas dataframe. Any help on this?

can you post an output of the following command: `print(data_store.head(10)[['mac_address','City']])` ? — MaxU - stand with Ukraine, Jun 11 '16 at 13:57
your desired JSON file is not a valid JSON file - please correct it — MaxU - stand with Ukraine, Jun 11 '16 at 14:22
@MaxU as asked by other user to ask my actual update question as a separate one and leave this one. I have posted another question on same topic with required modification. Please have a look at it. [http://stackoverflow.com/questions/37784660/creating-a-empty-dictionary-and-updating-it-by-a-pandas-data-frame-in-python] — , Jun 13 '16 at 08:20

MaxU - stand with Ukraine · Accepted Answer · 2016-06-11T14:27:22.230

your desired JSON file is not a valid JSON file, here is the error message thrown by the online JSON validator:

Input:

{
    "mac_address": "00:03:7f:05:c0:06" {
        "cities": [{
            "City Name1": "Wittenbergplatz",
            "City count": "12"
        }, {
            "City Name2": "Spichernstrasse",
            "City Count": "19"
        }, {
            "City Name3": "Weberwiese",
            "City count": "30"
        }]
    }
}

Error

Error: Parse error on line 2:
..."00:03:7f:05:c0:06" {        "cities": [{            
-----------------------^
Expecting 'EOF', '}', ':', ',', ']', got '{'

this solution might help you to start:

In [440]: (df.groupby(['mac_address','City'])
   .....:    .size()
   .....:    .reset_index()
   .....:    .rename(columns={0:'count'})
   .....:    .groupby('mac_address')
   .....:    .apply(lambda x: x[['City','count']].to_dict('r'))
   .....:    .to_dict()
   .....: )
Out[440]:
{'00:03:7f:05:c0:01': [{'City': 'aaa', 'count': 1}],
 '00:03:7f:05:c0:02': [{'City': 'bbb', 'count': 1}],
 '00:03:7f:05:c0:03': [{'City': 'ccc', 'count': 2}],
 '00:03:7f:05:c0:05': [{'City': 'xxx', 'count': 1},
  {'City': 'zzz', 'count': 1}],
 '00:03:7f:05:c0:06': [{'City': 'aaa', 'count': 1},
  {'City': 'bbb', 'count': 1}],
 '00:03:7f:05:c0:07': [{'City': 'aaa', 'count': 3},
  {'City': 'bbb', 'count': 1}]}

data:

In [441]: df
Out[441]:
          mac_address City
0   00:03:7f:05:c0:06  aaa
1   00:03:7f:05:c0:06  bbb
2   00:03:7f:05:c0:07  aaa
3   00:03:7f:05:c0:07  bbb
4   00:03:7f:05:c0:07  aaa
5   00:03:7f:05:c0:01  aaa
6   00:03:7f:05:c0:02  bbb
7   00:03:7f:05:c0:03  ccc
8   00:03:7f:05:c0:03  ccc
9   00:03:7f:05:c0:07  aaa
10  00:03:7f:05:c0:05  xxx
11  00:03:7f:05:c0:05  zzz

i just typed the JSON for example. So errors are there sorry for that. What your output is a dict so how to update that on existing JSON. — , Jun 11 '16 at 14:45
What can be done to get the desired output like this `{ "00:08:22:24:f8:02": { "cities": { "aaa": 12, "bbb": 4, "ccc": 6 } } }` no.s (12,4,6 are the count.) — , Jun 13 '16 at 12:48

creating a empty JSON file and uptaing it by pandas dataframe's row on python

1 Answers1