0

I want to transform the following example csv data:

platform_region_combined,platform,cloudregion,on_demand_price_usd,on_demand_price_eur,capacity_storage_price_usd,capacity_storage_price_eur,standard_tier_price_eur,standard_tier_price_usd,standard_tier_price_gbp,enterprise_tier_price_eur,enterprise_tier_price_usd,enterprise_tier_price_gbp,business-critical_tier_price_eur,business-critical_tier_price_usd,business-critical_tier_price_gbp amazonwebservicesaws_asiapacificmumbai,amazonwebservicesaws,asiapacificmumbai,46.00,38.33,25.00,20.83,1.83,2.20,,2.75,3.30,,3.67,4.40, amazonwebservicesaws_asiapacificosaka,amazonwebservicesaws,asiapacificosaka,46.00,38.33,25.00,20.83,2.38,2.85,,3.58,4.30,,4.75,5.70, amazonwebservicesaws_asiapacificseoul,amazonwebservicesaws,asiapacificseoul,46.00,38.33,25.00,20.83,2.29,2.75,,3.38,4.05,,4.58,5.50, amazonwebservicesaws_asiapacificsingapore,amazonwebservicesaws,asiapacificsingapore,46.00,38.33,25.00,20.83,2.08,2.50,,3.08,3.70,,3.38,5.00, amazonwebservicesaws_asiapacificsydney,amazonwebservicesaws,asiapacificsydney,46.00,38.33,25.00,20.83,2.29,2.75,,3.38,4.05,,4.58,5.50, googlecloudplatform_europewest2london,googlecloudplatform,europewest2london,40.00,33.33,23.00,19.17,2.25,2.70,,3.33,4.00,,4.50,5.40, googlecloudplatform_europewest4netherlands,googlecloudplatform,europewest4netherlands,35.00,33.33,20.00,19.17,2.17,2.60,,3.25,3.90,,4.33,5.20, googlecloudplatform_uscentral1iowa,googlecloudplatform,uscentral1iowa,35.00,29.17,20.00,16.67,1.67,2.00,,2.50,3.00,,3.33,4.00, microsoftazure_centralusiowa,microsoftazure,centralusiowa,40.00,33.33,23.00,19.17,1.67,2.00,,2.50,3.00,,3.33,4.00, microsoftazure_eastus2virginia,microsoftazure,eastus2virginia,40.00,33.33,23.00,19.17,1.67,2.00,,2.50,3.00,,3.33,4.00, microsoftazure_japaneasttokyo,microsoftazure,japaneasttokyo,46.00,38.33,25.00,20.83,2.38,2.85,,3.58,4.30,,4.75,5.70,

To the following JSON output:

{
  "platform": {
    "amazonwebserviceaws": {
      "asiapacificosaka": {
        "on_demand_price": {
          "eur": "1.00",
          "usd": "3.00",
          "gbp": "4.00"
        },
        "capacity_storage_price": {
          "eur": "5.00",
          "usd": "6.00",
          "gbp": "7.00"
        },
        "tier": {
          "standard": {
            "eur": "1.00",
            "usd": "2.00",
            "gbp": "3.00"
          },
          "enterprise": {
            "eur": "4.00",
            "usd": "5.00",
            "gbp": "6.00"
          },
          "business-critical": {
            "eur": "7.00",
            "usd": "8.00",
            "gbp": "9.00"
          }
        }
      },
      "asia_test": {
        "on_demand_price": {
          "eur": "1.00",
          "usd": "1.00",
          "gbp": "1.00"
        },
        "capacity_storage_price": {
          "eur": "1.00",
          "usd": "1.00",
          "gbp": "1.00"
        },
        "tier": {
          "standard": {
            "eur": "1.00",
            "usd": "2.00",
            "gbp": "3.00"
          },
          "enterprise": {
            "eur": "4.00",
            "usd": "5.00",
            "gbp": "6.00"
          },
          "business-critical": {
            "eur": "7.00",
            "usd": "8.00",
            "gbp": "9.00"
          }
        }
      }
    },
    "googlecloudplatform": {
      "europe-west": {
        "on_demand_price": {
          "eur": "1.00",
          "usd": "1.00",
          "gbp": "1.00"
        },
        "capacity_storage_price": {
          "eur": "1.00",
          "usd": "1.00",
          "gbp": "1.00"
        },
        "tier": {
          "standard": {
            "eur": "1.00",
            "usd": "2.00",
            "gbp": "3.00"
          },
          "enterprise": {
            "eur": "4.00",
            "usd": "5.00",
            "gbp": "6.00"
          },
          "business-critical": {
            "eur": "7.00",
            "usd": "8.00",
            "gbp": "9.00"
          }
        }
      },
      "europe_east": {
        "on_demand_price": {
          "eur": "1.00",
          "usd": "1.00",
          "gbp": "1.00"
        },
        "capacity_storage_price": {
          "eur": "1.00",
          "usd": "1.00",
          "gbp": "1.00"
        },
        "tier": {
          "standard": {
            "eur": "1.00",
            "usd": "2.00",
            "gbp": "3.00"
          },
          "enterprise": {
            "eur": "4.00",
            "usd": "5.00",
            "gbp": "6.00"
          },
          "business-critical": {
            "eur": "7.00",
            "usd": "8.00",
            "gbp": "9.00"
          }
        }
      }
    }
  }
}

So the CSV file has to be grouped in both platform and cloudregion. Then it has to add an extra key value 'tier' in which the columns standard_tier_price_eur, standard_tier_price_gbp, standard_tier_price_usd etc. have to be splitted by the column name to get the key:value pairs shown in the JSON.

    "tier": {
      "standard": {
        "eur": "1.00",
        "usd": "2.00",
        "gbp": "3.00"
      }, 

Similar for the on_demand_price_xxx or capacity_storage_price_xxx columns which have to be transformed to:

        "on_demand_price": {
          "eur": "1.00",
          "usd": "1.00",
          "gbp": "1.00"
        },
        "capacity_storage_price": {
          "eur": "1.00",
          "usd": "1.00",
          "gbp": "1.00"
        },

Anyone knows if this is even possible?

1 Answers1

0

The Python csv module has a class that is called csv.DictReader which will convert every CSV line in a flat dictionary you can iterate over. From here it is probably some manual coding to get the data in the shape you want.

You can iterate over keys in a dictionary like:

for key in dict:
   parts = key.split('_')

This might be helpful to use the resulting list to access your nested structure:

Access nested dictionary items via a list of keys?

Falk Schuetzenmeister
  • 1,497
  • 1
  • 16
  • 35