G'day!
I am trying to find the best way to convert the following data from a dataframe into YAML.
Region Country State City County Address Rep
0 AMER USA CA Brea Orange Street 1 Mr. X
1 AMER USA CA Brea Orange Street 2 Mr. Y
2 AMER USA CA Brea Orange Street 3 Mrs. Doe
3 AMER USA CA Brea Buena Park Street 1 Person 1
4 AMER USA CA Brea Buena Park Street 2 Person 2
5 AMER USA CA Brea Buena Park Street 3 Person 3
6 AMER USA NY NYC Bronx Street 40 Person 4
7 EMEA UK - London Kent Street 10 Mrs. Jane
8 APAC Singapore - Singapore Sabana Street 20 Mrs. Xi
In pandas, I can use pivot table to get this structure
df.pivot_table(index=['Region','Country','State','City','County','Address'],aggfunc='first')
Rep
Region Country State City County Address
AMER USA CA Brea Buena Park Street 1 Person 1
Street 2 Person 2
Street 3 Person 3
Orange Street 1 Mr. X
Street 2 Mr. Y
Street 3 Mrs. Doe
NY NYC Bronx Street 40 Person 4
APAC Singapore - Singapore Sabana Street 20 Mrs. Xi
EMEA UK - London Kent Street 10 Mrs. Jane`
But how do I take this to YAML (in Pandas way!) in order to achieve the following format:
- Region: AMER
- Country: USA
- State:
- NY:
- City:
- NYC:
- County:
- Bronx
- Address:
- Street40
- Rep:
- Person 4
- CA:
- City:
- Brea:
- County:
- Orange
- Buena Park
- Address:
- Street 1
- Street 2
- Street 3
- Street 1
- Street 2
- Street 3
- Street 4
- Rep:
- Mr. X
- Mr. Y
- Mrs. Doe
- Person 1
- Person 2
- Person 3
One way I found doing this is doing df.iterrows() and then handling the data manually into dictionary and then to yaml, but I just don't like this approach - there must be more elegant 'pandas' way of doing it, if that makes sense :-)
Any help will be much appreciated.