I'm trying to read one excel file which looks like below:
I also have one script which will convert this xlsx file into the csv files with its sheet names (If three sheets available then it will create three different csv files).
It's csv file looks like below:
Unnamed: 0,Gender A,Unnamed: 2,Gender B,Unnamed: 4,Gender C,Gender D
date,Male,Female,Male,Female,Male,Female
2017-01-01 00:00:00,2,3,3,2,3,3
2017-01-02 00:00:00,5,7,7,42,3,5
2017-01-03 00:00:00,4,6,6,12,2,7
2017-01-04 00:00:00,6,7,3,6,4,8
2017-01-05 00:00:00,6,8,8,3,5,3
2017-01-06 00:00:00,54,3,3,6,3,5
2017-01-07 00:00:00,3,4,6,3,6,5
2017-01-08 00:00:00,3,6,6,3,6,4
2017-01-09 00:00:00,2,2,8,7,5,2
2017-01-10 00:00:00,4,3,2,4,5,5
2017-01-11 00:00:00,12,10,10,3,1,6
2017-01-12 00:00:00,9,7,7,3,4,1
So, my first question is which is better choice to process these files - xlsx or csv?
Next, I just want to read first two rows as a column header. So that I can understand In which Gender how many males and females are available.
Expected Output:
0 date Gender A_Male Gender A_Female Gender B_Male Gender B_Female Gender C_Male Gender D_Female
1 2017-01-01 00:00:00 2 3 3 2 3 3
2 2017-01-02 00:00:00 5 7 7 42 3 5
3 2017-01-03 00:00:00 4 6 6 12 2 7
4 2017-01-04 00:00:00 6 7 3 6 4 8
5 2017-01-05 00:00:00 6 8 8 3 5 3
6 2017-01-06 00:00:00 54 3 3 6 3 5
7 2017-01-07 00:00:00 3 4 6 3 6 5
8 2017-01-08 00:00:00 3 6 6 3 6 4
9 2017-01-09 00:00:00 2 2 8 7 5 2
10 2017-01-10 00:00:00 4 3 2 4 5 5
11 2017-01-11 00:00:00 12 10 10 3 1 6
12 2017-01-12 00:00:00 9 7 7 3 4 1