0

I am struggling to find a way to transform excel data set to a tabular form using Pandas dataframe.

Source data set description: Column 1 lists metrics names and column 2 onward, all columns have corresponding site wise and date wise metric values. All columns in the row number 1 from column 2 onward have a date to identify which day the data belongs to and row number 3 from column 2 onward to identify which site the data belongs to. Row 4 is where metric name (column 1) and metric values (column 2) start.

Desired output description: Column 1 should list metric names column 2 should list corresponding date of the data and site column 3 should list corresponding site of data and date column 4 should list corresponding metric value

Hope this makes sense. I could have uploaded a sample file but being new to stack overflow, guess need some points to do so.

So far i have tried transpose, melt etc with pandas dataframe but really not able to get the results.

Guess this could be a real challenger for data wrangling experts here.

Nabeel
  • 1
  • Instead of uploading a sample file, see [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) and maybe add some sample data and the expected output accordingly. – BigBen Jun 02 '20 at 14:40

1 Answers1

0

Hard to say without seeing the data but would you it help you to put each column in a list, manipulate the lists, then put them back in a DataFrame in a multi-dim list?

col1 = df['column 1'].tolist()
col2 = df['column 2'].tolist()
col3 = df['column 3'].tolist()

for i in range(len(col1)):
    lst.append([col1[i],col2[i],col3[i]])

transposed_df = pd.DataFrame(lst, columns = ['metric names','date','site']
Ed Kloczko
  • 408
  • 4
  • 6
  • desired result is `Output_DF = pd.DataFrame(columns = ['metric_names','metric_value','date','site'])`. With list appending method, the site code and date become part of rows along with the metric values. In the original data read from excel, site code and date appear on the row 1 and 2 in each column barring the column 1 where metric names are listed. Each of these columns correspond to the daily metrics values for a site, metrics names being listed in column 1. Any way to pick up site code and date from their original position and repeat them against metric name and metric values? – Nabeel Jun 03 '20 at 11:55
  • Could you use the index to track their original position? – Ed Kloczko Jun 03 '20 at 12:09