0

I would like to calculate the rate of change between the numeric columns of two dataframes based on a common unique row identifier and unique row-column combination.

Here is an example. I opted to present the tables as images in order to use colors to highlight the peculiarities of the two datasets. That is, each dataframe contains numeric and non-numeric columns, and rows and columns may not be in the same order. Also, the numeric columns on which the calculation should take place are always those after the 'Time' column.

F

The df.divide() approach doesn't work here because the rows and columns are not in the same order. I also saw the top answer in this thread, but again the approach doesn't generalize to mine.

StatsScared
  • 517
  • 6
  • 20
  • Do you have corresponding columns and rows in both dfs? Can you order the colums to one way eg. lst=df1.columns.tolist df2[[lst]] and then order the rows also? – uxke Jan 08 '20 at 05:34

1 Answers1

0

If your problem is essentially with the columns and rows not being in the right order, that can be solved by essentially reordering the columns and rows.

#Identifying the columns for which the difference is to be computed. Since #'Time' is the 4th column, we take all columns after that
valCols = list(df.columns)[4:]

#Sorting the datasets so that the rows align
df1 = df1.sort_values('ID')
df2 = df2.sort_values('ID')

#Keeping only the value columns. This also ensures that the columns are in the same order now
df1 = df1[valCols]
df2 = df1[valCols]


Roshan Santhosh
  • 677
  • 3
  • 9