0

I have a Dataframe, simlified as follows.

ID | year | y | A1 | A2 | ... | A52 | B1 | ...
1    1980 | 10| 1  | 2  | ... | 52  | 53 |
1    1981 | 20| 1  | 2  | ... | 52  | 53 |
1    1982 | 30| 1  | 2  | ... | 52  | 53 |
2    1980
2    1981
2    1982
...
1000

I need to reorganize it into:

ID | 1982_y | A1(from 1980) | ... | A52 | A1(from 1981) | ... | A1(from 1982)| ... | A52 |
1  | 30     | 1 | 2 | 3 | 4 | ... 
2
3
4
...
1000

where I need to concatenate all the values from A1 to A52 for each year of the same location. Only the last y of each ID is kept as 1982_y. Other values are discarded. By the end of the operation, there should be one row for each location, so 1000 rows.

For example, we will merge 1980, 1981, 1982 for ID=1 into one row. The last value of y, 30 is kept as 1982_y. Then we concatenate the rest of the values for all 3 years into one line. If we have A B C, 52 for each, then we have 52x3 = 156 values.

I'm thinking I should iterate over the rows first, and then iterate over the columns, store A1-A52 and y of last year to a Datafame. Append this to the dataframe for each iteration. I'm not sure how to carry out this operation in pandas. Many thanks!

Andy
  • 167
  • 9
  • 2
    Please present your example more clearly than this. An actual small working example would be best. – piRSquared Apr 05 '22 at 14:58
  • You need a `pivot`: `df2 = df.pivot(index='ID', columns='year') ; df2.columns = df2.columns.map(lambda x: f'{x[0]}_{x[1]}')` – mozway Apr 05 '22 at 15:03

0 Answers0