I have the following dataframe:
dates = [str(datetime.datetime(2020, 1, 1, 0, 0, 0, 0) + datetime.timedelta(days=i)) for i in range(3)]
repetitions = [3, 6, 4]
dates = [i for i, j in zip(dates, repetitions) for k in range(j)]
cities_ = ['Paris', 'Tokyo', 'Sydney', 'New-York', 'Rio', 'Berlin']
cities = [cities_[0: repetitions[i]] for i in range(len(repetitions))]
cities = [i for j in cities for i in j]
temperatures = [round(random.normalvariate(20, 5), 1) for _ in range(len(cities))]
humidities = [round(random.normalvariate(0.5, 0.4), 1) for _ in range(len(cities))]
humidities = [min(i, 1) for i in humidities]
humidities = [max(i, 0) for i in humidities]
df = pd.DataFrame(data=list(zip(dates, cities, temperatures, humidities)), columns=['date', 'city', 'temperature', 'humidity'])
I need to remove the indexes after applying the pivot function; the code below
values = ['temperature', 'humidity']
df_ = df.pivot(index='date', columns='city', values=values)
Col = list(set(df['city'].values))
for value in values:
df_.rename(columns={i: value + '_' + i for i in Col}, inplace=True)
outputs:
temperature ... humidity
city temperature_Berlin temperature_New-York temperature_Paris temperature_Rio ... temperature_Paris temperature_Rio temperature_Sydney temperature_Tokyo
date ...
2020-01-01 00:00:00 NaN NaN 21.2 NaN ... 0.3 NaN 1.0 1.0
2020-01-02 00:00:00 18.4 14.2 19.3 28.7 ... 0.6 0.6 0.1 0.2
2020-01-03 00:00:00 NaN 31.6 25.9 NaN ... 0.8 NaN 0.1 0.0
and I need the following result:
temperature_Paris humidity_Paris temperature_Tokyo humidity_Tokyo temperature_Sydney ... humidity_New-York temperature_Rio humidity_Rio temperature_Berlin humidity_Berlin
2020-01-01 00:00:00 21.2 0.3 17.5 1.0 26.3 ... NaN NaN NaN NaN NaN
2020-01-02 00:00:00 19.3 0.6 15.1 0.2 22.8 ... 0.1 28.7 0.6 18.4 0.4
2020-01-03 00:00:00 25.9 0.8 27.5 0.0 29.7 ... 0.6 NaN NaN NaN NaN
The various solutions offered for questions that look similar, like essentially:
df_ = df_.reset_index().rename_axis([None, None], axis=1)
do not work here.