I have a dataset which has data recorded in time-frames:
| ID | ITEM | TIME | AMOUNT |
| --- | --------- | --------------------- | ------ |
| 1 | 5000 |10/12/2101 6:17:00 PM | 10 |
| 1 | 5200 |10/12/2101 6:17:00 PM | 2 |
| 1 | 5100 |10/12/2101 6:17:00 PM | 1.70 |
| 1 | 5300 |10/13/2101 3:00:00 AM | 1.52 |
| 1 | 5000 |10/13/2101 3:00:00 AM | 11 |
I am looking to convert this format to column-wise with the item numbers as column names with their corresponding value in each time. I created an empty data frame with all of the item numbers first and I am running a for loop with each time-stamp to get the item
and amount
at that time and append it as a dictionary to the empty dataset.
The problem is that this produces a data frame which looks like this
| ID | TIME | 5000 | 5100 | 5200 | 5300 |
| --- | ---------------------- | -------|-------|-------|-------|
| 1 | 10/12/2101 6:17:00 PM | 10 | | | |
| 1 | 10/12/2101 6:17:00 PM | | 1.70 | | |
| 1 | 10/12/2101 6:17:00 PM | | | 2 | |
| 1 | 10/13/2101 3:00:00 AM | 11 | | | |
| 1 | 10/13/2101 3:00:00 AM | | | | 1.52 |
But I want to have only one row for each time-stamp and fill the item columns for that particular time-stamp. For Example:
| ID | TIME | 5000 | 5100 | 5200 | 5300 |
| --- | ---------------------- | -------|-------|-------|-------|
| 1 | 10/12/2101 6:17:00 PM | 10 | 1.70 | 2 | |
| 1 | 10/13/2101 3:00:00 AM | 10 | | | 1.52 |
Here is my code:
# Create an empty dataframe of the selected items
df_to_fill = pd.DataFrame(columns=['ID', 'TIME', '5000', '5100', '5200', '5300'])
# Find list of time-frames
list_time_frames = df['TIME'].unique()
# For each time-frame
for _ in list_time_frames:
# For item(s) in each time-frame
for i, r in df.iterrows():
if r['time'] == _:
dict_ = {'TIME': _, 'ID': r['ID'], str(r['ITEM']): r['AMOUNT']}
df_to_fill = df_to_fill.append(dict_, ignore_index=True)