I have a dataframe 15000 rows of binary data, with each string being 365 characters. And I convert each binary numbers to 365 days with a starting date of 13/12/2020.
Because the data is so large, so my program runs very slowly. Is there a way I can optimize my program?
Data example:
ID | Nature | Binary |
---|---|---|
1122 | M | 1001100100100010010001001100100110010011001001100100110010011001001100100110010011001001100100110010011001001100110110010011001001100100110010011001000000100110011011001001100100110010011001001100100110010011001001100100110010011001001100100110010011001001100100110010011001001100100110010011001001100100110010011001001100110110010000001001100100110010011001001100 |
Output:
ID | Nature | Date | Code |
---|---|---|---|
1122 | M | 13/12/2020 | 1 |
1122 | M | 14/12/2020 | 0 |
1122 | M | .......... | ... |
1122 | M | 11/12/2021 | 0 |
Code:
start_date = '2021-12-13'
table_ = pd.DataFrame({'ID': df.id[0],'Nature':df.Nature[0], Date':pd.date_range(start_date, periods=len(df.binairy[0]), freq='D'), 'Code': list(df.binairy[0])})
for i in range(1,len(df)):
table_i = pd.DataFrame({'ID': df.id[i],'Nature':df.Nature[i],'Date':pd.date_range(start_date, periods=len(df.binairy[i]), freq='D'), 'Code': list(df.binairy[i]})
table_ = pd.concat([table_,table_i],ignore_index=True)
table_