I got some csv-s that i need to edit based on some checks. The problem is that, some of the csvs are very large (arround 40 000) lines, and sometimes my program is running for hours to complete the needed checks. Bellow you'll find a part from my code:
sample input:
Status Date
1 Active 12/03/2020
2 Locked 12/03/2020
3 Active NaN
for i in range(len(df)):
if type(df.at[i,'Date']) == float:
aa = df.loc[[i]]
newdf = newdf.append(aa)
df = df.drop([i])
df = df.reset_index(drop=True)
print("Passed date check")
for i in range(len(df)):
if "ACTIVE" not in df.at[i,'Status']:
aa = df.loc[[i]]
newdf = newdf.append(aa)
df = df.drop([i])
print(newdf)
output:
Status Date
Locked 12/03/2020
Active NaN
I got a few more loops, like those ones. How i can rewrite the code so it will proccess those csvs faster ?