My python code:
import pandas as pd
student_dict = {
"ID":[101,102,103,104,105],
"Student":["AAA","BBB","CCC","DDD","EEE"],
"Mark":[50,100,99,60,80],
"Address":["St.AAA","St.BBB","St.CCC","St.DDD","St.EEE"],
"PhoneNo":[1111111111,2222222222,3333333333,4444444444,5555555555]
}
df = pd.DataFrame(student_dict)
print("First dataframe")
print(df)
fName = "Student_CSVresult.csv"
def dups(df):
df = df.drop_duplicates(keep=False)
return df
try:
data = pd.read_csv(fName)
print("CSV file data")
print(data)
df_merged = pd.concat([data, df])
df = dups(df_merged)
print("After removing dups")
print(df)
df.to_csv(fName, mode='a', index=False,header=False)
except FileNotFoundError:
print("File Not Found Error")
df = df.drop_duplicates()
df.to_csv(fName, index=False)
print("New file created and data imported")
except Exception as e:
print(e)
on the first run, all data was imported without any copies.
I gave a different dataframe in the next run
student_dict = {
"ID":[101,102,103,104,105,101,102,103,104,105,106,107],
"Student":["AAA","BBB","CCC","DDD","EEE","AAA","BBB","CCC","DDD","EEE","YYY","ZZZ"],
"Mark":[50,100,99,60,80,50,100,99,60,80,100,80],
"Address":["St.AAA","St.BBB","St.CCC","St.DDD","St.EEE","St.AAA","St.BBB","St.CCC","St.DDD","St.EEE","St.AYE","St.ZZZ"],
"PhoneNo":[1111111111,2222222222,3333333333,4444444444,5555555555,1111111111,2222222222,3333333333,4444444444,5555555555,6666666666,7777777777]
}
also no issues, then I gave the first dataframe again.
student_dict = {
"ID":[101,102,103,104,105],
"Student":["AAA","BBB","CCC","DDD","EEE"],
"Mark":[50,100,99,60,80],
"Address":["St.AAA","St.BBB","St.CCC","St.DDD","St.EEE"],
"PhoneNo":[1111111111,2222222222,3333333333,4444444444,5555555555]
}
and it created copies
Can someone help me how to solve this issue? i don't want to overwrite the main file(Student_CSVresult.csv
), just append only
also, is there any way to create a new column in the file that will auto-capture the timestamp of the data entry?