I have a csv file wit values like
68,68
70,70
80,90
Here i would like it to remove the duplicates i.e. give the output
68
70
80,90
Or
68,
70,
80,90
But i tried searching everywhere and was not able to find how to do this
I have a csv file wit values like
68,68
70,70
80,90
Here i would like it to remove the duplicates i.e. give the output
68
70
80,90
Or
68,
70,
80,90
But i tried searching everywhere and was not able to find how to do this
Depending on the size of your input, a naive approach could be fine:
$ cat test
68,68
70,70
80,90
$ cat readvals.py
#! /usr/bin/env python
import csv
vals = [] # a list for the entire file
with open('test') as infile:
lines = csv.reader(infile,delimiter=',')
for i, line in enumerate(lines):
vals.append([]) # append a sub-list for this row.
for val in line:
if val not in vals[i]:
vals[i].append(val) # add values for the row
print(vals)
$ python readvals.py
[['68'], ['70'], ['80', '90']]
For removing duplicate rows I use this code.
import pandas as pd
df = pd.read_csv('myfile.csv')
df.drop_duplicates(inplace=True)
df.to_csv('myfile.csv', index=False)
i would suggest you to have a look at these below since its not clear about your requirement.
pandas documentation:https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html
Example ref: https://www.journaldev.com/33488/pandas-drop-duplicate-rows-drop_duplicates-function