-2

I have a csv file wit values like

68,68
70,70
80,90

Here i would like it to remove the duplicates i.e. give the output

68
70
80,90

Or

 68,
 70,
 80,90

But i tried searching everywhere and was not able to find how to do this

3 Answers3

0

Depending on the size of your input, a naive approach could be fine:

$ cat test 
68,68
70,70
80,90
$ cat readvals.py 
#! /usr/bin/env python
import csv
vals = [] # a list for the entire file
with open('test') as infile:
    lines = csv.reader(infile,delimiter=',')
    for i, line in enumerate(lines):
        vals.append([]) # append a sub-list for this row.
        for val in line:
            if val not in vals[i]:
                vals[i].append(val) # add values for the row
print(vals)
$ python readvals.py
[['68'], ['70'], ['80', '90']]
Chase LP
  • 229
  • 2
  • 7
0

For removing duplicate rows I use this code.

import pandas as pd

df = pd.read_csv('myfile.csv')

df.drop_duplicates(inplace=True)

df.to_csv('myfile.csv', index=False)
sushanth
  • 8,275
  • 3
  • 17
  • 28
0

i would suggest you to have a look at these below since its not clear about your requirement.

pandas documentation:https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html

Example ref: https://www.journaldev.com/33488/pandas-drop-duplicate-rows-drop_duplicates-function

thirdeye
  • 150
  • 7