How to print count of occourance of some string in the same CSV file using Python?

Question

I have a CSV file in which contains one column (column1). I want to check whether the element in cell repeats and how many times(occcurance_count).And print count of occurrence in the same CSV file using Python.
In the below example the "241682-27638-USD-OCOF" is not repeating so the count is one, "241942-37190-USD-DIV" is repeated twice so the count is 2 and so on.

Want the output as below in CSV format

column1                  ,occcurance_count

1682-27638-USD-OGGCOF ,1

241682-27638-USD-OGGINT ,1

241682-27638-USD-CIGGNT ,1

241682-27638-USD-OCGGINT ,1

241942-37190-USD-GGDIV ,2

241942-37190-USD-CHYOF ,1

241942-37190-USD-EQPL ,1

241942-37190-USD-INT ,1

242066-15343-USD-CYJOF ,3

242066-15343-USD-CYJOF ,3

242066-15343-USD-CYJOF ,3

242066-15343-USD-ETHQPL ,1

242066-15343-USD-INFRT ,1

241942-37190-USD-GGDIV ,2

242066-33492-USD-CJHOF ,1

Similar to [this question about counting elements in a tuple of tuples](http://stackoverflow.com/questions/24347482/how-can-i-convert-this-tuple-of-tuples-into-a-count-of-its-elements), maybe? — Augusta, May 30 '15 at 03:37

Padraic Cunningham · Answer 1 · 2014-10-27T10:54:00.807

As the count repeats you just need a normal dict:

d = {}
with open(infile) as f:
    next(f)
    for line in f:
        spl = line.rstrip().split(",")
        spl[0]= spl[1]

for k,v in d.items():
    print("key = {} count = {}".format(k,v))

If your file posted is actually expected output and you are trying to count each occurrence of a file with a single string on each line and the write the line and count:

from collections import Counter

d = Counter()
with open("i.csv") as f, open("out.csv","w") as out:
    for line in f:
        d.update([line.rstrip()]) # get counts 
    f.seek(0) # g back to start of the file
    out.write("column1, occcurance_count")
    for line in f:
       out.write("{}, {}\n".format(line.rstrip(),d[line.rstrip()])) # write line plus count of that line

better though if you can also answer the `output to csv` part of the question — Anzel, Oct 27 '14 at 10:30

score 1 · Accepted Answer · answered Oct 27 '14 at 14:09

I think below is the code which you are looking for. logic is simple but lengthier too. Explanation about logic: first you need to open csv file for reading and list down all elements in list Then use list count method to find out number of occurrence of each list item open the new csv file and write item and count for each item.

Surely there could be optimize way of doing the same thing but here is code which comes quickly.

    import csv
    import sys

    try :
        fr = open("mycsv.csv")
        fw = open("mscsv_counter.csv", "w")
    except:
        print "Couldn't open the file"

    reader = csv.reader(fr)

    counterlist = list()
    for row in reader :
     #   print row
         if len(row) > 0 :
            counterlist.append(row[0])
    #for item in counterlist :
    #    print counterlist.count(item)

    writer = csv.writer(fw)
    data = ["column 1", "counter"]
    writer.writerow(data)
    for item in counterlist :
        rowdata = [item, counterlist.count(item)]
     #   print rowdata
        writer.writerow(rowdata)

    fr.close();
    fw.close();

score 0 · Answer 3 · answered Oct 27 '14 at 10:26

0

You could use Counter:

>>> counter = Counter(line[0] for line in values.readlines())

>>> counter['242066-15343-USD-CYJOF']
3

>>> counter['241682-27638-USD-OGGINT]
2

answered Oct 27 '14 at 10:26

Peter Wood

23,859
5
60
99

score 0 · Answer 4 · answered Oct 27 '14 at 10:41

Here is a simple code. Hope this will help you:

>>> import numpy as np
>>> data=np.loadtxt('a.csv', dtype=str)
>>> data
array(['241682-27638-USD-OCOF', '241682-27638-USD-OINT',
       '241682-27638-USD-CINT', '241682-27638-USD-OCINT',
       '241942-37190-USD-DIV', '241942-37190-USD-COF',
       '241942-37190-USD-EQPL', '241942-37190-USD-INT',
       '242066-15343-USD-COF', '242066-15343-USD-COF',
       '242066-15343-USD-COF', '242066-15343-USD-EQPL',
       '242066-15343-USD-INT', '241942-37190-USD-DIV',
       '242066-33492-USD-COF'], 
      dtype='|S22')
>>> count = [len(np.where(data==i)[0]) for i in data]
>>> count
[1, 1, 1, 1, 2, 1, 1, 1, 3, 3, 3, 1, 1, 2, 1]
>>> fp=open('a.csv','w')
    for i in range(data.shape[0]):
        fp.write(str(data[i]) + ' , ' + str(count[i]) + '\n')

    fp.close()

How to print count of occourance of some string in the same CSV file using Python?

4 Answers4

Linked

Related