-4

I have 3 columns separated by commas as shown below(column1,column2,column3). In the below example the "241682-27638-USD-OCOF" is not repeating so the count is one, "241942-37190-USD-DIV" is repeated twice so the count is 2 and so on.

column1,column2,column3 ,occcurance_count_of_column3
name1,empId1,241682-27638-USD-CIGGNT ,1
name2,empId2,241682-27638-USD-OCGGINT ,1
name3,empId3,241942-37190-USD-GGDIV ,2
name4,empId4,241942-37190-USD-CHYOF ,1
name5,empId5,241942-37190-USD-EQPL ,1
name6,empId6,241942-37190-USD-INT ,1
name7,empId7,242066-15343-USD-CYJOF ,3
name8,empId8,242066-15343-USD-CYJOF ,3
name9,empId9,242066-15343-USD-CYJOF ,3
name10,empId10,241942-37190-USD-GGDIV ,2
name11,empId11,242066-33492-USD-CJHOF ,1

I have column1,column2,column3 in a CSV file. I want to have the occcurance_count_of_column3 as the next column.I want to check whether the element in column3 repeats and how many times(occcurance_count). And print count of occurrence in the same CSV file using Python.

Jakob
  • 19,815
  • 6
  • 75
  • 94
Rohit
  • 848
  • 3
  • 15
  • 31
  • 4
    Can you show us what you have done so far? – Jakob Oct 29 '14 at 07:28
  • @Jakob Imho the "show us what you have done so far" clause mostly applies when you have a very broad Q, with many possible answers (or the Q smells 'homework' across the road;) This Q is not well written but it's specific, it has not an obvious solution (i.e., if you're a beginner and you don't know of the `dict.get`idiom or the `Counter` class in `collections), its answer is likely useful to all of the above mentioned beginners and eventually you don't need too much context to show the OP how to overcome her/his specific problem "How do I count occurrences of a so-and-so item in a table?" – gboffi Oct 29 '14 at 11:15
  • @gboffi the OP should at least show some basic knowledge and attempts and not just ask the community to solve his/her problem. See the [stackoverflow tour](http://stackoverflow.com/tour) : "Don't ask about...Questions you haven't tried to find an answer for (show your work!)" – Jakob Oct 29 '14 at 12:06
  • What format is your data coming in? Are `column1`, `column2`, and `column3` all elements in a list (or tuple) of lists (or tuples)? Is each line a string (commas and all) in a list? Is the whole thing a single block of unprocessed text? What are we starting with for data? – Augusta May 30 '15 at 02:00

1 Answers1

1

You need a counter. A Counter is available from stdlib's collections module but we can do w/o. You need two passes over your data, and I assume that you're able to slurp file content in a list of lists data structure that we conveniently will name table

counts = {}
for row in table:
    # use the `get` method of a dict with the optional `d` argument
    # set to 0 (see ">>> help(dict.get)" if get is new for you)
    counts[row[2]] = counts.get(row[2],0) + 1

for row in table:
    print formatter(row,counts[row[2]])
gboffi
  • 22,939
  • 8
  • 54
  • 85