1

I have a CSV file with dates populated in each cell and I am wanting to count how many of each date there are and then I will be plotting onto a bar graph using matplotlib.

I'm not sure how to go about counting how many instances of each date there are?

I have the following code to read csv file but not sure where to go from here

def Readtoarray():
    with open('Book1.csv','r') as file:
        reader = csv.reader(file, delimiter=',')
        next(reader, None)  # skip the headers
        for row in reader:
            XXXXXXXXX

Example data:

23/03/2020,6630997
23/03/2020,6630990
20/03/2020,6630390
20/03/2020,6630386

3 Answers3

1

I would recommend using a Counter dict object (see https://docs.python.org/3/library/collections.html#collections.Counter)

def Readtoarray():
    c = collections.Counter()
    with open('Book1.csv','r') as file:
        reader = csv.reader(file, delimiter=',')
        next(reader, None)  # skip the headers
        for row in reader:
            c[row[0]] += 1 # or whatever the row index is for the date(s)
    return c

Which results in the following output:

>>> for k,v in c.items():
        print(k, ": ", v)

2-Sep :  4
3-Sep :  2
23-Sep :  2
j-berg
  • 87
  • 6
1

I guess the simplest way would be reading the csv file as a Pandas DataFrame and then using the value_counts() function

Count the frequency that a value occurs in a dataframe column

import pandas as pd
 
def Readtoarray():
   
   df = pd.read_csv('Book1.csv')
   x = df["Dates"].value_counts() # assuming the column is called Dates

guipleite
  • 58
  • 1
  • 8
  • Thanks I have added example data to original post. I'm not sure how I go about saving the dates (There is a years worth of dates in here) into variables so I can then call them into my plot graph. – user3663785 Feb 22 '21 at 22:06
  • 1
    Have you tried `df["Dates"].value_counts().plot()` ? – guipleite Feb 22 '21 at 22:16
  • Maybe it'd also be helpful to have a look at this other [thread:](https://stackoverflow.com/questions/28022227/sorted-bar-charts-with-pandas-matplotlib-or-seaborn) – guipleite Feb 22 '21 at 22:20
0

Could you provide an example of you CSV dates? To count the occurrences of something in Python, the best way is to use collections.Counter.

import collections

strings = ["a", "b", "a", "c", "b", "a", "a", "b"]
count = collections.Counter(strings)

count.items() == {"a": 4, "b": 3, "c": 1}

Edit

With the date as first value in a row, assuming you don't need to do something special with these dates implying to parse them:

import collections

count = collections.Counter([row[0] for row in reader])

It will return a dict with dates (as strings, for instance "23/03/2020") as keys, and their number of occurrences as values. If you want dates as datetime.date objects, you can parse them in the comprehension list:

import collections
import datetime

dates = [datetime.datetime.strptime(row[0], "%d/%m/%Y") for row in reader]
count = collections.Counter(dates)
Michael Marx
  • 104
  • 8