1

So basically, I'm writing out statistics.

date,students
2022-11-16,22
2022-11-17,29

I want to read this csv back in and pull the col2 value from "yesterdays" row and compare it to the col2 value from "todays" row and look for a threshold difference. Something like a 5% variance. The last part is straightforward but I'm having a heck of a time with pulling the right rows and re-capturing the 'student' count for comparison.

I can do the hunt operation good enough with Pandas but I lose the second column in the match and its just not clicking for me.

import pandas as pd
from datetime import date
from datetime import timedelta

today = date.today()
yesterday = date.today() - timedelta(1)

print("today is ", today, " and yesterday was ", yesterday)

df = pd.read_csv('test.csv')
col1 = df.timestamp
col2 = df.hostcount

for row in col1:
    if row == str(yesterday):
        print(row)

Any ideas are greatly appreciated! I'm sure this is something goofy that I'm overlooking at 1am.

Tim Roberts
  • 48,973
  • 4
  • 21
  • 30
n3tl0kr
  • 15
  • 5
  • 1
    Why don't you just generate the date string in `YYYY-MM-DD` format, then do `y0 = df[df.timestamp==yesterday]['students']`, and the same for today? – Tim Roberts Nov 17 '22 at 06:33

2 Answers2

1

You may consider that pandas is somewhat "heavyweight" for something so trivial.

So, without pandas how about:

from datetime import datetime, timedelta

now = datetime.now()
today, *_ = str(now).split()
yesterday, *_ = str(now - timedelta(days=1)).split()

tv = None
yv = None

with open('test.csv') as data:
    for line in data.readlines()[1:]:
        d, s = line.split(',')
        if d == today:
            tv = float(s)
        elif d == yesterday:
            yv = float(s)
        if tv and yv:
            variance = (tv-yv)/yv*100
            print(f'Variance={variance:.2f}%')
            break
DarkKnight
  • 19,739
  • 3
  • 6
  • 22
0

You can try this:


    today = str(date.today())
    yesterday = str(date.today() - timedelta(1))
    
    print("today is ", today, " and yesterday was ", yesterday)
    
    df = pd.read_csv('test.csv')
    
    today_value = df.loc[df['date'] == today, 'students'].values[0]

Actually is about how to extract column value based on another column in Pandas: https://stackoverflow.com/a/36685531/10787867
And pay attention to comparing string to string (and not to a datetime.date object)

Chana Drori
  • 159
  • 1
  • 1
  • 10