I have a problem with grouping data and plotting in over time to show incremental change. The data structure is below in the incoming data and added to a pandas dataframe:
"DateTime","Classification", "Confidence"
What I want to do is show the unique values of classification and count how many times they occur every 5 minutes. I then want to plot this in a graph that will update every 5 minutes showing the incremental values over time.
I have tried different approaches but I just cant get my head around it. The dataframe I can get is:
Index | class | count |
---|---|---|
0 | Car | 2 |
1 | Truck | 1 |
2 | Boat | 3 |
I got 'Index', 'Class', 'Count' This I can get updated every 5 minutes or I can add this to a list containing 'TimeStamp','Dataframe', where the dataframe looks like above.
The output in a chart, that I would like to have, is one line per class in different colors, showing how many they are in the dataframe every 5 minutes.
How can I do this with pandas and matplotlib in python? I attach my junk code below just to show what I have been using as starting point...
support is most appriciated
def CreateStats():
print("Reading from file")
fo = open("/home/User/Temp/test_data.txt", "r")
df = pd.DataFrame(columns=['time', 'class', 'conf'])
ndf = pd.DataFrame(columns=['class', 'class count'])
pos = 0
nPos=0
for t in range(1):
fo.seek(0, 0)
for line in fo:
#print(str(datetime.now())+" - " + line)
#time.sleep(1)
splitted = line.split(";")
df.loc[pos] = [datetime.now().strftime("%Y-%m-%d %H:%M:%S"),splitted[0],right(splitted[1],1)]
pos=pos+1
#time.sleep(1)
df['time'] = pd.to_datetime(df['time'])
ndf = df.groupby('class').agg({'class':['count']}).reset_index()
#ndf = df.groupby('class').count().reset_index()
#ndf = df.groupby('class').agg('count').reset_index()
#print(df.head())
#newDf = [datetime.now(),ndf]
print(ndf)
#ndf.plot.scatter(x='class', y='time count')
#plt.show()
fo.close()