I have some data about specific jobs; the important parts being the start time and the end time of each specific job. I would like to plot the aggregated(count) of simultaneous jobs, with x-axis being time and y-axis the number of jobs running at that specific point of time.
Since it's my start into R I started with some preprocessing steps, like merging the date and time columns, converting into POSIXlt, calculating timediffs() and so on. Now I'm stuck a bit. I don't need code but I would appreciate any hint how to realize that pretty much.
Specifically I don't really know how to use the job's processing time as a process instead of just using the starting point
This here is my Data frame:
'data.frame': 10000 obs. of 7 variables:
$ Process_name : Factor
$ Process_start : POSIXlt, format: "2009-12-23 03:44:38"
$ Process_end : POSIXlt, format: "2009-12-23 03:44:42"
$ Process_duration(s) : Class 'difftime' atomic [1:10000] 4 75 1 2 1
$ ProcessIncludedInJob : Factor
I want to know how many jobs are running at a specific point of time simultaneously. A job is a process which is running for some time. During its run another job could start and run simultaneously f.g. I want to calculate and plot this circumstance for further analysis. My first approach was to plot date on the x and for example use either the startdate or enddate for the y-axis. But since every job is kind of a process and not just a point in time (start or end), I am not able to see how many jobs are running simultaneously. So I guess I must somehow use the Jobstart column and the Jobduration column.