0

I have this huge data frame that has servernames, Date, CPU, memory as the headers. There are multiple servers names. I would like to be able to select certain server name order by the date column and create time serious graphs

this is a small subset of the data frame:

     Hostname                Date     5 60 61 CPUAVG CPUAVG+Sev CPUMaximum MemoryAVG
1 server1 2012-01-29 01:00:00 23.79 NA NA   2.33       0.72       2.33     23.76
2 server1 2012-01-29 02:00:00 23.91 NA NA   2.86       2.38       2.86     23.82
3 server1 2012-01-29 03:00:00 25.65 NA NA   6.25       9.59       6.25     24.85
4 server2 2012-01-29 04:00:00 26.30 NA NA  18.41      31.09      18.41     25.87
5 server3 2012-01-29 05:00:00 24.33 NA NA   1.92       0.42       1.92     24.24
6 server3 2012-01-29 06:00:00 24.40 NA NA   2.65       1.79       2.65     24.31
george willy
  • 1,693
  • 8
  • 22
  • 26

2 Answers2

3

Checkout the 'subset' command.

thisServer <- subset (servers, Hostname="server1")

Then to order the rows

thisServerSorted <- thisServer[order(thisServer$Date),]

Then you can plot from there.

Jeff Allen
  • 17,277
  • 8
  • 49
  • 70
  • 3
    you can also subset directly: `servers[servers$Hostname=='server1',]` – Justin Feb 02 '12 at 15:58
  • thank you so much. How would I do this if there was a large data set and I needed to automatically retreive distinct server names order by date graph them all in one chart. – george willy Feb 02 '12 at 15:59
  • "Large" means different things to different people. It would realy help if you just said how many GB it was, or how many rows and columns, or something like that. – Matt Dowle Feb 02 '12 at 16:12
  • Quick answer would be to use the split() command. More robust answer would be to use something like ddply (see http://stackoverflow.com/questions/1395191/how-to-split-a-data-frame-by-rows-and-then-process-the-blocks ) – Jeff Allen Feb 02 '12 at 16:18
  • large meaning there are 100 different servers when I have to real the file and create graphs based on each server name having a different line in the chart. I need to be able to dynamically choose the server name, not knowing ahead of time what it is. The below example assumes that I know the server names. – george willy Feb 02 '12 at 18:29
  • To get the list of server names from the data frame, use `unique(servers$Hostname)`. – dynamo Aug 29 '13 at 11:21
2
#convert Date to a date field (if needed)
library(lubridate)
servers$Date <- ymd_hms(servers$Date)
#select the servers you need
SelectedServers <- subset(servers, Hostname %in% c("server1", "server3"))
library(ggplot2)
#no need for sorting with ggplot2
ggplot(SelectedServers, aes(x = Date, y = CPUAVG, colour = Hostname)) + geom_line()
ggplot(SelectedServers, aes(x = Date, y = CPUAVG)) + geom_line() + facet_wrap(~Hostname)
Thierry
  • 18,049
  • 5
  • 48
  • 66