5

I used the aggregate function to get the range by factor level. I am trying to rename the columns, but the output from the aggregate function does not have the min and max as separate columns.

# example data
size_cor <- data.frame(SpCode = rep(c(200, 400, 401), 3),
                       Length = c(45, 23, 56, 89, 52, 85, 56, 45, 78))

# aggregate function
spcode_range <- with(size_cor, aggregate(Length, list(SpCode), FUN = range))

Output:

spcode_range 

  Group.1 x.1 x.2
1     200  45  89
2     400  23  52
3     401  56  85

Data structure:

str(spcode_range)

'data.frame':   3 obs. of  2 variables:
 $ Group.1: num  200 400 401
 $ x      : num [1:3, 1:2] 45 23 56 89 52 85

dim(spcode_range)
[1] 3 2

The output has three columns: Group.1, x.1 (min) and x.2 (max), but the dataframe has only 2 columns. I have tried setNames, rename and name with no success because I am trying to name three columns when R has only 2 columns.

Roman
  • 4,744
  • 2
  • 16
  • 58
user41509
  • 978
  • 1
  • 10
  • 31
  • You can use: `names(spcode_range) <- c("group","min","max")` – Jaap Jul 23 '15 at 14:40
  • possible duplicate of [Changing column names of a data frame in R](http://stackoverflow.com/questions/6081439/changing-column-names-of-a-data-frame-in-r) – Jaap Jul 23 '15 at 14:50
  • I can't use names. Here is the error message R gives when names is called:Error in names(spcode_range) <- c("group", "min", "max") : 'names' attribute [3] must be the same length as the vector [2] – user41509 Jul 23 '15 at 14:53
  • Sorry I hit enter too fast. The dataframe that comes out of the aggregate function has only 2 columns. – user41509 Jul 23 '15 at 14:54

1 Answers1

1

Basically what happened here is that you've called the range function by group which returned two values at a time. The aggregate function returned a data.frame (which it always does unless the data set is a ts class) with those values as a matrix in a single column (of class matrix obviously).

Then, when you print it, it triggers the print.data.frame method which in turn calls format.data.frame which converts each column in the matrix column into a separate column (see str(format.data.frame(spcode_range))) and then, the printed result is actually not the actual data.frame you are trying to print (don't ask me why, probably for convenience - as it is not clear how to print a matrix within a data.frame).

So basically, one way to fix this is to combine do.call and cbind.data.frame, e.g.

res <- do.call(cbind.data.frame, aggregate(Length ~ SpCode, size_cor, range))
str(res)
# 'data.frame': 3 obs. of  3 variables:
# $ SpCode  : num  200 400 401
# $ Length.1: num  45 23 56
# $ Length.2: num  89 52 85

Or just use other packages such dplyr or data.table which were designed to (among other stuff) replace/improve data manipulation operations in R.

David Arenburg
  • 91,361
  • 17
  • 137
  • 196