0

I am trying to extract data from a data frame for analysis.

heightweight <- function(person, health) {
    ## Read in data
    data <- read.csv("heightweight.csv", header = TRUE,
                     colClasses = "character")
    ## Check that the outcomes are valid
    measure = c("height", "weight")
    if(health %in% measure == FALSE){
        stop("Valid inputs are height and weight")
    }
    ## Truncate the data matrix to only what columns are needed
    data <- data[c(1, 5, 7)]
    ## Rename columns
    names(data)[1] <- "Name"
    names(data)[2] <- "Height"
    names(data)[3] <- "Weight"
    ## Convert numeric columns to numeric
    data[, 2] <- as.numeric(data[, 3])
    data[, 3] <- as.numeric(data[, 4])
    ## Convert NAs to 0 after coercion
    data[is.na(data)] <- 0
    ## Check that the name is valid
    name <- data[, 1]
    name <- unique(name)
    if(person %in% name == FALSE){
        stop("Invalid person")
    }
    ## Return person with lowest height or weight
    list <- data[data$name == person & data[health],]
    outcomes <- list[, health]
    minumum <- which.min(outcomes)
    ## Min Rate
    minimum[rowNum, ]$name
}

The problem I am having is occurring with

list <- data[data$name == person & data[health],]

That is, I run heightweight("Bob", "weight"), I get the following message

Error in matrix(unlist(value, recursive = FALSE, use.names = FALSE), nrow = nr,  : 
  length of 'dimnames' [2] not equal to array extent

I have Googled this message and checked out some threads here but can't determine what the problem is.

Arun
  • 116,683
  • 26
  • 284
  • 387
dustin
  • 4,309
  • 12
  • 57
  • 79
  • Did you mean `list <- data[data$name == person & data[health]==health,]`? – Marat Talipov Jan 12 '15 at 20:21
  • @MaratTalipov doing that produces: `Error in [.data.frame(data, data$name == person & data[health] == : dims [product 4706] do not match the length of object [0]` – dustin Jan 12 '15 at 20:23
  • what is the expected output from `list <- data[data$name == person & data[health],]`? – Marat Talipov Jan 12 '15 at 20:31
  • @MaratTalipov if I call heightweight("Bob", "weight"), then the list should be Bob and all his weights. – dustin Jan 12 '15 at 20:32

2 Answers2

3

Unless I'm missing something, if you only need the lowest weight or height for a given name, the last three lines of code are a bit redundant.

Here's a simple way to get the minimum health measurement for a given person:

min(data[data$name==person, "height"])

The first part selects only the rows of data that correspond to that person, it acts as a row index. The second part, after the comma, selects only the desired variable (column). Once you have selected the desired data, you look for the minimum in that subset of the data.

An example to illustrate the result:

data<-data.frame(name=as.character(c(rep("carlos",2),rep("marta",3),rep("johny",2),"sara")))
set.seed(1)
data$height <- rnorm(8,68,3)
data$weight <- rnorm(8,160,10)

The corresponding data frame:

   name   height   weight
1 carlos 66.12064 165.7578
2 carlos 68.55093 156.9461
3  marta 65.49311 175.1178
4  marta 72.78584 163.8984
5  marta 68.98852 153.7876
6  johny 65.53859 137.8530
7  johny 69.46229 171.2493
8   sara 70.21497 159.5507

Let's say we want the minimum weight for marta:

person <- "marta"
health <- "weight"

The minimum "weight" for "marta" is,

min(data[data$name==person,health])

which gives the desired result:

[1] 153.7876
AleMorales
  • 46
  • 4
  • Can you expand a bit? You need to clarify exactly what you are looking for. Do you want the minimum of a measure or the whole list of measurements? It is not completely clear from the question above. – AleMorales Jan 12 '15 at 21:23
  • Your data frame is similar to mine but your code doesn't work in my program, `min(data[data$name==person,health])`. It returns: `In min(data[data$name == person, health]) : no non-missing arguments to min; returning Inf`. Also, I have introduced zeros to get rid of NAs so if it did work it would return zero since everyone has atleast one zero. – dustin Jan 12 '15 at 21:27
  • I have also commented out `data[is.na(data)] <- 0` and added `na.rm=TRUE` to `min` but the error is the same. – dustin Jan 12 '15 at 21:34
  • Can you edit your question, to put a very simple example of what's expected from your function? Like: heightweight("Bob","weight") and the expected answer. Could you do a copy-paste of the head of the data.frame before you obtain the list inside the function? To make sure it is what you are expecting. – AleMorales Jan 12 '15 at 21:36
  • that error message means that the argument to `min` is empty. See [link](http://stackoverflow.com/questions/24282550/no-non-missing-arguments-warning-when-using-min-or-max-in-reshape2) – AleMorales Jan 12 '15 at 21:45
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/68694/discussion-between-alemorales-and-dustin). – AleMorales Jan 13 '15 at 02:58
0

Here is the simplified analogue of your function:

heightweight <- function(person,health) {
  data.set <- data.frame(names=rep(letters[1:5],each=3),height=171:185,weight=seq(95,81,by=-1))
  d1 <- data.set[data.set$name == person,]
  d2 <- d1[d1[,health]==min(d1[,health]),]
  d2[,c('names',health)]    
}

The first line produces a sample data set. The second line selects all records for a given person. The last line finds a record corresponding to the minimum value of health.

heightweight('b','height')
#   names height
# 4     b    174
Marat Talipov
  • 13,064
  • 5
  • 34
  • 53