1

I have a set of athlete records from openpowerlifting.org and I want to retrieve all of the athletes from a certain division. The entries are of the form "Meet ID Name Sex Equipment Age Divison ..." and I wish to extract all those who participated in a ceratain division. Here is my code:

powerlift <- read.csv("openpowerlifting.csv",header = TRUE,fill = TRUE,stringsAsFactors = FALSE )

n = length(powerlift$TotalKg)

UPA_Open = as.data.frame(matrix(c(rep(0,n*17)),ncol=17))
j=1

for(i in 1:n){
    if(powerlift$Divison[i]=="UPA Open"){
        UPA_Open[j,] = powerlift[i,]
        j = j + 1
    }
 }

I encounter the following problem:

Error in if (powerlift$Divison[i] == "UPA Open") { : 
  argument is of length zero

and investigating the data set after execution

> i
[1] 1
> powerlift$Division[i]
[1] "Mst 45-49"
> powerlift$Division[i] == "Mst 45-49"
[1] TRUE

so it stopped after attempting one iteration, claiming that the data was null which is was not. What is going on?

HereBeeBees
  • 145
  • 9
  • can you include some sample data? See here: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Dodge May 11 '18 at 13:59

1 Answers1

1

Trying to avoid the XY problem and considering you "want to retrieve all of the athletes from a certain division", here is an alternative for your problem:

# Simulating your data
Division <- c("UPA Open", "DEF", "GHI", "UPA Open", "UPA Open")
someColumn <- c("athlete1", "athlete2", "athlete3", "athlete4" , "athlete5")
otherColumn <- c(11, 22, 33, 44, 55)
powerlift <- data.frame(someColumn, otherColumn, Division)
print(powerlift)

# The actual solution
UPA_Open <- powerlift[powerlift$Division == "UPA Open", ]
print(UPA_Open)

Explanation:

# Explanation line by line
pos <- powerlift$Division == "UPA Open" # variable pos now contains a vector of TRUE OR FALSE, indicating the lines which Division are equals to "UPA OPEN"
print(pos) # verify the content of pos variable
UPA_Open <- powerlift[pos, ] # Selecting only the lines of the powerlift data.frame which pos is TRUE. powerlift[<<lines>>, <<columns>>].
print(UPA_Open) # print the results

Hope it helps! :)

tk3
  • 990
  • 1
  • 13
  • 18
  • 1
    Yeah, that works! Thanks, much simpler. However, I am still really curious as to why the above happens, it seems really strange. – HereBeeBees May 11 '18 at 14:30
  • @HereBeeBees I will edit my answer explaining the code. Thanks for the feedback. – tk3 May 11 '18 at 14:32
  • @HereBeeBees Also, this tutorial may help you to understand more about acessing R data. http://www.r-tutor.com/r-introduction/data-frame/data-frame-row-slice – tk3 May 11 '18 at 14:57