0

I'm trying to manipulate some data frames using R, and I keep hitting a roadblock. I began with a data frame femaleDF with three columns: Product Name, Person.1, Person.2.

The data frame had values in rows like:

'E0001', 'John Smith-M', 'Jane Smith-F'

What I want is to separate the genders into two different data frames. I was able to strip out males from the female list using grepl searches for '-M', so now I have something like:

'E0001', NA, 'Jane Smith-F'

But now I want to shift the female values from Person.2 into Person.1 if Person.1 == NA. Here's what I have so far:

female.strip <- function(femaleDF) {     
increment <- 1
repeat {
  if(is.na(femaleDF$Person.1[increment])==TRUE) {
    femaleDF$Person.1[increment] <- femaleDF$Person.2[increment]
    femaleDF$Person.2[increment] <- NA
    increment <- increment + 1
 } else {increment <- increment + 1}
if(increment > nrow(femaleDF)){
    break
    }
 }
}
femaleDF <- female.strip(femaleDF)

I think I'm going wrong somewhere in the line where I try to reassign values: femaleDF$Person.1[increment] <- femaleDF$Person.2[increment]

I get errors that look like
1: In [<-.factor(*tmp*, increment, value = structure(c(282L, ... : invalid factor level, NA generated

I'm sure if I knew more about R basics this would be a breeze, but can anyone help me out? Thanks!

edit: not the same as this question. I didn't create the data frame, I imported it with read.csv.

Community
  • 1
  • 1
nathan.hunt
  • 117
  • 2
  • 8
  • Possible duplicate of [invalid factor level, NA generated](http://stackoverflow.com/questions/16819956/invalid-factor-level-na-generated) – mt1022 Nov 18 '16 at 18:41
  • You can use the `stringsAsFactors=F` argument, as the linked answer suggests, in your `read.csv` call. – paqmo Nov 18 '16 at 19:00
  • I tried using `read.csv` with `stringsAsFactors=FALSE`, and it didn't fix it. – nathan.hunt Nov 18 '16 at 19:36

1 Answers1

0

Well, I've solved it, but I'm totally lost as to how and why I solved it. The following code works:

female.strip <- function(femaleDF) {     
 increment <- 1
 repeat {
  if(is.na(femaleDF$Person.1[increment])==TRUE){
   femaleDF$Person.1[increment] <- femaleDF$Person.2[increment]
   femaleDF$Person.2[increment] <- NA
   }
  increment <- increment + 1
  if(increment > nrow(femaleDF)){
   break
   }
 }
 return(femaleDF)
}

I feel dumb that I forgot to return my data frame.

nathan.hunt
  • 117
  • 2
  • 8