0

I have a csv file containing trainee names and a csv file containing a list of publications, including a variable defining the author's name. I'd like for R to add a variable to the publications dataframe containing a dummy variable if the author name in the publication matches any of the trainee names ("peeps") contained in the trainee file. The following code isn't working for me, and I can't figure out why. The error I receive is "object 'i' not found. Am I going about this all wrong? Thanks!

publications <- read.csv("publications.csv", header = TRUE, stringsAsFactors = FALSE)
trainees <- read.csv("TraineeRoster.csv", header = TRUE, stringsAsFactors = FALSE)

peeps <- trainee$LastName

publications["TraineePub"]
for (i in 1:nrow(publications)) {
    if (publications$AuthorLast[i] == peeps) {
        publications$TraineePub[i]
    } else {
        publications$TraineePub[i]
    }
}

2 Answers2

0

You may try this. Since your example is not reproducible (see here, here, and here), I made up some data.

set.seed(123)
publications <- data.frame(AuthorLast = sample(letters[1:10]), TraineePub = "no")
peeps <- letters[1:5]

publications$TraineePub[publications$AuthorLast %in% peeps] <- "yes"
publications

#    AuthorLast TraineePub
# 1           c        yes
# 2           h         no
# 3           d        yes
# 4           g         no
# 5           f         no
# 6           a        yes
# 7           j         no
# 8           i         no
# 9           b        yes
# 10          e        yes
Community
  • 1
  • 1
Henrik
  • 65,555
  • 14
  • 143
  • 159
0

You should probably look through some R tutorials, as your code doesn't do anything apart from reading the original tables. The code should look like this.

publications <- read.csv("publications.csv", header = TRUE, stringsAsFactors = FALSE)
trainees <- read.csv("TraineeRoster.csv", header = TRUE, stringsAsFactors = FALSE)
peeps <- trainee$LastName

publications$IsTrainee = 1*(publications$AuthorLast %in% peeps & publications$AuthorFirst %in% trainee$FirstName)

write.csv(publications,file='PublicationsTrainee.csv')

A few things wrong with the code above, though:

publications["TraineePub"] doesn't do anything. You can add commas to try to reference rows or columns named "TraineePub", but I don't know if that variable even exists.

publications$TraineePub[i] calls a value, but it doesn't do anything with it (unless you call print, which will print the value.

Edit: Also, you should try to avoid using for loops as much as you can. Learn to use apply or just vector operations (e.g. c(1,2,3,4,5)+c(2,0,3,1,3) is equivalent to c(3,2,6,5,8))

Max Candocia
  • 4,294
  • 35
  • 58
  • That makes a lot of sense. I can't believe I've made it through 2 MOOCs and O'Reilly's R Cookbook, and I'm still running into basic stumbling blocks like this. Your feedback really helps. Thank you! @user1362215 – Super Niemie Apr 08 '14 at 20:19