0

So my data looks like this

   id first  middle  last       Age
    1 Carol  Jenny   Smith      15
    2 Sarah  Carol   Roberts    20
    3 Josh   David   Richardson 22

I have a function that creates a new column which gives you how many times the name was found for each row in previous columns that I specified (2nd-4th columns or 'first':'last' columns). I have a function that outputs the result below,

funname <- function(df, cols, value, newcolunmn) {
  df$newcolumn <- as.integer(rowSums(df[cols] == value) > 0)
}
   id first  middle  last       Age  Carol
    1 Carol   Jenny   Smith      15    1
    2 Sarah  Carol   Roberts     20    1
    3 Josh   David   Richardson  22    0

But my real data is more complicated and I want to create at least 20 new, different columns (ex: Carol, Robert, Jenny, Anna, Richard, Daniel, Eric...) So how can I incorporate multiple new columns into the existing function? I can only think of adding function(df, cols, value, newcolumn1, newcolumn2, newcolumn3,...,) but this would be impossible if I want like hundred columns later,..any help? thank you in advance! :)

EDIT:

 function(df, cols, value, newcol) {

   df$newcol <- as.integer(rowSums(df[cols] == value) > 0)
   df 
}

I read the comments below..but let me change my question.. How would I map this function so that I can generate multiple columns with new names that I want to assign?..

JNB
  • 161
  • 2
  • 8
  • 2
    [My answer to your previous question](https://stackoverflow.com/a/55402325/496488) shows how to do this. Also, you can adapt [Ronak Shah's answer](https://stackoverflow.com/a/55402161/496488) to that question (which is the one you're using above) in a similar way. By the way, if Ronak Shah's previous answer resolved that question, you should click the "accept" check mark next to his answer to let people know that a solution was found. – eipi10 Apr 02 '19 at 20:50
  • @eipi10 got it! thank you! – JNB Apr 02 '19 at 20:56
  • Hmm..not quiet working on the function I wrote in the post..if anyone can help, please do so! – JNB Apr 02 '19 at 21:20
  • Please provide a data sample using `dput()` (e.g., paste into your question the output of `dput(df[1:3, ])`) and also paste into your question the code you're running to try and add multiple columns. – eipi10 Apr 02 '19 at 21:23
  • I think this is just one giant `table` operation if you get your data converted to two long vectors, one representing row number and the other the value. E.g.: `table(row(dat[2:4]), unlist(dat[2:4]))` As per usual, long-form, 'tidy' data is conceptually simpler to work with (whether you use 'tidy' packages or not...). – thelatemail Apr 02 '19 at 22:00
  • @thelatemail I just tried it but I think it's doing something different.. – JNB Apr 02 '19 at 22:06
  • @Molly - how is it doing something different? For me, it gives a table with 8 columns with the counts from each row, named after each of the values in `first/middle/last`. Is that not what you want? – thelatemail Apr 02 '19 at 22:08
  • @thelatemail Oh yes, you're right. It works for this dataset for sure. But my real dataset is longer and wider that when I run it on mine it outputs in a different way/..that's what I meant..sorry for the confusion – JNB Apr 02 '19 at 22:13

1 Answers1

1

I think this is just one giant table operation if you get your data converted to two long vectors, one representing row number and the other the value:

tab <- as.data.frame.matrix(table(row(dat[2:4]), unlist(dat[2:4])))
cbind(dat, tab)
#  id first middle       last Age Carol David Jenny Josh Richardson Roberts Sarah Smith
#1  1 Carol  Jenny      Smith  15     1     0     1    0          0       0     0     1
#2  2 Sarah  Carol    Roberts  20     1     0     0    0          0       1     1     0
#3  3  Josh  David Richardson  22     0     1     0    1          1       0     0     0

This method would also allow you to map the new output columns to variations of the names if required:

tab <- as.data.frame.matrix(table(row(dat[2:4]), unlist(dat[2:4])))
dat[paste0(colnames(tab),"_n")] <- tab
dat
#  id first middle       last Age Carol_n David_n Jenny_n Josh_n Richardson_n Roberts_n Sarah_n Smith_n
#1  1 Carol  Jenny      Smith  15       1       0       1      0            0         0       0       1
#2  2 Sarah  Carol    Roberts  20       1       0       0      0            0         1       1       0
#3  3  Josh  David Richardson  22       0       1       0      1            1         0       0       0
thelatemail
  • 91,185
  • 12
  • 128
  • 188