How to count the presence of unique elements from each row of data set and list them in r

Question

I have a dataset like the following:

Age      Monday Tuesday Wednesday 
6-9        a     b        
6-9        b     a        c
6-9              c        a
9-10       c     c        b
9-10       c     a        b

Using R, I want to get the following data set/ results (where 1 represents the presence of the element and 0 represents the absence):

Age        a     b        c
6-9        1     1        0
6-9        1     1        1
6-9        1     0        1
9-10       0     1        1
9-10       1     1        1

For each observation, I need the presence or absence of each unique element. — fitrat, Nov 19 '19 at 20:06
Several related ideas here: https://stackoverflow.com/q/18474896/5325862. You might need to create an ID variable in order to reshape — camille, Nov 19 '19 at 20:33

score 3 · Answer 1 · answered Nov 19 '19 at 20:17

3

This can be done by melt then create the freq count by table

library(reshape2)
df['New']=row.names(df)
s=melt(df,c('Age','New'))
s=as.data.frame.matrix(table(s$New,s$value))
s$Age=df$Age
s
  a b c  Age
1 1 1 0  6-9
2 1 1 1  6-9
3 1 0 1  6-9
4 0 1 2 9-10
5 1 1 1 9-10

answered Nov 19 '19 at 20:17

BENY

317,841
20
164
234

@denis glad I can help :-) have a nice day – BENY Nov 19 '19 at 20:32
2

You could also achieve it with just `base R` `data.frame(df[1], as.data.frame.matrix(table(data.frame(1:nrow(df), stack(df[-1])[1]))))` :) – tmfmnk Nov 19 '19 at 20:45

score 1 · Answer 2 · answered Nov 19 '19 at 20:06

One option with pivot_longer and pivot_wider

library(dplyr)
library(tidyr)
df1 %>%
   mutate(rn = row_number()) %>% 
   pivot_longer(cols = -c(Age, rn)) %>%
   filter(value != '')  %>% 
   select(-name) %>%
   distinct %>%
   mutate(val = 1) %>%
   pivot_wider(names_from = value, values_from = val, 
            values_fill = list(val = 0)) %>%
   select(-rn)
# A tibble: 5 x 4
#  Age       a     b     c
#  <chr> <dbl> <dbl> <dbl>
#1 6-9       1     1     0
#2 6-9       1     1     1
#3 6-9       1     0     1
#4 9-10      0     1     1
#5 9-10      1     1     1

data

df1 <- structure(list(Age = c("6-9", "6-9", "6-9", "9-10", "9-10"), 
    Monday = c("a", "b", "", "c", "c"), Tuesday = c("b", "a", 
    "c", "c", "a"), Wednesday = c("", "c", "a", "b", "b")),
    class = "data.frame", row.names = c(NA, 
-5L))

I am getting an error for the select(-name). Error in select(., -name) : unused argument (-name) — fitrat, Nov 19 '19 at 20:27
@fitrat try by changing `select(-name)` to `dplyr::select(-name)` could be a masking issue by `select` from some other package — akrun, Nov 19 '19 at 20:29

denis · Answer 3 · 2019-11-19T20:54:15.033

1

A data.table solution, using an ID variable;

library(data.table)
library(magrittr)
df <- setDT(df)

ag = function(x){if(length(x>1)){1}else{length(x)}}

df[,idx:=.I][]%>%
  melt(id.vars = c("Age","idx")) %>%
  .[,.(Age,value,idx)]%>%
  dcast(Age+idx~value,fun.aggregate = ag)%>%
  .[,-c("idx","NA")]


    Age a b c
1:  6-9 1 1 0
2:  6-9 1 1 1
3:  6-9 1 0 1
4: 9-10 0 1 1
5: 9-10 1 1 1

The data:

df <- read.table(text = "Age      Monday Tuesday Wednesday 
6-9        a     b        NA
6-9        b     a        c
6-9       NA     c        a
9-10       c     c        b
9-10       c     a        b",header = T)

edited Nov 19 '19 at 20:54

answered Nov 19 '19 at 20:09

denis

5,580
1
13
40

Thank you very much for your help. However, instead of just giving me the presence or absence of each unique element per row, it is giving me the total number of unique elements per row. It would be appreciated if you suggest me how can I just get the presence or absence of each unique element per row? – fitrat Nov 19 '19 at 20:35
Ho, right I didn't realized, sorry. I edited my answer, but WeNYoBen anser's is way better and simpler I think. – denis Nov 19 '19 at 20:55

How to count the presence of unique elements from each row of data set and list them in r

3 Answers3

data