-3

How can i select unique observation in group. Reproducible example.

mydata=structure(list(N = c(111L, 111L, 111L, 111L, 112L, 112L, 112L, 
111L, 111L, 111L, 111L, 112L, 112L, 112L), group = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "control group", class = "factor"), 
    char = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L), .Label = c("bad", "good"), class = "factor")), .Names = c("N", 
"group", "char"), class = "data.frame", row.names = c(NA, -14L
))

I need find only unique observation in N by char variable. So N contains number of prisoner. char contains good or bad behavior So i must calculate total count of unique numbers of prisoners by good and bad category. There are two group control and test, i just indicated control. As we can see unique observations here 111 and 112 number.

Here output which i want

    number of unique   prisoners for control group
bad     2
good    2

How to perform it?

Edit

mydata=structure(list(N = c(111L, 111L, 111L, 111L, 112L, 112L, 112L, 
111L, 111L, 111L, 111L, 112L, 112L, 112L, 111L, 111L, 111L, 111L, 
112L, 112L, 112L, 111L, 111L, 111L, 111L, 112L, 112L, 112L), 
    group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L), .Label = c("control group", "test group"), class = "factor"), 
    char = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L), .Label = c("bad", "good"), class = "factor")), .Names = c("N", 
"group", "char"), class = "data.frame", row.names = c(NA, -28L
))

output divided by group

     control group test group
bad    2             2
good    2             2
Community
  • 1
  • 1
psysky
  • 3,037
  • 5
  • 28
  • 64
  • 5
    `library(dplyr); mydata %>% group_by(char) %>% summarise(N_unique = n_distinct(N))` – AntoniosK Jul 06 '18 at 12:22
  • @AntoniosK, i edited post , can you help me do it by group, see updated output – psysky Jul 06 '18 at 12:30
  • 2
    `library(tidyverse); mydata %>% group_by(group, char) %>% summarise(N_unique = n_distinct(N)) %>% ungroup() %>% spread(group, N_unique)` – AntoniosK Jul 06 '18 at 12:34
  • @AntoniosK, very well. last request. Is it possible to calculate for each value percentage 2/100=2% ; 2(2,00%) – psysky Jul 06 '18 at 12:42

1 Answers1

0

Using the data.table and dplyr packages:

library(data.table)
library(dplyr)

mydata %>% 
group_by(char) %>% 
summarise(Unique = uniqueN(N))

or for your final question:

library(data.table)
library(dplyr)

mydata %>% 
  group_by(char) %>% 
  summarise(Control = paste(uniqueN(mydata[mydata$group == 'control group',]$N),"(",
                           formatC(100 * uniqueN(mydata[mydata$group == 'control group',]$N)/100, format = "f", digits = 2),"%",")", sep = ""), 
            Test = paste(uniqueN(mydata[mydata$group == 'control group',]$N),"(",
                         formatC(100 * uniqueN(mydata[mydata$group == 'control group',]$N)/100, format = "f", digits = 2),"%",")",sep = ""))
Maylo
  • 572
  • 5
  • 16
  • uniqueN is part of the data.table package. I think. Please do correct me if i'm wrong – Maylo Jul 06 '18 at 12:24
  • Yes it is data.table, but i updated output, see edited post. How can i do it diveded by group – psysky Jul 06 '18 at 12:31
  • 1
    `library(data.table); library(dplyr); mydata %>% group_by(char) %>% summarise(Control = uniqueN(mydata[mydata$group == 'control group',]$N), Test = uniqueN(mydata[mydata$group == 'test group',]$N))` – Maylo Jul 06 '18 at 12:42
  • @Maylo, Is it possible to calculate for each value percentage 2/100=2% ; 2(2,00%) – psysky Jul 06 '18 at 12:43
  • 1
    @varimax simply add '/100' after each unique `library(data.table); library(dplyr); mydata %>% group_by(char) %>% summarise(Control = uniqueN(mydata[mydata$group == 'control group',]$N)/100, Test = uniqueN(mydata[mydata$group == 'test group',]$N)/100)` – Maylo Jul 06 '18 at 12:45
  • @ Maylo Yes, but output bad 0.0200 0.0200 , and needed output 2(0.02) 2(0.02) /percentages in parentheses, I mean it. Then i accept your answer. Thank you , – psysky Jul 06 '18 at 12:50
  • @varimax i've edited my above answer – Maylo Jul 06 '18 at 12:57