-1

I 'd like to get the number and percent of cases that meet a certain condition , grouped by another column.

The groups are the cities, the condition is hour >= 6.

For example

  city hour
    A    7
    A    6
    A    3
    B    2
    C    7

I'd like to get

 city hour>=6
    A 2
    B 1
    C 0

and than every percentage based on cases by city.

  city         hours >= 6 (%)
     A 0.6666667
     B 1.0000000
     C 0.0000000
City    ---  hour

I think I'm almost there

aggregate(hours, list(city), mean)

I get the mean of hour by city but I don't understand how to get the other results.

MG

Magal
  • 3
  • 5

2 Answers2

1

using package dplyr

data:

df1<-data.frame(city=c(rep("A",3), "B","C"), hour = c(7,6,3,2,7))

code:

df1 %>% group_by(city) %>% summarise(hourLHE6 = sum(hour <= 6), hourPCT = sum(hour <= 6)/length(hour))

result:

## A tibble: 3 x 3
#  city  hourLHE6 hourPCT
#  <fct>    <int>   <dbl>
#1 A            2   0.667
#2 B            1   1    
#3 C            0   0    
Andre Elrico
  • 10,956
  • 6
  • 50
  • 69
0

Try this:

x <- structure(list(city = c("A", "A", "A", "B", "C"), hour = c(7, 
6, 3, 2, 7)), row.names = c(NA, -5L), class = "data.frame")

> x
  city hour
1    A    7
2    A    6
3    A    3
4    B    2
5    C    7

> aggregate(x$hour, by = list(city = x$city), function(z) length(z[z<=6]))
  city x
1    A 2
2    B 1
3    C 0

> aggregate(x$hour, by = list(city = x$city), function(z) length(z[z<=6]) / length(z))
  city         x
1    A 0.6666667
2    B 1.0000000
3    C 0.0000000
Cettt
  • 11,460
  • 7
  • 35
  • 58