frequency count of a column based on two other columns

Question

I am relatively new to r and I got the following problem, I hope you can help me with.

I have a table with a column RANDOM.s. Another column shows the year. And a third column that represents some values or NA.

RANDOM <- sample(c("A","B","C","D"), size = 100, replace = TRUE)
Year <- sample(c(2008,2009,2010), 100, TRUE)
Value <- sample(c(0.22, NA), 100, TRUE)

I am looking for the following solution:

       Year  2008 2009 2010 ...
Ticker
 A             9     11   7
 B             11    2    6
 C
 D

I want to get a table like this for example that gives me back, how often the Value in the Column "Value" appeared for the "RANDOM" in the year 2008.

So far I could only get a table that showed me how often I got the match RANDOM and Year, but not the count of my third column. Like this:

     Year 2008 2009 2010 ...  
 RANDOM
 A        4    5    6
 B
 C

I would be really grateful if you could help me out on this. Thank you! :)

This will be fairly easy to solve using aggregate() or dplyr, however, please provide a reproducible example. — yrx1702, May 05 '18 at 16:06
Please don't post data as screenshots. Use `dput` to include a minimal representable dataset. Also include any code attempt. — Maurits Evers, May 05 '18 at 16:06

score 1 · Accepted Answer · answered May 05 '18 at 16:22

You are actually close to the solution. I also want to stress to first check out how to make a good reproducible example (at least for your next question) --> How to make a great R reproducible example?

Here comes an example how it might look for your data:

        # Make up some demo data

    Ticker <- sample(c("A","B","C","D"), size = 100, replace = TRUE)
    Year <- sample(c(2008,2009,2010), 100, TRUE)
    Value <- sample(c(0.22, NA), 100, TRUE)

    data <- data.frame(Ticker,Year,Value)

    # open dplyr library
    library(dplyr)

    #Group data by Ticker and year and count Values that are not NA 
    data %>% group_by(Ticker, Year) %>% summarise(count = length(Value[!is.na(Value)]))

   Ticker  Year count
   <fctr> <dbl> <int>
1       A  2008     9
2       A  2009    11
3       A  2010     7
4       B  2008    11
5       B  2009     2
6       B  2010     6
7       C  2008     7
8       C  2009    10
9       C  2010     9
10      D  2008     5
11      D  2009    12
12      D  2010    11

You are welcome! Would be nice if you could mark the question as answered und upvote! Thanks and have a nice weekend! — FAMG, May 05 '18 at 18:45

score 0 · Answer 2 · answered May 05 '18 at 17:34

You can also use count without summarise; it will create a new variable called n

# some example data
df <- data_frame(
    Ticker = c(LETTERS[1:5],LETTERS[1:5]),
    y2008 = sample(1:3,10,replace = T),
    y2009 = sample(1:3,10,replace = T),
    y2010 = sample(1:3,10,replace = T)
)

df %>% 
    gather(key,value,-Ticker) %>% 
    group_by(Ticker,key,value) %>% 
    count()

frequency count of a column based on two other columns

2 Answers2

Linked