1

I am using R.

I have a database that looks like this :

Price A       Price B   
1              3           
3              2            
1              3            
2              4          
3              2           
3              2            
3              3           
       

What I would like to do is to count the number of occurence of each couple (Price A, Price B) and know which couples is the most prominent (let's say the three more prominent couples) and their frequency.

I really don't know how to do it.

For instance here:

(3,2) : 3 times

(1,3) : 2 times

(2,4) : 1 time

(3,3) : 1 time

Nimantha
  • 6,405
  • 6
  • 28
  • 69
Marc
  • 71
  • 7

4 Answers4

1

A dplyr possibility could be:

df %>%
 group_by_all() %>%
 tally()

  Price_A Price_B     n
    <int>   <int> <int>
1       1       3     2
2       2       4     1
3       3       2     3
4       3       3     1

Or the same result with count():

df %>%
 count(Price_A, Price_B)

Or if you want the combinations of the two columns as one column:

df %>%
 mutate(Price_comb = paste(Price_A, Price_B, sep = ",")) %>%
 count(Price_comb) 

  Price_comb     n
  <chr>      <int>
1 1,3            2
2 2,4            1
3 3,2            3
4 3,3            1
tmfmnk
  • 38,881
  • 4
  • 47
  • 67
1

With base R : df.new <- as.data.frame(with(df, table(Price.A, Price.B))) or just df.new <- as.data.frame(table(df)) as pointed out by @Ronak Shah

# output df.new
  Price.A Price.B Freq
1       1       2    0
2       2       2    0
3       3       2    3
4       1       3    2
5       2       3    0
6       3       3    1
7       1       4    0
8       2       4    1
9       3       4    0

df.new[df.new$Freq != 0, ]
#   Price.A Price.B Freq
3       3       2    3
4       1       3    2
6       3       3    1
8       2       4    1

Data

df <- structure(list(Price.A = c(1L, 3L, 1L, 2L, 3L, 3L, 3L), Price.B = c(3L, 
2L, 3L, 4L, 2L, 2L, 3L)), .Names = c("Price.A", "Price.B"), class = "data.frame", row.names = c(NA, 
-7L))
nghauran
  • 6,648
  • 2
  • 20
  • 29
0

This can easily be done with a SQL query, so let's try using the sqldf package:

library(sqldf)

sql <- "SELECT price_a, price_b, COUNT(*) AS frequency
        FROM your_df
        GROUP BY price_a, price_b
        ORDER BY frequency DESC"

result <- sqldf(sql)

enter image description here

Demo

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
0

A different tidyverse way using summarise allowing you to specify the name of the new column.

Your data:

df <- structure(list(PriceA = c(1, 3, 1, 2, 3, 3, 3), PriceB = c(3, 
                                                             2, 3, 4, 2, 2, 3)), class = c("spec_tbl_df", "tbl_df", "tbl", 
                                                                                           "data.frame"), row.names = c(NA, -7L), spec = structure(list(
                                                                                             cols = list(PriceA = structure(list(), class = c("collector_double", 
                                                                                                                                              "collector")), PriceB = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                  "collector"))), default = structure(list(), class = c("collector_guess", 
                                                                                                                                                                                                                                                        "collector")), skip = 2), class = "col_spec"))

Solution:

library(tidyverse)
df %>% 
  group_by(PriceA, PriceB) %>% 
  summarise(Freq = n())

# A tibble: 4 x 3
# Groups:   PriceA [3]
  PriceA PriceB  Freq
   <dbl>  <dbl> <int>
1      1      3     2
2      2      4     1
3      3      2     3
4      3      3     1
nycrefugee
  • 1,629
  • 1
  • 10
  • 23