-3

I have this table full of characters and numbers and would like only to have top 3 freq, plus their own variable. According to image, I would like to have results of a table includes only AZ 520, then AE 488, then AU 399.

   Var1 Freq
1    AE  488
2    AR   12
3    AU  399
4    AW   56
5    AZ  520
6    BA    2
7    BB   84
8    BG  246
9    BH   85
10   BI    6




as.data.frame(table(training.data.raw$destinationcountry))

1 Answers1

2

Recreating your data as follows, assuming column names, name, and value:

training.data.raw <- data_frame(name  = c("IN", "IS", "IT", "JO", "JP",     "KZ", "MA", "MZ", "NG", "NO", "NZ", "PE", "PH", "PR", "RO", "RU", "SA", "SE", "SY", "TM", "TN", "TR", "UK", "US", "WS"),
                                value = c(999, 1, 1885, 1098, 2, 584, 858, 11, 10, 522, 193, 29, 2, 1, 1603, 353, 6, 2, 4, 33, 228, 3201, 852, 1363, 1));

You can use the top_n function in the dplyr package to easily get your desired results (details in helpfile ?top_n):

library(dplyr);
top_3 <- top_n(x=training.data.raw, n=3);
top_3;

EDIT BASED ON COMMENT: If you have character factors instead of regular character vectors, you can mutate them first to characters:

training.data.characters <- mutate(training.data.raw, name = as.character(name));

# Now top_n() will take it
# Can also explicity state wt argument to tell it to sort by value
top_3 <- top_n(x=training.data.characters, n=3, wt=value);
top_3;
Mekki MacAulay
  • 1,727
  • 2
  • 12
  • 23
  • Thanks but i received this message `Error in UseMethod("tbl_vars") : no applicable method for 'tbl_vars' applied to an object of class "factor"` –  Jan 06 '16 at 15:50
  • Ok, that means your named variables are `factors`. Which is awkward. You can transform them first with `mutate`. I'll update answer. – Mekki MacAulay Jan 06 '16 at 15:51
  • Thanks ! i will check it out –  Jan 06 '16 at 15:52
  • I have edited the question ,removed image and pasted first 10 results :) just in case –  Jan 06 '16 at 15:55
  • May i ask why am i getting - to my question ? is it that weird ? –  Jan 06 '16 at 15:58
  • 1
    It's generally considered bad practice to not provide data. It forces respondents to recreate the data manually, which is time consuming. See http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example for how to write a good quality reproducible question. – Mekki MacAulay Jan 06 '16 at 15:59