I would like to show the department that uses the same vendor using the vendor code in a very big dataset, so I guess I will need a loop for that but I am not really sure how to start.
Asked
Active
Viewed 45 times
-1
-
1Please provide us with a sample dataset through `dput` and also provide your attempt at creating the desired output. – Mossa Nov 13 '21 at 11:09
-
Please make a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) or [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) with a sample input and your expected output. This is needed to create, test and verify possible solutions. – Martin Gal Nov 13 '21 at 11:11
-
Try `library(dplyr)`: If your dataframe is named `df`, you could use `df %>% group_by(vendor_code) %>% filter(n() >= 3)`. – Martin Gal Nov 13 '21 at 11:14
-
@Mossa Hello , i posted a sample of dataset ... and i haven't started yet , i don't know how to start – Imane Nov 13 '21 at 11:26
-
What i want to show exactly is , if a vendor code exist in 3 or more departments , i want to list them – Imane Nov 13 '21 at 11:28
-
@MartinGal what does filter(n() >= 3 mean ? and how is it linked to department – Imane Nov 13 '21 at 11:29
-
We group by `vendor_code` and then take the groups with three or more data points. That's what `filter(n() >= 3)` does. – Martin Gal Nov 13 '21 at 11:32
-
1Posting your data as images isn't a good way of sharing data. There are even people who deny that this _is_ a way of sharing data. Please take a look at the links posted on how to make a reproducible example. There you find examples how to put data into your question. – Martin Gal Nov 13 '21 at 11:35
1 Answers
2
Here's a base R solution.
# get the repeated values
dat_tb <- table(dat$vendor_code)
# select for the condition and print from the whole data set
dat[ dat$vendor_code %in% names(dat_tb[ dat_tb > 2 ]), ]
vendor_code department
2 9966 dept2
3 9966 dept3
8 9966 dept8
9 9966 dept9
Data:
dat <- data.frame( vendor_code=rep(c(3344,9966,9966,3444,5566,3388),2),
department=paste0("dept",1:12))

Andre Wildberg
- 12,344
- 3
- 12
- 29