1


Imagine this is my df

>df
gen A   B   C   D
M1  1   2   3   4
M1  8   6   5   3
M1  4   8   6   0
M1  8   5   6   3
M2  8   5   6   0
M2  0   2   8   6
M3  3   8   9   2
M3  8   9   5   6
M4  3   7   8   5
M4  5   6   3   2

Here, how to subset set of duplicates based on first column, like

M1  1   2   3   4
M1  8   6   5   3
M1  4   8   6   0
M1  8   5   6   3

Many thanks

ramesh
  • 1,187
  • 7
  • 19
  • 42

1 Answers1

0

This will work.

> d
   gen A B C D
1  M1   1 2 3 4
2  M1   8 6 5 3
3  M1   4 8 6 0
4  M1   8 5 6 3
5  M2   8 5 6 0
6  M2   0 2 8 6
7  M3   3 8 9 2
8  M3   8 9 5 6
9  M4   3 7 8 5
10 M4   5 6 3 2
> subset(d, (gen %in% c("M1")))
   h gen A C D
1 M1   1 2 3 4
2 M1   8 6 5 3
3 M1   4 8 6 0
4 M1   8 5 6 3

For programatically grouping and looping over:

for (i in unique(d$gen)) { print(subset(d, (gen %in% c(i)))) } 
Manoj Awasthi
  • 3,460
  • 2
  • 22
  • 26
  • Is it possible to subset without indicating "M1". Because, i like to carry out inside for loop. Like first i like to subset "M1", process it, then "M2", so on... Thanks! – ramesh Mar 22 '14 at 05:25
  • Updated the answer: `for (i in unique(d$gen)) { print(subset(d, (gen %in% c(i)))) }` – Manoj Awasthi Mar 23 '14 at 07:34