0

Taking the data-frame df, I would like to extract the unique value according to the following preferred conditions per each Field:

1- if C1 exists, extract the respective value and ignore the others

2- if C2 exists, extract the respective value and ignore the others

... and so on up to C5

data:

df <- data.frame (Field=rep(c("F1","F2","F3","F4","F5"),each=3),
              Cond=rep(c("C1","C2","C3","C4","C5"),3),
              Value=c(1:15))

the desired output:

output <-  data.frame (F= c("F1","F2","F3","F4","F5"),
                   C= c("C1","C1","C2","C1","C3"),
                   Value= c(1,6,7,11,13))

(note1: the values were only set as such to a matter of exemplification, the real data values are not ordered)

(note2: The real conditional column is not ordered alphabetically at all. my though was to have something like, if A exists than chose "A value", otherwise pass to the next condition "if B exists ..." and so on)

10 Rep
  • 2,217
  • 7
  • 19
  • 33
Rui
  • 187
  • 1
  • 11

2 Answers2

2

If you can sort the data.frame before processing, this is fairly easy. Note that this works for this particular case. If your Cond values change, alphabetic sorting may go out the window.

library(dplyr)
df <- data.frame (Field=rep(c("F1","F2","F3","F4","F5"),each=3),
                  Cond=rep(c("C1","C2","C3","C4","C5"),3),
                  Value=c(1:15))

df <- df[with(df, order(Field, Cond)), ]
res <- df %>%
  group_by(Field) %>%
  filter(row_number() == 1)

Source: local data frame [5 x 3]
Groups: Field [5]

   Field   Cond Value
  <fctr> <fctr> <int>
1     F1     C1     1
2     F2     C1     6
3     F3     C2     7
4     F4     C1    11
5     F5     C3    13

Here is another, more genereric way, of doing this. Order of sorting is defined in so (see this question). Notice how I mangled up the values for Cond to show that it's not being sorted alphabetically.

df <- data.frame (Field=rep(c("F1","F2","F3","F4","F5"),each=3),
                  Cond=rep(c("rg1","kl2","xy3","rq4","ab5"),3),
                  Value=c(1:15))

so <- c("rg1","kl2","xy3","rq4","ab5")

df %>%
  group_by(Field) %>%
  slice(match(so, Cond)) %>%
  filter(row_number() == 1)

   Field   Cond Value
  <fctr> <fctr> <int>
1     F1    rg1     1
2     F2    rg1     6
3     F3    kl2     7
4     F4    rg1    11
5     F5    xy3    13
Community
  • 1
  • 1
Roman Luštrik
  • 69,533
  • 24
  • 154
  • 197
  • indeed you are right, I dint think that would be a problem. My "conditional" column is not ordered alphabetically at all. my though was to have something like, if A exists than chose "A value", otherwise pass to the next condition "if B exists ..." and so on – Rui Jan 29 '17 at 21:39
  • @Rui I modified my answer and I think it should work for your general case now. – Roman Luštrik Jan 29 '17 at 21:53
1

Another option is using data.table

library(data.table)
setDT(df)[order(Field, Cond), head(.SD, 1), by = Field]
#    Field Cond Value
#1:    F1   C1     1
#2:    F2   C1     6
#3:    F3   C2     7
#4:    F4   C1    11
#5:    F5   C3    13
akrun
  • 874,273
  • 37
  • 540
  • 662
  • yes it works indeed if "Cond" is to be sorted alphabetically. however if the "sort" column is not ordered alphabetically that example would not work. – Rui Jan 30 '17 at 08:56
  • @Rui In that case, convert it to `factor` and specify the `levels` in the order you want `order(Field, factor(Cond, levels = lvl))` – akrun Jan 30 '17 at 09:10