0

Could anyone answer this? I have a dataframe (thousands of rows) with a particular column name Name and I want to modify the dataframe based on the column Name e.g. I have a sample dataframe

df1<-data.frame(Id=c(1,2,3,4,5,6,7,8,9,10), 
       Name=c('Plant_A','Plant_A','Plant_A','Plant_A','Plant_B','Plant_B','Plant_B','Plant_C','Plant_C','Plant_C'), 
       Value=c(100,100,100,100,55,55,55,90,90,90),
       stringsAsFactors=FALSE)

Now, according to the columnName the new columns namely Availability and Status should be added/populate with the values shown in the dataframe df2. The first value of the row with Yes and 0 and rest of the values for the same Name should be No and empty `` and so on.

df2<-data.frame(Id=c(1,2,3,4,5,6,7,8,9,10), 
                Name=c('Plant_A','Plant_A','Plant_A','Plant_A','Plant_B','Plant_B','Plant_B','Plant_C','Plant_C','Plant_C'), 
                Value=c(100,100,100,100,55,55,55,90,90,90),
                Availability=c('Yes','No','No','No','Yes','No','No','Yes','No','No'),
                Status =c(0,'','','',0,'','',0,'',''),
                stringsAsFactors=FALSE)
            

I can add only one type of the value like,

df1$Availability<-'Yes'
df1$Status<-0

But don't understand how to populate df1 in order to get df2. Can anyone help me? Thank you.

mitco
  • 35
  • 7

3 Answers3

0

A dplyr pipe can do it by grouping and mutating de data set.

library(dplyr)

df1 %>%
  group_by(Name) %>%
  mutate(Availability = c("Yes", rep("No", n() - 1)), 
         Status = c(0, rep("", n() - 1)))
## A tibble: 10 x 5
## Groups:   Name [3]
#      Id Name    Value Availability Status
#   <dbl> <chr>   <dbl> <chr>        <chr> 
# 1     1 Plant_A   100 Yes          "0"   
# 2     2 Plant_A   100 No           ""    
# 3     3 Plant_A   100 No           ""    
# 4     4 Plant_A   100 No           ""    
# 5     5 Plant_B    55 Yes          "0"   
# 6     6 Plant_B    55 No           ""    
# 7     7 Plant_B    55 No           ""    
# 8     8 Plant_C    90 Yes          "0"   
# 9     9 Plant_C    90 No           ""    
#10    10 Plant_C    90 No           "" 
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
0

You can use match against unique values from Name. Match will retrieve first apparition of every element in his first argument.

df1$Availability = 'No'
df1$Status = ''


df1$Availability[match(unique(df1$Name), df1$Name)] <- 'Yes'
df1$Status[match(unique(df1$Name), df1$Name)] <- 0
Ric
  • 5,362
  • 1
  • 10
  • 23
  • did you try with triple colon ::: operator? https://stackoverflow.com/questions/41582136/r-what-do-you-call-the-and-operators-and-how-do-they-differ – Ric Oct 12 '20 at 15:38
0

Here is a data.table solution for posterity:

library(data.table)

df1<-data.table(
  Id=c(1,2,3,4,5,6,7,8,9,10), 
  Name=c('Plant_A','Plant_A','Plant_A','Plant_A','Plant_B','Plant_B','Plant_B','Plant_C','Plant_C','Plant_C'), 
  Value=c(100,100,100,100,55,55,55,90,90,90))

df1[, `:=` (Availability = c("Yes", rep("", .N-1)),
            Status = c(0, rep("", .N-1))),
    by="Name"]

df1[]
#>     Id    Name Value Availability Status
#>  1:  1 Plant_A   100          Yes      0
#>  2:  2 Plant_A   100                    
#>  3:  3 Plant_A   100                    
#>  4:  4 Plant_A   100                    
#>  5:  5 Plant_B    55          Yes      0
#>  6:  6 Plant_B    55                    
#>  7:  7 Plant_B    55                    
#>  8:  8 Plant_C    90          Yes      0
#>  9:  9 Plant_C    90                    
#> 10: 10 Plant_C    90
Vincent
  • 15,809
  • 7
  • 37
  • 39