1

I have several big data frames which would be much easier to further analyse if I could add certain numbers in an additional column called "event".

E.g. row 19 to 53 should all have the number 1 as they belong to event no. 1, row 65 to 92 the number 2 etc. So not every row gets a number, because some data will be left out for further analysis (left out data could get 0/NA if that is needed and the column has to be continuous).

Thanks in advance!

I just know how to add a column but not how to attach specific data for certain rows.

Kingpin96
  • 15
  • 4
  • Try `library(dplyr);df1 %>% mutate(grp = case_when(row_number() %in% 19:53 ~ 1, row_number() %in% 65:92 ~ 2))` – akrun Apr 27 '23 at 12:31
  • Adding a column is easy: simply `data$new_columns <- x`. But your question seems to have more to do with how to obtain `x`, that is, what's the logic of the vector that has the data of events. Please, explain a little more about your data and how you want to classify it. Check this post on how to make reproducible examples: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example. – Santiago Apr 27 '23 at 12:32
  • If you know the row's numbers, you could just use the `rep()` function : `data$event=c(rep(0,18),rep(1,34),rep(0,12),rep(2,27)`, meaning you repeat '0' 18 times (0 being the 'no event' code), the '1' 34 times and so on and so forth – Dimitri Apr 27 '23 at 12:33
  • @akrun, that worked, thanks! Not sure if it's the easiest way, but I get what I want. :) – Kingpin96 Apr 27 '23 at 12:41

1 Answers1

2

We could use case_when to modify specific row index with a value

library(dplyr)
df1 %>%
   mutate(grp = case_when(row_number() %in% 19:53 ~ 1, 
   row_number() %in% 65:92 ~ 2))

Or using base R - create the column for specific index by replicating the values that many times i.e. 53 - 19 = 34 and 92-65 =27)

df1$grp[c(19:53, 65:92)] <- rep(c(1,2), c(34, 27))
akrun
  • 874,273
  • 37
  • 540
  • 662