We create a logical index for rows that have no numbers ('i1'), get the cumulative sum of that, split
the dataset with that grouping index, extract the 'year', 'month' with (sub
), and the 'Country' as the first element, create a data.frame
, and rbind
the list
contents.
i1 <- grepl('^[^0-9]+$', exampledf$Col1)
lst <- lapply(split(exampledf, cumsum(i1)), function(x)
data.frame(year= as.numeric(sub('\\/.*', '', x[-1,1])),
month = as.numeric(sub('.*\\/', '', x[-1,1])),
Country = x[1,1] ) )
res <- do.call(rbind, lst)
row.names(res) <- NULL
res
# year month Country
#1 2005 12 Argentina
#2 2005 11 Argentina
#3 2006 12 Bolivia
Or using data.table
, we convert the 'data.frame' to 'data.table' (setDT(exampledf)
), grouped by the cumsum
of the index (from above), we split ( tstrsplit
) on the 'Col1' (removing the first element) with delimiter (/
). We get two columns out of that. Then, concatenate the first element to create three columns and change the column names with setnames
. If we don't need the grouping variable, it can be assigned (:=
) to NULL.
library(data.table)
res1 <- setDT(exampledf)[, c(tstrsplit(Col1[-1],
'/'),Country = Col1[1L]), .(i2=cumsum(i1))][,i2:= NULL][]
setnames(res1, 1:2, c('year', 'month'))
data
exampledf<-data.frame(Col1=c("Argentina","2005/12","2005/11",
"Bolivia","2006/12"),stringsAsFactors=FALSE)