Based on the input showed, the values in the second column can be a string. One option would be to extract the letters from the 'ram' column with str_extract
(stringr
), stack
it to a two column data.frame
, get the frequency count (table
) after converting the 'values' column to a factor
with levels
specified so that we get 0 for all the levels that are not found in the dataset, reshape it to 'long' format with as.data.frame
library(stringr)
df2 <- stack(setNames(str_extract_all(df1$ram, '[a-z]'), seq_len(nrow(df1))))[2:1]
out <- as.data.frame(table(df2$ind, factor(df2$values, levels = letters[1:6])))[-1]
names(out) <- names(df1)
out
# items ram
#1 a 1
#2 b 1
#3 c 1
#4 d 0
#5 e 0
#6 f 0
data
df1 <- data.frame(items = 'x1', ram = '[a,b,c]', stringsAsFactors = FALSE)