-1

I want to get the data to be each element separate by ; into a column

I have tried strsplit(df$data) but then I have an unequal length problem I can't solve. I don't know exactly how many elements will be in each row but it should be less then 6. There is a lot of rows and I can't figure out how to deal with the uneven nature of the data and making it fit in a square. Sample data:

data = c("1;Donor;Constituent;Blog Subscriber", "2;Donor;Constituent;Blog Subscriber", "3;Donor;Constituent", "4;Donor;Constituent;Blog Subscriber",  "5;Donor;Constituent", "6;NA")
df <- data.frame(data)

messy = strsplit(df$data)

How do I make this so each element separted by a ";" has its own column and row?

Spruce Island
  • 425
  • 1
  • 4
  • 10

1 Answers1

1

No need for data.table hieroglyphics, esp since it's unlikely you need the large-scale data that data.table was really meant for:

bits <- c("1;Donor;Constituent;Blog Subscriber", "2;Donor;Constituent;Blog Subscriber", "3;Donor;Constituent", "4;Donor;Constituent;Blog Subscriber",  "5;Donor;Constituent", "6;NA")
df <- data.frame(bits)

tidyr::separate(df, bits, sprintf("X%d", 1:4), ";")
hrbrmstr
  • 77,368
  • 11
  • 139
  • 205
  • 1
    Are you for real? `setDT(df)[, tstrsplit(data, ";")]` is considered *hieroglyphics* while `tidyr::separate(df, bits, sprintf("X%d", 1:4), ";")` is much better? How come? – David Arenburg Oct 21 '16 at 11:53
  • 2
    The vast majority of data.table shortcuts & short names are absolutely hieroglyphs. Their idioms aren't readily apparent to most new R users. It's a fine pkg but encourages extremely unreadable human code. That's an opinion and as valid as any you are going to counter with on the pro data.table side. – hrbrmstr Oct 21 '16 at 11:58
  • 2
    What shortcut do you see here, please explain. `t` stands from *transpose* in base R. `strsplit` is a function from base R too. Hence `tstrsplit` would make a perfect sense to someone who knows base R. Or are you suggesting that new R users are supposed to know Hadley's packages and nothing else? Maybe we will just remove base R and all the rest of the packages all together and just make `tidyverse` the new R and make Hadley are one and only king? – David Arenburg Oct 21 '16 at 12:02
  • 1
    It's an _opinion_ and I'm totally allowed to have and express one just as you are. My _opinion_ is that encouraging data.table use == encouraging writing unreadable code for a large percentage of R users. It's valid. It's shared by others. Your opines of data.table are valid too and shared by others. Much of base R has been drastically improved by the new idioms you referred to (not the data.table ones) and, yes, I encourage/teach the "tidyverse" to new R folks all the time. When a new/better idiom arises, I'll teach/use that. Post your answer, rage-minus mine if need be, and let the OP choose – hrbrmstr Oct 21 '16 at 12:09
  • 1
    (comment length reached) but I'm not replying to any further ones. it's a poor use of my time. – hrbrmstr Oct 21 '16 at 12:10