0

So, I have some data that looks like this:
Current Data Sample

dput(df)
structure(list(Parties = structure(c(2L, 3L, 1L), .Label = c("City (Plaintiff) Doe, John (Defendant)", 
"John Doe Enterprises Inc (Plaintiff) Doe, John (Defendant) Does 1-5, John (Defendant)", 
"John Doe Properties LLC (Plaintiff) Doe, Jane (Defendant) Doe, Jane (Defendant)"
), class = "factor")), class = "data.frame", row.names = c(NA, 
-3L))

And I'm hoping to split it out so that I have a couple columns for plaintiff and a couple for defendant, like this (or something similar):

Goal Data Sample

But I'm having trouble figuring out what my R code should look like to do this. Any thoughts?

  • Done @akrun. Still figuring out how this website works... – Lantz McGinnis-Brown Apr 22 '20 at 21:25
  • You rexpected output seems to have multiple column wiht single value. Can you please elaborate. we cannot have data.frame with cell value merged for some rows across columns – akrun Apr 22 '20 at 21:26
  • 2
    Lantz, if you mouse over the [tag:r] tag under your question, it makes several suggestions including the use of `dput(...)`. I know that most people don't really pay attention to the welcome [tour] or look at [help] or even [ask] or [MCVE], but the help section of StackOverflow *does* have some hints here and there. The best not included in that bunch (in my opinion) is https://stackoverflow.com/q/5963269. (This is certainly not meant as criticism, just offering other places to look to learn how SO tends to do things. Thanks!) – r2evans Apr 22 '20 at 21:28
  • @akrun, if I understand your point correctly, I am looking for the contents of the original Parties column to be broken out by the individual people/organizations within each cell of that column, with new columns being created to designate the titles given in parentheses. So a cell that says "Person 1 (A), Person 2 (B)" would be split into an "A" column with "Person 1" in the first row, and a "B" column with "Person 2 in the first row. The tricky part is that some cells have multiple entries of the same type (two defendants, for example). – Lantz McGinnis-Brown Apr 22 '20 at 22:13
  • Here's a possible approach: `library(splitstackshape); library(data.table); getanID(cSplit(cSplit(data.table(df, id = 1:nrow(df)), "Parties", ")", "long"), "Parties", "("), c("id", "Parties_2"))[, dcast(.SD, id ~ Parties_2 + sub_id, value.var = "Parties_1")]` – A5C1D2H2I1M1N2O1R2T1 May 02 '20 at 06:14
  • Here's another: `library(tidyverse); rownames_to_column(df, var = "id") %>% separate_rows(Parties, sep = "\\)") %>% filter(Parties != "") %>% separate(Parties, into = c("Party", "Group"), sep = " \\(") %>% group_by(id, Group) %>% mutate(time = sequence(n())) %>% unite(grp, Group, time) %>% pivot_wider(names_from = grp, values_from = Party)` – A5C1D2H2I1M1N2O1R2T1 May 02 '20 at 06:14
  • @A5C1D2H2I1M1N2O1R2T1 Thank you! That worked! – Lantz McGinnis-Brown May 07 '20 at 22:32
  • @LantzMcGinnis-Brown, feel free to add it as a self-answer and accept the answer to "close" the question. – A5C1D2H2I1M1N2O1R2T1 May 09 '20 at 20:20

0 Answers0