1

This is following on from another question Extracting from Nested list to data frame

Using the updated answer I get my data frame I will start with.

I then use df <- data.frame(start = df3[5,])

So I'm left with:

dput(df)
structure(list(start.X1_1 = structure(4L, .Names = "experience.start", .Label = c("", 
" ", "1", "2015"), class = "factor"), start.X2_2 = structure(3L, .Names = "experience.start", .Label = c(" ", 
"1", "2011"), class = "factor"), start.X3_2 = structure(3L, .Names = "experience.start", .Label = c(" ", 
"1", "2007"), class = "factor"), start.X4_2 = structure(NA_integer_, .Names = "experience.start", .Label = c(" ", 
"1"), class = "factor"), start.X5_2 = structure(NA_integer_, .Names = "experience.start", .Label = c(" ", 
"1"), class = "factor"), start.X6_2 = structure(NA_integer_, .Names = "experience.start", .Label = c(" ", 
"1"), class = "factor"), start.X7_2 = structure(NA_integer_, .Names = "experience.start", .Label = c(" ", 
"1"), class = "factor"), start.X8_2 = structure(NA_integer_, .Names = "experience.start", .Label = c(" ", 
"1"), class = "factor"), start.X9_2 = structure(NA_integer_, .Names = "experience.start", .Label = c(" ", 
"1"), class = "factor"), start.X10_3 = structure(3L, .Names = "experience.start", .Label = c(" ", 
"1", "2016", "3000"), class = "factor"), start.X11_3 = structure(3L, .Names = "experience.start", .Label = c(" ", 
"1", "2015", "3000"), class = "factor"), start.X12_3 = structure(4L, .Names = "experience.start", .Label = c("", 
" ", "1", "2015", "2016", "EE"), class = "factor"), start.X13_3 = structure(4L, .Names = "experience.start", .Label = c("", 
" ", "1", "2014", "2015"), class = "factor"), start.X14_3 = structure(3L, .Names = "experience.start", .Label = c(" ", 
"1", "2013", "2014"), class = "factor"), start.X15_3 = structure(3L, .Names = "experience.start", .Label = c(" ", 
"1", "2010", "2011", "Virtusa"), class = "factor")), .Names = c("start.X1_1", 
"start.X2_2", "start.X3_2", "start.X4_2", "start.X5_2", "start.X6_2", 
"start.X7_2", "start.X8_2", "start.X9_2", "start.X10_3", "start.X11_3", 
"start.X12_3", "start.X13_3", "start.X14_3", "start.X15_3"), row.names = "experience.start", class = "data.frame")

Now I'd like to get to the format:

  v1    v2  v3   v4   v5   v6   v7   v8
1 2015
2 2011 2007 null null null null null null
3 2016 2015 2015 2015 2013 2010

I can use the following to find the columns that match

sR <- function(x, n){
    substr(x, nchar(x)-n+1, nchar(x))}

 sR(names(df),2)
 [1] "_1" "_2" "_2" "_2" "_2" "_2" "_2" "_2" "_2" "_3" "_3" "_3" "_3" "_3" "_3"

So I think from here there must be a way I can get to my desired output.

Or I'm sure someone will show me a better way

Community
  • 1
  • 1
Olivia
  • 814
  • 1
  • 14
  • 26

1 Answers1

2

The main idea is to split your data frame based on the suffix after the underscore. This way you get a list with 3 elements, 1 for each suffix (in your case 1, 2, 3)

df[] <- lapply(df[], as.character)
l1 <- lapply(split(stack(df), as.numeric(sub('.*_', '', stack(df)[,2]))), '[', 1)
lapply(l1, head, 2)

#$`1`
#  values
#1   2015

#$`2`
#  values
#2   2011
#3   2007

#$`3`
#   values
#10   2016
#11   2015

Now all we need to do is cbind those 3 elements together which is a bit tricky since their length is different. Luckily there are great answers here in SO which we can use (see disclaimer below) that take care of that problem.

t(do.call(cbindPad, l1))

#       1      2      3      4      5      6      7  8 
#values "2015" NA     NA     NA     NA     NA     NA NA
#values "2011" "2007" NA     NA     NA     NA     NA NA
#values "2016" "2015" "2015" "2014" "2013" "2010" NA NA

DISCLAIMER

The function cbindPad was taken from @Joran's answer in this post

Alternatively, there is the function rbind.fill in plyr package which can be used after transposing to give a sort of cbind.fill result.

plyr::rbind.fill(lapply(l1, function(i) as.data.frame(t(i))))

#     1    2    3    4    5    6    7    8    9   10   11   12   13   14   15
#1 2015 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#2 <NA> 2011 2007 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#3 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 2016 2015 2015 2014 2013 2010
Community
  • 1
  • 1
Sotos
  • 51,121
  • 6
  • 32
  • 66