-2

I have a data set that looks like the one below. I am trying to write R code to transform it. This is ego network, meaning that the first column has two people who listed their connections (which are in columns A1, A2 and A3). Then in columns 5 through 10 I have interrelationships between people in A1, A2 and A3:

d <- data.frame(matrix(c("Steph","Ellen","John","Jim","Sam","Tom","Sally","Jane","Sam","Jane","Sally","NA","John","Jim","NA","Jane","Sam","NA","NA","Tom"),2,10))
names(d)<-c("Ego","A1","A2","A3","A1Connection1","A1Connection2","A2Connection1","A2Connection2","A3Connection1","A3Connection2")
d

My challenge is to take columns 2 through 10 and make them look like this

ReshapedData<-data.frame(matrix(c("John","John","Sam","Sam","Sally","Sally","Jim","Jim","Tom","Tom","Jane","Jane",
            "Sam","Sally","John","NA","Sam","NA","Jane","NA","Jim","Jane","NA","Tom"),12,2))
names(ReshapedData)<-c("Alter", "Alter_Alter")
ReshapedData

I don't need the ego name, at least at this stage. The key is to get the other stuff first. So far the best thing I can come up with is transposing columns 5-10 in each row, and then using rbind to create one long column an then cbind that with list of alters in A1, A2, A3. That has to be some more streamlined way to manage that.

Thanks

Bogdan

Bogdan
  • 23
  • 3
  • 2
    Please do not add data as a picture link. Instead, copy and paste the data with indents for code formatting. For more on [How to make a great example, see this link](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – Pierre L Jan 06 '17 at 18:14
  • Thanks for the note. Just updated. – Bogdan Jan 06 '17 at 21:45

1 Answers1

1

Use melt() function from reshape package and match items with a common index:

d <- data.frame(matrix(c("Steph","Ellen","John","Jim","Sam","Tom","Sally","Jane","Sam","Jane","Sally","NA","John","Jim","NA","Jane","Sam","NA","NA","Tom"),2,10))
names(d)<-c("Ego","A1","A2","A3","A1Connection1","A1Connection2","A2Connection1","A2Connection2","A3Connection1","A3Connection2")
d

library(reshape)

a <- melt(d,id.vars=NULL,measure.vars = c("A1","A2","A3"))
a$match <- as.character(paste(a[,1],rep(1:2)))
b <- melt(d,id.vars=NULL,measure.vars = c(5:dim(df)[2]))
b$match <- as.character(paste(gsub(pattern = ".*A([0-9]+).*",replacement = "A\\1",x = b[,1]),
                              rep(1:2)))

df.final <- data.frame(Alter=a$value[match(b$match,a$match)], Alter_Alter=b$value)

index <- 1:dim(df.final)[1]

index <- matrix(1:dim(df.final)[1], nrow = dim(df.final)[1]/2,byrow = T)

df.final <- df.final[as.vector(index),]

df.final
   Alter Alter_Alter
1   John         Sam
3   John       Sally
5    Sam        John
7    Sam          NA
9  Sally         Sam
11 Sally          NA
2    Jim        Jane
4    Jim          NA
6    Tom         Jim
8    Tom        Jane
10  Jane          NA
12  Jane         Tom

# Test

ReshapedData<-data.frame(matrix(c("John","John","Sam","Sam","Sally","Sally","Jim","Jim","Tom","Tom","Jane","Jane",
            "Sam","Sally","John","NA","Sam","NA","Jane","NA","Jim","Jane","NA","Tom"),12,2))
names(ReshapedData)<-c("Alter", "Alter_Alter")

df.final==ReshapedData

   Alter Alter_Alter
1   TRUE        TRUE
3   TRUE        TRUE
5   TRUE        TRUE
7   TRUE        TRUE
9   TRUE        TRUE
11  TRUE        TRUE
2   TRUE        TRUE
4   TRUE        TRUE
6   TRUE        TRUE
8   TRUE        TRUE
10  TRUE        TRUE
12  TRUE        TRUE
aldo_tapia
  • 1,153
  • 16
  • 27
  • Thank You so much. Perfect! – Bogdan Jan 06 '17 at 22:13
  • There is one issue I see. In the reshaped data frame John in column 1 should have two connections Sam and Sally. In the df.final it's pulling Sam and Jane, so going down the column instead of going across the columns 5 and 6. Is that something that can be easily fixed with indexing? – Bogdan Jan 07 '17 at 00:13