split and add columns to a dataframe - R

Question

I have a dataframe (built from consecutive pairs) with the following structure:

Name    Var1    Var2    Comments
A       0.5     4       Zone1
A1      0.9     5       Zone2
B       0.3     55      Zone1
B1      0.1     7       Zone2
C       2       8       Zone1
C       3       9       Zone2
D       0.2     0.6     Zone1
D       3       4       Zone2

I wish to split it and save it in new dataframe with the following structure:

Name    Var1    Var2    Comments    Name.1  Var3    Var4    Comments.1
A       0.5     4       Zone1       A1      0.9     5       Zone2
B       0.3     55      Zone1       B1      0.1     7       Zone2
C       2       8       Zone1       C       3       9       Zone2
D       0.2     0.6     Zone1       D       3       4       Zone1

Is it possible? can some one help me.

Thanks

score 2 · Accepted Answer · answered Feb 10 '17 at 07:24

2

You should be able to split the data by Zone (I'm assuming the Zone1 in Comments.1 is a mistake...) and then cbind them back together:

wide<-cbind(df[df$Comments=="Zone1",],df[df$Comments=="Zone2",])

Then you just need to change the column names if you want as in your example (e.g., to get Name.1 for the 2nd Name column)

answered Feb 10 '17 at 07:24

MPhD

456
2
9

Thanks, Indeed, Zone1 was an error. As I get duplicate headers (Var1, Var2 and Comments) is it possible that they will be named Var1.1, Var2.1 and Comments.1 (It will be easier to run sertnames)? – hs100 Feb 10 '17 at 07:52
This works: colnames(wide)<-c(colnames(df), paste(colnames(df),".1", sep="")). Though there may be a better way to do all this with reshape or spread...the tricky bit to me is that you don't seem to have a clear identifier as to which Zone 1 and Zone 2 match...other than doing something that matches the first letter of the Name.... – MPhD Feb 10 '17 at 08:17
This way leaves potential issues with your data getting mis-sorted. It would be better with a key variable linking matched Zones, then using tidyr or reshape to go to wide format from there...based on the link from Akrun to the duplicate answer by Arun, you might be able to use: library(data.table) df<-data.table(df) df$ID<-substr(df$Name,1,1) dcast(wide2, ID~rowid(ID), value.var=c("Name","Var1", "Var2", "Comments", sep=".")) Though I can't get rowid to work; you might need a different/more recent version of data.table? But I don't want to restart R right now to find out... ;) – MPhD Feb 10 '17 at 08:38

split and add columns to a dataframe - R

1 Answers1