-2

I am relatively new with R. I looked for a reply on the forum, unsuccessfully I need to align values coming from 2 different columns. For example, I have:

Educ wage a e a j e a j a

and I need:

Educ Wage a a a a j j e e

Knowing that I have to do this on a large dataset and that I have a different number of values in column "Educ" and in Column "Wage".

I want to thanks the Stackoverflow community and again I am very sorry if my question has already been asked.

J. Perez
  • 117
  • 1
  • 2
  • 7
  • check this one out: https://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right – ygaft May 24 '18 at 08:47
  • 1
    Your problem is not sufficiently defined. If both columns really contain exactly the same values as in your example you can just copy one column to have it twice. – Roland May 24 '18 at 08:50
  • @ygaft I already looked at this post. The problem is that this works for multiple dataset but I am working on a single dataset. – J. Perez May 24 '18 at 08:54
  • @Roland , I need a code to automatically align values which are the same in two different columns. Or you might know a trick on excel. I think my problem is very trivial. – J. Perez May 24 '18 at 08:56
  • 1
    Following @Roland's point, what would you want to happen when the columns have different data, e.g. `data.frame(x = c(1, 2, 1), y = c(3, 1, 2))`? – Mikko Marttila May 24 '18 at 09:05
  • I suspect you want to `merge` two data.frames. Difficult to say without a representative example. – Roland May 24 '18 at 09:24
  • @MikkoMarttila I want to to put a NA if the x column does not match the y column. – J. Perez May 24 '18 at 09:40
  • @Roland I am working on a single dataset. – J. Perez May 24 '18 at 09:41

2 Answers2

0

If I correctly understand your question, try to adapt this code:

df_ord<-data.frame(cbind(Educ=as.character(df[order(df$Educ),"Educ"]),Wage=as.character(df[order(df$Wage),"Wage"])))
df_ord
  Educ Wage
1    a    a
2    a    a
3    e    e
4    j    j

Input data:

df<-data.frame(Educ=c("a","a","e","j"),Wage=c("e","j","a","a"))
> df
  Educ Wage
1    a    e
2    a    j
3    e    a
4    j    a

WARNING: This approach works only if you have same elements in value and numbers in both columns.

Terru_theTerror
  • 4,918
  • 2
  • 20
  • 39
0

If your columns contain exactly the same data but in different order than this would work:

df <- data.frame(Educ = c("a","a","e","j"), Wage = c("e","j","a","a"), stringsAsFactors = F )
df$Educ <- sort(df$Educ)
df$Wage <- sort(df$Wage)

EDIT: If the values are not the same and column length differ than I can think of this data.table solution:

require(data.table)

dt.left <- data.table(x = c("a","z","a","b","c","d","b","c"))
dt.rght <- data.table(x = c("a","e","c","b","f"))

dt.left[ , dum := paste0(x,1:.N), by = "x"]
dt.rght[ , dum := paste0(x,1:.N), by = "x"]

merge(dt.left,dt.rght,by = "dum", all = T)

You can play with the all var e.g all.x , all.y. Since it uses data.table you will also be very fast on big data sets.

moooh
  • 459
  • 3
  • 10
  • thanks for your reply. I just edited my post: my number of observation in the 2 columns are different. – J. Perez May 24 '18 at 09:19
  • Hey @J.Perez. Since columns in a data frame are always of the same length do you have vectors instead? Or do your columns contain NA's ? – moooh May 24 '18 at 09:25
  • I want to be able to put NAs if the left column does not match the right column. At the moment, I just have two columns of 2 different length which contain NA. – J. Perez May 24 '18 at 09:45