1

I have two data vectors (datA and datB) that needs combining into a single dataframe. It looks like a straight-forward thing to accomplish, until I tried unsuccessfully as shown below:

datA <- c("uuw", "aat", "auyt", "uut")
datB <- c("mmu", "asty", "wou")

XX <- data.frame(m=rep(NA, datA),y=rep(NA, datB))

My attempt generated the following errors:

Error in rep(NA, datA) : invalid 'times' argument
In addition: Warning message:
In data.frame(m = rep(NA, datA), y = rep(NA, datB)) :
NAs introduced by coercion

Please help!

user27976
  • 903
  • 3
  • 17
  • 28

5 Answers5

4

Here is a simple version that takes advantage of length<-:

cols <- list(m=datA, y=datB)
as.data.frame(lapply(cols, `length<-`, max(sapply(cols, length)))) 

Produces

     m    y
1  uuw  mmu
2  aat asty
3 auyt  wou
4  uut <NA>
BrodieG
  • 51,669
  • 9
  • 93
  • 146
3

If you want to combine the vectors into a dataframe without recycling the values of datB, you can use the cbind.fill function

cbind.fill<-function(...){
    nm <- list(...) 
    nm<-lapply(nm, as.matrix)
    n <- max(sapply(nm, nrow)) 
    do.call(cbind, lapply(nm, function (x) 
    rbind(x, matrix(, n-nrow(x), ncol(x))))) 
}

XX <- data.frame(cbind.fill(datA,datB))
colnames(XX) <- c("m","y")
Community
  • 1
  • 1
Jonas Tundo
  • 6,137
  • 2
  • 35
  • 45
  • Quick questions: I was not able to get rid of 'NA' in the cbind,fill output using XX[is.na(XX)] <- "". Is there a better way? – user27976 Mar 26 '14 at 13:00
  • check http://stackoverflow.com/questions/8161836/how-do-i-replace-na-values-with-zeros-in-r – Jonas Tundo Mar 26 '14 at 13:05
1

Not sure why are you trying to create a data.frame with NAs but this should work

datA <- c("uuw", "aat", "auyt", "uut")
datB <- c("mmu", "asty", "wou")
XX <- data.frame(m=rep(NA, max(c(length(datA), length(datB)))),y=rep(NA, max(c(length(datA), length(datB)))))
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
1

One can't create an uneven data.frame. If you would like to create a "jagged" data structure in R, lists are the way to go. They can also be named similar to columns in the data.frame.

XX <- list( datA = c("uuw", "aat", "auyt", "uut"), datB = c("mmu", "asty", "wou"))
XX
$datA
[1] "uuw"  "aat"  "auyt" "uut" 

$datB
[1] "mmu"  "asty" "wou"

And further accessed as

XX$datA[1]
"uuw"
XX[["datA"]][2]
"aat"

In your example (as Roland) mentioned you're filling your data.frame with NA's, plus you have a bug as you're passing datA and datB themselves to rep rather than length(datA) and length(datB).

Dave's solution solves your problem by introduction of NA's into the data.frame, the choice of solution depends on your usage.

M M
  • 177
  • 1
  • 11
  • Thanks everyone for your useful suggestions and codes. cbind.fill was useful for my purpose. Thanks again JT85! – user27976 Mar 26 '14 at 12:45
0

use indexes instead of columns and transpose it afterwards

l1 = [1,1]
l2 = [2,2,2,2]

df = pd.DataFrame([l1,l2], index = ('l1', 'l2'))
df.T

#    l1  l2
# 0   1   2
# 1   1   2
# 2 NaN   2
# 3 NaN   2
Sebastian
  • 13
  • 5