-4

does anyone know how to combine/join files that have different number of rows and columns using R?

Thanks

gccd
  • 49
  • 2
  • 10
  • 3
    Please add an example using example data in and example output. There are multiple ways to combine data depending on what you actually want. We can't tell whether you need to bind rows, bind columns, join, or other. – Adam Sampson Aug 01 '18 at 21:50
  • 2
    See [How to join (merge) data frames (inner, outer, left, right)?](https://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right) – Maurits Evers Aug 01 '18 at 22:04

1 Answers1

0

You should use merge but beforehead you need it to read it to the memory using e.g. read.csv. Supposedly you loaded the data. See below:

# simulation of data to be merged
set.seed(123)
x <- data.frame(id = letters[1:10], valx = rnorm(10))
dim(x)
# [1] 10  2
# 10 rows, 2 columns

y <- data.frame(id = sample(letters[1:10], 5), valy = rnorm(5), valz = LETTERS[3:7]) 
dim(y)
# [1] 5 3
# 5 rows, 3 columns

merge(x, y, by = "id", all.x = TRUE)

Data frame x:

   id        valx
1   a -0.56047565
2   b -0.23017749
3   c  1.55870831
4   d  0.07050839
5   e  0.12928774
6   f  1.71506499
7   g  0.46091621
8   h -1.26506123
9   i -0.68685285
10  j -0.44566197

Dataframe y:

   id        valx
1   a -0.56047565
2   b -0.23017749
3   c  1.55870831
4   d  0.07050839
5   e  0.12928774
6   f  1.71506499
7   g  0.46091621
8   h -1.26506123
9   i -0.68685285
10  j -0.44566197

merged data.frame (all rows in the first dataframe x are preserved):

   id        valx       valy valz
1   a -0.56047565         NA <NA>
2   b -0.23017749         NA <NA>
3   c  1.55870831         NA <NA>
4   d  0.07050839  0.8255398    G
5   e  0.12928774         NA <NA>
6   f  1.71506499 -1.0488931    E
7   g  0.46091621  0.2382129    D
8   h -1.26506123         NA <NA>
9   i -0.68685285  0.5490967    C
10  j -0.44566197  1.2947633    F
Artem
  • 3,304
  • 3
  • 18
  • 41
  • thank you. before I merge I have to read the data and I get the error: no lines available in input. Please see code before – gccd Aug 15 '18 at 18:48
  • for (i in 1:length(fileL)) { for (j in 1:length(fileL[[i]])) { # fetch and read files if (j==1) { newFile<- read.delim(paste(dataFnsDir, fileL[[i]][j], sep="/"), as.is=T) print(fileL[[i]][j]) } else { newFile<- dplyr::rbind_rows(newFile, read.delim(paste(dataFnsDir, fileL[[i]][j], sep="/"), as.is=T)) } } tmpFn<- paste(dataFnsDir, "/", new_dataFns[i], ".tsv", sep="") } – gccd Aug 15 '18 at 18:48
  • Use `tryCatch`. Some files from the `fileL` list are missing from the location you are loading. It is better to post another question as comments are too tight `for (i in 1:length(fileL)) { for (j in 1:length(fileL[[i]])) { # fetch and read files tryCatch( { if (j==1) { trnewFile <- read.delim(paste(dataFnsDir, fileL[[i]][j], sep="/"), as.is=T) print(fileL[[i]][j]) } else { newFile<- dplyr::rbind_rows(newFile, read.delim(paste(dataFnsDir, fileL[[i]][j], sep="/"), as.is=T)) } }, error=function(e) NULL) } }` – Artem Aug 15 '18 at 20:10
  • Without reproducible example it is quite difficult to help. Please have a look here [How to make greate reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – Artem Aug 15 '18 at 20:15