-1

I am trying to create a new df new_df with columns from different data frames.

The columns are of unequal length, which I presume can be solved by replacing empty 'cells' with NA? However, this is above my current skill level, so any help will be much appreciated!

Packages:

library(tidyverse)
library(ggplot2)
library(here)
library(readxl)
library(gt)

I want to create new_df with columns from the following subsets:

Kube_liten$Unit_cm
Kube_Stor$Unit_cm
jpsmith
  • 11,023
  • 5
  • 15
  • 36
Novice
  • 15
  • 2
  • 1
    Novice, welcome to Stackoverflow. You will find a lot of friends here, if you provide a minimal reproducible example of your problem ... and explain what you tried, what does not work, or why the output is not what you expect. Your question is overly broad, the `libraries` you mention have nothing to do with your problem, i.e. combining vectors of different length. It is also not clear how your colum names helpt to understand what you are aiming at. Only if you give us a bit to work with, we can help. – Ray Nov 06 '22 at 15:45
  • Does this answer your question? [How to cbind or rbind different lengths vectors without repeating the elements of the shorter vectors?](https://stackoverflow.com/questions/3699405/how-to-cbind-or-rbind-different-lengths-vectors-without-repeating-the-elements-o) – jpsmith Nov 06 '22 at 15:50
  • Is there a common feature in those dataframes that could be use to link columns? I.e name or ID? If yes, then the operation is generally called joining - https://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right (covers `merge` from base and join operations from dplyr) – margusl Nov 06 '22 at 15:51
  • You might wanna see this https://stackoverflow.com/questions/6988184/combining-two-data-frames-of-different-lengths – Aaqib Gulzar Nov 06 '22 at 16:02

2 Answers2

0

Novice, we appreciate that you are new to R. But please study a few basics. In particular how vector recycle.

Your problem:

vec1 <- c(1,2,3)
vec2 <- c("A","B","C","D","E")

df <- data.frame(var1 = vec1, var2 = vec2)
Error in data.frame(var1 = vec1, var2 = vec2) : 
  arguments imply differing number of rows: 3, 5

You may "glue" vectors together with cbind - check out the warning. The problem of different vector length is not gone.

df <- cbind(vec1, vec2)
Warning message:
In cbind(vec1, vec2) :
  number of rows of result is not a multiple of vector length (arg 1)

What you get - vec1 is "recycled". In principle R assumes you want to fill the missing places by repeating the values ... (which might not what you want).

df
     vec1 vec2
[1,] "1"  "A" 
[2,] "2"  "B" 
[3,] "3"  "C" 
[4,] "1"  "D" 
[5,] "2"  "E" 

## you can convert this to a data frame, if you prefer that object structure
Warning message:
In cbind(vec1, vec2) :
  number of rows of result is not a multiple of vector length (arg 1)
> df
  vec1 vec2
1    1    A
2    2    B
3    3    C
4    1    D
5    2    E

So your approach to extend the unequal vector length with NA is valid (and possibly what you want). Thus, you are on the right way.

  1. determine the length of your longest vector
  2. inject NAs where needed (mind you you may not want to have them always at the end)

This problem can be found on Stackoverflow. Check out How to cbind or rbind different lengths vectors without repeating the elements of the shorter vectors?

Ray
  • 2,008
  • 14
  • 21
  • Ray, thank you for your answer and feedback! The solution offered by @jpsmith turned out to work for me. I see a lot of the more experienced users asking for more 'studying of basics' like how vector recycle. I consider this human interaction a part of my studying, as a last solution when other options, like google, YouTube and books, falls short. I decided to re-study some of the bascs concepts and that helped a lot. Thank you! – Novice Nov 07 '22 at 22:41
0

You can try the following, which extends the "short" vector with NA values:

col1 <- 1:9
col2 <- 1:12

col1[setdiff(col2, col1)] <- NA

data_comb <- data.frame(col1, col2)
# or
# data_comb <- cbind(col1, col2)

Output:

   col1 col2
1     1    1
2     2    2
3     3    3
4     4    4
5     5    5
6     6    6
7     7    7
8     8    8
9     9    9
10   NA   10
11   NA   11
12   NA   12

Since you didn't provide sample data or a desired output, I don't know if this will be the exact approach for your data.

jpsmith
  • 11,023
  • 5
  • 15
  • 36