1

I want to create a new data frame from the existing one for visualisation however, the datasets I want to combine vary in length. Is there a way of combining them without losing any of the data points of the variables from the longer datasets?

info_summary <- data.frame(Croydon = Readability.Croydon$Flesch, Kingston = Readability.Kingston$Flesch,Lambeth = Readability.Lambeth$Flesch, Merton = Readability.Merton$Flesch, Richmond = Readability.Richmond$Flesch, Sutton = Readability.Sutton$Flesch, Wandswoth = Readability.Wandsworth$Flesch)

When I ran that it came back with:

Error in data.frame(Croydon = Readability.Croydon$Flesch, Kingston = Readability.Kingston$Flesch,  : 

arguments imply differing number of rows: 9567, 3197, 9583, 6392, 3198, 3199, 9598

If any one has any suggestions on how to combine with into a new dataset would really help.

The means of visualisation I aim/hope to use is a box plot, where the readability levels are compared between the seven boroughs.

sarah laid
  • 11
  • 2
  • 2
    to combine the columns from the different datasets, you will need to have some common "ID" type variable that you can then "merge" the datasets on. However, if all you want is to plot the data from different data.frames on to the same visualisation, you don't really need to combine the dataframes. The mechanic of how to do this will depend on the plotting library you are using. – AdroMine Mar 24 '22 at 20:00
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Mar 24 '22 at 20:13

1 Answers1

0

This is a simple solution, perhaps it could be extended to a function.

#Assuming these are your columns 
a <- c(1, 5, 10)
b <- c(1, 4, 3, 12)
c <- c(1, 3)

# Create a list with this vectors
var_list <- list(a=a, b=b, c=c)

# Calculate the length of each column and then the maximum of all columns
n <- sapply(var_list, length)
n <- max(n)

#To each vector of the list you put the length n
var_list <- lapply(var_list, `length<-`, n)

#Convert the list to a dataframe 
df <- data.frame(var_list)
dvera
  • 314
  • 1
  • 10