0

I have imported 90 excel files by creating a directory (dir) and using (lapply). I am not sure if this is called a nested list or not. After importing the excel files in a list, I can see that some files have four columns and some other files have five columns. My questions are:

  1. How can I cut the files with four columns from this list and paste it into a new list.
  2. How can I then combine all the similar column files within a list into a single file.
  3. How can I save those two files as two different excel sheets?
library(foreign)
setwd("F:\\Spring 2019\\Thesis_data\\Kam_Thesis\\data\\nontidal_water_level")

nontidal_list <- dir(pattern= ".xlsx")

nontidal_water_level <- lapply(nontidal_list, read_excel)
Heikki
  • 2,214
  • 19
  • 34
  • 1
    How do you define similar? Similar due to names, data or some other criterion? – NelsonGon Apr 19 '19 at 18:01
  • 2
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. You should ask just one question at a time. – MrFlick Apr 19 '19 at 18:09

2 Answers2

0

Since you didn't provide an example dataset, we'll pretend it looks like this:

> nontidal_water_level <- lapply(1:4, function(i) data.frame(matrix(1:12 * i, ncol = 3 + (i %% 2))))
> str(nontidal_water_level)
List of 4
 $ :'data.frame':   3 obs. of  4 variables:
  ..$ X1: int [1:3] 1 2 3
  ..$ X2: int [1:3] 4 5 6
  ..$ X3: int [1:3] 7 8 9
  ..$ X4: int [1:3] 10 11 12
 $ :'data.frame':   4 obs. of  3 variables:
  ..$ X1: int [1:4] 2 4 6 8
  ..$ X2: int [1:4] 10 12 14 16
  ..$ X3: int [1:4] 18 20 22 24
 $ :'data.frame':   3 obs. of  4 variables:
  ..$ X1: int [1:3] 3 6 9
  ..$ X2: int [1:3] 12 15 18
  ..$ X3: int [1:3] 21 24 27
  ..$ X4: int [1:3] 30 33 36
 $ :'data.frame':   4 obs. of  3 variables:
  ..$ X1: int [1:4] 4 8 12 16
  ..$ X2: int [1:4] 20 24 28 32
  ..$ X3: int [1:4] 36 40 44 48

For question number 1, you can find out the columns of each by looping over and using ncol, then putting the dataframes with the same number of columns into individual lists:

> ncolumns <- lapply(nontidal_water_level, ncol)
> nontidal_3col <- nontidal_water_level[ncolumns == 3]
> nontidal_4col <- nontidal_water_level[ncolumns == 4]

For question number 2, you can call rbind over all elements of each list:

> (nontidal_3col <- do.call(rbind, nontidal_3col))
  X1 X2 X3
1  2 10 18
2  4 12 20
3  6 14 22
4  8 16 24
5  4 20 36
6  8 24 40
7 12 28 44
8 16 32 48
> (nontidal_4col <- do.call(rbind, nontidal_4col))
  X1 X2 X3 X4
1  1  4  7 10
2  2  5  8 11
3  3  6  9 12
4  3 12 21 30
5  6 15 24 33
6  9 18 27 36

Finally, for question 3, you could use a package like writexl, but I would recommend just writing it out as a .csv:

write.csv(nontidal_3col, "my_3_column_file.csv")
write.csv(nontidal_4col, "my_4_column_file.csv")
C. Braun
  • 5,061
  • 19
  • 47
0

You can create a loop that does the job.

for the three steps:

if!require(plyr) install.packages(plyr) 
df_4col <- data.frame() # a dataframe that after function must contain all four columns data
df_5col <- data.frame() # the same eith five colums
df_4list <- c() # a list with the filenames
df_5list <- c() # a list with the filenames

for (i in 1:length(nontidal_list)){
df <- read_excel(nontidal_list[i])
if (length(df)==4) {
    df_4col<-plyr::rbind.fill(df_4col, df) 
    df_4_list<-append(df_4list, nontidal_list[i]} else
if (length(df)==5) {
    df_5col<-plyr::rbind.fill(df_4col, df) 
    df_5list<-append(df_5list, nontidal_list[i]} 
}