1

having a small issue binding some dfs together.

I used the following procedure to create simplified dfs:

PWI_1 <- read.xlsx("/Users/myname/Desktop/PrelimPWI/PWI_1_V1A.xlsx", colNames = TRUE, startRow = 2)
Slim_1 <- data.frame(PWI_1$ExperimentName,PWI_1$Subject, PWI_1$error,PWI_1$Block, PWI_1$Category, PWI_1$Trial,PWI_1$prompt1.RT)

...etc for the following files.

I then used this code to try and bind the dfs:

merged <- bind_rows(list(Slim_1,Slim_10,Slim_11...))

However, the dfs are concatenated to the right, instead of appended on the end in one long format df. *Note the PWI_V1x is the name of the experiment version, which needs to be lined up

enter image description here

I think the error is caused by the variable trimming process (ie creating a 'slim' df), but unfortunately the untrimmed files have different numbers of columns, so I get an error when trying to bind the original dfs. Any advice would be appreciated!

Machavity
  • 30,841
  • 27
  • 92
  • 100
  • 1
    If the columns have different names, bind_rows will 'concatenate to the right' as you say. Each df needs indentical column names. – SmokeyShakers Nov 26 '19 at 21:17
  • 1
    I think what is happening is that if you define data frames that way, the column names will be prefixed with the data frame name they came from. This is probably causing your column names not to line up when you call `bind_rows`. I would suggest using dplyr's `select` to subset the columns instead. – svenhalvorson Nov 26 '19 at 21:18
  • Convert your each data frame to long format then bind them together. It's probably a good idea to write a function then use `map` to loop through all files https://stackoverflow.com/a/59035745/786542 & https://stackoverflow.com/a/48105838/786542 – Tung Nov 26 '19 at 22:13

1 Answers1

1

bind_rows requires columns names to be the same. Instead of "slimming" your data frame, use dplyr::select so you pick out the same column names every time.

    Slim_PWI_1 <- read.xlsx("/Users/myname/Desktop/PrelimPWI/PWI_1_V1A.xlsx", colNames = TRUE, startRow = 2) %>% 
    select(ExperimentName, Subject, error, Block, Category, Trial, prompt1.RT)

then this should work:

merged <- bind_rows(Slim_PWI_1, ...)

Edit: If you have multiple files as you indicate, you can read and slim them all together like this:

    Slim_PWI_list <- dir(path = "/Users/myname/Desktop/PrelimPWI/", pattern = "PWI.*xlsx", full.names = TRUE) %>% 
map(~read.xlsx(., colNames = TRUE, startRow = 2)) %>% 
map(~select(., ExperimentName, Subject, error, Block, Category, Trial, prompt1.RT))
    merged <- bind_rows(Slim_PWI_list)
Arthur Yip
  • 5,810
  • 2
  • 31
  • 50