1

The code so far looks like this:

abc <- import_list(dir("MyData/", pattern = "*.xlsx",
        full.names = TRUE), rbind = TRUE, rbind_label = "source") 

Using the "rio" package this code imports many excel files at once putting one table under the other. The columns are sorted by column name (rbind = TRUE) in order to avoid a situation where data is put into the wrong columns (e.g. if some tables have more columns than others).

I want to have a FIRST column that entails the name of the excel file so that I know from where the data comes. However, there are two problems with rbind_label = "source"

  1. It creates a column but in that column it's not the name of the file, but the whole path of it (pretty long)
  2. The column is not at the beginning of the newly created table, but somewhere in the middle.

How can I solve these two problems?

camille
  • 16,432
  • 18
  • 38
  • 60
shymilk
  • 96
  • 6
  • I do similar operations pretty often using `purrr`, where I get a list of paths, then use `set_names` and regex to name the list by some extracted subset. `purrr::map_dfr` then lets me create a column of those names with its `.id` argument – camille Nov 22 '19 at 15:05
  • Similar to this post: https://stackoverflow.com/q/46299777/5325862 – camille Nov 22 '19 at 15:09

1 Answers1

1

Assuming the name of the source column is source. This will make it the first column:

abc <- abc[c('source', setdiff(names(abc),'source'))]

This will change that columns value from the full path to the filename:

abc$source <- basename(abc$source)
SmokeyShakers
  • 3,372
  • 1
  • 7
  • 18
  • Is it possible to include the basename(abc&source) command somehow when creating the list? I tried rbind_label = basename("Source"), but doesn't work – shymilk Nov 22 '19 at 14:49
  • 1
    I haven't used the rio package, but it doesn't appear so to me. Looks like you need to import first, then alter with code i provided. Sorry. – SmokeyShakers Nov 22 '19 at 15:05